# Automata and Graph Theory for Adaptive Programming

The theory of adaptive programming is pretty well understood as far as traversal specifications and their semantics and compilation algorithms are concerned.

One overall style rule in the theory of adaptive programming is to rely on "standard" results from automata, formal language and graph theory instead of inventing everything from scratch. In other words, we use a reduction approach by reducing AP problems to "standard" problems such that a solution to the "standard" problem yields a solution to the AP problem. This is an effective way of theory reuse widely practiced in mathematics and theoretical computer science.

What are "standard" results on which the theory of AP rests?

## Traversing a graph guided by a regular expression

```REGULAR PATH
Instance:
Regular expression R and graph G
Question:
Is there a directed path p in G satisfying R,
where the concatenation of edge labels comprising p
is in the language denoted by R.
```

REGULAR PATH is in P. See:

```@ARTICLE{mendelzon:siam,
AUTHOR = "Alberto Mendelzon and Peter Wood",
TITLE = "Finding regular simple paths in graph databases",
JOURNAL = "SIAM Journal on Computing",
YEAR = 1995,
PAGES = "1235-1258",
MONTH = "",
VOLUME = 24,
NUMBER = 6
}
@ARTICLE{hunt-rosen-szym:equiv76,
AUTHOR = "H.B. Hunt and D.J. Rosenkrantz and T.G. Szymanski",
TITLE = "On the equivalence, containment, and covering problems for the regular
and context-free languages",
JOURNAL = "Journal of Computer and System Sciences",
YEAR = 1976,
PAGES = "222-268",
VOLUME = 12
}
```

There are two ways to prove the theorem:

Proof 1: Rely on the equivalence between directed, labeled graphs and NFAs. A graph G with nodes x and y can be viewed as an NFA with initial state x and final state y. Construct the intersection graph I of G and an NFA M accepting L(R). There is a path from x to y satisfying R iff there is a path in I from (x,s0) to (y,sf), for s0 the start state of M and some final state sf in M. All this is polynomial [hunt-rosen-szym:equiv76].

Proof 2: Tarjan provides a polynomial algorithm for constructing a regular expression Rxy which represents the set of all paths between two nodes x and y in a given graph. The alphabet used here is the set of edge labels in the graph. Note that the set of paths may be infinite or exponential in the size of the graph yet it can be described succinctly by a regular expression. There is a path from x to y satisfying R iff the intersection of L(R) and L(Rxy) is nonempty.

Why is REGULAR PATH important to AP? A class graph corresponds to a graph G and a strategy graph corresponds to a regular expression R. (Question: Does this work only if the constraint map predicate is not too complex?) To compile an adaptive program we need to find all paths in G from some node x to some node y satisfying R.

Another application of REGULAR PATH is at the object graph level: Given an object graph OG and a regular expression R representing a strategy graph, we need to determine whether we can traverse from the root of OG to a given node in OG following R.

## Intersection of NFAs

Intersection of non-deterministic finite automata plays a key role in both polynomial algorithms for solving the REGULAR PATH problem. Therefore, NFA intersection is also fundamental to AP.

We denote by PathSet[X](Y) the set of traversals of graph X which are allowed by automaton Y. Given a strategy graph SG and a class graph CG, the PathSet[CG](SG) is represented essentially by the intersection of the automata for CG and SG. We call this automaton "traversal graph". Given a traversal graph TG and an object graph OG, the PathSet[OG](TG) is represented essentially by the intersection of the OG and TG. "essentially" is used here because class graphs and traversal graphs may contain abstract classes and subclass edges which require special treatment.

It is useful to mention that FINITE STATE AUTOMATA INTERSECTION is PSPACE-complete:

```FINITE STATE AUTOMATA INTERSECTION
Instance: Sequence A1, ... ,An of DFAs.
Question: Is there a string accepted by each automaton Ai for i = 1 ... n.
```

Solvable in polynomial time for fixed n.

## Viewing a context-free grammar as a regular set

A source of confusion: There seems to be an apparent contradiction between the following two statements: 1. A class dictionary defines a context-free language. 2. A class dictionary can be viewed as an automaton defining a regular set. This is not a contradiction because in case 1. the language is defined by viewing the class dictionary as a set of rewrite rules while in the second case it is defined in terms of paths through the class graph. In the 2. case, the language consists of paths which express navigation through object graphs which conform to the context-free structure of the class dictionary.

```Example: (by Doug Orleans)
>The class graph
>
>    S = A [S] B.
>    A = .
>    B = .
>
>This accepts object graphs of the form
>
>               S
>              /|\
>             A S B
>              /|\
>             A S B
>               :
>               :
>               S
>              / \
>             A   B
>
>whose prefix depth-first-traversal visits {SA}^n B^n, which violates
>the pumping lemma. I.e., the language us context-sensitive.

The NFA corresponding to

>    S = A [S] B.
>    A = .
>    B = .

is

q0 S q1
q1 A q2
q1 S q1
q1 B q3

If the source state is q0 and the target state q2,
the regular expression is S*A.

If the strategy is S any* A, the intersection automaton is:

q0 S q1
q1 A q2
q1 S q1
```

## Equivalence of traversal strategies

It is important that we can check inequivalence of two strategies efficiently. Since strategies can be expressed as NFAs (non-deterministic finite state automata), the following results seem to make it expensive to compare strategies:
```(from Gary and Johnson)
FINITE STATE AUTOMATON INEQUIVALENCE
Instance: Two non-deterministic finite state automata A1 and A2
having the same input alphabet Sigma.
Question: Do A1 and A2 recognize different languages?

REGULAR EXPRESSION INEQUIVALENCE
Instance: Regular expressions E1 and E2.
Question: Do E1 and E2 represent different languages?

PSPACE-complete even if the alphabet contains two elements and A2 (E2)
is the trivial automaton (regular expression) recognizing Sigma*.
```
But checking of inequivalence of strategies is polynomial if the constraint maps are not too complicated (say expressed with bypassing and only-through as in Demeter/Java). Strategies are an important abstraction of NFAs (or regular expressions) which have a tractable inequivalence problem. If we would use NFAs or regular expressions to express traversals, the situation would be more complex. One informal reason why strategies are more tractable than regular expressions is that they clearly separate positive information (strategy graph structure) from negative information (constraint maps).

This page is based on: Paper on Traversal Strategies and personal communications with Yannis Smaragdakis and Alberto Mendelzon.

Karl Lieberherr, College of Computer and Information Science, Northeastern University
demeter@ccs.neu.edu
Fall 2009.