Traversal Strategies

The viewgraphs for this material are in:

http://www.ccs.neu.edu/research/demeter/course/f97/lectures/powerpoint/lec7.pptTraversal strategies, for short, strategies, are a layer of abstraction above the class graph layer. In adaptive programming we use three kinds of graphs to reason about programs: we have object graphs and class graphs, familiar from object-oriented programming, and a third layer called the strategy graph layer unique to adaptive programming.

Strategy graphs are often small, much smaller than the class graphs. The purpose of strategy graphs is to define traversals of objects through a set of paths in a class graph. A strategy graph may be viewed as an independent entity which defines vaguely how objects should be traversed. The description is vague since the class graph is left open. However, as soon as a class graph is provided, a strategy defines a very precise set of paths.

Strategies are usually developed in the context of a class graph but they are written in such a way that they will work for many other class graphs as well.

Strategy graphs are a useful concept similar to the concept of a grammar or a finite state machine. Strategy graphs are helpful to define program fragments which collaborate with other program fragments. A grammar defines a parser and a strategy defines a "traverser" in combination with a class graph and an object graph. Strategies are a little more complex than a grammar since they are labeled graphs whose labels refer to two other graphs. A grammar is simpler since it can be viewed as a graph whose labels are tokens.

In most cases, a strategy in connection with a class graph can be understood in terms of a subgraph of the class graph. The subgraph describes a group of collaborating classes. You think of this subgraph also as a traversal scope which in turn determines how objects are going to be traversed. The strategy is the blueprint of a trip which a a visitor will perform. The blueprint gives only the milestones of the trip and forbids to go through certain hazardous areas. The blue print is then implemented in a detailed map which will tell the visitor very precisely how to travel. In some cases, the blue print must be viewed as a set of paths instead of as a subgraph.

Indeed, the visitor metaphor used above is matching reality. Traversals only traverse objects and we need a mechanism to specify what needs to be done on top of the traversal. Therefore, each traversal gets one or more visitors sent along. But the purpose of this chapter is just to focus on strategies.

- Line graph notation
It allows to express graphs which are a line. An example is:

S1 = from BookKeeping via Taxes via Business to LineItem

Note: naming of strategies not yet supported.If it is not necessary to name a strategy, we introduce it directly in class BookKeeping:

BookKeeping { ... via Taxes via Business to LineItem ... }

In this form, the

*"from BookKeeping"*is not needed. Between the milestones it is also possible to bypass "hazardous areas". An example is:BookKeeping { ... via Taxes via Business bypassing HomeOffice to LineItem ... }

- Strategy graph notation
This notation is more general than the line graph notation and it allows to express any strategy graph.

The bookkeeping example would be expressed as:

{BookKeeping -> Taxes Taxes -> Business bypassing HomeOffice Business -> LineItem } ...

The line graph notation is intuitive and self-explaining. It does not require us to repeat node names to express a line graph, for example. On the other hand, the strategy graph notation is more general. Therefore, we use both notations. The strategy graph notation we use to explain the "details". When we introduce a feature of the line graph notation we show the translation into the strategy graph notation to explain the detailed meaning.

We start with very simple strategies and gradually introduce more complex ones.

Single edge strategies are the most fundamental strategies and they are powerful enough to express any group of collaborating classes which you need for a specific class graph. So why don't we express all strategies as single edge strategies? It turns out that some groups of collaborating classes can be specified much more succinctly as strategies which are not single edge.

Single edge strategies are the building blocks of general strategies. Once you understand single edge strategies you have mastered strategies pretty well although multi-edge strategies can lead in rare cases to complications which Demeter will handle automatically for you. The explanation here is for the common case, however Demeter also handles the uncommon case. Indeed, Demeter will always correctly compile your strategies into Java code and you will not have to worry about any exceptions. You should, however, be aware that strategies are not as easy to implement for multi-edge strategies as the following description will make you believe. For single-edge strategies the following description leads to a correct implementation and you will not have to worry about any lurking pitfalls. We call such strategies ``no-pitfall'' strategies. The pitfalls show up only in certain multi-edge strategies.

The simplest line strategy is of the form
*"from Container to Weight"*
in line graph notation and of the form
`{ Container -> Weight }`

in strategy graph notation. The meaning is to find all Weight-objects contained in a Container-object. The group of collaborating classes consists of all classes ``between'' Container and Weight. How do you find those classes given a class graph?

The procedure is quite simple: To find the collaborating classes for ``from A to B" in a class graph C:

- Reverse all inheritance edges and call them subclass edges.
- Flatten all inheritance by expanding all common parts to concrete subclasses.
- Find all classes reachable from A and color them red including the edges traversed.
- Find all classes from which B is reachable and color them blue including the edges traversed.
- The group of collaborating classes is the set of classes and edges colored both red and blue.

The graph of collaborating classes we get this way is called the propagation graph for ``from A to B'' and C. The propagation graph is used as control agent for traversing A-objects. When we traverse an A-object and we get to an X-object, the edges outgoing from class X in the propagation graph will determine which objects we visit from the X-objects. If class X has parts p and q in the propagation graph, then we can only visit the p and q parts of the X-object.

Computing propagation graphs for ``from A to B" is the fundamental computation, called the From-To computation for, implementing strategies. Indeed, general strategies are also implemented using this fundamental From-To computation.

You might get confused from all this talking about From-To computations. You would like to see the group of collaborating classes for your class graph and a ``from A to B" strategy. To allow you to get visual feedback for your strategies, the GUI of Demeter/Java, called AP Studio, allows you to draw your class graph and to point and click in this graph to define your strategy. The preview feature of AP Studio will highlight the selected group of collaborating classes. AP Studio is described in this manual on page ??.

Single edge strategies may contain a set of classes which must be bypassed as you go from A to B. You think of the bypassed classes to be taken out of the class graph (together with the edges incident with those classes) before the From-To computation is applied.

An example of a single edge strategy with bypassing in line graph notation is:

S2 = from BusRoute bypassing Bus to Person

If it is not necessary to name the strategy, we introduce it directly in class BusRoute:

BusRoute { ... bypassing Bus to Person ... }

In the strategy graph notation the same strategy is written as:

{ BusRoute -> Person bypassing Bus }

In the general case, bypassing may contain a set of classes, using the curly brace syntax

bypassing {X, Y, Z}

We need to address the meaning of bypassing when source or target are bypassed.

from A bypassing A to Bmeans to delete all edges ingoing into A from the class graph. (currently a different implementation.)

from A bypassing B to Bmeans to delete all edges outgoing from B from the class graph. This equivalent to from A to-stop B. (currently a different implementation.)

from A bypassing A to Ameans to delete all edges outgoing from and ingoing into A from the class graph. We have to stay at A.

We leave now the territory of single edge strategies. The star graph strategies are multi target strategies and are, like the single edge strategies, no-pitfall strategies. Sometimes you need to visit objects of a set of classes instead of only one. For example, for a Company-object, you need to find all Customer and all SalesAgent objects. In the line graph notation you write

S3 = from Company bypassing {...} to {Customer, SalesAgent}

or, if there is no need to name the strategy

Company { ... bypassing {...} to {Customer, SalesAgent} ... }

In the strategy graph notation, we write:

{Company -> Customer bypassing {...} Company -> SalesAgent bypassing {...} }or

{Company -> {Customer,SalesAgent} bypassing {...} }

We need to explain the approximate meaning of multi-edge strategies. For most cases the following union technique gives the correct result: Consider the decomposition of the strategy graph into its individual edges and compute the propagation graph for each edge with respect to the given class graph. This process has been explained in the section on single edge strategies. To get the propagation graph for the multi-edge strategy we take the union of the propagation graphs of the single edge strategies. In other words, we merge all the propagation graphs together. The union graph is called the propagation graph for the multi-edge strategy and is used to control traversals like in the single edge case.

The meaning is approximate since information may be lost when the propagation paths are merged. Demeter will always generate the correct code for your traversal strategies.

A basic join strategy joins two single edge strategies together. In line graph notation we write:

S4 = from Company bypassing {...} through Customer to Address

or

Company { ... bypassing {...} through Customer to Address ... }

*through* may be replaced by *via*.

In strategy graph notation:

{Company -> Customer bypassing {...} Customer -> Address }

A join strategy may have multiple join points. An example in line graph notation is:

S5 = from Company through {Secretary, Manager} to Salary

This means that we are interested in the salaries of secretaries or managers. In strategy graph notation:

{ Company -> Secretary, Company -> Manager, Secretary -> Salary, Manager -> Salary }

So far we expressed strategies by only referring to classes in the class graph. We call such strategies class-only strategies. Class-only strategies have the advantage that they do not reveal details about the parts of the classes. Class-only strategies should be used whenever possible.

When edge-controlled strategies are absolutely needed, we can refer to edges as follows

-> A,b,B construction edge from A to B labeled b => A,B subclass edge from A to B

A set of edges is put into curly braces and they are separated by commas:

{-> A,b,B , -> X,y,Y}

A situation where an edge-controlled strategy is needed is illustrated by the following class graph:

A = <b1> B <b2> B <b3> B. B = C. C = .

We want the propagation graph

A = <b1> B. B = C. C = .

We can use either a positive or a negative edge-controlled strategy. The positive edge-controlled strategy would be in line graph notation (currently not implemented)

from A through -> A,b1,B to C

or in strategy graph notation

{ A -> A A -> B only-through -> A,b1,B B -> C }

only-through means that we are allowed to go only through the edge indicated. Only-through is a negative constraint since it bypasses all the other edges. Only-through is the complement of bypassing.

The negative edge-controlled strategy is in strategy graph notation

{A -> C bypassing {-> A,b2,B , -> A,b3,B}}

Only-through may also be used with nodes.

{A -> C only-through Bdeletes from the class graph all edges not incident with B.

Suppose we want to find inside an A-object all objects except the ones reachable through B-objects. This can be expressed by the strategy

from A to * bypassing B

or in strategy graph notation

{A -> * bypassing B}

* is the wildcard symbol. It may be used both for class names as well as for edge names. The wildcard symbol adds additional expressiveness to the notation since we can talk about class names and edge names without knowing them.

from A to *defines a traversal which visits all sub objects contained in an A-object.

Demeter/Java provides a universal traversal, called universal_trv0 (name may change in future), which does a depth-first traversal of entire objects. You may reuse the universal traversal to avoid generating unnecessary traversal code.

Recursion in the class graph may lead to undesired loops in the propagation graph. For example, if we want to find the top-level companies in a conglomerate of companies, the strategy

from Conglomerate to Company

will find all Company-objects. To get only the top-level company objects, we use instead

from Conglomerate to-stop Companywhich is equivalent to

from Conglomerate bypassing {-> Company,*,* , => Company,* } to Company

In strategy graph notation

{Conglomerate -> Company bypassing {-> Company,*,* , => Company,*} }

The meaning of to-stop is that all edges outgoing from the targets are bypassed.

A strategy defines a path set in the class graph. For example,

{A -> B B -> C}defines the set of all paths from A through B to C. Such paths may be of the form

{A->B bypassing {A,B,C} B->C bypassing {A,B,C}}which excludes any intermediate A's and B's and C's. Notice that the source A and B and targets B and C are, of course, not excluded.

We call such strategies WYSIWYG (what you see is what you get) strategies. In a WYSIWYG strategy all class graph nodes which appear as labels of strategy graph nodes are bypassed. We use the term WYSIWYG since those strategies eliminate surprise paths and they tell explicitly how the paths will look like. If a WYSIWYG strategy defines a propagation graph with a cycle then the strategy itself must contain a cycle. Indeed, we get what we see in the strategy.

In class graphs with cycles, we have to be careful with surprise paths.

Consider the following class graph:

Person = <sisters> PersonList <brothers> PersonList Status. Status : Single | Married. Single = . Married = <marriedTo> Person. PersonList ~ {Person}.

To compute the spouse of a person, we use

Person { Person get_spouse() bypassing Person through Married bypassing Person to Person // through Married to-stop Person {return_val = this;} // return_val initialized with null }

Notice that the *bypassing Person* is needed otherwise we would get also the brothers and sisters of the spouse as well as the original person we started with.

Person = Brothers Sisters Status. Status : Single | Married. Single = . Married = <marriedTo> Person. Brothers ~ {Person}. Sisters ~ {Person}.

In-laws are relatives by marriage. To print the brothers and sisters in-law of a person we use either of the following two the strategies: (first one not implemented yet)

{ Person -> Married bypassing Person Married -> spouse:Person bypassing Person spouse:Person -> Brothers bypassing Person spouse:Person -> Sisters bypassing Person Brothers -> brothers_in_law:Person bypassing Person Sisters -> sisters_in_law:Person bypassing Person } from Person bypassing Person through Married bypassing Person through Person bypassing Person through {Brothers, Sisters} bypassing Person to Person

This in-laws example shows an important feature of strategies (not yet implemented in Demeter/Java): we can use strategy graphs to define traversal states. So far we have ignored this possibility. *spouse:Person* means that if during the traversal we arrive here, the person is a spouse.

An example of a general strategy graph is:

{ A -> B //negative constraint 1 B -> D //negative constraint 2 A -> C //negative constraint 3 C -> D //negative constraint 4 }

A negative constraint is either of the bypassing or the only-through form.

{A -> B bypassing -> *,y,*}

means to go from A to B but not through a construction edge labeled y.

{A -> B only-through -> A,b2,B}

means to go from A to B but the only edge we are allowed to use is the construction edge from A to B labeled b2. All other edges from A to B are implicitly bypassed. Only-through is the complement of bypassing.

Either write your class graphs so that they don't contain self-loops (a construction edge from a class X back to class X) or avoid using strategies to traverse through self-loops. The reason is that self-loops create an infinite loop unless you control the traversal appropriately with wrappers. A self-loop is very easy to traverse manually.

To be written. See viewgraphs for now.