Chapter 30. Debugging Techniques for Stratego/XT

Table of Contents

30.1. Debugging Stratego
30.1.1. Writing readable code
30.1.2. Debugging Stratego code
30.1.3. Common Pitfalls
30.2. Debugging XT compositions
30.3. Debugging SDF definitions

Even in Stratego/XT, it is not entirely uncommon for developers to produce erroneous code from time to time. This chapter will walk you through the tools and techniques available for hunting down bugs, and tips for how to avoid them in the first place.

30.1. Debugging Stratego

Both the Stratego language paradigm and its syntax are rather different from most other languages. Knowing how to use the unique features of Stratego properly, in the way we have described in this manual, goes a long way towards avoiding future maintenance problems.

30.1.1. Writing readable code

One important practical aspect of using language constructs is expressing their syntax in a readable manner. The intention behind the code should be apparent for readers of the code. Judicious use of whitespaces is vital in making Stratego code readable, partly because its language constructs are less block-oriented than most Algol-derivates.

The most basic unit of transformation in Stratego is the rewrite rule. The following suggests how rules should be written to maximize readability.

  EvalExpr:
  Expr(Plus(x), Plus(y) -> Expr(z)
  where
    <addS> (x,y) => z

Rules are composed using combinators in strategies. One of the combinators is composition, written ;. It is important to realize that ; is not a statement terminator, as found in imperative languages. Therefore, we suggest writing a series of strategies and rules joined by composition as follows:

  
  eval-all =
      EvalExpr
    ; s1
    ; s2

Both rules and strategies should be documented, using xDoc. At the very least, the type of term expected and the type of term returned should be specified by the @type attribute. Also take care to specify the arity of lists and tuples, if this is fixed.

 /**
  * @type A -> B
  */
  foo = ...

Inline rules are handy syntactic sugar that should be used with care. Mostly, inline rules are small enough to fit a single line. When they are significantly longer than one line, it is recommended to extract them into a separate, named rule.

  strat = 
    \ x -> y where is-okay(|x) => y \

Formatting concrete syntax depends very much on the language being embedded, so we will provide no hard and fast rules for how to do this.

Formatting of large terms should be done in the style output by pp-aterm.

30.1.2. Debugging Stratego code

The Stratego/XT environment does not feature a proper debugger yet, so the best low-level debugging aids are provided by the library, in the from of two kinds of strategies, namely debug and a family based around log.

The debug strategy will print the current term to stdout. It will not alter anything. While hunting down a bug in your code, it is common to sprinkle debug statements liberally in areas of code which are suspect:

  foo = 
      debug
    ; bar
    ; debug
    ; baz

Sometimes, you need to add additional text to your output, or do additional formatting. In this case, an idiom with where and id is used:

  foo = 
     where(<debug> [ "Entered foo : ", <id> ])
     ; bar
     ; where(<debug> [ "After bar : ", <id> ])
     ; baz

The where prevents the current term from being altered by the construction of your debugging text, and id is used to retrieve the current term before the where clause. If, as in this example, you only need to prepend a string before the current term, you should rather use debug(s), as shown next.

  foo =
    debug(!"Entered foo : ")
    ; bar
    ; debug(!"After bar : ")
    ; baz

The use of debug is an effective, but very intrusive approach. A more disciplined regime has been built on top of the log(|severity, msg) and lognl(|severity, msg) strategies. (See Chapter 25 for details on log and lognl). The higher-level strategies you should focus on are fatal-err-msg(|msg), err-msg(|msg), warn-msg(|msg) and notice-msg(|msg).

It is recommended that you insert calls to these strategies at places where your code detects potential and actual problems. During normal execution of your program, the output from the various -msg strategies is silenced. Provided you allow Stratego to deal with the I/O and command line options, as explained in Chapter 26, the user (or the developer doing debugging) can use the --verbose option to adjust which messages he wants to be printed as part of program execution. This way, you get adjustable levels of tracing output, without having to change your code by inserting and removing calls to debug, and without having to recompile.

30.1.3. Common Pitfalls

Some types of errors seem to be more common than others. Awareness of these will help you avoid them in your code.

Strategy and Rule Overloading.  The way Stratego invokes strategies and rules may be a bit unconventional to some people. We have already seen that the language allows overloading on name, i.e. you can have multiple strategies with the same name, and also multiple rules with the same name. You can even have rules and strategies which share a common name. When invoking a name, say s, all rules and strategies with that name will be considered. During execution the alternatives are tried in some order, until one succeeds. The language does not specify the order which the alternatives will be tried.

 Eval:
 If(t, e1, e2) -> If(t, e1', e2')
 where
     <simplify> e1 => e1'
   ; <simplify> e2 => e2'
 
 Eval:
 If(False, e1, e2) -> e2

When Eval is called, execution may never end up in the second case, even though it the current term is an If term, with the condition subterm being just the False term.

If you want to control the order in which a set of rules should be tried, you must name each alternative rule differently, and place them behind a strategy that specifies the priority, e.g:

  SimplifyIf
  If(t, e1, e2) -> If(t, e1', e2')
  where
     <simplify> e1 => e1'
   ; <simplify> e2 => e2'

  EvalIfCond:
  If(False, e1, e2) -> e2
  
  Eval = EvalIfCond <+ SimplifyIf

Combinator Precedence.  The precedence of the composition operator (;) is higher than that of the choice operators (<+,+, >+). This means that the expression s1 < s2 ; s3 should be read as s1 < (s2 ; s3), and similarly for non-deterministic choice (+) and right choice (>+). See Section 15.3 for a more detailed treatment.

Guarded Choice vs if-then-else The difference between if s1 then s2 else s3 end and s1 < s2 + s3 (guarded choice) is whether or not the result after s1 is passed on to the branches. For if-then-else, s2 (or s3) will be applied to the original term, that is, the effects of s1 are unrolled before proceeding to the branches. With the guarded choice, this unrolling does not happen. Refer to Section 15.3.2 for details.

Variable Scoping.  Stratego enforces a functional style, with scoped variables. Once a variable has been initialized to a value inside a given scope, it cannot be changed. Variables are immutable. Any attempt at changing the value inside this scope will result in a failure. This is generally a Good Thing, but may at times be the cause of subtle coding errors. Consider the code below:

stratego> <map(\ x -> y where !x => y \)> [1]
[1]
stratego> <map(\ x -> y where !x => y \)> [1,1,1,1]
[1,1,1,1]
stratego> <map(\ x -> y where !x => y \)> [1,2,3,4]
command failed

Apparently, the map expression works for a singleton list, a list with all equal elements, but not lists with four different elements. Why? Let us break this conondrum into pieces and attack it piece by piece.

First, the inline rule \ x -> y where !x => y \ will be applied to each element in the list, by map. For each element, it will bind x to the element, then build x and assign the result to y. Thus, for each element in the list, we will assign this element to y. This explains why it works for lists with only one element; we never reassign to y. But why does it work for lists of four equal elements? Because the rule about immutability is not violated: we do not change the value of y by reassigning the same value to it, so Stratego allows us to do this.

But why does this happen? We clearly stated that we want a local rule here. The gotcha is that Stratego separates control of scopes from the local rules. A separate scoping construct, {y: s} must be used to control the scoping of variables. If no scoping has been specified, the scope of a variable will be that of its enclosing named strategy. Thus, the code above must be written:

stratego> <map({y: \ x -> y where !x => y \})> [1,2,3,4]
[1,2,3,4]

It may be a bit surprising that this works. We have not said anything about x, so logically, we should not be able to change this variable either. The difference between x and y is that x is a pattern variable. Its lifetime is restricted to the local rule. At first glance, this may seem a bit arbitrary, but after you code a bit of Stratego, it will quickly feel natural.

30.2. Debugging XT compositions

The XT component model is based on Unix pipes. Debugging XT compositions can therefore be done using many of the familiar Unix command line tools.

Checking XTC registrations.  Whenever you call XTC components using xtc-transform, the location of the component you are calling is looked up in a component registry. When invoking a component fails, it may be because the component you are calling has been removed. Checking the registrations inside a component registry is done using the xtc command:

# xtc -r /usr/local/apps/dryad/share/dryad/XTC q -a
dryad (0.1pre11840) : /usr/local/apps/dryad/dryad
dryad.m4 (0.1pre11840) : /usr/local/apps/dryad/share/dryad/dryad.m4
...

The -r option is used to specify which registry you want to inspect. The path given to -r must be the XTC registry file of your installed program transformation system that you built with Stratego/XT. By default, xtc will work on the Stratego/XT XTC repository, and only list the components provided by Stratego/XT. This is seldom what you want.

XTC registries are hierarchical. The XTC repository of your project imports (refers back to) the other projects you used in your build process, such as Stratego/XT itself. The component list you get from xtc when giving it your repository is therefore a full closure of all components visible to transformations in your project.

Now that you know how to obtain the paths for all XT components, it is easy to determine that they actually exist at the locations recorded, and that the access rights are correct.

Programs such as strace may also be useful at the lowest level of debugging, to see which parameters are passed between components, whether a given component is located correctly, and whether execution of a given component succeeds.

Format Checking.  Each component in a system built with Stratego/XT accepts a term, definable by some grammar, and outputs another term, also definable by a (possibly the same) grammar. During debugging of XT compositions, it is useful to check that the data flowing between the various components actually conform to the defined grammars. It is not always the case that the grammar in question has been defined, but you are highly encouraged to do so, see Chapter 8 for how to define regular tree grammars.

Once you have a formal declaration of your data, in the form of a regular tree grammar, you can insert calls to the format-check between your XT components to verify data correctness, i.e. the correctness of the terms.

  ast2il = 
      xtc-transform(!"format-check", !["--rtg", "language-ast.rtg"])
    ; xtc-transform(!"ast2il")
    ; xtc-transform(!"format-check", !["--rtg", "language-il.rtg"])

The ast2il component transforms from the abstract syntax tree representation of a given language to an intermediate language (IL). format-check is used to verify that the AST passed to ast2il is well-formed, and that the result obtained from ast2il is also well-formed.

Tool Debugging Options.  Most of the XT tools accept a common set of options useful when debugging. These include --keep, for adjusting the amount of intermediate results you want to keep as separate files on disk after transformation, --verbose for adjusting the level of debugging information printed by the tool, and --statistics for displaying runtime statistics.

30.3. Debugging SDF definitions

The SDF toolkit comes with some very useful debugging aids. The first is the sdfchecker command, which will analyze your SDF definition and offer a list of issues it finds. You do not need to invoke sdfchecker directly. It is invoked by the sdf2table by default, whenever you generate a parse table from a syntax definition. Be advised that the issues pointed to by sdfchecker are not always errors. Nontheless, it is usually prudent to fix them.

The other SDF debugging tool is the visamb command. visamb is used to display ambiguities in parse trees. Its usage is detailed in the command reference (visamb).

Pitfalls with Concrete Syntax.  Doing transformations with concrete syntax in Stratego, as explained in Chapter 19 depends in the correct placement of .meta files. When creating, splitting, moving or removing Stratego source files (.str files), it is important that you bring along the accompanying .meta files.

Another thing to be aware of with concrete syntax, is the presence of reserved meta variables. Typically, x, xs, e, t and f have a reserved meaning inside the concrete syntax fragments as being meta variables, i.e. variables in the Stratego language, not in the object language.

A final stumbling block is the general problem of ambiguities in the syntax definition. While SDF allows you to write ambiguous grammars, and sglr accepts these gracefully, you are not allowed to have ambiguous syntax fragments in your Stratego code. In cases where the Stratego compiler (strc) fails due to ambiguous fragments, you can run parse-stratego on your source code to see exactly which parts are ambiguous. The visamb tool should then be applied to the output from parse-stratego to visualize the ambiguities.