Transforming Java with Stratego

Martin Bravenboer

Delft University of Technology

Table of Contents

1. Getting started with Java-Front
1.1. Basics
1.2. Example: Add Blocks
1.2.1. Getting Used to Stratego
1.2.2. The Real Job
1.2.3. Source to Source Program
1.2.4. Conclusion
1.3. Example: Java Generation with Concrete Syntax
1.3.1. Basic Concrete Syntax Skills
1.3.2. The Structure is Out There
1.3.3. Using Anti-Quotation
1.3.4. More Variability
1.3.5. Using Meta Variables
2. Getting started with Dryad
2.1. Linking with the Dryad Library
2.1.1. Compilation in Automake Package
2.1.2. Standalone Compilation at Command-line
2.1.3. Explanation
2.2. Dryad on Mac OS X
2.2.1. Installing Dryad using Nix
2.2.2. Dryad on Mac OS X 10.3

Chapter 1. Getting started with Java-Front

JavaFront is a package that adds support for transforming Java programs to Stratego/XT. The main things you need to know if you want to use JavaFront for Stratego/XT are,

  • Stratego

  • some knowledge of SDF (Syntax Definition Formalism) is useful

1.1. Basics

A basic Java to Java transformation will be a pipeline:

$ parse-java -i Foo.java | ./your-transformation | pp-java

Your transformation operates on the abstract syntax of Java, represented in the ATerm format. The pp-aterm tool (part of Stratego/XT) can be used to inspect this representation:

$ parse-java -i Foo.java | pp-aterm

The parse-java tool will parse the input with a parser for Java version 1.5 (aka J2SE 5.0).

1.2. Example: Add Blocks

Let's have a look at a real transformation on Java implemented in Stratego and using Java Front. This example will show how to implement a basic transformation in Stratego with concrete or abstract syntax.

The following program does not use blocks in the if and else branch of the if-else construct. I'm rather fundamentalistic about using blocks in these constructs, so I'm going to implement a transformation that adds blocks at the places where they belong.

public class Foo
{
  public static void main(String[] ps)
  {
    if(ps.length == 0)
      System.err.println("No arguments");
    else
      System.err.println(ps.length + " arguments");
  }
}
    

1.2.1. Getting Used to Stratego

Before we really start with the interesting stuff, let's make sure that we can compile a transformation tool that does nothing at all. First, this tool reads input from stdin or a file specified with the -i option, next it does nothing with the abstract syntax tree, and last it writes the program to stdout or a file specified with -o. The Stratego library contains a strategy that does all thus: io-wrap. It takes a strategy argument that will be applied to the term that has been read from the input. The 'do nothing' strategy in Stratego is called id, so we provide this strategy for now. The resulting module is:

module add-block
imports libstratego-lib
strategies
  main =
    io-wrap(id)
    

Save this module in a file add-block.str and compile it with the Stratego compiler, linking with the standard library:

  $ strc -i add-block.str $(strcflags stratego-lib)

The result is an executable file add-block. We can use this executable to setup our first pipeline:

  $ parse-java -i Foo.java | ./add-block | pp-java

This pipeline first parses the Java file Foo.java to an abstract syntax tree, then it applies our add-block tool (which does nothing) and last it pretty-prints the abstract syntax tree to ordinary Java syntax.

1.2.2. The Real Job

Now it's about time to do something useful in our transformation tool. We need to implement a rewrite rule that wraps a statements of an if-then construct in a block, but only if it is itself not a block. Of course we also have to handle the if-then-else construct, but that is more of the same.

First we need to know how the if is represented in abstract syntax. We don't want to dive in the syntax definition yet, so let's just parse a simple Java class:

class Foo
{
  static
  {
    if(x) 
      foo();

    if(x)
    {
      foo();
    }
  }
}    

You can get a nice, structured view of the abstract syntax tree by passing the output of parse-java to the pp-aterm tool:

$ parse-java -i Foo.java | pp-aterm

This reveals that the body of the static initializer is represented as:

[ If(
    ExprName(Id("x"))
  , ExprStm(Invoke(Method(MethodName(Id("foo"))), []))
  )
, If(
    ExprName(Id("x"))
  , Block(
      [ExprStm(Invoke(Method(MethodName(Id("foo"))), []))]
    )
  )
]

As you can see, a block is represented as a Block (how surprising!). Now we can implement a rewrite rule that applies the puts a block in the then branch of an if-then construct:

   AddBlock:
     If(c, stm) -> If(c, Block([stm]))
    

This rewrite rule still needs to be applied. We can do this with a simple topdown traversal, where we try to apply this rule at every node in the abstract syntax tree. The topdown strategy is readily available in the Stratego Library that we already import as libstratego-lib. We also need to import the JavaFront library, which defines the Java language. This module, called libjava-front, is available in your installation of JavaFront. Therefore you should instruct the compiler to use JavaFront. The complete implementation is:

module add-block
imports libstratego-lib libjava-front
strategies

  main =
    io-wrap(add-block)

  add-block = 
    topdown(try(AddBlock))

rules
  
  AddBlock:
    If(c, stm) -> If(c, Block([stm]))
    

Compile the module with the following command.

$ strc -i add-block.str $(strcflags stratego-lib java-front)

Now apply the program to the test program we have used before:

class Foo
{
  static
  {
    if(x) 
      foo();

    if(x)
    {
      foo();
    }
  }
}    
$ parse-java -i Foo.java | ./add-block | pp-java

The result is:

class Foo
{
  static
  
    if(x)
    {
      foo();
    }
    if(x)
    {
      {
        foo();
      }
    }
  }
}
    

But ... that's not what we intended to achieve with our tool! The second block is now in yet another block, which is rather ugly. So, we need to extend our tool to skip if statements that already use a block. To this end, we add a condition to the rewrite rule that checks if the stm is not yet a block. The new rule is:

AddBlock:
  If(c, stm) -> If(c, Block([stm]))
    where <not(?Block(_))> stm
    

If you compile and run your new program, then you'll see that the results is exactly what we want to have.

1.2.3. Source to Source Program

The current program requires the user to invoke parse-java and pp-java before and after the real transformation. With very minor effort, it is possible to include the parsing and pretty-printing of Java in the program itself. The JavaFront library provides a strategy io-java2java-wrap, which is a variant of the strategy io-wrap. The difference is that io-java2java-wrap(s) parses the input file using the Java parser before invoking your strategy s and afterwards pretty-prints the result.

module add-block
imports libstratego-lib libjava-front
strategies

  main =
    io-java2java-wrap(add-block)

  ...
	  

Because this strategy is part of the JavaFront library, you can still compile the Stratego program with the same command:

$ strc -i add-block.str $(strcflags stratego-lib java-front)

Your source to source transformation tool can now be invoked in the following way:

$ ./add-block -i Foo.java

1.2.4. Conclusion

In this tiny example you have learned how to implement a very basic Java transformation in Stratego with abstract syntax. At this point, it might be a useful exercise to add support for different statements, such as the if-then-else, switch and for. I'm sure you can think of many more Java transformations to do next. Have a lot of fun!

1.3. Example: Java Generation with Concrete Syntax

In this example I will show how to use concrete syntax for Java inside your Stratego programs. Moreover, you will learn how to use concrete object syntax in general, since using concrete syntax for object languages is a basic feature of Stratego/XT.

So, what's the point of using concrete syntax? If you have already implemented some Java transformations using abstract syntax, then you will have noticed this yourself: using abstract syntax requires in-depth knowledge of this representation. Also, the abstract syntax fragments can be quite verbose and don't show clearly what the code actually stands for.

Fortunately, I have fooled you by letting you implement your transformations in abstract syntax first. Stratego allows you to embed the concrete syntax of the object language. Why then did I show you the abstract syntax based implementations first? Well, it is important to realize what the underlying mechanism of the transformation is. If you are only using concrete syntax, then you might think that your are not transforming a structured representation of your Java program. However, this is actually still the case when you are using concrete syntax.

1.3.1. Basic Concrete Syntax Skills

In this first example we will implement a hello generator. The generator takes the name of a person and generates a Java program that welcomes this person. But, let's start with the basic Stratego compilation skills by generating a just static hello world program. The following program shows the implementation. The concrete syntax for Java is denoted between the |[ and ]| symbols. Usually we specify the kind of syntax that is produced, in this case a compilation unit, before the quotation. Not doing this might result in ambiguities. For example, the Java fragment in this program could be parsed as a complete compilation unit or just a class declaration.

module gen-hello-world
imports
  libstratego-lib
  libjava-front

strategies

  main =
    output-wrap(generate)

  generate =
    !compilation-unit |[
       public class HelloWorld
       {
         public static void main(String[] ps)
         {
           System.err.println("Hello world!");
         }
       }
     ]|
    

Notice that this program uses an output-wrap instead of an io-wrap. The output-wrap strategy doesn't provide the argument for input, which we don't need in this example.

To compile this program you need to create a meta file. In this file you tell the compiler what syntax is used in your Stratego program. The name of this file should be gen-hello-world.meta and its content is:

Meta([
  Syntax("Stratego-Java-15")
])

Now you're ready to compile the hello world generator. The compiler needs to know where to look for the syntax definition of Java in Stratego. The commmand strcflags java-front will take care of that.

$ strc -i gen-hello-world.str $(strcflags stratego-lib java-front)

If you invoke the ./gen-hello-world program, then you'll see that the program indeed produces an abstract syntax tree. To produce a file HelloWorld.java in concrete syntax, use the following command:

$ ./gen-hello-world | pp-java -o HelloWorld.java

The file HelloWorld.java can now be compiled and executed.

1.3.2. The Structure is Out There

It is important to realize that the concrete syntax in the Stratego program is processed to a structured representation at compile-time. You can observe this by making a typo in the program. For example, forget the semicolon after the println invocation.

The pp-stratego tool (part of Stratego/XT) can be used to show the Stratego program in abstract syntax. Thus, pp-stratego shows how your program with concrete syntax translates into a plain Stratego program. pp-stratego is a very useful tool if you need to debug a Stratego program with concrete syntax, and it is motivating to apply it right now to our generator to have a look at all code that we didn't need to write.

$ pp-stratego -i gen-hello-world.str $(strcflags java-front)

You will see a rather large abstract syntax tree, which you obviously would not like to write by hand. The pp-stratego tool is especially useful in debugging somewhat smaller pieces of embedded concrete syntax, where you are using anti-quotation or meta variable, which we will discuss next.

1.3.3. Using Anti-Quotation

Next, we want to make the generatator more flexible by producing programs that can say anything you want. The message will be passed to the program as a Java expression and the generator will wrap in a complete Java program. To incorporate the message in the generated code, we need to escape from the embedded Java code to the Stratego code. In JavaFront you can escape from Java to Stratego in two ways: by using a meta variable or an anti-quotation.

Let's first have a look at anti-quotation. In this example we want to escape to Stratego in a Java expression. The anti-quotation defined for an escape at a Java expression is ~e: (or ~expr:). It is used in the following program to insert the term bound to Stratego variable msg.

module gen-print
imports
  libstratego-lib
  libjava-front

strategies

  main =
    io-wrap(generate)

rules

  generate :
    msg ->
      compilation-unit |[
        public class Print
        {
          public static void main(String[] ps)
          {
            System.err.println(~expr:msg);
          }
        }
     ]|
    

Notice that main now uses the io-wrap strategy, since the program should now accept input. The generate strategy is now implemented as a rewrite rule because we need to rewrite an expression to a full program. Such a rewriting of a term to term by defining patterns for the input and output term can concisely be expressed in a rewrite rule.

The compilation command is still the same. This time the filename is gen-print.str. Don't forget to create a gen-print.meta file, otherwise the compiler will report syntax errors in your program.

$ strc -i gen-print.str $(strcflags stratego-lib java-front)

How should our new generator be invoked? The input of the generator should be a Java expression. The parse-java tools has a -s (or --start-symbol) flag that allows you to specify the symbol that should be parsed. The following composition creates a Java program with a message provided at the command-line. You can also store the expression in a file and use the -i option of parse-java to parse from a file.

$ echo "\"I Like to Quote\"" | parse-java -s Expr | ./gen-print | pp-java

1.3.4. More Variability

In the current version of our generator the name of the class is fixed. That is, all generated programs will have the name Print. Next, we will make our generator a little bit more flexible by parameterizing it with the name of the class. First, we will use a mock strategy that returns the name of the class. After this, we'll make it a real command-line argument of the generator. Thus, we will also learn how to handle command-line arguments in Stratego.

  get-class-name =
    !"NextGenPrint"
    

The String that is returned by the get-class-name strategy should be embedded in the generated program. The name of class in a class declaration is an identifier (Id). Let's have a look at how a class declaration is represented in abstract syntax by parsing some test input:

$ echo "class Foo {}" | parse-java -s TypeDec | pp-aterm
ClassDec(
  ClassDecHead([], Id("Foo"), None, None, None)
, ClassBody([])
)

As you can see, the name of the class is represented as

Id("Foo")

. This construct corresponds to the non-terminal Id in the Java syntax definition.

"Foo"

itself corresponds to ID, which is part of the lexical syntax of Java.

The string that we want to embed in this program, should therefore be inserted as an ID. The anti-quotation for an ID is ~x:, so the solution is class ~x:name, where the name variable should be bound somewhere. We use the where clause of the rewrite for this. The strategy get-class-name => name invokes the strategy get-class-name binds the new current term to variable name (s => p is equivalent to s; ?p). The following program lists the full solution.

module gen-print
imports
  libstratego-lib
  libjava-front

strategies

  main =
    io-wrap(generate)

  get-class-name =
    !"NextGenPrint"

rules

  generate :
    msg ->
      compilation-unit |[
        public class ~x:name
        {
          public static void main(String[] ps)
          {
            System.err.println(~expr:msg);
          }
        }
     ]|
     where
       get-class-name => name
    

The invocation of the generator is still the same, since we have not implemented support for a command-line argument. Command-line arguments are passed as the current term to the main strategy of Stratego program. The util/config/options module of the Stratego library provides some abstractions for processing the arguments. We have been using this module all the time: io-wrap and output-wrap are defined in options and implement support for the -i and -o options.

The following code adds support for a command-line argument.

  main =
    io-wrap(class-name-option, generate)

  class-name-option =
    ArgOption("--name"
    , set-class-name
    , !"--name n         Generate a class with name n"
    )

  set-class-name =
    <set-config> ("class-name", <id>)

  get-class-name =
    <get-config> "class-name"
    <+ <fatal-error> ["gen-print: you must specify a class name!"]
    

First of all, notice that io-wrap now gets two arguments: a strategy for processing an additional option and the strategy that performs the real transformation in the program, generate. The class-name-option invokes the ArgOption strategy, which is used for processing options that consist of two parts: a name, typically starting with - or -- and a value. The first argument of ArgOption determines whether this option is applicable. It is usually just the key of the option. If the option is applicable, then the actual value will be passed to second argument of ArgOption. Our implementation in set-class-name just puts the value into a global configuration table. The third argument should return a string that will be shown if the user needs help. If you have compiled the program, then you will see that ./gen-print --help shows information about our new option.

An example invocation of our generator in a single pipeline:

$ echo "\"Oh, How Sweet\"" | parse-java -s Expr | ./gen-print --name "Dusted" | pp-java

We have now shown how to use anti-quotation in your Java transformations. Of course, JavaFront provides much more anti-quotations. In the future we will give an overview of all anti-quotations in this manual. Until then, please use the source code of the embedding: Stratego-Java-15.sdf. The production rules in this syntax definition correspond to (anti-)quotations.

1.3.5. Using Meta Variables

In the introduction of the anti-quotation section we mentioned that there are actually two ways of escaping from embedded Java code to the Stratego level. The first escape mechanism is anti-quotation, which is denoted by ~ followed by some identifier for the anti-quotation. The second way, which I will explain in this section, are meta variables. Meta variables are identifiers that have a special meaning in embedded Java code. For example, the identifier e refers to a Stratego variable that is bound to a Java expression. The variable x refers to an ID and a bstm refers to a block-level statements. These meta variables can be used in embedded Java code without any additional syntax. Notice that you should not use these identifiers as names for Java level variables!

As an example, let's change our implementation of the generate rule to use meta variables.

  generate :
    e ->
      compilation-unit |[
        public class x
        {
          public static void main(String[] ps)
          {
            System.err.println(e);
          }
        }
     ]|
     where
       get-class-name => x
    

Notice that the use of meta variables imposes a restriction on the names of variables: in anti-quotations you are free to choose any variable name you want, but meta variables have a fixed form. If you need more then one variable of the same kind, for example for two expressions, then you can add a number to the name (e.g. e1 and e2).

The conciseness of meta variables becomes clear in the following example, which applies some minor optimizations to a Java program. The rules are implemented in two variants: using an anti-quotation and using a meta variable.

module java-simple-opt
imports
  libstratego-lib
  libjava-front

strategies

  main =
    io-wrap(optimize)

  optimize =
    innermost(Simplify)

rules

  Simplify :
    |[ 0 + e ]| -> e

  Simplify :
    |[ ~e:e + 0 ]| -> e

  Simplify :
    |[ 1 * e ]| -> e

  Simplify :
    |[ ~e:e * 1 ]| -> e
    

Chapter 2. Getting started with Dryad

Work in Progress

This chapter is work in progress. Not all parts have been finished yet. The latest revision of this manual may contain more material. Refer to the online version.

Dryad is a collection of tools for developing transformation systems for Java source and bytecode.

2.1. Linking with the Dryad Library

For some applications, you might want to link with the Dryad library. This library has a few dependencies, such as libjvm, that make linking a bit more involved than it should be. Fortunately, you don't have to have to know all these details.

2.1.1. Compilation in Automake Package

In an autoxt-based Automake package, you can use the variable DRYAD_LIBS in the Makefile.am. This variable contains all the required linker flags, including platform specific ones.

2.1.2. Standalone Compilation at Command-line

At the command-line, the preferred way of compilation is:

$ strc -i const-prop.str $(strcflags dryad java-front)

The strcflags include the Stratego includes (-I) of these packages and special linker options required to use the Dryad library. This way of compilation works on all supported platforms, since it reuses the information the configure script of Dryad has figured out about the platform you are running on.

If you haven't seen strcflags before: it is an alias for the invocation of pkg-config. You can define it using the following command. Of course, you also use the longer pkg-config variant in the invocation of strc.

$ alias strcflags="pkg-config --variable=strcflags "

Make sure that Dryad is in the PKG_CONFIG_PATH. You can check if it is by invoking the following command. This will print a bunch of strc options. If it prints nothing, then dryad is not on the path and you can extend it by defining the PKG_CONFIG_PATH.

$ echo $(strcflags dryad)

$ export PKG_CONFIG_PATH=$dryadprefix/lib/pkgconfig:$PKG_CONFIG_PATH

2.1.3. Explanation

If you don't use the suggested ways of linking, then you probably get the following message:

$ ./const-prop -i Foo.java 
./const-prop: error while loading shared libraries: libjvm.so: cannot
open shared object file: No such file or directory

This can be solved in several ways, for example by setting the LD_LIBRARY_PATH, or by adding the runtime path of the libjvm library to the executable. This is what is done be the previously suggested solutions.

2.2. Dryad on Mac OS X

Dryad supports Mac OS X if the JDK 5.0 is installed. You need to configure 5.0 as the default JVM in the preferences, or you can set an environment variable for this:

$ export JAVA_JVM_VERSION="1.5"

If you get an UnsupportedClassVersionError, then there is something wrong with this configuration.

For Dryad, there is no need to manipulate the Current and CurrentJDK symbolic links in /System/Library/Frameworks/JavaVM.framework/Versions, which is often suggested on the Internet. In fact, this will not affect the default JVM at all for Dryad, which starts the JVM as a library using JNI, not from the command-line.

2.2.1. Installing Dryad using Nix

Users of Dryad in Nix have to install the JDK 5.0 as well: it is not included in the dependencies of Dryad in Nix. They also have to configure this JDK as the default.

2.2.2. Dryad on Mac OS X 10.3

The JDK 5.0 is not officially supported on Mac OS X 10.3, but the Java features Dryad uses work with an installation of JDK 5.0 on Mac OS X 10.3. For this, you can copy the installation of the JDK on a Mac OS X 10.4 machine to Mac OS X 10.3. Copy the directory /System/Library/Frameworks/JavaVM.framework/Versions/1.5.0 to Mac OS X 10.3 and create a symbolic link /System/Library/Frameworks/JavaVM.framework/Versions/1.5 to this directory. We advice you not to make this the global default JVM: it is safer to set the JAVA_JVM_VERSION to 1.5 for Dryad sessions only, since the 1.5.0 installation will not work for most other Java applications.