Table of Contents
Table of Contents
This chapter shows how to define a syntax definition in SDF and how to derive a number of artifacts from such a definition, i.e., a parser, a pretty-printer, and a Stratego signature. The chapter also introduces TIL, a Tiny Imperative Language, which is used in many of the examples.
TIL is a tiny imperative language designed for the demonstration of language and transformation definition formalisms. The language has two data-types, integers and strings. A TIL program consists of a list of statements, which can be variable declarations, assignments, I/O instructions, and control-flow statements. Statements can use expressions to compute values from integer and string constants, and the values of variables.
The following example gives an impression of the language.
Figure 2.1. file: til/xmpl/test1.til
// TIL program computing the factorial var n; n := readint(); var x; var fact; fact := 1; for x := 1 to n do fact := x * fact; end write("factorial of "); writeint(n); write(" is "); writeint(fact); write("\n");
This section shows a modular definition of the syntax of TIL, the generation of a parse table from that definition, and its use for parsing program texts into abstract syntax trees.
The following files define the syntax of TIL in the syntax definition formalism SDF. SDF is a modular formalism that supports the definition of lexical and context-free syntax. Modularity entails that a syntax definition can be composed from multiple modules, and that such modules can be reused in multiple syntax definitions. Also, syntax definitions of different languages can be combined.
Module TIL-layout
defines the syntax of
LAYOUT
, the symbols that occur between every two
context-free non-terminals. This is the way to introduce the
whitespace and comments for a language. The definition uses character classes to
indicate classes of characters. For instance, whitespace consists
of spaces, tabs, newlines, and carriage returns. Likewise,
comments consist of two slashes followed by zero or more
characters which are not newlines or carriage returns. The follow restriction
prevents ambiguities in parsing layout.
Figure 2.2. file: til/syn/TIL-layout.sdf
module TIL-layout exports lexical syntax [\ \t\n\r] -> LAYOUT "//" ~[\n\r]* -> LAYOUT context-free restrictions LAYOUT? -/- [\ \t\n\r]
Module TIL-literals
defines the syntax of
identifiers, integer constants, and string literals. Note again
the use of character
classes to indicate collections of characters and regular expressions to
indicate lists of zero or more (*
), or one or more
(+
) elements. String characters
(StrChar
) are any characters other than double
quote, backslash, or newline, and escaped versions of these
characters.
Figure 2.3. file: til/syn/TIL-literals.sdf
module TIL-literals exports sorts Id Int String StrChar lexical syntax [A-Za-z][A-Za-z0-9]* -> Id [0-9]+ -> Int "\"" StrChar* "\"" -> String ~[\"\\\n] -> StrChar [\\][\"\\n] -> StrChar lexical restrictions Id -/- [A-Za-z0-9] Int -/- [0-9]
Module TIL-expressions
defines the syntax of
expressions that compute a value. Basic expressions are
identifiers (variables), integer constants, and string literals.
More complex expressions are obtained by the arithmetic and
relational operators.
The constructor attributes of productions (e.g.,
cons("Add")
) indicates the constructor to be used in
the construction of an abstract syntax tree from a parse tree.
Ambiguities of and between productions are solved by means of
associativity and priority declarations.
Figure 2.4. file: til/syn/TIL-expressions.sdf
module TIL-expressions imports TIL-literals exports sorts Exp context-free syntax "true" -> Exp {cons("True")} "false" -> Exp {cons("False")} Id -> Exp {cons("Var")} Int -> Exp {cons("Int")} String -> Exp {cons("String")} Exp "*" Exp -> Exp {cons("Mul"),assoc} Exp "/" Exp -> Exp {cons("Div"),assoc} Exp "%" Exp -> Exp {cons("Mod"),non-assoc} Exp "+" Exp -> Exp {cons("Add"),assoc} Exp "-" Exp -> Exp {cons("Sub"),left} Exp "<" Exp -> Exp {cons("Lt"),non-assoc} Exp ">" Exp -> Exp {cons("Gt"),non-assoc} Exp "<=" Exp -> Exp {cons("Leq"),non-assoc} Exp ">=" Exp -> Exp {cons("Geq"),non-assoc} Exp "=" Exp -> Exp {cons("Equ"),non-assoc} Exp "!=" Exp -> Exp {cons("Neq"),non-assoc} Exp "&" Exp -> Exp {cons("And"),assoc} Exp "|" Exp -> Exp {cons("Or"),assoc} "(" Exp ")" -> Exp {bracket} context-free priorities {left: Exp "*" Exp -> Exp Exp "/" Exp -> Exp } > {left: Exp "+" Exp -> Exp Exp "-" Exp -> Exp } > {non-assoc: Exp "<" Exp -> Exp Exp ">" Exp -> Exp Exp "<=" Exp -> Exp Exp ">=" Exp -> Exp Exp "=" Exp -> Exp Exp "!=" Exp -> Exp } > Exp "&" Exp -> Exp > Exp "|" Exp -> Exp
Module TIL-statements
defines the syntax of
statements, i.e., instructions to be executed. The assignment
statement assigns the value of the right-hand side expression to
the variable in the left-hand side. The read
and
write
statements read a value from standard input or
write a value to standard output. The control-flow constructs use
lists of statements.
Figure 2.5. file: til/syn/TIL-statements.sdf
module TIL-statements imports TIL-expressions TIL-types exports sorts Stat context-free syntax "var" Id ";" -> Stat {cons("Declaration")} "var" Id ":" Type ";" -> Stat {cons("DeclarationTyped")} Id ":=" Exp ";" -> Stat {cons("Assign")} "begin" Stat* "end" -> Stat {cons("Block")} "if" Exp "then" Stat* "end" -> Stat {cons("IfThen")} "if" Exp "then" Stat* "else" Stat* "end" -> Stat {cons("IfElse")} "while" Exp "do" Stat* "end" -> Stat {cons("While")} "for" Id ":=" Exp "to" Exp "do" Stat* "end" -> Stat {cons("For")}
Module TIL-calls
defines the syntax of function and
procedure calls. Even though TIL does not have function
definitions, it is useful to be able to call primitive functions
and procedures.
Figure 2.6. file: til/syn/TIL-calls.sdf
module TIL-calls imports TIL-expressions TIL-statements exports context-free syntax Id "(" {Exp ","}* ")" -> Exp {cons("FunCall")} Id "(" {Exp ","}* ")" ";" -> Stat {cons("ProcCall")}
Module TIL
(Figure 2.7)
defines the syntax of the complete language by importing the
modules above, and defining a Program
as a list of
statements. In addition, the module introduces a
start-symbol
. This is the sort that a parser will
start parsing with. There may be multiple start symbols.
Figure 2.7. file: til/syn/TIL.sdf
module TIL imports TIL-layout TIL-literals TIL-expressions TIL-statements TIL-calls exports sorts Program context-free syntax Stat* -> Program {cons("Program")} context-free start-symbols Program
The following maak
script first collects the modules
of the syntax definition and then generates a parse table.
The result of pack-sdf is a `definition'
file (as opposed to a single module file, as we saw above) that
contains all modules imported by the main module,
TIL.sdf
in this case (Figure 2.9).
The parser generator sdf2table creates a
parse table from a syntax definition. Note the use of the
-m
flag to indicate the main module from which to
generate the table. The parse table (TIL.tbl
) is a
file in ATerm format, that is interpreted by the sglri tool to parse text files.
Figure 2.8. file: til/syn/maak
#! /bin/sh -e # collect modules pack-sdf -i TIL.sdf -o TIL.def # generate parse table sdf2table -i TIL.def -o TIL.tbl -m TIL
Figure 2.9. file: til/syn/TIL.def
definition module TIL-calls imports TIL-expressions TIL-statements exports context-free syntax Id "(" {Exp ","}* ")" -> Exp {cons("FunCall")} Id "(" {Exp ","}* ")" ";" -> Stat {cons("ProcCall")} module TIL-types imports TIL-literals exports sorts Type context-free syntax Id -> Type {cons("TypeName")} module TIL-statements imports TIL-expressions TIL-types exports sorts Stat context-free syntax "var" Id ";" -> Stat {cons("Declaration")} "var" Id ":" Type ";" -> Stat {cons("DeclarationTyped")} Id ":=" Exp ";" -> Stat {cons("Assign")} "begin" Stat* "end" -> Stat {cons("Block")} "if" Exp "then" Stat* "end" -> Stat {cons("IfThen")} "if" Exp "then" Stat* "else" Stat* "end" -> Stat {cons("IfElse")} "while" Exp "do" Stat* "end" -> Stat {cons("While")} "for" Id ":=" Exp "to" Exp "do" Stat* "end" -> Stat {cons("For")} module TIL-expressions imports TIL-literals exports sorts Exp context-free syntax "true" -> Exp {cons("True")} "false" -> Exp {cons("False")} Id -> Exp {cons("Var")} Int -> Exp {cons("Int")} String -> Exp {cons("String")} Exp "*" Exp -> Exp {cons("Mul"),assoc} Exp "/" Exp -> Exp {cons("Div"),assoc} Exp "%" Exp -> Exp {cons("Mod"),non-assoc} Exp "+" Exp -> Exp {cons("Add"),assoc} Exp "-" Exp -> Exp {cons("Sub"),left} Exp "<" Exp -> Exp {cons("Lt"),non-assoc} Exp ">" Exp -> Exp {cons("Gt"),non-assoc} Exp "<=" Exp -> Exp {cons("Leq"),non-assoc} Exp ">=" Exp -> Exp {cons("Geq"),non-assoc} Exp "=" Exp -> Exp {cons("Equ"),non-assoc} Exp "!=" Exp -> Exp {cons("Neq"),non-assoc} Exp "&" Exp -> Exp {cons("And"),assoc} Exp "|" Exp -> Exp {cons("Or"),assoc} "(" Exp ")" -> Exp {bracket} context-free priorities {left: Exp "*" Exp -> Exp Exp "/" Exp -> Exp } > {left: Exp "+" Exp -> Exp Exp "-" Exp -> Exp } > {non-assoc: Exp "<" Exp -> Exp Exp ">" Exp -> Exp Exp "<=" Exp -> Exp Exp ">=" Exp -> Exp Exp "=" Exp -> Exp Exp "!=" Exp -> Exp } > Exp "&" Exp -> Exp > Exp "|" Exp -> Exp module TIL-literals exports sorts Id Int String StrChar lexical syntax [A-Za-z][A-Za-z0-9]* -> Id [0-9]+ -> Int "\"" StrChar* "\"" -> String ~[\"\\\n] -> StrChar [\\][\"\\n] -> StrChar lexical restrictions Id -/- [A-Za-z0-9] Int -/- [0-9] module TIL-layout exports lexical syntax [\ \t\n\r] -> LAYOUT "//" ~[\n\r]* -> LAYOUT context-free restrictions LAYOUT? -/- [\ \t\n\r] module TIL imports TIL-layout TIL-literals TIL-expressions TIL-statements TIL-calls exports sorts Program context-free syntax Stat* -> Program {cons("Program")} context-free start-symbols Program
The sglri tool parses a text file given a parse table generated by sdf2table. The result is an abstract syntax term in the ATerm format. In order to inspect this term it is useful to `pretty-print' it using the pp-aterm tool. Compare the resulting term with the program in Figure 2.1.
Figure 2.10. file: til/xmpl/parse-test
# parse input file sglri -p ../syn/TIL.tbl -i test1.til -o test1.ast # `pretty-print' abstract syntax term pp-aterm -i test1.ast -o test1.atil
Figure 2.11. file: til/xmpl/test1.atil
Program( [ Declaration("n") , Assign("n", FunCall("readint", [])) , Declaration("x") , Declaration("fact") , Assign("fact", Int("1")) , For( "x" , Int("1") , Var("n") , [Assign("fact", Mul(Var("x"), Var("fact")))] ) , ProcCall("write", [String("\"factorial of \"")]) , ProcCall("writeint", [Var("n")]) , ProcCall("write", [String("\" is \"")]) , ProcCall("writeint", [Var("fact")]) , ProcCall("write", [String("\"\\n\"")]) ] )
The result of parsing a text is an abstract syntax tree represented by means of an ATerm. When produced by a parser, one can be sure that an ATerm has the right format, since it was derived directly from a parse tree. However, terms can also be produced by other components, e.g., be the result of a transformation. In those cases it may worthwhile to check that the term is well-formed according to some schema. In Stratego/XT tree schemas are described by Regular Tree Grammars (RTGs). Stratego signatures are used within Stratego programs to verify some aspects of Stratego programs. RTGs and signatures can be derived automatically from a syntax definition in SDF.
The following maak
scripts derives from a syntax
definition first an RTG, and from the RTG a Stratego signature.
Figure 2.12. file: til/sig/maak
#! /bin/sh -e # generate regular tree grammar from syntax definition sdf2rtg -i ../syn/TIL.def -o TIL.rtg -m TIL # generate Stratego signature from regular tree grammar rtg2sig -i TIL.rtg -o TIL.str
A regular tree grammar defines well-formedness rules for a set of trees (or terms). The following regular tree grammar has been generated from the syntax definition of TIL and precisely describes the abstract syntax trees of TIL programs.
Figure 2.13. file: til/sig/TIL.rtg
regular tree grammar start Program productions ListStarOfStat0 -> ListPlusOfStat0 ListStarOfStat0 -> <nil>() ListStarOfStat0 -> <conc>(ListStarOfStat0,ListStarOfStat0) ListPlusOfStat0 -> <conc>(ListStarOfStat0,ListPlusOfStat0) ListPlusOfStat0 -> <conc>(ListPlusOfStat0,ListStarOfStat0) ListPlusOfStat0 -> <conc>(ListPlusOfStat0,ListPlusOfStat0) ListPlusOfStat0 -> <cons>(Stat,ListStarOfStat0) ListStarOfExp0 -> ListPlusOfExp0 ListStarOfExp0 -> <nil>() ListStarOfExp0 -> <conc>(ListStarOfExp0,ListStarOfExp0) ListPlusOfExp0 -> <conc>(ListStarOfExp0,ListPlusOfExp0) ListPlusOfExp0 -> <conc>(ListPlusOfExp0,ListStarOfExp0) ListPlusOfExp0 -> <conc>(ListPlusOfExp0,ListPlusOfExp0) ListPlusOfExp0 -> <cons>(Exp,ListStarOfExp0) ListStarOfStrChar0 -> <string> ListPlusOfStrChar0 -> <string> Program -> Program(ListStarOfStat0) Stat -> ProcCall(Id,ListStarOfExp0) Exp -> FunCall(Id,ListStarOfExp0) Stat -> For(Id,Exp,Exp,ListStarOfStat0) Stat -> While(Exp,ListStarOfStat0) Stat -> IfElse(Exp,ListStarOfStat0,ListStarOfStat0) Stat -> IfThen(Exp,ListStarOfStat0) Stat -> Block(ListStarOfStat0) Stat -> Assign(Id,Exp) Stat -> DeclarationTyped(Id,Type) Stat -> Declaration(Id) Type -> TypeName(Id) Exp -> Or(Exp,Exp) Exp -> And(Exp,Exp) Exp -> Neq(Exp,Exp) Exp -> Equ(Exp,Exp) Exp -> Geq(Exp,Exp) Exp -> Leq(Exp,Exp) Exp -> Gt(Exp,Exp) Exp -> Lt(Exp,Exp) Exp -> Sub(Exp,Exp) Exp -> Add(Exp,Exp) Exp -> Mod(Exp,Exp) Exp -> Div(Exp,Exp) Exp -> Mul(Exp,Exp) Exp -> String(String) Exp -> Int(Int) Exp -> Var(Id) Exp -> False() Exp -> True() StrChar -> <string> String -> <string> Int -> <string> Id -> <string>
Algebraic signatures are similar to regular tree grammars. Stratego requires signatures for the declaration of term constructors to be used in transformation programs. The following Stratego signature is generated from the regular tree grammar above, and thus describes the constructors of TIL abstract syntax trees.
Figure 2.14. file: til/sig/TIL.str
module TIL signature constructors Program : List(Stat) -> Program ProcCall : Id * List(Exp) -> Stat For : Id * Exp * Exp * List(Stat) -> Stat While : Exp * List(Stat) -> Stat IfElse : Exp * List(Stat) * List(Stat) -> Stat IfThen : Exp * List(Stat) -> Stat Block : List(Stat) -> Stat Assign : Id * Exp -> Stat DeclarationTyped : Id * Type -> Stat Declaration : Id -> Stat TypeName : Id -> Type FunCall : Id * List(Exp) -> Exp Or : Exp * Exp -> Exp And : Exp * Exp -> Exp Neq : Exp * Exp -> Exp Equ : Exp * Exp -> Exp Geq : Exp * Exp -> Exp Leq : Exp * Exp -> Exp Gt : Exp * Exp -> Exp Lt : Exp * Exp -> Exp Sub : Exp * Exp -> Exp Add : Exp * Exp -> Exp Mod : Exp * Exp -> Exp Div : Exp * Exp -> Exp Mul : Exp * Exp -> Exp String : String -> Exp Int : Int -> Exp Var : Id -> Exp False : Exp True : Exp : String -> String : String -> Int : String -> Id signature constructors Some : a -> Option(a) None : Option(a) signature constructors Cons : a * List(a) -> List(a) Nil : List(a) Conc : List(a) * List(a) -> List(a)
The well-formedness of a term with respect to a regular tree grammar can be checked using the format-check tool. When well-formed the tool prints the type of the term. If not it indicates, which subterms cannot be typed. The following examples illustrate checking of a well-formed and non well-formed term.
Figure 2.15. file: til/xmpl/fc-test1
format-check --rtg ../sig/TIL.rtg -i test1.atil -s Program 2> test1.atil.fc
Figure 2.17. file: til/xmpl/test1-wrong.atil
Program( [ Declaration("fact") , Assig("fact", Int("1")) , Assign("fact", Mul("x", Var("fact"))) ] )
Figure 2.18. file: til/xmpl/fc-test2
format-check --rtg ../sig/TIL.rtg -i test1-wrong.atil -s Program 2> test1-wrong.atil.fc
Figure 2.19. file: til/xmpl/test1-wrong.atil.fc
error: cannot type Assig("fact",Int("1")) inferred types of subterms: typed "fact" as String, Int, Id, <string> typed Int("1") as Exp error: cannot type Mul("x",Var("fact")) inferred types of subterms: typed "x" as String, Int, Id, <string> typed Var("fact") as Exp
After transforming a program we need to turn it into a program
text again. Unparsing is the reverse of parsing and turns an
abstract syntax tree into a text. Pretty-printing is unparsing
with an attempt at creating a readable program text.
There is a direct correspondence between abstract syntax trees
and the program text from which they were produced defined by the
syntax definition. This can be used in the other direction as well
to generate a pretty-printer from a syntax definition.
The following maak
script generates from the TIL
syntax definition a pretty-print table TIL.pp
using
ppgen and a parenthesizer using sdf2parenthesize. The latter is a Stratego
program, which is compiled using the Stratego compiler strc.
Figure 2.20. file: til/pp/maak
#! /bin/sh -e # generate pretty-print table from syntax definition ppgen -i ../syn/TIL.def -o TIL.pp # generate program to insert parentheses sdf2parenthesize -i ../syn/TIL.def -o til-parenthesize.str -m TIL --lang TIL # compile the generated program strc -i til-parenthesize.str -I ../sig -m io-til-parenthesize -la stratego-lib
A pretty-print table defines a mapping from abstract syntax trees to expressions in the Box Formatting Language. The following pretty-print table is generated from the syntax definition for TIL. It is a default table and only ensures that the program text resulting from pretty-printing is syntactically correct, not that it is actually pretty.
Figure 2.21. file: til/pp/TIL.pp
[ FunCall -- _1 KW["("] _2 KW[")"], FunCall.2:iter-star-sep -- _1 KW[","], ProcCall -- _1 KW["("] _2 KW[")"] KW[";"], ProcCall.2:iter-star-sep -- _1 KW[","], TypeName -- _1, Declaration -- KW["var"] _1 KW[";"], DeclarationTyped -- KW["var"] _1 KW[":"] _2 KW[";"], Assign -- _1 KW[":="] _2 KW[";"], Block -- V [V vs=2 [KW["begin"] _1] KW["end"]], Block.1:iter-star -- _1, IfThen -- KW["if"] _1 KW["then"] _2 KW["end"], IfThen.2:iter-star -- _1, IfElse -- KW["if"] _1 KW["then"] _2 KW["else"] _3 KW["end"], IfElse.2:iter-star -- _1, IfElse.3:iter-star -- _1, While -- KW["while"] _1 KW["do"] _2 KW["end"], While.2:iter-star -- _1, For -- KW["for"] _1 KW[":="] _2 KW["to"] _3 KW["do"] _4 KW["end"], For.4:iter-star -- _1, True -- KW["true"], False -- KW["false"], Var -- _1, Int -- _1, String -- _1, Mul -- _1 KW["*"] _2, Div -- _1 KW["/"] _2, Mod -- _1 KW["%"] _2, Add -- _1 KW["+"] _2, Sub -- _1 KW["-"] _2, Lt -- _1 KW["<"] _2, Gt -- _1 KW[">"] _2, Leq -- _1 KW["<="] _2, Geq -- _1 KW[">="] _2, Equ -- _1 KW["="] _2, Neq -- _1 KW["!="] _2, And -- _1 KW["&"] _2, Or -- _1 KW["|"] _2, Program -- _1, Program.1:iter-star -- _1 ]
A pretty-print table is applied using the ast2text tool, which translates an abstract
syntax term to text given a pretty-print table. (In fact,
ast2text is a composition of ast2abox and
abox2text.) The pp-test1
script shows how to use a pretty-print table. The result of
unparsing the AST for the test1.til
program is
clearly not very pretty, but it is a syntactically correct TIL
program. This is tested by the subsequent commands in the script,
which parse the test1.txt1
program and compare the
resulting AST to the original AST. It turns out that the two
programs have the exact same AST.
Figure 2.22. file: til/xmpl/pp-test1
# abstract syntax term to text ast2text -p ../pp/TIL.pp -i test1.atil -o test1.txt1 # test unparsed code sglri -p ../syn/TIL.tbl -i test1.txt1 | pp-aterm -o test1.atxt1 # test if the terms are the same diff test1.atil test1.atxt1
Figure 2.23. file: til/xmpl/test1.txt1
var n ; n := readint ( ) ; var x ; var fact ; fact := 1 ; for x := 1 to n do fact := x * fact ; end write ( "factorial of " ) ; writeint ( n ) ; write ( " is " ) ; writeint ( fact ) ; write ( "\n" ) ;
To get more readable programs after pretty-printing we adapt the generated pretty-print table using constructs from the Box Formatting Language to indicate how each construct should be formatted.
Figure 2.24. file: til/pp/TIL-pretty.pp
[ FunCall -- H hs=0 [_1 KW["("] H [_2] KW[")"]], FunCall.2:iter-star-sep -- H hs=0 [_1 KW[","]], ProcCall -- H hs=0 [_1 KW["("] H [_2] KW[")"] KW[";"]], ProcCall.2:iter-star-sep -- H hs=0 [_1 KW[","]], Declaration -- H hs=0 [H [KW["var"] _1] KW[";"]], DeclarationTyped -- H hs=0 [H [KW["var"] _1 KW[":"] _2] KW[";"]], Assign -- H hs=0 [H [_1 KW[":="] _2] KW[";"]], Read -- H hs=0 [H [KW["read"] _1] KW[";"]], Write -- H hs=0 [H [KW["write"] _1] KW[";"]], Block -- V [V is=2 [KW["begin"] V [_1]] KW["end"]], Block.1:iter-star -- _1, IfThen -- V [V is=2 [H [KW["if"] _1 KW["then"]] V [_2]] KW["end"]], IfThen.2:iter-star -- _1, IfElse -- V [V is=2 [H [KW["if"] _1 KW["then"]] V [_2]] V is=2 [KW["else"] V [_3]] KW["end"]], IfElse.2:iter-star -- _1, IfElse.3:iter-star -- _1, While -- V [V is=2 [H [KW["while"] _1 KW["do"]] _2] KW["end"]], While.2:iter-star -- _1, For -- V [V is=2 [H [KW["for"] _1 KW[":="] _2 KW["to"] _3 KW["do"]] _4] KW["end"]], For.4:iter-star -- _1, True -- KW["true"], False -- KW["false"], Var -- _1, Int -- _1, String -- _1, Mul -- H hs=1 [_1 KW["*"] _2], Div -- H hs=1 [_1 KW["/"] _2], Mod -- H hs=1 [_1 KW["%"] _2], Add -- H hs=1 [_1 KW["+"] _2], Sub -- H hs=1 [_1 KW["-"] _2], Lt -- H hs=1 [_1 KW["<"] _2], Gt -- H hs=1 [_1 KW[">"] _2], Leq -- H hs=1 [_1 KW["<="] _2], Geq -- H hs=1 [_1 KW[">="] _2], Equ -- H hs=1 [_1 KW["="] _2], Neq -- H hs=1 [_1 KW["!="] _2], And -- H hs=1 [_1 KW["&"] _2], Or -- H hs=1 [_1 KW["|"] _2], Program -- V [_1], Program.1:iter-star -- _1, Parenthetical -- H hs=0 ["(" _1 ")"], TypeName -- _1 ]
Using the same procedure as before, but using the adapted pretty-print table (see til/xmpl/pp-test2) we not get a program that is much closer to the original.
Figure 2.25. file: til/xmpl/test1.txt2
var n; n := readint(); var x; var fact; fact := 1; for x := 1 to n do fact := x * fact; end write("factorial of "); writeint(n); write(" is "); writeint(fact); write("\n");
The til-parenthesize
program generated by sdf2parenthesize is a simple rewrite system with
rules that add a Parenthetical
constructor around
subtrees that have a priority or associativity conflict with
their parent node.
The implementation in til/pp/til-parenthesize.str is not of interest here.
The program is used before applying the pretty-print table, as
illustrated with the following example. The
test2.txt1
program is produced without introducing
parentheses and clearly has a different meaning than the original
program. The test2.txt2
program has parentheses in
the right places.
Figure 2.26. file: til/xmpl/pp-test3
# pretty-print without parentheses ast2text -p ../pp/TIL-pretty.pp -i test2.atil -o test2.txt1 # add parentheses to abstract syntax term ../pp/til-parenthesize -i test2.atil | pp-aterm -o test2.atil.par # pretty-print without parentheses ast2text -p ../pp/TIL-pretty.pp -i test2.atil.par -o test2.txt2
Figure 2.27. file: til/xmpl/test2.atil
Program( [ IfElse( Equ( Mul(Var("x"), Add(Var("y"), Int("10"))) , Int("34") ) , [ Assign( "x" , Div(Var("x"), Sub(Var("y"), Int("1"))) ) ] , [ Assign( "x" , Mul(Var("x"), Div(Var("y"), Int("3"))) ) ] ) ] )
Figure 2.28. file: til/xmpl/test2.txt1
if x * y + 10 = 34 then x := x / y - 1; else x := x * y / 3; end
Figure 2.29. file: til/xmpl/test2.atil.par
Program( [ IfElse( Equ( Mul( Var("x") , Parenthetical(Add(Var("y"), Int("10"))) ) , Int("34") ) , [ Assign( "x" , Div( Var("x") , Parenthetical(Sub(Var("y"), Int("1"))) ) ) ] , [ Assign( "x" , Mul( Var("x") , Parenthetical(Div(Var("y"), Int("3"))) ) ) ] ) ] )
Figure 2.30. file: til/xmpl/test2.txt2
if x * (y + 10) = 34 then x := x / (y - 1); else x := x * (y / 3); end
Given an SDF syntax definition we can produce a parser, a format
checker, a parenthesizer, and a pretty-printer. Together these
tools form the basic ingredients of a transformation
pipeline. The composition in til-process
shows how
the tools are combined. This composition is not so useful as it
only turns a program in a pretty-printed version of itself. In
the next chapters we'll see how this pipeline can be extended
with transformation tools.
Figure 2.31. file: til/xmpl/til-process
# a parse, check, parenthesize, pretty-print pipeline sglri -p ../syn/TIL.tbl -i $1 |\ format-check --rtg ../sig/TIL.rtg |\ ../pp/til-parenthesize |\ ast2text -p ../pp/TIL-pretty.pp -o $2
Table of Contents
Given the infrastructure presented in the previous chapter we can create a
basic pipeline for performing transformations on abstract syntax
trees. What is missing is the actual transformations to
perform. In the Stratego/XT setting, we use the Stratego language
for defining transformations on abstract syntax trees.
In this chapter we show how to define a collection of basic term
rewrite rules, and how to combine these using the standard
innermost
strategy to simplify TIL programs.
Stratego is a modular programming languages and supports the definition of named rewrite rules. These rules can be used in many different transformations by importing the module defining them and using their names as arguments of rewriting strategies.
Module til-eval
imports the signature of TIL, which
defines the constructors of the abstract syntax of TIL. Then the
module defines rewrite rules for evaluating constructs that are
(partially) constant. For instance, the addition of two integer
constants can be reduced to the integer constant of their sum.
Figure 3.1. file: til/sim/til-eval.str
module til-eval imports TIL rules EvalAdd : Add(Int(i), Int(j)) -> Int(<addS>(i,j)) EvalAdd : Add(String(i), String(j)) -> String(<conc-strings>(i,j)) EvalSub : Sub(Int(i), Int(j)) -> Int(<subtS>(i,j)) EvalMul : Mul(Int(i), Int(j)) -> Int(<mulS>(i,j)) EvalDiv : Div(Int(i), Int(j)) -> Int(<divS>(i,j)) EvalMod : Mod(Int(i), Int(j)) -> Int(<modS>(i,j)) EvalLt : Lt(Int(i), Int(j)) -> <compare(ltS)>(i,j) EvalGt : Gt(Int(i), Int(j)) -> <compare(gtS)>(i,j) EvalLeq : Leq(Int(i), Int(j)) -> <compare(leqS)>(i,j) EvalGeq : Geq(Int(i), Int(j)) -> <compare(geqS)>(i,j) EvalEqu : Equ(Int(i), Int(j)) -> <compare(eq)>(i,j) EvalNeq : Neq(Int(i), Int(j)) -> <compare(not(eq))>(i,j) compare(s) = if s then !True() else !False() end EvalOr : Or(True(), e) -> True() EvalOr : Or(False(), e) -> e EvalAnd : And(True(), e) -> e EvalAnd : And(False(), e) -> False() AddZero : Add(e, Int("0")) -> e AddZero : Add(Int("0"), e) -> e MulOne : Mul(e, Int("1")) -> e MulOne : Mul(Int("1"), e) -> e EvalS2I : FunCall("string2int", [String(x)]) -> Int(x) where <string-to-int> x EvalI2S : FunCall("int2string", [Int(i)]) -> String(i) EvalIf : IfElse(False(), st1*, st2*) -> Block(st2*) EvalIf : IfElse(True(), st1*, st2*) -> Block(st1*) EvalWhile : While(False(), st*) -> Block([])
Another common kind of rules are desugaring rules, which define a
language construct in terms of other language constructs. Module
til-desugar
defines rules ForToWhile
and IfThenToIfElse
. The ForToWhile
rule
rewrites For
loops to a While
loop,
which requires the introduction of a new variable to hold the
value of the upperbound. The IfThenToIfElse
transforms if-then
statements into
if-then-else
statements.
Figure 3.2. file: til/sim/til-desugar.str
module til-desugar imports TIL rules ForToWhile : For(x, e1, e2, st*) -> Block([ DeclarationTyped(y, TypeName("int")), Assign(x, e1), Assign(y, e2), While(Leq(Var(x), Var(y)), <conc>(st*, [Assign(x, Add(Var(x), Int("1")))]) ) ]) where new => y IfThenToIfElse : IfThen(e, st*) -> IfElse(e, st*, []) DefaultDeclaration : Declaration(x) -> DeclarationTyped(x, TypeName("int")) WriteInt : ProcCall("writeint", [e]) -> ProcCall("write", [FunCall("int2string", [e])]) ReadInt : FunCall("readint", []) -> FunCall("string2int", [FunCall("read", [])]) BinOpToFunCall : op#([e1,e2]){t*} -> FunCall(op, [e1,e2]){t*} where <is-bin-op> op FunCallToBinOp : FunCall(op, [e1,e2]){t*} -> op#([e1,e2]){t*} where <is-bin-op> op is-bin-op = "Add" <+ "Mul" <+ "Subt" <+ "Div" <+ "Mod" <+ "Geq" <+ "Leq" <+ "Gt" <+ "Lt" <+ "Equ" <+ "Neq"
Module til-simplify
is a basic Stratego program. It
imports the til-eval
and til-desugar
modules, and the Stratego standard library (liblib
),
which defines standard rewriting strategies and additional
utilities for I/O and such.
The io-til-simplify
strategy of
til-simplify
represents the entry point for the
program when it is invoked. It uses the io-wrap
strategy to parse command-line arguments, read the input term,
and write the output term.
The til-simplify
strategy defines the actual
transformation, which is a complete normalization of the input
term with respect to the rewrite rules introduced above. The
normalization strategy chosen here is innermost
,
which exhaustively applies its argument strategy starting with
the innermost nodes of the tree.
Figure 3.3. file: til/sim/til-simplify.str
module til-simplify imports liblib til-eval til-desugar strategies io-til-simplify = io-wrap(til-simplify) til-simplify = innermost( ForToWhile <+ IfThenToIfElse <+ AddZero <+ MulOne <+ EvalAdd <+ EvalMul <+ EvalSub <+ EvalDiv <+ EvalLeq <+ EvalGeq <+ EvalEqu <+ EvalNeq <+ DefaultDeclaration <+ ReadInt <+ WriteInt )
Stratego programs are compiled to executable programs by the
Stratego compiler strc. The -I
option is used to indicate that some modules imported by this
program (TIL.str
) resides in the ../sig
directory. The -la
option is used to link the
separately compiled Stratego library.
Figure 3.4. file: til/sim/maak
#! /bin/sh -e # compile til-simplify strc -i til-simplify.str -I ../sig -la stratego-lib -m io-til-simplify
A compiled Stratego program is an ordinary executable. When the
io-wrap
strategy was used the program has a
-i
option for indicating the input file, and a
-o
option to indicate the output file. Thus, the
program reads the ATerm in the input file, transforms it, and
writes the resulting ATerm back to the output file.
The following test script shows how the til/xmpl/test1.til file is simplified, and the result pretty-printed.
Figure 3.5. file: til/xmpl/simplify-test
# simplify program ../sim/til-simplify -i test1.atil -o test1.sim.atil # check result format-check --rtg ../sig/TIL.rtg -i test1.sim.atil # parentheses ../pp/til-parenthesize -i test1.sim.atil -o test1.sim.par.atil # pretty-print ast2text -p ../pp/TIL-pretty.pp -i test1.sim.par.atil -o test1.sim.txt
Table 3.1. files: til/xmpl/test1.til, til/xmpl/test1.sim.txt
before | after |
---|---|
// TIL program computing the factorial var n; n := readint(); var x; var fact; fact := 1; for x := 1 to n do fact := x * fact; end write("factorial of "); writeint(n); write(" is "); writeint(fact); write("\n"); | var n : int; n := string2int(read()); var x : int; var fact : int; fact := 1; begin var a_0 : int; x := 1; a_0 := n; while x <= a_0 do fact := x * fact; x := x + 1; end end write("factorial of "); write(int2string(n)); write(" is "); write(int2string(fact)); write("\n"); |
Table of Contents
This chapter is work in progress. Not all parts have been finished yet. The latest revision of this manual may contain more material. Refer to the online version.
renaming of bound variables
Figure 4.1. file: til/renaming/til-rename-vars.str
module til-rename-vars imports TIL liblib strategies io-til-rename-vars = io-wrap(til-rename-vars) til-rename-vars = Var(RenameVar) <+ Assign(RenameVar, til-rename-vars) // <+ Read(RenameVar) <+ For(RenameVar, til-rename-vars, til-rename-vars, til-rename-vars) <+ RenameDeclaration <+ Block({| RenameVar : map(til-rename-vars) |}) <+ all(til-rename-vars) RenameDeclaration : Declaration(x) -> Declaration(y) where <newname> x => y ; rules( RenameVar : x -> y ) RenameDeclaration : DeclarationTyped(x, t) -> DeclarationTyped(y, t) where <newname> x => y ; rules( RenameVar : x -> y )
Figure 4.2. file: til/xmpl/rename-test
sglri -p ../syn/TIL.tbl -i test1.til |\ ../renaming/til-rename-vars |\ ast2text -p ../pp/TIL-pretty.pp -o test1.rn.txt
Table 4.1. files: til/xmpl/test1.til, til/xmpl/test1.rn.txt
before | after |
---|---|
// TIL program computing the factorial var n; n := readint(); var x; var fact; fact := 1; for x := 1 to n do fact := x * fact; end write("factorial of "); writeint(n); write(" is "); writeint(fact); write("\n"); | var n0; n0 := readint(); var x0; var fact0; fact0 := 1; for x0 := 1 to n0 do fact0 := x0 * fact0; end write("factorial of "); writeint(n0); write(" is "); writeint(fact0); write("\n"); |
Table of Contents
This chapter is work in progress. Not all parts have been finished yet. The latest revision of this manual may contain more material. Refer to the online version.
propagation of variable types and type annotation of expressions and statements
Figure 5.1. file: til/tc/til-typecheck.str
module til-typecheck imports liblib til-typecheck-stats strategies io-til-typecheck = io-wrap(typecheck-program) typecheck-program = Program(typecheck-stats)
Figure 5.2. file: til/tc/til-typecheck-exp.str
module til-typecheck-exp imports TIL strategies typecheck-exp(typecheck-var) = bottomup(try( typecheck-var <+ TypecheckAdd <+ TypecheckMul <+ TypecheckSub <+ TypecheckDiv <+ TypecheckMod <+ TypecheckLeq <+ TypecheckGeq <+ TypecheckLt <+ TypecheckGt <+ TypecheckEqu <+ TypecheckNeq <+ TypecheckOr <+ TypecheckAnd <+ TypecheckInt <+ TypecheckString <+ TypecheckBool )) rules typeof : e{t*} -> t where <fetch-elem(is-type)> t* => t is-type = ?TypeName(_) TypecheckInt : Int(i) -> Int(i){TypeName("int")} TypecheckString : String(x) -> String(x){TypeName("string")} TypecheckBool : True() -> True(){TypeName("bool")} TypecheckBool : False() -> False(){TypeName("bool")} TypecheckAdd : Add(e1, e2) -> Add(e1, e2){TypeName("string")} where <typeof> e1 => TypeName("string") ; <typeof> e2 => TypeName("string") TypecheckAdd : Add(e1, e2) -> Add(e1, e2){TypeName("int")} where <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int") TypecheckMul : Mul(e1, e2) -> Mul(e1, e2){TypeName("int")} where <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int") TypecheckSub : Sub(e1, e2) -> Sub(e1, e2){TypeName("int")} where <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int") TypecheckDiv : Div(e1, e2) -> Div(e1, e2){TypeName("int")} where <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int") TypecheckMod : Mod(e1, e2) -> Mod(e1, e2){TypeName("int")} where <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int") TypecheckLt : Lt(e1, e2) -> Lt(e1, e2){TypeName("bool")} where <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int") TypecheckGt : Gt(e1, e2) -> Gt(e1, e2){TypeName("bool")} where <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int") TypecheckLeq : Leq(e1, e2) -> Leq(e1, e2){TypeName("bool")} where <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int") TypecheckGeq : Geq(e1, e2) -> Geq(e1, e2){TypeName("bool")} where <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int") TypecheckEqu : Equ(e1, e2) -> Equ(e1, e2){TypeName("bool")} where <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int") TypecheckNeq : Neq(e1, e2) -> Neq(e1, e2){TypeName("bool")} where <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int") TypecheckAnd : And(e1, e2) -> And(e1, e2){TypeName("bool")} where <typeof> e1 => TypeName("bool") ; <typeof> e2 => TypeName("bool") TypecheckOr : Or(e1, e2) -> Or(e1, e2){TypeName("bool")} where <typeof> e1 => TypeName("bool") ; <typeof> e2 => TypeName("bool")
Figure 5.3. file: til/tc/til-typecheck-stats.str
module til-typecheck-stats imports til-typecheck-var strategies typecheck-block = Block(typecheck-stats) typecheck-stats = {| TypeOf : map(typecheck-stat) |} typecheck-stat = typecheck-block <+ typecheck-declaration <+ Assign(id, typecheck-exp) ; try(TypecheckAssign) <+ IfElse(typecheck-exp, typecheck-stats, typecheck-stats) ; TypecheckIf <+ IfThen(typecheck-exp, typecheck-stats) ; TypecheckIf <+ While(typecheck-exp, typecheck-stats) ; TypecheckWhile <+ ProcCall(id, map(typecheck-exp)) ; typecheck-proccall <+ For(id, typecheck-exp, typecheck-exp, typecheck-stats) ; TypecheckFor <+ debug(!"unknown statement: ") ; <exit> 1 TypecheckIf : IfElse(e, st1*, st2*) -> IfElse(e, st1*, st2*){TypeName("void")} where <typeof> e => TypeName("bool") TypecheckIf : IfThen(e, st*) -> IfThen(e, st*){TypeName("void")} where <typeof> e => TypeName("bool") TypecheckWhile : While(e, st*) -> While(e, st*){TypeName("void")} where <typeof> e => TypeName("bool") TypecheckFor : For(x, e1, e2, st*) -> For(x, e1, e2, st*){TypeName("void")} where <TypeOf> x => TypeName("int") ; <typeof> e1 => TypeName("int") ; <typeof> e2 => TypeName("int")
Figure 5.4. file: til/tc/til-typecheck-var.str
module til-typecheck-var imports til-typecheck-exp strategies typecheck-declaration = ?Declaration(x) ; rules( TypeOf+x : x -> TypeName("int") ) typecheck-declaration = ?DeclarationTyped(x, t) ; rules( TypeOf+x : x -> t ) TypecheckVar : Var(x) -> Var(x){t} where <TypeOf> x => t TypecheckAssign : Assign(x, e) -> Assign(x, e){TypeName("void")} where <typeof> e => t ; <TypeOf> x => t strategies // expressions with variables and calls typecheck-exp = typecheck-exp(TypecheckVar <+ typecheck-funcall) typecheck-funcall = TypecheckRead <+ TypecheckS2I <+ TypecheckI2S <+ TypecheckB2S typecheck-proccall = TypecheckWrite strategies // built-in functions TypecheckS2I : FunCall("string2int", [e]) -> FunCall("string2int", [e]){TypeName("int")} where <typeof> e => TypeName("string") TypecheckI2S : FunCall("int2string", [e]) -> FunCall("int2string", [e]){TypeName("string")} where <typeof> e => TypeName("int") TypecheckB2S : FunCall("bool2string", [e]) -> FunCall("bool2string", [e]){TypeName("string")} where <typeof> e => TypeName("bool") TypecheckRead : FunCall("read", []) -> FunCall("read", []){TypeName("string")} TypecheckRead : FunCall("readint", []) -> FunCall("readint", []){TypeName("int")} strategies // built-in procedures TypecheckWrite : ProcCall("write", [e]) -> ProcCall("write", [e]){TypeName("void")} where <typeof> e => TypeName("string") TypecheckWrite : ProcCall("writeint", [e]) -> ProcCall("writeint", [e]){TypeName("void")} where <typeof> e => TypeName("int")
Figure 5.5. file: til/xmpl/typecheck-test1
# typecheck test1.til after parsing ../tc/til-typecheck -i test1.atil | pp-aterm > test1.tc
Table 5.1. files: til/xmpl/test1.til, til/xmpl/test1.tc
before | after |
---|---|
// TIL program computing the factorial var n; n := readint(); var x; var fact; fact := 1; for x := 1 to n do fact := x * fact; end write("factorial of "); writeint(n); write(" is "); writeint(fact); write("\n"); | Program( [ Declaration("n") , Assign("n", FunCall("readint", []){TypeName("int")}){TypeName("void")} , Declaration("x") , Declaration("fact") , Assign("fact", Int("1"){TypeName("int")}){TypeName("void")} , For( "x" , Int("1"){TypeName("int")} , Var("n"){TypeName("int")} , [Assign("fact", Mul(Var("x"){TypeName("int")}, Var("fact"){TypeName("int")}){TypeName("int")}){TypeName("void")}] ){TypeName("void")} , ProcCall("write", [String("\"factorial of \""){TypeName("string")}]){TypeName("void")} , ProcCall("writeint", [Var("n"){TypeName("int")}]){TypeName("void")} , ProcCall("write", [String("\" is \""){TypeName("string")}]){TypeName("void")} , ProcCall("writeint", [Var("fact"){TypeName("int")}]){TypeName("void")} , ProcCall("write", [String("\"\\n\""){TypeName("string")}]){TypeName("void")} ] ) |
Table of Contents
Interpreting programs does not seem a typical application of program transformation. However, in partial evaluation a program is evaluated as far as possible. Also in simpler transformations, evaluating a part of a program is a common operation. Furthermore, in order to execute programs in our TIL language, it is useful have an interpreter.
This chapter shows how to construct an interpreter for typical elements of an imperative language, such as expressions, variable access constructs, and control-flow statements. The specification uses advanced features of Stratego such as traversal strategies, pattern match operations, match-project, and dynamic rewrite rules. So don't be surprised if you don't get it all on a first reading.
The interpreter developed in this chapter requires the parsing and simplification infrastructure from the previous chapters. That is, the interpreter operates on the abstract syntax tree of a program after simplification.
Expressions can be evaluated using a simple bottom-up traversal
of the AST. The bottomup
traversal strategy does
just that; it applies its argument strategy first to the leaves
of the tree, and then to each level above. The try
strategy makes its argument strategy always succeeding. Thus, the
eval-exp
strategy performs a bottomup traversal of
the AST of an expression, applying the Eval
rules
from module sim/til-eval. The
eval-exp
strategy is parameterized with a strategy
for evaluating variables; their values depend on prior
assignments in the program.
Figure 6.1. file: til/run/til-eval-exp.str
module til-eval-exp imports til-eval strategies eval-exp(eval-var) = bottomup(try( eval-var <+ EvalAdd <+ EvalMul <+ EvalSub <+ EvalDiv <+ EvalMod <+ EvalLeq <+ EvalGeq <+ EvalLt <+ EvalGt <+ EvalEqu <+ EvalNeq <+ EvalOr <+ EvalAnd <+ EvalS2I <+ EvalI2S ))
The essence of an imperative language is the manipulation of
values in the store, held by variables. In TIL variables are
introduced by the Declaration
construct. The values
of variables is set, or changed by the Assign
assignment statement, and the Read
input
statement. The Write
statement prints the value of
an expression. The interpreter uses the dynamic rewrite rule
EvalVar to store the mappings from variables to their values.
When encountering a variable declaration, the current scope
is labeled with the name of that variable, and its mapping is
undefined, reflecting the fact that the variable doesn't have a
value after being declared. Note that the scope of a variable in
TIL is the rest of the current block.
When encountering an assignment statement the
EvalVar
rule for the variable being assigned is
updated. Thus, after an assignment to a variable x
,
EvalVar
rewrites that variable to its value.
Similarly, the Read
input statement reads the next
line from the stdin
stream, decides whether it
represents an integer or a string, and defines the
EvalVar
rule for the variable.
Finally, the eval-exp
strategy is now defined in
terms of the parameterized eval-exp
strategy from
module run/til-eval-exp
using the EvalVar
rule as strategy for evaluating
variables. In addition, the VarUndefined
strategy is
provided to catch variables that are used before being assigned
to.
Figure 6.2. file: til/run/til-eval-var.str
module til-eval-var imports til-eval-exp strategies eval-declaration = (?Declaration(x) <+ ?DeclarationTyped(x, t)) ; rules( EvalVar+x :- Var(x) ) eval-assign = Assign(?x, eval-exp => val) ; rules(EvalVar.x : Var(x) -> val) EvalRead : FunCall("read", []) -> String(x) where stdin-stream; read-text-line => x eval-write = ?ProcCall("write", [<eval-exp>]) ; ?String(<try(un-double-quote); try(unescape)>) ; <fprint>(<stdout-stream>, [<id>]) VarUndefined = ?Var(<id>) ; fatal-err(|<concat-strings>["variable ", <id>, " used before being defined"]) eval-exp = eval-exp(EvalVar <+ VarUndefined <+ EvalRead)
What remains is the interpretation of control-flow statements. A
block statements simply entails the execution of each statement
in the block in order. Any variables declared within the block
are local, and shadow variables with the same name in outer
blocks. For this purpose a dynamic rule scope {| EvalVar :
... |}
is used to restrict the scope of
EvalVar
rules to the block. The statements in a
block are evaluated by map
ping the
eval-stat
strategy over the list of statements.
For the execution of the if-then-else
construct,
first the condition is evaluated. The EvalIf
rule
then evaluates the construct, reducing it to one of its
branches. The resulting block statement is then evaluated by
eval-stat
.
The while
statement is evaluated by transforming the
loop to an if-then-else
statement, with a
Goto
at the end of the body. The dynamic rule
EvalGoto
maps the goto statement for the new label
to the if-then-else
statement.
Figure 6.3. file: til/run/til-eval-stats.str
module til-eval-stats imports til-eval-var signature constructors Goto : String -> Stat strategies eval-block = ?Block(<eval-stats>) eval-stats = {| EvalVar : map(eval-stat) |} eval-stat = //debug(!"eval-stat: "); ( eval-assign <+ eval-write <+ eval-declaration <+ eval-block <+ eval-if <+ eval-while <+ EvalGoto //) eval-if = IfElse(eval-exp, id, id) ; EvalIf ; eval-stat eval-while = ?While(e, st*) ; where(new => label) ; where(<conc>(st*, [Goto(label)]) => st2*) ; rules( EvalGoto : Goto(label) -> <eval-stat>IfElse(e, st2*, []) ) ; <eval-stat> Goto(label)
To complete the interpreter we define an interpretation strategy
for the Program
constructor, and a main strategy
that takes care of I/O.
The program is compiled in the usual way, taking into account the
include paths for the ../sig
and ../sim
directories that contain the TIL signature and evaluation rules,
respectively.
Figure 6.4. file: til/run/til-run.str
module til-run imports libstrategolib til-eval-stats strategies io-til-run = io-wrap(eval-program) eval-program = ?Program(<eval-stats>) ; <exit> 0
Figure 6.5. file: til/run/maak
#! /bin/sh -e # compile til-simplify strc -i til-run.str -I ../sig -I ../sim -la stratego-lib -m io-til-run
Now it is time to try out the interpreter at our test1.til program that computes the factorial of its input. Note that the interpreter operates on a parsed and simplified version of the program; not directly on the text file.
Figure 6.6. file: til/xmpl/run-test1
# run test1.til after parsing and simplification echo "10" | ../run/til-run -i test1.sim.atil > test1.run
Table of Contents
This chapter is work in progress. Not all parts have been finished yet. The latest revision of this manual may contain more material. Refer to the online version.
This chapter shows examples of data-flow transformations
Figure 7.1. file: til/opt/til-opt-lib.str
module til-opt-lib imports liblib TIL til-eval strategies contains(|x) = // should be in stratego-lib oncetd(?x) elem-of(|xs) = // should be in stratego-lib ?x; where(<fetch(?x)> xs) get-var-names = collect(?Var(<id>)) get-var-dependencies = collect(?Var(<!(<id>,<id>)>)) EvalExp = EvalAdd <+ EvalMul <+ EvalSub <+ EvalDiv <+ EvalMod <+ EvalLt <+ EvalGt <+ EvalEqu <+ EvalNeq is-var = ?Var(_) is-value = ?Int(_) <+ ?String(_) is-pure-exp = ?Var(_) <+ ?Int(_) <+ ?String(_) <+ is-bin-op#([is-pure-exp, is-pure-exp]) is-bin-op = ?"Or" <+ ?"And" <+ ?"Neq" <+ ?"Equ" <+ ?"Gt" <+ ?"Lt" <+ ?"Sub" <+ ?"Add" <+ ?"Mod" <+ ?"Div" <+ ?"Mul"
Figure 7.2. file: til/xmpl/propconst-test2
sglri -p ../syn/TIL.tbl -i propconst-test2.til |\ ../sim/til-simplify |\ ../opt/til-propconst |\ ast2text -p ../pp/TIL-pretty.pp -o propconst-test2.txt
Table 7.1. files: til/xmpl/propconst-test2.til, til/xmpl/propconst-test2.txt
before | after |
---|---|
var x; var y; var z; var a; var b; z := readint(); x := 1; // constant value y := 2; // not constant x := 3; // override a := x + 4; // compute constant y := x + z; // not constant if y then z := 8; x := z - 5; // constant else x := a - 4; // constant z := a + z; // z is not constant end b := a + z; // a still constant, z not z := a + x; writeint(b + z); | var x : int; var y : int; var z : int; var a : int; var b : int; z := string2int(read()); x := 1; y := 2; x := 3; a := 7; y := 3 + z; if y then z := 8; x := 3; else x := 3; z := 7 + z; end b := 7 + z; z := 10; write(int2string(b + 10)); |
Figure 7.3. file: til/opt/til-propconst.str
module til-propconst imports til-opt-lib strategies io-til-propconst = io-wrap(propconst) propconst = PropConst <+ propconst-declaration <+ propconst-assign <+ propconst-block <+ propconst-if <+ propconst-while <+ all(propconst); try(EvalExp) propconst-block = Block({| PropConst : map(propconst) |}) propconst-declaration = (?Declaration(x) <+ ?DeclarationTyped(x, t)) ; rules( PropConst+x :- Var(x) ) propconst-assign = Assign(?x, propconst => e) ; if <is-value> e then rules( PropConst.x : Var(x) -> e ) else rules( PropConst.x :- Var(x) ) end propconst-if = IfElse(propconst, id, id) ; (EvalIf; propconst <+ IfElse(id, propconst, id) /PropConst\ IfElse(id,id,propconst)) propconst-while = ?While(_, _) ; (/PropConst\* While(propconst, propconst))
Figure 7.4. file: til/xmpl/copyprop-test2
sglri -p ../syn/TIL.tbl -i copyprop-test2.til |\ ../sim/til-simplify |\ ../opt/til-copyprop |\ ast2text -p ../pp/TIL-pretty.pp -o copyprop-test2.txt
Table 7.2. files: til/xmpl/copyprop-test2.til, til/xmpl/copyprop-test2.txt
before | after |
---|---|
var x : int; var y : int; x := string2int(read()); y := x; y := y + 1; x := y; write(int2string(x)); | var x : int; var y : int; x := string2int(read()); y := x; y := x + 1; x := y; write(int2string(y)); |
Figure 7.5. file: til/opt/til-copyprop.str
module til-copyprop imports til-opt-lib strategies io-til-copyprop = io-wrap(copyprop) copyprop = CopyProp <+ copyprop-declaration <+ copyprop-assign <+ copyprop-block <+ copyprop-if <+ copyprop-while <+ all(copyprop) copyprop-block = Block({| CopyProp : map(copyprop) |}) copyprop-declaration = (?Declaration(x) <+ ?DeclarationTyped(x, t)) ; new-CopyProp(|x, x) copyprop-assign = Assign(?x, copyprop => e) ; undefine-CopyProp(|x) ; where( innermost-scope-CopyProp => z ) ; if <?Var(y)> e then rules( CopyProp.z : Var(x) -> Var(y) depends on [(x,x),(y,y)] ) end copyprop-if = IfElse(copyprop, id, id) ; (IfElse(id, copyprop, id) /CopyProp\ IfElse(id,id,copyprop)) copyprop-while = ?While(_, _) ; (/CopyProp\* While(copyprop, copyprop)) innermost-scope-CopyProp = get-var-names => vars ; innermost-scope-CopyProp(elem-of(|vars))
Figure 7.6. file: til/opt/til-copyprop-rev.str
module til-copyprop-rev imports til-opt-lib strategies io-til-copyprop-rev = io-wrap(copyprop-rev) copyprop-rev = CopyPropRev <+ copyprop-rev-declaration <+ copyprop-rev-assign <+ copyprop-rev-block <+ copyprop-rev-if <+ copyprop-rev-while <+ all(copyprop-rev) copyprop-rev-block = Block({| CopyPropRev : map(copyprop-rev) |}) copyprop-rev-declaration = (?Declaration(x) <+ ?DeclarationTyped(x, t)) ; new-CopyPropRev(|x, x) copyprop-rev-assign = Assign(?x, copyprop-rev => e) ; undefine-CopyPropRev(|x) ; where( innermost-scope-CopyPropRev => z ) ; if <?Var(y)> e then rules( CopyPropRev.z : Var(y) -> Var(x) depends on [(x,x),(y,y)] ) end copyprop-rev-if = IfElse(copyprop-rev, id, id) ; (IfElse(id, copyprop-rev, id) /CopyPropRev\ IfElse(id,id,copyprop-rev)) copyprop-rev-while = ?While(_, _) ; (/CopyPropRev\* While(copyprop-rev, copyprop-rev)) innermost-scope-CopyPropRev = get-var-names => vars ; innermost-scope-CopyPropRev(elem-of(|vars))
Figure 7.7. file: til/xmpl/cse-test2
sglri -p ../syn/TIL.tbl -i cse-test2.til |\ ../sim/til-simplify |\ ../opt/til-cse |\ ast2text -p ../pp/TIL-pretty.pp -o cse-test2.txt
Table 7.3. files: til/xmpl/cse-test2.til, til/xmpl/cse-test2.txt
before | after |
---|---|
var a; var b; var x; a := readint(); b := readint(); x := a + b; writeint(a + b); a := 23; x := a + b; writeint(a + b); | var a : int; var b : int; var x : int; a := string2int(read()); b := string2int(read()); x := a + b; write(int2string(x)); a := 23; x := a + b; write(int2string(x)); |
Figure 7.8. file: til/opt/til-cse.str
module til-cse imports til-opt-lib strategies io-til-cse = io-wrap(cse) cse = CSE <+ cse-declaration <+ cse-assign <+ cse-if <+ cse-block <+ cse-while <+ all(cse) cse-block = Block({| CSE : map(cse) |}) cse-declaration = (?Declaration(x) <+ ?DeclarationTyped(x, t)) ; new-CSE(|x, x) cse-assign = Assign(?x, cse => e) ; undefine-CSE(|x) ; if <not(contains(|Var(x)))> e then where( innermost-scope-CSE => z ) ; where( get-var-dependencies => deps ) ; rules( CSE.z : e -> Var(x) depends on deps ) end cse-if = IfElse(cse, id, id) ; (IfElse(id, cse, id) /CSE\ IfElse(id,id,cse)) cse-while = ?While(_, _) ; (/CSE\* While(cse, cse)) innermost-scope-CSE = get-var-names => vars ; innermost-scope-CSE(elem-of(|vars))
Figure 7.9. file: til/opt/til-forward-subst.str
module til-forward-subst imports til-opt-lib strategies io-til-forward-subst = io-wrap(forward-subst) forward-subst = ForwardSubst <+ forward-subst-declaration <+ forward-subst-assign <+ forward-subst-if <+ forward-subst-block <+ forward-subst-while <+ all(forward-subst) forward-subst-block = Block({| ForwardSubst : map(forward-subst) |}) forward-subst-declaration = (?Declaration(x) <+ ?DeclarationTyped(x, t)) ; new-ForwardSubst(|x, x) forward-subst-assign = Assign(?x, forward-subst => e) ; undefine-ForwardSubst(|x) ; if <not(is-var); is-pure-exp; not(contains(|Var(x)))> e then where( innermost-scope-ForwardSubst => z ) ; where( get-var-dependencies => deps ) ; rules( ForwardSubst.z : Var(x) -> e depends on deps ) end forward-subst-if = IfElse(forward-subst, id, id) ; (IfElse(id, forward-subst, id) /ForwardSubst\ IfElse(id,id,forward-subst)) forward-subst-while = ?While(_, _) ; (/ForwardSubst\* While(forward-subst, forward-subst)) innermost-scope-ForwardSubst = get-var-names => vars ; innermost-scope-ForwardSubst(elem-of(|vars))
Figure 7.10. file: til/xmpl/dce-test2
sglri -p ../syn/TIL.tbl -i dce-test2.til |\ ../sim/til-simplify |\ ../opt/til-dce |\ ast2text -p ../pp/TIL-pretty.pp -o dce-test2.txt
Table 7.4. files: til/xmpl/dce-test2.til, til/xmpl/dce-test2.txt
before | after |
---|---|
var x; var y; var z; var a; var b; z := readint(); x := 1; y := 2; x := 3; a := 7; y := 3 + z; if y then z := 8; x := 3; else x := 3; z := 7 + z; end b := 7 + z; z := 10; writeint(b + 10); | var y : int; var z : int; var b : int; z := string2int(read()); y := 3 + z; if y then z := 8; else z := 7 + z; end b := 7 + z; write(int2string(b + 10)); |
Figure 7.11. file: til/opt/til-dce.str
module til-dce imports TIL til-eval liblib rules ElimDecl : [Declaration(x) | st*] -> st* where <not(VarUsed)> Var(x) ElimDecl : [DeclarationTyped(x, t) | st*] -> st* where <not(VarUsed)> Var(x) ElimAssign : Assign(x, e) -> Block([]) where <not(VarNeeded)> Var(x) ElimIf : IfElse(e, [], []) -> Block([]) strategies io-til-dce = io-wrap(dce-program) dce-stat = ElimAssign <+ dce-assign <+ dce-proccall <+ dce-if; try(ElimIf) <+ dce-block <+ dce-while dce-program = Program(dce-stats) dce-block = Block(dce-stats) dce-stats = dce-stats-decl <+ dce-stats-other <+ [] dce-stats-decl = (?[Declaration(x) | _] <+ ?[DeclarationTyped(x, t) | _]) ; {| VarNeeded, VarUsed : rules( VarNeeded+x :- Var(x) VarUsed+x :- Var(x) ) ; [id | dce-stats] ; try(ElimDecl) |} dce-stats-other = [not(?Declaration(_) <+ ?DeclarationTyped(_, _)) | dce-stats] ; [dce-stat | id] ; try(?[Block([]) | <id>]) dce-assign = ?Assign(x, _) ; rules( VarNeeded.x :- Var(x) ) ; Assign(id, declare-var-needed) dce-proccall = ProcCall(id, map(declare-var-needed)) declare-var-needed = alltd({x : ?Var(x) ; rules( VarNeeded.x : Var(x) VarUsed.x : Var(x) ) }) dce-if = ?IfElse(_,_,_) ; (IfElse(id,dce-stats,id) /VarNeeded,VarUsed\ IfElse(id,id,dce-stats)) ; IfElse(declare-var-needed, id, id) dce-while = ?While(_, _) ; (\VarNeeded,VarUsed/* While(declare-var-needed, dce-stats))
Figure 7.12. file: til/opt/maak
#! /bin/sh -e INCL="-I ../sig -I ../sim" # compile forward-subst strc ${INCL} -i til-forward-subst.str -la stratego-lib -m io-til-forward-subst -O 2 # compile dce strc ${INCL} -i til-dce.str -la stratego-lib -m io-til-dce -O 2 # compile cse strc ${INCL} -i til-cse.str -la stratego-lib -m io-til-cse # compile copyprop strc -i til-copyprop.str ${INCL} -la stratego-lib -m io-til-copyprop # compile copyprop-rev strc -i til-copyprop-rev.str ${INCL} -la stratego-lib -m io-til-copyprop-rev # compile propconst strc -i til-propconst.str ${INCL} -la stratego-lib -m io-til-propconst
This chapter is work in progress. Not all parts have been finished yet. The latest revision of this manual may contain more material. Refer to the online version.
Sometimes it is useful to extend a language in order to ease transformations.
Figure 9.1. file: til/til-eblock/TIL-eblocks.sdf
module TIL-eblocks imports TIL exports context-free syntax "begin" Stat* "return" Exp ";" Stat* "end" -> Exp {cons("EBlock")}
Figure 9.2. file: til/til-eblock/til-eblock-desugar.str
module til-eblock-desugar imports liblib TIL-eblocks strategies io-til-eblock-desugar = io-wrap(til-eblock-desugar) til-eblock-desugar = bottomup(try(FunCallToEBlock)) ; innermost( EBlockEBlock <+ NoStatements <+ MovePostStatementsInFront <+ HoistEBlockFromFunCall <+ HoistEBlockFromBinOp <+ HoistEBlockFromAssign <+ HoistEBlockFromProcCall // <+ HoistEBlockFromWrite <+ HoistEBlockFromIf <+ HoistEBlockFromWhile <+ HoistEBlockFromFor ) ; bottomup(try(flatten-block)) rules FunCallToEBlock : FunCall(f, e*) -> EBlock([Declaration(x), Assign(x, FunCall(f, e*))], Var(x), []) where new => x NoStatements : EBlock([], e, []) -> e MovePostStatementsInFront : EBlock(st1*, e, st2*@[_|_]) -> EBlock([Declaration(x), st1*, Assign(x, e), st2*], Var(x), []) where new => x EBlockEBlock : EBlock(st1*, EBlock(st2*, e, st3*), st4*) -> EBlock([st1*, st2*], e, [st3*, st4*]) HoistEBlockFromFunCall : FunCall(f, e1*) -> EBlock(st1*, FunCall(f, e2*), []) where <collect-eblocks> e1* => (st1*, e2*) HoistEBlockFromBinOp : op#([e1, e2]) -> EBlock(st1*, op#([e1', e2']), []) where <is-bin-op> op ; <collect-eblocks> [e1, e2] => (st1*, [e1', e2']) is-bin-op = ?"Or" <+ ?"And" <+ ?"Neq" <+ ?"Equ" <+ ?"Gt" <+ ?"Lt" <+ ?"Sub" <+ ?"Add" <+ ?"Mod" <+ ?"Div" <+ ?"Mul" HoistEBlockFromAssign : Assign(x, EBlock(st1*, e, st2*)) -> Block([st1*, st2*, Assign(x, e)]) HoistEBlockFromProcCall : ProcCall(f, e1*) -> Block([st*, ProcCall(f, e2*)]) where <collect-eblocks> e1* => (st*, e2*) // HoistEBlockFromWrite : // Write(EBlock(st1*, e, st2*)) -> Block([st1*, Write(e), st2*]) HoistEBlockFromIf : IfElse(EBlock(st1*, e, st2*), st3*, st4*) -> Block([st1*, IfElse(e, [st2*,st3*], [st2*,st4*])]) HoistEBlockFromIf : IfThen(EBlock(st1*, e, st2*), st3*) -> Block([st1*, IfElse(e, [st2*,st3*], st2*)]) HoistEBlockFromWhile : While(EBlock(st1*, e, st2*), st3*) -> Block([st1*, While(e, [st2*,st3*,st1*])]) HoistEBlockFromFor : For(x, e1, e2, st1*) -> Block([st2*, For(x, e1', e2', st1*)]) where <collect-eblocks> [e1, e2] => (st2*, [e1', e2']) collect-eblocks = fetch(?EBlock(_,_,_)) ; map({where(new => x); !([Declaration(x), Assign(x, <id>)], Var(x))}) ; unzip ; (concat, id) flatten-block = Block( foldr(![], \ (Block(st2*), st3*) -> [st2*, st3*] \ <+ \ (st, st*) -> [st | st*] \ ) ; partition(?Declaration(_)) ; conc )
desugar expresion blocks, then perform simplification, copy propagation, constant propagation, and finally dead code elimination.
Figure 9.3. file: til/xmpl/eblock-desugar-test2
#! /bin/sh -v sglri -p ../til-eblock/TIL-eblocks.tbl -i eblock-desugar-test2.til |\ ../sim/til-simplify |\ ../renaming/til-rename-vars |\ ../til-eblock/til-eblock-desugar -o eblock-desugar-test2.des.til ast2text -p ../pp/TIL-pretty.pp \ -i eblock-desugar-test2.des.til \ -o eblock-desugar-test2.txt ../opt/til-copyprop-rev -i eblock-desugar-test2.des.til |\ ../opt/til-forward-subst |\ ../opt/til-copyprop |\ ../opt/til-propconst |\ ../opt/til-dce |\ ast2text -p ../pp/TIL-pretty.pp -o eblock-desugar-test2.cp.txt
Table 9.1. files: til/xmpl/eblock-desugar-test2.til, til/xmpl/eblock-desugar-test2.txt
before | after |
---|---|
write( begin var x; x := read(); return x + begin x := readint(); return x; x := x * 2; end + 1; x := x + 1; write(x); end ); | begin var k_0; var j_0; var a_0; var h_0; var f_0; var g_0; var e_0; var c_0; var d_0; var b_0; var i_0; var x0 : int; a_0 := read(); x0 := a_0; f_0 := x0; b_0 := read(); d_0 := b_0; c_0 := string2int(d_0); x0 := c_0; e_0 := x0; x0 := x0 * 2; g_0 := e_0; h_0 := f_0 + g_0; i_0 := 1; j_0 := h_0 + i_0; x0 := x0 + 1; write(x0); k_0 := j_0; write(k_0); end |
Table 9.2. files: til/xmpl/eblock-desugar-test2.txt, til/xmpl/eblock-desugar-test2.cp.txt
before | after |
---|---|
begin var k_0; var j_0; var a_0; var h_0; var f_0; var g_0; var e_0; var c_0; var d_0; var b_0; var i_0; var x0 : int; a_0 := read(); x0 := a_0; f_0 := x0; b_0 := read(); d_0 := b_0; c_0 := string2int(d_0); x0 := c_0; e_0 := x0; x0 := x0 * 2; g_0 := e_0; h_0 := f_0 + g_0; i_0 := 1; j_0 := h_0 + i_0; x0 := x0 + 1; write(x0); k_0 := j_0; write(k_0); end | begin var a_0; var c_0; var b_0; a_0 := read(); b_0 := read(); c_0 := string2int(b_0); write(c_0 * 2 + 1); write(a_0 + c_0 + 1); end |