Chapter 2. Syntax Definition and Pretty-Printing

Table of Contents

2.1. TIL: a Tiny Imperative Language
2.2. Syntax Definition
2.2.1. Modules
2.2.2. Parse Table Generation
2.2.3. Parsing Programs
2.3. Term Format
2.3.1. Regular Tree Grammars
2.3.2. Signatures
2.3.3. Format Checking
2.4. Pretty-Printing
2.4.1. Pretty-Print Table
2.4.2. Applying Pretty-Print Tables
2.4.3. Adapting the Pretty-Print Table
2.4.4. Restoring Parentheses
2.5. A Complete Pipeline

2.1. TIL: a Tiny Imperative Language

This chapter shows how to define a syntax definition in SDF and how to derive a number of artifacts from such a definition, i.e., a parser, a pretty-printer, and a Stratego signature. The chapter also introduces TIL, a Tiny Imperative Language, which is used in many of the examples.

TIL is a tiny imperative language designed for the demonstration of language and transformation definition formalisms. The language has two data-types, integers and strings. A TIL program consists of a list of statements, which can be variable declarations, assignments, I/O instructions, and control-flow statements. Statements can use expressions to compute values from integer and string constants, and the values of variables.

The following example gives an impression of the language.

Figure 2.1. file: til/xmpl/test1.til

// TIL program computing the factorial

var n;
n := readint();
var x;
var fact;
fact := 1;
for x := 1 to n do
  fact := x * fact;
end
write("factorial of ");
writeint(n);
write(" is ");
writeint(fact);
write("\n");

2.2. Syntax Definition

This section shows a modular definition of the syntax of TIL, the generation of a parse table from that definition, and its use for parsing program texts into abstract syntax trees.

2.2.1. Modules

The following files define the syntax of TIL in the syntax definition formalism SDF. SDF is a modular formalism that supports the definition of lexical and context-free syntax. Modularity entails that a syntax definition can be composed from multiple modules, and that such modules can be reused in multiple syntax definitions. Also, syntax definitions of different languages can be combined.

2.2.1.1. Layout

Module TIL-layout defines the syntax of LAYOUT, the symbols that occur between every two context-free non-terminals. This is the way to introduce the whitespace and comments for a language. The definition uses character classes to indicate classes of characters. For instance, whitespace consists of spaces, tabs, newlines, and carriage returns. Likewise, comments consist of two slashes followed by zero or more characters which are not newlines or carriage returns. The follow restriction prevents ambiguities in parsing layout.

Figure 2.2. file: til/syn/TIL-layout.sdf

module TIL-layout
exports 
  lexical syntax
    [\ \t\n\r]    -> LAYOUT
    "//" ~[\n\r]* -> LAYOUT
  context-free restrictions
    LAYOUT? -/- [\ \t\n\r]

2.2.1.2. Literals

Module TIL-literals defines the syntax of identifiers, integer constants, and string literals. Note again the use of character classes to indicate collections of characters and regular expressions to indicate lists of zero or more (*), or one or more (+) elements. String characters (StrChar) are any characters other than double quote, backslash, or newline, and escaped versions of these characters.

Figure 2.3. file: til/syn/TIL-literals.sdf

module TIL-literals
exports 
  sorts Id Int String StrChar
  lexical syntax
    [A-Za-z][A-Za-z0-9]* -> Id
    [0-9]+               -> Int
    "\"" StrChar* "\""   -> String
    ~[\"\\\n]            -> StrChar
    [\\][\"\\n]          -> StrChar
  lexical restrictions
    Id  -/- [A-Za-z0-9]
    Int -/- [0-9]    

2.2.1.3. Expressions

Module TIL-expressions defines the syntax of expressions that compute a value. Basic expressions are identifiers (variables), integer constants, and string literals. More complex expressions are obtained by the arithmetic and relational operators. The constructor attributes of productions (e.g., cons("Add")) indicates the constructor to be used in the construction of an abstract syntax tree from a parse tree. Ambiguities of and between productions are solved by means of associativity and priority declarations.

Figure 2.4. file: til/syn/TIL-expressions.sdf

module TIL-expressions
imports TIL-literals
exports 
  sorts Exp
  context-free syntax
    "true"       -> Exp {cons("True")}
    "false"      -> Exp {cons("False")}
    Id           -> Exp {cons("Var")}
    Int          -> Exp {cons("Int")}
    String       -> Exp {cons("String")}
    Exp "*" Exp  -> Exp {cons("Mul"),assoc}
    Exp "/" Exp  -> Exp {cons("Div"),assoc}
    Exp "%" Exp  -> Exp {cons("Mod"),non-assoc}
    Exp "+" Exp  -> Exp {cons("Add"),assoc}
    Exp "-" Exp  -> Exp {cons("Sub"),left}
    Exp "<" Exp  -> Exp {cons("Lt"),non-assoc}
    Exp ">" Exp  -> Exp {cons("Gt"),non-assoc}
    Exp "<=" Exp -> Exp {cons("Leq"),non-assoc}
    Exp ">=" Exp -> Exp {cons("Geq"),non-assoc}
    Exp "=" Exp  -> Exp {cons("Equ"),non-assoc}
    Exp "!=" Exp -> Exp {cons("Neq"),non-assoc}
    Exp "&" Exp  -> Exp {cons("And"),assoc}
    Exp "|" Exp  -> Exp {cons("Or"),assoc}
    "(" Exp ")"  -> Exp {bracket}
  context-free priorities
    {left: 
       Exp "*" Exp -> Exp 
       Exp "/" Exp -> Exp }
  > {left: 
       Exp "+" Exp -> Exp 
       Exp "-" Exp -> Exp }
  > {non-assoc: 
       Exp "<" Exp  -> Exp 
       Exp ">" Exp  -> Exp
       Exp "<=" Exp -> Exp 
       Exp ">=" Exp -> Exp
       Exp "="  Exp -> Exp 
       Exp "!=" Exp -> Exp }
  > Exp "&" Exp  -> Exp
  > Exp "|" Exp  -> Exp

2.2.1.4. Statements

Module TIL-statements defines the syntax of statements, i.e., instructions to be executed. The assignment statement assigns the value of the right-hand side expression to the variable in the left-hand side. The read and write statements read a value from standard input or write a value to standard output. The control-flow constructs use lists of statements.

Figure 2.5. file: til/syn/TIL-statements.sdf

module TIL-statements
imports TIL-expressions TIL-types
exports 
  sorts Stat
  context-free syntax
    "var" Id ";"			                    -> Stat {cons("Declaration")}
    "var" Id ":" Type ";"			            -> Stat {cons("DeclarationTyped")}
    Id ":=" Exp ";"                             -> Stat {cons("Assign")}
    "begin" Stat* "end"                         -> Stat {cons("Block")}
    "if" Exp "then" Stat* "end"                 -> Stat {cons("IfThen")}
    "if" Exp "then" Stat* "else" Stat* "end"    -> Stat {cons("IfElse")}
    "while" Exp "do" Stat* "end"                -> Stat {cons("While")}
    "for" Id ":=" Exp "to" Exp "do" Stat* "end" -> Stat {cons("For")}


2.2.1.5. Function and Procedure Calls

Module TIL-calls defines the syntax of function and procedure calls. Even though TIL does not have function definitions, it is useful to be able to call primitive functions and procedures.

Figure 2.6. file: til/syn/TIL-calls.sdf

module TIL-calls
imports TIL-expressions TIL-statements
exports
  context-free syntax
    Id "(" {Exp ","}* ")"     -> Exp  {cons("FunCall")}
    Id "(" {Exp ","}* ")" ";" -> Stat {cons("ProcCall")}

2.2.1.6. Programs

Module TIL (Figure 2.7) defines the syntax of the complete language by importing the modules above, and defining a Program as a list of statements. In addition, the module introduces a start-symbol. This is the sort that a parser will start parsing with. There may be multiple start symbols.

Figure 2.7. file: til/syn/TIL.sdf

module TIL
imports TIL-layout TIL-literals TIL-expressions TIL-statements TIL-calls
exports 
  sorts Program
  context-free syntax
    Stat* -> Program {cons("Program")}

  context-free start-symbols Program

2.2.2. Parse Table Generation

The following maak script first collects the modules of the syntax definition and then generates a parse table. The result of pack-sdf is a `definition' file (as opposed to a single module file, as we saw above) that contains all modules imported by the main module, TIL.sdf in this case (Figure 2.9). The parser generator sdf2table creates a parse table from a syntax definition. Note the use of the -m flag to indicate the main module from which to generate the table. The parse table (TIL.tbl) is a file in ATerm format, that is interpreted by the sglri tool to parse text files.

Figure 2.8. file: til/syn/maak

#! /bin/sh -e

# collect modules 
pack-sdf -i TIL.sdf -o TIL.def

# generate parse table
sdf2table -i TIL.def -o TIL.tbl -m TIL

Figure 2.9. file: til/syn/TIL.def

definition
module TIL-calls
imports TIL-expressions TIL-statements
exports
  context-free syntax
    Id "(" {Exp ","}* ")"     -> Exp  {cons("FunCall")}
    Id "(" {Exp ","}* ")" ";" -> Stat {cons("ProcCall")}

module TIL-types
imports TIL-literals
exports 
 sorts Type
  context-free syntax
    Id -> Type {cons("TypeName")}

module TIL-statements
imports TIL-expressions TIL-types
exports 
  sorts Stat
  context-free syntax
    "var" Id ";"			                    -> Stat {cons("Declaration")}
    "var" Id ":" Type ";"			            -> Stat {cons("DeclarationTyped")}
    Id ":=" Exp ";"                             -> Stat {cons("Assign")}
    "begin" Stat* "end"                         -> Stat {cons("Block")}
    "if" Exp "then" Stat* "end"                 -> Stat {cons("IfThen")}
    "if" Exp "then" Stat* "else" Stat* "end"    -> Stat {cons("IfElse")}
    "while" Exp "do" Stat* "end"                -> Stat {cons("While")}
    "for" Id ":=" Exp "to" Exp "do" Stat* "end" -> Stat {cons("For")}

module TIL-expressions
imports TIL-literals
exports 
  sorts Exp
  context-free syntax
    "true"       -> Exp {cons("True")}
    "false"      -> Exp {cons("False")}
    Id           -> Exp {cons("Var")}
    Int          -> Exp {cons("Int")}
    String       -> Exp {cons("String")}
    Exp "*" Exp  -> Exp {cons("Mul"),assoc}
    Exp "/" Exp  -> Exp {cons("Div"),assoc}
    Exp "%" Exp  -> Exp {cons("Mod"),non-assoc}
    Exp "+" Exp  -> Exp {cons("Add"),assoc}
    Exp "-" Exp  -> Exp {cons("Sub"),left}
    Exp "<" Exp  -> Exp {cons("Lt"),non-assoc}
    Exp ">" Exp  -> Exp {cons("Gt"),non-assoc}
    Exp "<=" Exp -> Exp {cons("Leq"),non-assoc}
    Exp ">=" Exp -> Exp {cons("Geq"),non-assoc}
    Exp "=" Exp  -> Exp {cons("Equ"),non-assoc}
    Exp "!=" Exp -> Exp {cons("Neq"),non-assoc}
    Exp "&" Exp  -> Exp {cons("And"),assoc}
    Exp "|" Exp  -> Exp {cons("Or"),assoc}
    "(" Exp ")"  -> Exp {bracket}
  context-free priorities
    {left: 
       Exp "*" Exp -> Exp 
       Exp "/" Exp -> Exp }
  > {left: 
       Exp "+" Exp -> Exp 
       Exp "-" Exp -> Exp }
  > {non-assoc: 
       Exp "<" Exp  -> Exp 
       Exp ">" Exp  -> Exp
       Exp "<=" Exp -> Exp 
       Exp ">=" Exp -> Exp
       Exp "="  Exp -> Exp 
       Exp "!=" Exp -> Exp }
  > Exp "&" Exp  -> Exp
  > Exp "|" Exp  -> Exp


module TIL-literals
exports 
  sorts Id Int String StrChar
  lexical syntax
    [A-Za-z][A-Za-z0-9]* -> Id
    [0-9]+               -> Int
    "\"" StrChar* "\""   -> String
    ~[\"\\\n]            -> StrChar
    [\\][\"\\n]          -> StrChar
  lexical restrictions
    Id  -/- [A-Za-z0-9]
    Int -/- [0-9]

module TIL-layout
exports 
  lexical syntax
    [\ \t\n\r]    -> LAYOUT
    "//" ~[\n\r]* -> LAYOUT
  context-free restrictions
    LAYOUT? -/- [\ \t\n\r]

module TIL
imports TIL-layout TIL-literals TIL-expressions TIL-statements TIL-calls
exports 
  sorts Program
  context-free syntax
    Stat* -> Program {cons("Program")}

  context-free start-symbols Program

2.2.3. Parsing Programs

The sglri tool parses a text file given a parse table generated by sdf2table. The result is an abstract syntax term in the ATerm format. In order to inspect this term it is useful to `pretty-print' it using the pp-aterm tool. Compare the resulting term with the program in Figure 2.1.

Figure 2.10. file: til/xmpl/parse-test

# parse input file
sglri -p ../syn/TIL.tbl -i test1.til -o test1.ast

# `pretty-print' abstract syntax term
pp-aterm -i test1.ast -o test1.atil

Figure 2.11. file: til/xmpl/test1.atil

Program(
  [ Declaration("n")
  , Assign("n", FunCall("readint", []))
  , Declaration("x")
  , Declaration("fact")
  , Assign("fact", Int("1"))
  , For(
      "x"
    , Int("1")
    , Var("n")
    , [Assign("fact", Mul(Var("x"), Var("fact")))]
    )
  , ProcCall("write", [String("\"factorial of \"")])
  , ProcCall("writeint", [Var("n")])
  , ProcCall("write", [String("\" is \"")])
  , ProcCall("writeint", [Var("fact")])
  , ProcCall("write", [String("\"\\n\"")])
  ]
)

2.3. Term Format

The result of parsing a text is an abstract syntax tree represented by means of an ATerm. When produced by a parser, one can be sure that an ATerm has the right format, since it was derived directly from a parse tree. However, terms can also be produced by other components, e.g., be the result of a transformation. In those cases it may worthwhile to check that the term is well-formed according to some schema. In Stratego/XT tree schemas are described by Regular Tree Grammars (RTGs). Stratego signatures are used within Stratego programs to verify some aspects of Stratego programs. RTGs and signatures can be derived automatically from a syntax definition in SDF.

The following maak scripts derives from a syntax definition first an RTG, and from the RTG a Stratego signature.

Figure 2.12. file: til/sig/maak

#! /bin/sh -e

# generate regular tree grammar from syntax definition
sdf2rtg -i ../syn/TIL.def -o TIL.rtg -m TIL

# generate Stratego signature from regular tree grammar
rtg2sig -i TIL.rtg -o TIL.str 

2.3.1. Regular Tree Grammars

A regular tree grammar defines well-formedness rules for a set of trees (or terms). The following regular tree grammar has been generated from the syntax definition of TIL and precisely describes the abstract syntax trees of TIL programs.

Figure 2.13. file: til/sig/TIL.rtg

regular tree grammar
  start Program
  productions
    ListStarOfStat0    -> ListPlusOfStat0
    ListStarOfStat0    -> <nil>()
    ListStarOfStat0    -> <conc>(ListStarOfStat0,ListStarOfStat0)
    ListPlusOfStat0    -> <conc>(ListStarOfStat0,ListPlusOfStat0)
    ListPlusOfStat0    -> <conc>(ListPlusOfStat0,ListStarOfStat0)
    ListPlusOfStat0    -> <conc>(ListPlusOfStat0,ListPlusOfStat0)
    ListPlusOfStat0    -> <cons>(Stat,ListStarOfStat0)
    ListStarOfExp0     -> ListPlusOfExp0
    ListStarOfExp0     -> <nil>()
    ListStarOfExp0     -> <conc>(ListStarOfExp0,ListStarOfExp0)
    ListPlusOfExp0     -> <conc>(ListStarOfExp0,ListPlusOfExp0)
    ListPlusOfExp0     -> <conc>(ListPlusOfExp0,ListStarOfExp0)
    ListPlusOfExp0     -> <conc>(ListPlusOfExp0,ListPlusOfExp0)
    ListPlusOfExp0     -> <cons>(Exp,ListStarOfExp0)
    ListStarOfStrChar0 -> <string>
    ListPlusOfStrChar0 -> <string>
    Program            -> Program(ListStarOfStat0)
    Stat               -> ProcCall(Id,ListStarOfExp0)
    Exp                -> FunCall(Id,ListStarOfExp0)
    Stat               -> For(Id,Exp,Exp,ListStarOfStat0)
    Stat               -> While(Exp,ListStarOfStat0)
    Stat               -> IfElse(Exp,ListStarOfStat0,ListStarOfStat0)
    Stat               -> IfThen(Exp,ListStarOfStat0)
    Stat               -> Block(ListStarOfStat0)
    Stat               -> Assign(Id,Exp)
    Stat               -> DeclarationTyped(Id,Type)
    Stat               -> Declaration(Id)
    Type               -> TypeName(Id)
    Exp                -> Or(Exp,Exp)
    Exp                -> And(Exp,Exp)
    Exp                -> Neq(Exp,Exp)
    Exp                -> Equ(Exp,Exp)
    Exp                -> Geq(Exp,Exp)
    Exp                -> Leq(Exp,Exp)
    Exp                -> Gt(Exp,Exp)
    Exp                -> Lt(Exp,Exp)
    Exp                -> Sub(Exp,Exp)
    Exp                -> Add(Exp,Exp)
    Exp                -> Mod(Exp,Exp)
    Exp                -> Div(Exp,Exp)
    Exp                -> Mul(Exp,Exp)
    Exp                -> String(String)
    Exp                -> Int(Int)
    Exp                -> Var(Id)
    Exp                -> False()
    Exp                -> True()
    StrChar            -> <string>
    String             -> <string>
    Int                -> <string>
    Id                 -> <string>

2.3.2. Signatures

Algebraic signatures are similar to regular tree grammars. Stratego requires signatures for the declaration of term constructors to be used in transformation programs. The following Stratego signature is generated from the regular tree grammar above, and thus describes the constructors of TIL abstract syntax trees.

Figure 2.14. file: til/sig/TIL.str

module TIL

signature
  constructors
    Program          : List(Stat) -> Program
    ProcCall         : Id * List(Exp) -> Stat
    For              : Id * Exp * Exp * List(Stat) -> Stat
    While            : Exp * List(Stat) -> Stat
    IfElse           : Exp * List(Stat) * List(Stat) -> Stat
    IfThen           : Exp * List(Stat) -> Stat
    Block            : List(Stat) -> Stat
    Assign           : Id * Exp -> Stat
    DeclarationTyped : Id * Type -> Stat
    Declaration      : Id -> Stat
    TypeName         : Id -> Type
    FunCall          : Id * List(Exp) -> Exp
    Or               : Exp * Exp -> Exp
    And              : Exp * Exp -> Exp
    Neq              : Exp * Exp -> Exp
    Equ              : Exp * Exp -> Exp
    Geq              : Exp * Exp -> Exp
    Leq              : Exp * Exp -> Exp
    Gt               : Exp * Exp -> Exp
    Lt               : Exp * Exp -> Exp
    Sub              : Exp * Exp -> Exp
    Add              : Exp * Exp -> Exp
    Mod              : Exp * Exp -> Exp
    Div              : Exp * Exp -> Exp
    Mul              : Exp * Exp -> Exp
    String           : String -> Exp
    Int              : Int -> Exp
    Var              : Id -> Exp
    False            : Exp
    True             : Exp
                     : String -> String
                     : String -> Int
                     : String -> Id


signature
  constructors
    Some : a -> Option(a)
    None : Option(a)


signature
  constructors
    Cons : a * List(a) -> List(a)
    Nil  : List(a)
    Conc : List(a) * List(a) -> List(a)

2.3.3. Format Checking

The well-formedness of a term with respect to a regular tree grammar can be checked using the format-check tool. When well-formed the tool prints the type of the term. If not it indicates, which subterms cannot be typed. The following examples illustrate checking of a well-formed and non well-formed term.

Figure 2.15. file: til/xmpl/fc-test1

format-check --rtg ../sig/TIL.rtg -i test1.atil -s Program 2> test1.atil.fc

Figure 2.16. file: til/xmpl/test1.atil.fc


Figure 2.17. file: til/xmpl/test1-wrong.atil

Program(
  [ Declaration("fact")
  , Assig("fact", Int("1"))
  , Assign("fact", Mul("x", Var("fact")))
  ]
)

Figure 2.18. file: til/xmpl/fc-test2

format-check --rtg ../sig/TIL.rtg -i test1-wrong.atil -s Program 2> test1-wrong.atil.fc

Figure 2.19. file: til/xmpl/test1-wrong.atil.fc

error: cannot type Assig("fact",Int("1"))
    inferred types of subterms: 
    typed "fact" as String, Int, Id, <string>
    typed Int("1") as Exp
error: cannot type Mul("x",Var("fact"))
    inferred types of subterms: 
    typed "x" as String, Int, Id, <string>
    typed Var("fact") as Exp

2.4. Pretty-Printing

After transforming a program we need to turn it into a program text again. Unparsing is the reverse of parsing and turns an abstract syntax tree into a text. Pretty-printing is unparsing with an attempt at creating a readable program text. There is a direct correspondence between abstract syntax trees and the program text from which they were produced defined by the syntax definition. This can be used in the other direction as well to generate a pretty-printer from a syntax definition. The following maak script generates from the TIL syntax definition a pretty-print table TIL.pp using ppgen and a parenthesizer using sdf2parenthesize. The latter is a Stratego program, which is compiled using the Stratego compiler strc.

Figure 2.20. file: til/pp/maak

#! /bin/sh -e

# generate pretty-print table from syntax definition
ppgen -i ../syn/TIL.def -o TIL.pp

# generate program to insert parentheses
sdf2parenthesize -i ../syn/TIL.def -o til-parenthesize.str -m TIL --lang TIL

# compile the generated program
strc -i til-parenthesize.str -I ../sig -m io-til-parenthesize -la stratego-lib

2.4.1. Pretty-Print Table

A pretty-print table defines a mapping from abstract syntax trees to expressions in the Box Formatting Language. The following pretty-print table is generated from the syntax definition for TIL. It is a default table and only ensures that the program text resulting from pretty-printing is syntactically correct, not that it is actually pretty.

Figure 2.21. file: til/pp/TIL.pp

[
   FunCall                  -- _1 KW["("] _2 KW[")"],
   FunCall.2:iter-star-sep  -- _1 KW[","],
   ProcCall                 -- _1 KW["("] _2 KW[")"] KW[";"],
   ProcCall.2:iter-star-sep -- _1 KW[","],
   TypeName                 -- _1,
   Declaration              -- KW["var"] _1 KW[";"],
   DeclarationTyped         -- KW["var"] _1 KW[":"] _2 KW[";"],
   Assign                   -- _1 KW[":="] _2 KW[";"],
   Block                    -- V  [V vs=2 [KW["begin"] _1] KW["end"]],
   Block.1:iter-star        -- _1,
   IfThen                   -- KW["if"] _1 KW["then"] _2 KW["end"],
   IfThen.2:iter-star       -- _1,
   IfElse                   -- KW["if"] _1 KW["then"] _2 KW["else"] _3 KW["end"],
   IfElse.2:iter-star       -- _1,
   IfElse.3:iter-star       -- _1,
   While                    -- KW["while"] _1 KW["do"] _2 KW["end"],
   While.2:iter-star        -- _1,
   For                      -- KW["for"] _1 KW[":="] _2 KW["to"] _3 KW["do"] _4 KW["end"],
   For.4:iter-star          -- _1,
   True                     -- KW["true"],
   False                    -- KW["false"],
   Var                      -- _1,
   Int                      -- _1,
   String                   -- _1,
   Mul                      -- _1 KW["*"] _2,
   Div                      -- _1 KW["/"] _2,
   Mod                      -- _1 KW["%"] _2,
   Add                      -- _1 KW["+"] _2,
   Sub                      -- _1 KW["-"] _2,
   Lt                       -- _1 KW["<"] _2,
   Gt                       -- _1 KW[">"] _2,
   Leq                      -- _1 KW["<="] _2,
   Geq                      -- _1 KW[">="] _2,
   Equ                      -- _1 KW["="] _2,
   Neq                      -- _1 KW["!="] _2,
   And                      -- _1 KW["&"] _2,
   Or                       -- _1 KW["|"] _2,
   Program                  -- _1,
   Program.1:iter-star      -- _1
]

2.4.2. Applying Pretty-Print Tables

A pretty-print table is applied using the ast2text tool, which translates an abstract syntax term to text given a pretty-print table. (In fact, ast2text is a composition of ast2abox and abox2text.) The pp-test1 script shows how to use a pretty-print table. The result of unparsing the AST for the test1.til program is clearly not very pretty, but it is a syntactically correct TIL program. This is tested by the subsequent commands in the script, which parse the test1.txt1 program and compare the resulting AST to the original AST. It turns out that the two programs have the exact same AST.

Figure 2.22. file: til/xmpl/pp-test1

# abstract syntax term to text
ast2text -p ../pp/TIL.pp -i test1.atil -o test1.txt1

# test unparsed code
sglri -p ../syn/TIL.tbl -i test1.txt1 | pp-aterm -o test1.atxt1

# test if the terms are the same
diff test1.atil test1.atxt1

Figure 2.23. file: til/xmpl/test1.txt1

var n ; n := readint (  ) ; var x ; var fact ; fact := 1 ; for x := 1 to n do
fact := x * fact ; end write ( "factorial of " ) ; writeint ( n ) ; write (
" is " ) ; writeint ( fact ) ; write ( "\n" ) ;

2.4.3. Adapting the Pretty-Print Table

To get more readable programs after pretty-printing we adapt the generated pretty-print table using constructs from the Box Formatting Language to indicate how each construct should be formatted.

Figure 2.24. file: til/pp/TIL-pretty.pp

[
   FunCall                  -- H hs=0 [_1 KW["("] H  [_2] KW[")"]],
   FunCall.2:iter-star-sep  -- H hs=0 [_1 KW[","]],
   ProcCall                 -- H hs=0 [_1 KW["("] H  [_2] KW[")"] KW[";"]],
   ProcCall.2:iter-star-sep -- H hs=0 [_1 KW[","]],
   Declaration              -- H hs=0 [H  [KW["var"] _1] KW[";"]],
   DeclarationTyped         -- H hs=0 [H  [KW["var"] _1 KW[":"] _2] KW[";"]],
   Assign                   -- H hs=0 [H  [_1 KW[":="] _2] KW[";"]],
   Read                     -- H hs=0 [H  [KW["read"] _1] KW[";"]],
   Write                    -- H hs=0 [H  [KW["write"] _1] KW[";"]],
   Block                    -- V  [V is=2 [KW["begin"] V  [_1]] KW["end"]],
   Block.1:iter-star        -- _1,
   IfThen                   -- V  [V is=2 [H  [KW["if"] _1 KW["then"]] V  [_2]] KW["end"]],
   IfThen.2:iter-star       -- _1,
   IfElse                   -- V  [V is=2 [H  [KW["if"] _1 KW["then"]] V  [_2]] V is=2 [KW["else"] V  [_3]] KW["end"]],
   IfElse.2:iter-star       -- _1,
   IfElse.3:iter-star       -- _1,
   While                    -- V  [V is=2 [H  [KW["while"] _1 KW["do"]] _2] KW["end"]],
   While.2:iter-star        -- _1,
   For                      -- V  [V is=2 [H  [KW["for"] _1 KW[":="] _2 KW["to"] _3 KW["do"]] _4] KW["end"]],
   For.4:iter-star          -- _1,
   True                     -- KW["true"],
   False                    -- KW["false"],
   Var                      -- _1,
   Int                      -- _1,
   String                   -- _1,
   Mul                      -- H hs=1 [_1 KW["*"] _2],
   Div                      -- H hs=1 [_1 KW["/"] _2],
   Mod                      -- H hs=1 [_1 KW["%"] _2],
   Add                      -- H hs=1 [_1 KW["+"] _2],
   Sub                      -- H hs=1 [_1 KW["-"] _2],
   Lt                       -- H hs=1 [_1 KW["<"] _2],
   Gt                       -- H hs=1 [_1 KW[">"] _2],
   Leq                      -- H hs=1 [_1 KW["<="] _2],
   Geq                      -- H hs=1 [_1 KW[">="] _2],
   Equ                      -- H hs=1 [_1 KW["="] _2],
   Neq                      -- H hs=1 [_1 KW["!="] _2],
   And                      -- H hs=1 [_1 KW["&"] _2],
   Or                       -- H hs=1 [_1 KW["|"] _2],
   Program                  -- V  [_1],
   Program.1:iter-star      -- _1,
   Parenthetical            -- H hs=0 ["(" _1 ")"],
   TypeName                 -- _1
]

Using the same procedure as before, but using the adapted pretty-print table (see til/xmpl/pp-test2) we not get a program that is much closer to the original.

Figure 2.25. file: til/xmpl/test1.txt2

var n;
n := readint();
var x;
var fact;
fact := 1;
for x := 1 to n do
  fact := x * fact;
end
write("factorial of ");
writeint(n);
write(" is ");
writeint(fact);
write("\n");

2.4.4. Restoring Parentheses

The til-parenthesize program generated by sdf2parenthesize is a simple rewrite system with rules that add a Parenthetical constructor around subtrees that have a priority or associativity conflict with their parent node. The implementation in til/pp/til-parenthesize.str is not of interest here. The program is used before applying the pretty-print table, as illustrated with the following example. The test2.txt1 program is produced without introducing parentheses and clearly has a different meaning than the original program. The test2.txt2 program has parentheses in the right places.

Figure 2.26. file: til/xmpl/pp-test3

# pretty-print without parentheses
ast2text -p ../pp/TIL-pretty.pp -i test2.atil -o test2.txt1

# add parentheses to abstract syntax term
../pp/til-parenthesize -i test2.atil | pp-aterm -o test2.atil.par

# pretty-print without parentheses
ast2text -p ../pp/TIL-pretty.pp -i test2.atil.par -o test2.txt2

Figure 2.27. file: til/xmpl/test2.atil

Program(
  [ IfElse(
      Equ(
        Mul(Var("x"), Add(Var("y"), Int("10")))
      , Int("34")
      )
    , [ Assign(
          "x"
        , Div(Var("x"), Sub(Var("y"), Int("1")))
        )
      ]
    , [ Assign(
          "x"
        , Mul(Var("x"), Div(Var("y"), Int("3")))
        )
      ]
    )
  ]
)

Figure 2.28. file: til/xmpl/test2.txt1

if x * y + 10 = 34 then
  x := x / y - 1;
else
  x := x * y / 3;
end

Figure 2.29. file: til/xmpl/test2.atil.par

Program(
  [ IfElse(
      Equ(
        Mul(
          Var("x")
        , Parenthetical(Add(Var("y"), Int("10")))
        )
      , Int("34")
      )
    , [ Assign(
          "x"
        , Div(
            Var("x")
          , Parenthetical(Sub(Var("y"), Int("1")))
          )
        )
      ]
    , [ Assign(
          "x"
        , Mul(
            Var("x")
          , Parenthetical(Div(Var("y"), Int("3")))
          )
        )
      ]
    )
  ]
)

Figure 2.30. file: til/xmpl/test2.txt2

if x * (y + 10) = 34 then
  x := x / (y - 1);
else
  x := x * (y / 3);
end

2.5. A Complete Pipeline

Given an SDF syntax definition we can produce a parser, a format checker, a parenthesizer, and a pretty-printer. Together these tools form the basic ingredients of a transformation pipeline. The composition in til-process shows how the tools are combined. This composition is not so useful as it only turns a program in a pretty-printed version of itself. In the next chapters we'll see how this pipeline can be extended with transformation tools.

Figure 2.31. file: til/xmpl/til-process

# a parse, check, parenthesize, pretty-print pipeline

sglri -p ../syn/TIL.tbl -i $1           |\
format-check --rtg ../sig/TIL.rtg       |\
../pp/til-parenthesize                  |\
 ast2text -p ../pp/TIL-pretty.pp -o $2