Table of Contents
This chapter is work in progress. Not all parts have been finished yet. The latest revision of this manual may contain more material. Refer to the online version.
Format-check analyzes whether the input ATerm conforms to the format that is specified in the RTG (Regular Tree Grammar).
Format-check verifies that the input ATerm is part of the language defined in the RTG. If this is not the case, then the ATerm contains format errors. Format-check can operate in three modes: plain, visualize and XHTML.
The plain mode is used if the other modes are not enabled. In the plain mode format errors are reported and no result is written the the output (stdout or a file). Hence, if format-check is included in a pipeline, then the following tool will probably fail. If the input term is correct, then it is written to the output.
The visualize mode is enabled with the --vis
option. In visualize mode format errors are reported and in a
pretty-printed ATerm will be written to the output. All
innermost parts of the ATerm that cause format errors are
printed in red, if your terminal supports control characters
for colors. If you want to browse through the ATerm with less,
then you should use the -r
flag.
The XHTML mode is enabled with the --xhtml
option. In XHTML mode format errors are reported and an report
in XHTML will be written to the output. This report shows the
parts of the ATerm that are not formatted correctly. Also,
moving with your mouse over the nodes of ATerm, will show the
non-terminals that have be inferred by format-check (do not use
IE6. Firefox or Mozilla is recommended).
Format-check reports all innermost format errors. That is, only the deepest format errors are reported. A format error is reported by showing the ATerm that is not in the correct format, and the inferred types of the children of the ATerm. In XHTML and visualize mode a format error of term in a list is presented by a red comma and term. This means that a type has been inferred for the term itself, but that it is not expected at this point in the list. If only the term is red, then no type could be inferred for the term itself.
In all modes format-check succeeds (exit code 0) if the ATerm contains no format errors. If the term does contain format errors, then format-check fails (exit code 1).
Consider the RTG generated in the example of sdf2rtg:
regular tree grammar start Exp productions Exp -> Minus(Exp,Exp) Exp -> Plus(Exp,Exp) Exp -> Mod(Exp,Exp) Exp -> Div(Exp,Exp) Exp -> Mul(Exp,Exp) Exp -> Int(IntConst) Exp -> Var(Id) IntConst -> <string> Id -> <string>
The ATerm
Plus(Var("a"), Var("b"))
is part of the language defined by this RTG. Therefore, format-check succeeds:
$ format-check --rtg Exp.rtg -i exp1.trm Plus(Var("a"),Var("b")) $
Note that format-check outputs the input term in this case. In this way format-check can be used in a pipeline of tools. On the other hand, the ATerm
Plus(Var("a"), Var(1))
is not part of the language defined by this RTG. Therefore, the invocation of format-check fails. Format-check also reports which term caused the failure:
$ format-check --rtg Exp.rtg -i exp2.trm error: cannot type Var(1) inferred types of subterms: typed 1 as <int> $
In large ATerms it might be difficult to find the incorrect
subterm. To help with this, format-check supports the
--vis
argument. If this argument is used, then the
result will pretty-printed (in the same way as pp-aterm) and the incorrect parts
will be printed in red.
For example, consider this term:
Plus(Mul(Int(1), Var("a")), Minus(Var("b"), Div(1, Var("c"))))
format-check will visualize the errors in this ATerm:
The XHTML mode shows the errors in red as well. Moreover, if you move your moves over the terms, then you'll see the inferred types of the term.
$ format-check --rtg Exp.rtg -i exp3.trm --xhtml -o foo.html
You can now view the resulting file in your browser. You need a decent browser (Firefox or Mozilla. no IE).
The abstract syntax of a programming language or data format can
be described by means of an algebraic
signature. A signature declares for each constructor
its arity m
, the sorts of its arguments S1 *
... * Sm
, and the sort of the resulting term
S0
by means of a constructor declaration c:S1 *
... * Sm -> S0
. A term can be validated against a
signature by a format checker.
Signatures can be derived automatically from syntax definitions.
For each production A1...An -> A0 {cons(c)}
in a
syntax definition, the corresponding constructor declaration is
c:S1 * ... * Sm -> S0
, where the Si
are
the sorts corresponding to the symbols Aj
after
leaving out literals and layout sorts. Figure 8.1 shows the signatures of statement and
expression constructors for the example language from this
chapter. The modules have been derived automatically from the
syntax definitions in ??? and
Figure 6.1.
Figure 8.1. Signature for statement and expression constructors, automatically derived from a syntax definition.
module Statements signature constructors FunDec : Id * List(Id) * Exp -> FunDec FunDecs : List(FunDec) -> Dec VarDec : Id * Exp -> Dec Call : Var * List(Exp) -> Exp Let : List(Dec) * List(Exp) -> Exp While : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Seq : List(Exp) -> Exp Assign : Var * Exp -> Exp Gt : Exp * Exp -> Exp Eq : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Plus : Exp * Exp -> Exp Times : Exp * Exp -> Exp Uminus : Exp -> Exp Int : IntConst -> Exp : Var -> Exp Var : Id -> Var : String -> IntConst : String -> Id signature constructors Some : a -> Option(a) None : Option(a) signature constructors Cons : a * List(a) -> List(a) Nil : List(a) Conc : List(a) * List(a) -> List(a)
sdf2rtg -i m.def -o m.rtg -m M
.
sdf2rtg
derives from an SDF syntax definition
m.def
a regular tree grammar for the module
M
and all modules it imports.
rtg2sig -i m.def -o m.rtg
.
rtg2sig
generates from a regular tree grammar a
Stratego signature.