Compute the type of an expression that includes both arithmetic and boolean operators. (The type is INT, BOOL, or ERROR.)
CFG Translation rules
=== =================
exp -> exp + exp if ((exp2.trans == INT) and (exp3.trans == INT)
then exp1.trans = INT
else exp1.trans = ERROR
exp -> exp and exp if ((exp2.trans == BOOL) and (exp3.trans == BOOL)
then exp1.trans = BOOL
else exp1.trans = ERROR
exp -> exp == exp if ((exp2.trans == exp3.trans) and (exp2.trans != ERROR))
then exp1.trans = BOOL
else exp1.trans = ERROR
exp -> true exp.trans = BOOL
exp -> false exp.trans = BOOL
exp -> int exp.trans = INT
exp -> ( exp ) exp1.trans = exp2.trans
Input
=====
( 2 + 2 ) == 4
Annotated Parse Tree
====================
exp (BOOL)
/|\
/ | \
(INT) exp == exp (INT)
/|\ |
/ | \ 4
/ | \
( exp )
(INT)
/|\
/ | \
(INT) exp + exp (INT)
| |
2 2
Building an Abstract-Syntax Tree
So far, our example syntax-directed translations have produced simple values (an int or a type) as the translation of an input. In practice however, we want the parser to build an abstract-syntax tree as the translation of an input program. But that is not really so different from what we've seen so far; we just need to use tree-building operations in the translation rules instead of, e.g., arithmetic operations.
The AST vs the Parse Tree
First, let's consider how an abstract-syntax tree (AST) differs from a parse tree. An AST can be thought of as a condensed form of the parse tree:
* Operators appear at internal nodes instead of at leaves.
* "Chains" of single productions are collapsed.
* Lists are "flattened".
* Syntactic details (e.g., parentheses, commas, semi-colons) are omitted.
In general, the AST is a better structure for later stages of the compiler because it omits details having to do with the source language, and just contains information about the essential structure of the program.
Below is an example of the parse tree and the AST for the expression 3 * (4 + 2) (using the usual arithmetic-expression grammar that reflects the precedences and associativities of the operators). Note that the parentheses are not needed in the AST because the structure of the AST defines how the subexpressions are grouped.
Parse Tree Abstract Syntax Tree
========== ====================
exp *
| / \
term 3 +
/|\ / \
term * factor 4 2
| /|\
| / | \
factor ( exp )
| /|\
3 exp + term
| |
term factor
| |
factor 2
|
4
For constructs other than expressions, the compiler writer has some choices when defining the AST -- but remember that lists (e.g., lists of declarations lists of statements, lists of parameters) should be flattened, that operators (e.g., "assign", "while", "if") go at internal nodes, not at leaves, and that syntactic details are omitted.
For example:
Input Parse Tree
===== ==========
{ ____ methodBody _________
x = 0; / / \ \
while (x<10) { { declList stmtList }
x = x+1; | / \
} epsilon stmtList stmt___
y = x*2; / \ / | \ \
} stmtList stmt ID = exp ;
/ | \ (y) / | \
stmtList stmt ... exp * term
| / | | \ | |
epsilon ID = exp ; term factor
(x) | | |
INTLITERAL factor INT
(0) | (2)
ID
(x)
AST
===
methodBody
/ \
declList stmtList
/ | \
assign while assign
/ \ ... / \
ID INT ID *
(x) (0) (y) / \
ID INT
(x) (2)
Note that in the AST there is just one stmtList node, with a list of three children (the list of statements has been "flattened"). Also, the "operators" for the statements (assign and while) have been "moved up" to internal nodes (instead of appearing as tokens at the leaves). And finally, syntactic details (curly braces and semi-colons) have been omitted.