Вы находитесь на странице: 1из 30

Compiler Construction(CS F363)

Prof.R.Gururaj
BITS Pilani CS&IS Dept.
Hyderabad Campus
Syntax Directed Translator-2
(Ch.2 of T1)
Prof.R.Gururaj
BITS Pilani CS&IS Dept.
Hyderabad Campus
Introduction

Analysis phase:
breaks up the source program into pieces and produces an
internal representation called intermediate code.
Synthesis phase: translates the intermediate code into the target
program.

The analysis is organized around the syntax of language.

Syntax: proper form


Semantics: meaning

CFG or BNF to specifying the syntax of languages.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Program Code and TAC

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Syntax Directed Translation

Besides specifying the syntax of the program, Grammar can be


used to help guide the translation of program.

There exists a technique for translation which is based on


Grammar, known as Syntax Directed Translation.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Two forms of intermediate code

Syntax Tree: represents hierarchical structure.


Parser produces this.
Three-address code: Intermediate code generator does this.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus
Syntax Definition

CFGs used to specify the syntax of a language.

Ex: if …else statement in Java.

stmt if ( expre ) stmt else stmt

Grammar Definition:
1. Set of terminals (tokens)
2. Set of NTs (syntactic variables)
3. Set of productions
4. A starting NT

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Ex: Expression with digits and + and – operators.

list  list + digit


list  list – digit
list  digit
digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Derivations: the process of deriving a valid string from starting


NT.

Parsing: taking a string of terminals and figuring out how it can


be derived from the start NT of the Grammar.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Parse Tree

 Root
 Leaf
 Interior node
 Yield of the tree

Parse tree for 9-5+2 according to the previous grammar for


expressions.

Ambiguity and ambiguous grammars:

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Syntax Directed Translation

Besides specifying the syntax of the program, Grammar can be


used to help guide the translation of program.

There exists a technique for translation which is based on


Grammar, known as Syntax Directed Translation.

Syntax Directed Translation is done attaching rules or program


fragments to productions in Grammar.

Ex: expr expr1 + term

Translate expr1;
Translate term’
Handle +;
Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus
Attributes

An attribute is any quantity associated with programming


construct.
Ex: type, number of instructions, location of first instruction etc.

We extend the notion of attributes to symbols that represent


programming constructs.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


SDT schemes

A translation scheme is notation for attaching program


fragments to productions of grammars.

The program fragments are executed when the production is


used in syntax analysis.

The combined result of all these fragment executions, in the


order induced by the syntax analysis and produce the
translation.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Postfix notation

We look at an example that translates an infix notation to postfix


notation.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Syntax Directed Definition

1. Associates set of attributes with each Grammar symbol.


2. With each production, a set of semantic rules for computing
the values of the attributes associated with the symbols
appearing in the production.

3. We associate an attribute t with each NT symbol.

An attribute is said to be synthesized attribute if its value at a


parse-tree node is determined by the values at the children of
that node. They can be evaluated during a single bottom-up
traversal.

Here t is attached with a string.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


SDT for Infix to Postfix translation

Grammar: expr expr+term

expr expr-term

expr term

term 0|1|…|9

String : 9-5+2

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Another SDT approach

In the previous SDT, the string representing the translation of


the non-terminal at the head of each production is the
concatenation of the translations of the NTs in the production
body in the same order.

This can also be implemented by printing only the additional


strings in the order they appear in the definition. It does not
need manipulation of strings. It requires that the programs
fragments to be executed.

We use Depth first traversal of the tree.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


procedure visit(Node N)
{
for (each child C of N, from left to right)
{
visit(C);
}
evaluate semantic rule at Node N;
}

Ex: infix to postfix using the second approach.

Dr.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Parsing

Parsing is the process of determining how a string of terminals


can be generated by a grammar.

 Top-down parsing
 Bottom-up parsing

Software tools for generating parsers directly from Grammars


often use bottom-up approach.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Top-down parsing

Recursive-descent parsing is a top-down method of syntax


analysis in which a set of recursive procedures is used to
process the input.

Every NT is associated with a procedure.

A simple form of recursive-descent parsing is look-ahead


parsing.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Look-ahead approach

The look-ahead symbol unambiguously determines the flow of


control through the procedure body.

The sequence of procedure calls during the analysis of an input


string implicitly defines a parse tree for the input.

The predictive parsing relies on the information about the first


symbol that can be generated by the production body.

We use the notion of FIRST (NT)

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


The FIRST set must be considered if there are two productions
Ar; As.

The predictive parsing requires that FIRST( r) and FIRST(s)


must be disjoint.

The look-ahead symbol can be used to decide which production


to use.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Syntax Tree

(Abstract) Syntax Tree: is the structure that helps in designing


translator.
In an AST for an expression, each interior node represents an
operator.

More generally any programming construct can be handled by


making up an operator for the construct and treating as
operands the syntactically meaningful components of that
construct.
Syntax Tree : Interior nodes represent programming constructs.
Parse Tree: Interior nodes represent NTs.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Lexical Analysis

A sequence of input characters that comprises a single token is


called a lexeme.

<id, “count”>

id.lexeme= “count”

Here “count “ is the actual lexeme comprising this instance of


the token id.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Process

Removal of white spaces.


Grouping of digits into integers.
Storing the strings in string table.

String tables can be implemented using Hashtable.

Reserved words are stored initially in the table.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Symbol table

Symbol tables are the data structures that are used for holding
information about the source program constructs.

The information is collected incrementally by the analysis phase


and used by synthesis phases to generate the target code.
The role is to pass information from declarations to uses.

Info stored: identifier, type, position in storage, and other


relevant information.

Symbol table typically need to support multiple declarations of


the same identifier with in a program.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Intermediate Code

There are two kinds of intermediate representations:

1. Trees: parse trees/ syntax trees

2. Linear representations: Three-address code

Representation for programming constructs.

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Static Checking

Syntactic checking:

Type checking: coercion ; overloading

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Three-address code

We walk through the syntax trees to generate three address


code.

Format x= y op z

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus


Summary

 Grammar
 Parsing
 Three-address code
 Syntax tree
 SDT – rules, action body
 Lexical analyzer
 Symbol table
 Intermediate code

Prof.R.Gururaj CSF363 Compiler Construction BITS Pilani, Hyderabad Campus

Вам также может понравиться