Академический Документы
Профессиональный Документы
Культура Документы
VI Sem (CSE )
Course Outline
Compiler phases Lexical Analysis, Syntax Analysis,
Preliminaries Required
Basic knowledge of programming languages.
Basic knowledge of FSA and CFG.
programming assignments.
Textbook:
Lecture Outline
Language Processors
The Structure of a Compiler
Language Processors
A compiler
source program
Compiler
target program
input
Target Program
output
An interpreter
Much slower program execution
Better error diagnostics
source program
input
Interpreter
output
Translator
intermediate
program
input
Virtual
Machine
output
Preprocessor
modified source program
Compiler
target assembly program
Assembler
relocatable machine code
Linker/Loader
target machine code
library files
relocatable object files
Disk
Preprocessor
Disk
Compiler
Disk
Linker
Disk
Primary Memory
2. Preprocess
3. Compile
4. Link
Loader
Disk
Primary Memory
CPU
1. Edit
5. Load
6. Execute
Source (.c)
file on disk
(Format: text)
Compiler attempts to
translate the program
into machine code
Success
New object
(*.obj) files
(Format: binary)
Failure
Revised
source file
Correct
syntax
errors
List of errors
Other object
(*.obj) files
Input data
Executable
(*.exe, *.out) file
(Format: binary)
Executable
program in
memory
Results
Introduction to Compilers
Source
program
Compiler
Error messages
Diverse & Varied
Target
Program
Synthesis
Back end
Constructing the target program from the intermediate representation and
Classifications of Compilers
Compilers Viewed from Many Perspectives
Single Pass
Multiple Pass
Construction
Load & Go
Debugging
Optimizing
Functional
actions
3
Symbol-table
Manager
Lexical
Analyzer
Syntax Analyzer
Semantic Analyzer
Error Handler
Intermediate
Code Generator
Code Optimizer
Code Generator
1, 2, 3 : Analysis - Our Focus
4, 5, 6 : Synthesis
Target Program
Phases of A Compiler
Source
Program
Lexical
Syntax Semantic
Analyzer Analyzer Analyzer
Intermediate
Code
Code Generator Optimizer
Code
Generator
Target
Program
The Model
The TWO Fundamental Parts:
Hierarchical Analysis:
Semantic Analysis:
Lexical Analyzer
Lexical Analyzer reads the source program character by
assignment operator
identifier
add operator
a number
constructs).
A (Deterministic) Finite State Automaton can be used in
the implementation of a lexical analyzer.
grammar (CFG).
The rules in a CFG are mostly recursive.
A syntax analyzer checks whether a given program satisfies
the rules implied by a CFG or not.
If it satisfies, the syntax analyzer creates a parse tree for the
given program.
What is a Grammar?
Grammar is a Set of Rules Which Govern the
is an
assignment statement, or
while statement, or if
statement, or ...
assignment statement is an
identifier := expression ;
expression
(expression), or expression +
expression, or expression *
expression, or number, or
identifier, or ...
is an
:=
expression
identifier
initial
expression
+
expression
*
expression
expression
identifier
rate
number
60
Syntax Analyzer
A Syntax Analyzer creates the syntactic structure
assgstmt
identifier
newval
:=
expression
expression
expression
identifier
number
oldval
12
Parsing
Techniques
Depending on how the parse tree is created, there are different
parsing techniques.
These parsing techniques are categorized into two groups:
Top-Down Parsing:
Construction of the parse tree starts at the root, and proceeds towards
the leaves.
Efficient top-down parsers can be easily constructed by hand.
Recursive Predictive Parsing, Non-Recursive Predictive Parsing (LL
Parsing).
Bottom-Up Parsing:
Construction of the parse tree starts at the leaves, and proceeds towards
the root.
Normally efficient bottom-up parsers are created with the help of some
software tools.
Bottom-up parsing is also known as shift-reduce parsing.
Operator-Precedence Parsing simple, restrictive, easy to implement
LR Parsing much general form of shift-reduce parsing, LR, SLR, LALR
position
initial
initial
rate
60
*
rate inttoreal
60
Compressed Tree
Conversion Action
do
Semantic Analyzer
A semantic analyzer checks the source program for
The type of the identifier newval must match with type of the
expression (oldval+12)
Code Generation
should be represented as
t1:=z * w
t2:=y + t1
x:=t2
Observe that given the syntax-tree or the dag of the
graphical representation we can easily derive a three
address code for assignments as above.
In fact three-address code is a linearization of the tree.
Three-address code is useful: related to machine-language/
simple/ optimizable.
Code Optimization
Why
Reduce programmers burden
Target
Reduce execution time
Reduce space
Sometimes, these are tradeoffs
Types
Intermediate code level
Assembly level
Code Optimization
Scope
Peephole analysis
Local analysis
Global analysis
Inter-procedural analysis
Code Optimization
Techniques
Constant propagation
Constant folding
Algebraic simplification, strength reduction
Copy propagation
Common subexpression elimination
Unreacheable code elimination
Dead code elimination
Loop Optimization
Code Generation
Must generate code executable by target machine
Most complex phase of compiler
Typically code generation includes an intermediate
representation.
Machine Code Generator should translate all the
instructions in intermediate representation to assembly
language.
:=
id1
id2l
*
id3
60
semantic analyzer
:=
Symbol
Table
position ....
initial .
rate.
id1
+
id2l
*
id3
inttoreal
60
intermediate code generator
E
r
r
o
r
s
ERRORS
intermediate code generator
temp1 := inttoreal(60)
temp2 := id3 * temp1
temp3 := id2 + temp2
id1 := temp3
code optimizer
3 address code
Supporting Phases/
Activities for Analysis
Symbol Table Creation / Maintenance
Contains Info (storage, type, scope, args) on Each
Meaningful Token, Typically Identifiers
Data Structure Created / Initialized During Lexical Analysis
Utilized / Updated During Later Analysis & Synthesis
Error Handling
Detection of Different Errors Which Correspond to All
Phases
What Kinds of Errors Are Found During the Analysis Phase?
What Happens When an Error Is Found?
Assembly code