Вы находитесь на странице: 1из 13

INTRODUCTION TO LEX & YACC 1

Introduction to Compilers

1. Compilers : is a program which takes one language (source


program) as input and translates it into an equivalent another language (target program)

Input
COMPILER
Source Program

Output
Target Program

1.1. Compiler : Analysis Synthesis Model


Compilation can be done in 2 parts

Analysis Part
(Source Program Intermediate Code)

Synthesis Part
(Intermediate Code to Target Code)

PHASES OF COMPILER
Source Program
Lexical Analyzer Analysis Phase

Syntax Analyzer
Semantic Analyzer Symbol table Management Intermediate Code Generator Synthesis Phase Code Optimizer Error detection and handling

Code Generator

Target Machine Coe

Analysis Part : 1. Lexical Analysis

Source Program to Tokens

(Scanning or Tokenization)

2. Syntax Analysis
(Parsing)

Tokens to Hierarchical structure

3. Semantic Analysis Meaning of


(Determining Meaning)

source string

Example showing Analysis step Ex.: total = count + rate*10


Lexical Analysis:
Tokens : 1. identifier total 2. The Assignment symbol 3. identifier count 4. The plus sign 5. identifier rate 6. The Multiplication Sign 7. The constant number 10

Syntax Analysis: = total Count rate + * 10

Semantic Analysis: = total Count rate + * int to float 10

Intermediate code generation t1 : = int_to_float (10) t2 : = rate * t1 t3 : = count + t2 total : = t3 Code Optimization
Improves the intermediate code this is necessary to have faster executing code

Code Generation
Intermediate code is translated into sequence of machine instructions MOV MUL MOV ADD MOV rate, R1 #10.0, R1 count, R2 R2, R1 R1, total

Symbol Table Management Stores identifiers used in program,


its type, its scope, information about the storage allocated for it, information about the subroutines used in the program

Error Detection and Handling


As programs are written by human beings therefore they cannot be free from errors Detects Errors in each phase Errors are reported to Error handler During Lexical analyzer Syntax Errors During Semantic analyzer type mismatch kind, etc

ROLE OF LEXICAL ANALYZER


Demands tokens

Input String
Lexical Analyzer Returns tokens Parser

Syntax tree

1. 2. 3. 4. 5.

It produces stream of tokens It eliminates blank and comments It generates symbol table It keeps track of line numbers It reports the error encountered while generating the tokens

Tokens, Patterns, and Lexemes : Tokens : Category of input string ex: identifiers, Keywords, constants Patterns : Set of rules that describe the tokens (RE) Lexemes : Sequence of characters in source program that are matched with
the pattern of the token. Ex int, I, num, ans, choice

Input Buffering : for storing input string and scanning in LA bp (begin pointer)
i n t i , j ; i = i + 1 ;

fp (forward pointer)

bp and fp used to keep track of the portion of the input scanned Two Schemes of buffering 1. One buffer scheme 2. Two buffer scheme

Specification of Tokens : RE is used


Strings and Language Length of string is denoted by | S | Empty string is denoted by Empty set of strings is denoted by Prefix of string Suffix of String Substring Sequence of String Operations on Language Union Concatenation Kleene Closure of L Positive Closure of L Regular Expressions

Recognition of Tokens :
Token is represented by a pair :
Token Type Token Value

Example showing encoding tokens


Token if else while for identifier constant < <= 10 > >= Code 1 2 3 4 5 6 7 7 7 7 Value Pointer to symbol Pointer to symbol 1 2 3 4

Corresponding Symbol Table


Location Counter
100 . . 105 . . 107 . . 110

Type
identifier . . constant . . identifier . . constant

Value
a

!=
i

7 8
8 9 9 10

5 1
2 1 2 -

(
) + =

Вам также может понравиться