Вы находитесь на странице: 1из 25

Unit I

Introduction to Compiling

12/07/21 Department of Computer Science 1


ER&DCIInstitute of Technology
Programming languages
 Humans use human languages to
communicate with each other
 English,Malayalam,Hindi etc.

 Humans use programming language to


communicate with the computers
 Java , Fortran , C

12/07/21 Department of Computer ScienceER&DCI 2


Institute of Technology
Computer Organization

High level language

Compiler

Operating Systems

Hardware

12/07/21 Department of Computer ScienceER&DCI 3


Institute of Technology
Introduction

 What is a compiler? (original meaning)


 A compiler reads a program written in
the source language and translate it
into an equivalent program in machine
language.

Source Machine
program Compiler code

Error messages
12/07/21 Department of Computer ScienceER&DCI 4
Institute of Technology
Introduction

 What is a compiler? (broader meaning)


 A compiler reads a program written in
the source language and translate it
into an equivalent program in target
language.

Source Target
program Compiler
Program

Error messages
12/07/21 Department of Computer ScienceER&DCI 5
Institute of Technology
What Do Compilers Do?

 Source language:
High-level programming languages,
Query languages
 Target language:
Machine language,
Typesetting commands,
VLSI layout, …

Compiler Technology can be broadly applied

12/07/21 Department of Computer ScienceER&DCI 6


Institute of Technology
Classifications of Compilers
 Compilers Viewed from Many Perspectives
Single Pass
Construction
Multiple Pass
Load & Go
Debugging
Functional
Optimizing

 However, All utilize same basic tasks to


accomplish their actions
12/07/21 Department of Computer ScienceER&DCI 7
Institute of Technology
Traditional Compiler Structure

Source Target
class public Foo { load R0,R8
int maxargs(Stm s) call label4

…}

Compilers have “phases”:

• each phase has an input and an output


• each phase transforms its input code into output code
• they are typically classified into “early,” “middle,” and “late” phases
which accomplish different kinds of transformations

12/07/21 Department of Computer ScienceER&DCI 8


Institute of Technology
Analysis and Synthesis

 Two parts to compilation


 Analysis: This part breaks up the source
program into pieces for syntax and
semantic analysis, e.g. parsing.
The analysis portion is widely used in
many tools.
 Synthesis: This part constructs the
desired target program, e.g. code
generation.
The synthesis portion can be very
complex, such as the optimizer.
12/07/21 Department of Computer ScienceER&DCI 9
Institute of Technology
Notes
 Today: There are many Software Tools for
helping with the Analysis Part. This Wasn’t the
Case in Early Days. (some) analysis is also
important in:
 Structure / Syntax directed editors: Force
“syntactically” correct code to be entered
 Pretty Printers: Standardized version for program
structure (i.e., blank space, indenting, etc.)
 Static Checkers: A “quick” compilation to detect
rudimentary errors
 Interpreters: “real” time execution of code a “line-
at-a-time”
12/07/21 Department of Computer ScienceER&DCI 10
Institute of Technology
Notes
 Compilation Is Not Limited to Programming Language
Applications
 Text Formatters
 LaTeX & TROFF Are Languages Whose Commands Format
Text
 Silicon Compilers
 Textual / Graphical: Take Input and Generate Circuit
Design
 Database Query Processors
 Database Query Languages Are Also a Programming
Language
 Input is compiled Into a Set of Operations for Accessing
the Database

12/07/21 Department of Computer ScienceER&DCI 11


Institute of Technology
Language-Processing System
Source Program

1
Pre-Processor

2
Compiler

3
Assembler

4 Relocatable
Machine Code
Library,
5 relocatable
Loader object files
Link/Editor

Executable

12/07/21 Department of Computer ScienceER&DCI 12


Institute of Technology
Analysis of source program
 Analysis consists of 3 phases
 Linear analysis (or Lexical analysis)
 L-to-r Scan to Identify Tokens
token: sequence of chars having a collective
meaning
 Hierarchical analysis (or Syntax analysis)
 Hierarchical grouping of tokens – grouping of
token to a meaningful collection
 Semantic analysis
 Checks to ensure meaningfulness of program

12/07/21 Department of Computer ScienceER&DCI 13


Institute of Technology
Lexical Analysis Example

Pay := Base + Rate* 60


 Lexical analysis:
characters are grouped into seven
tokens:
Pay, Base, Rate are identifiers
:= is assignment symbol
+ and * are operators
60 is a number

12/07/21 Department of Computer ScienceER&DCI 14


Institute of Technology
Syntax Analysis Example

Pay := Base + Rate* 60


 The seven tokens are grouped into a parse tree
 That is they are grouped into grammatical phrases
that are used by the compiler to synthesize output
Assignment stmt

identifier := expression

pay expression + expression

identifier
Rate*60
Nodes of tree are constructed
12/07/21 Department of Computer
using a grammar for the base
ScienceER&DCI
Institute of Technology
15
 For example we have some rules as a
part of the definition of expressions
 Any identifier is an expression
 Any number is an expression
 If expression1 end expression2 are
expressions, then so are
expression1 + expression2,
expression1 * expression2 etc are also
expressions

12/07/21 Department of Computer ScienceER&DCI 16


Institute of Technology
What is a Grammar?
 Grammar is a Set of Rules Which Govern the
Interdependencies & Structure Among the
Tokens
statement is an assignment statement, or
while statement, or if
statement, or ...
assignment statement is an identifier := expression ;

expression is an (expression), or expression


+ expression, or expression *
expression, or number, or
identifier, or ...

12/07/21 Department of Computer ScienceER&DCI 17


Institute of Technology
Semantic Analysis Example

Pay := Base + Rate* 60


 Checks for semantic errors and gathers type
information for code generation.
:=
:=
pay +
pay +
base *
base *
rate Int-to-real
rate 60
60
12/07/21 Department of Computer ScienceER&DCI 18
Institute of Technology
Semantic Analysis

 Most Important Activity in This Phase:


 Type Checking - Legality of Operands
 Many Different Situations:

Real := int + char ;


A[int] := A[real] + int ;

12/07/21 Department of Computer ScienceER&DCI 19


Institute of Technology
Intermediate Code Generation eg.
 A compiler may produce an explicit intermediate code
representing the source program.
 These intermediate codes are generally machine
(architecture) independent. But the level of
intermediate codes is close to the level of machine
codes.
Three address
code i.e. a kind
temp1 := inttoreal(60) of Intermediate
language
temp2 := id3 * temp1
temp3 := id2 + temp2
Id1 := temp3

12/07/21 Department of Computer ScienceER&DCI 20


Institute of Technology
Properties of intermediate forms

1. Each instruction has atmost one operator in


addnl to assignment operator
2. Compiler must generate a temporary name to
hold the value computed by each instruction
3. Some three address instruction can have
fewer than three operands

12/07/21 Department of Computer ScienceER&DCI 21


Institute of Technology
Code optimization e.g..
 The code optimizer optimizes the code produced by the
intermediate code generator in the terms of time and
space.
 Improves intermediate code to give a faster running
machine code

temp1 := id3 * 60.0

Id1 := id2 + temp1

12/07/21 Department of Computer ScienceER&DCI 22


Institute of Technology
Code Generation
 Produces the target language in a specific architecture.
 The target program is normally is a relocatable object file
containing the machine codes

movf id3, fr2; movf id2, fr1

mulf #60.0, fr2

addf fr2,fr1

movf fr1, id1

12/07/21 Department of Computer ScienceER&DCI 23


Institute of Technology
The Phases of a Compiler
Source Program

Analysis Lexical
Syntax

Semantic

IR code gen

Optimizer
Synthesis Target code gen
Target Program
12/07/21 Department of Computer ScienceER&DCI 24
Institute of Technology
Source Program Pay := base + rate * 60
Id1 := Id2 + Id3 * 60
lexical
:=
+
id1 *
syntax id2
:= id3 60
+
semantic id1
id2 *
id3 inttoreal
IR code gen temp1 = inttoreal(60) 60
temp2 = id3* temp1
temp1 = id3* 60.0
optimizer temp3 = id2+ temp2
id1 = id2+temp1
id1 = temp3
movf id3, fr2; movf id2, fr1
target code gen mulf #60.0, fr2
addf fr2,fr1
Target Program movf fr1, id1
12/07/21 Department of Computer ScienceER&DCI 25
Institute of Technology

Вам также может понравиться