KIIT Compiler - File

EXPERIMENT-1
Aim: To study LEX tools.

A lexical analyzer is a specification of tokens in the form of regular expressions. We can
automatically generate the lexical analyzer by designing the existing tool called LEX
tools.
If the computer writer doesn’t have LEX tools then the mental process should be
followed for the designing of the LEX tools.
A LEX source program is a specification of lexical analyzer consisting of set of regular

expressions along with the actions performed for each regular expression.
An action is the piece of code which is to be executed whenever token specified by the
corresponding regular expression is recognized. The output of the LEX is the lexical
analyzer program constructed from the LEX source specification.
Unlike most programming languages a LEX source program doesn’t contain the details
of the computation. The data inputted is in the form of transition tables. The transition
table is that part of LEX output that is directly taken from LEX input.
LEX Source LEX Lexical analyzer

Compiler
Lexical
Input Stream of tokens
analyzer
The lexical analyzer L is the transition table plus the program to stimulate finite automata
expressed as transition table.
LEX source Program
Auxiliary Definitions Transition rules

D 1 = R1 P1 {A1}
D 2 = R2 P2 {A2}
. .
. .
. .
D n = Rn Pn {An}
Ri = Regular expressions.
Di = Distinct names assigned to those regular expressions.
Pi = Patterns of regular expressions
Ai = Actions to be performed
The transition rules of a LEX program are the statements of the given form where each P i
is a regular expression called the pattern over the alphabets consisting of auxiliary
definitions. The pattern describes the form of tokens.
Each Ai is a program fragment describing what actions the lexical analyzer should
perform when corresponding token is found. The Ais are written in conventional
programming language. To create the lexical analyzer each of the A is must be compiled
into machine code.
The lexical analyzer created by LEX behaves in the following manner: L reads its input
one character at a time until it has found the longest prefix to the input which matches
one of the regular expressions Pi.
One L has found that prefix L removes it from input and places it to the buffer called
token.
Letter A/B/……/Z Translation rules

Digits 0/1/……/9 BEGIN
END
IF
THEN
ELSE
LOOPS
Look Ahead Operator
Certain programming languages requires look-ahead feature in order to specify the

lexical analyzer correctly. It is denoted by ‘\’. For example, consider this statement D010I
= 105. In some programming languages, this statement can be treated as an assignment
statement but while using look-ahead operator the keywords can be distinguished i.e.
D0/10I = 105.
Implementation of lexical analyzer
Lexical analyzer can be built from its input. Lexical analyzer behaves rough as a finite
automaton. The idea is to construct a NFA ‘N’ for each token P in the translation table.
NFA1
Є NFA2
Є
NFA3
Є
Є NFAn
Link these NFAs together with a new start state. Next we convert this NFA to DFA using
subset construction.
These are several cases while constructing this procedure:
• There are several different accepting states i.e. the accepting state of each Ni
indicates that its own token has been found.
• When we convert to DFA, the subsets we construct may contain several different
finite states but final DFA must contain a single accepting state i.e. the state with
the longest prefix which matches with the given pattern.
• After reaching the final state the lexical analyzer must continue to simulate the
DFA until it reaches at a state where no next state for the current input symbol.
We call this state as a termination.
• On reaching at the termination point it is necessary to review the states of the

DFA. Each such state represents a subset of NFA state. The single final state
recognized that the token has been found if none of the states of the DFA includes
any of the final states of the NFA then an error condition is found e.g. Suppose we
have a following LEX program.
Auxiliary Definitions: - (NONE)
Transition Rules: - a
abb
a*b+
State a b Token found

01347 247 8 None
247 7 58 a
8 - 8 a*b+
7 7 8 None
58 - 68 a*b+
68 - 8 abb
EXPERIMENT-2
AIM: To check whether a string belong to the grammar or not.
#include<iostream.h>
#include<conio.h>
#include<string.h>
void main()
{
char a[10];int count1=0,count2=0;
clrscr();
cout<<"enter the string";
cin>>a;
int l=strlen(a);
for(int i=0;i<l;i++)
{
if(a[i]=='A' || a[i]=='a')
count1++;
else
count2++;
}
if(count1==count2)
cout<<"grammar is accepted";
else
cout<<"grammer is not accepted";
getch();
}
OUTPUT:
EXPERIMENT-3
AIM : To generate parse tree.
#include<conio.h>
#include<string.h>
#include<graphics.h>
void main()
{
clrscr();
char s[5],a[5][5],b[5][2],cq[5][2],z[5];
int p=0;
cout<<"\nEnter the productions for S -> ";
cin>>s;
int len=strlen(s);
cout<<"\nProductions for S -> "<<s;
for(int i=0;i<5;i++)
if((s[i]>='A')&&(s[i]<='Z'))
{ z[p]=s[i]; p++; }
int o=p;
cout<<"\nValue: "<<o;
for(p=0;p<o;p++)
{
cout<<"\nEnter the productions for "<<z[p]<<" -> ";
cin>>a[p];
cout<<"\nProduction for "<<z[p]<<" -> "<<a[p];
}
clrscr();
int driver,mode;
driver=DETECT;
initgraph(&driver,&mode,"c:\\tc\\bgi");
outtextxy(320,50,"S");
int count;
int g=0,h=-100,c,u,y=200;
p=0;
for(u=0;u<len;u=u+1)
{
y=y+50;
lineto(y,100);
b[u][0]=s[u];
b[u][1]='\0';
outtext(*(b+u));
if((s[u]>='A')&&(s[u]<='Z'))
{
count=strlen(a[g]);
a[g][count+1]='\0';
c=0;
for(int q=0;q<count;q++)
{
h=h+50;
moveto(y,100);
linerel(h,100);
cq[p][0]=*(*(a+g)+c);
cq[p][1]='\0';
outtext(cq[p]);
++p;
++c;
}
g++;
}
moveto(320,50);
}
getch();
}
Output:
EXPERIMENT-4
AIM: To compute FIRST of non terminal.
//computation of first.
#include<conio.h>
void main()
{
int n;char q[3],w[3],e[3];
clrscr();
cout<<"enter the no. of prod";
cin>>n;
for(int i=1;i<=n;i++)
{
cout<<"enter "<<i<<"pord : \n";
cout<<"enter left symbol \n";
cin>>q[i];
cout<<"enter the right fisrt symabol \n";
cin>>w[i];
cout<<"enter the next \n";
cin>>e[i];
}
cout<<"the prod are :\n\n\n";
for(i=1;i<=n;i++)
{
cout<<"pord 1: "<<q[i]<<"->"<<w[i]<<e[i]<<"\n";
}
for(i=1;i<=n;i++) //recoginition of +,*,(.
{
if(w[i]=='+')
cout<<"\nfisrt("<<q[i]<<")"<<"="<<"+";
else if(w[i]=='*')
cout<<"\nfisrt("<<q[i]<<")"<<"="<<"*";
else if(w[i]=='(')
cout<<"\nfisrt("<<q[i]<<")"<<"="<<"(";
else
cout<<"chain rule";
}
EXPERIMENT-5
AIM: Construction of NFA from regular expression
Regular Expressions
Just as finite automata are used to recognize patterns of strings, regular expressions are
used to generate patterns of strings. A regular expression is an algebraic formula whose
value is a pattern consisting of a set of strings, called the language of the expression.
Operands in a regular expression can be:
• characters from the alphabet over which the regular expression is defined.
• variables whose values are any pattern defined by a regular expression.
• epsilon which denotes the empty string containing no characters.
• null which denotes the empty set of strings.
Operators used in regular expressions include:
• Union: If R1 and R2 are regular expressions, then R1 | R2 (also written as R1 U

R2 or R1 + R2) is also a regular expression.
L(R1|R2) = L(R1) U L(R2).
• Concatenation: If R1 and R2 are regular expressions, then R1R2 (also written as

R1.R2) is also a regular expression.
L(R1R2) = L(R1) concatenated with L(R2).
• Kleene closure: If R1 is a regular expression, then R1* (the Kleene closure of R1)
is also a regular expression.
L(R1*) = epsilon U L(R1) U L(R1R1) U L(R1R1R1) U ...
Closure has the highest precedence, followed by concatenation, followed by union.
NonDeterministic Finite Automata
• A nondeterministic finite automata (NFA) allows transitions on a symbol from

one state to possibly more than one other state.
• Allows e-transitions from one state to another whereby we can move from the
first state to the second without inputting the next character.
• In a NFA, a string is matched if there is any path from the start state to an
accepting state using that string.
Converting a regular expression to a NFA - Thompson's Algorithm
We will use the rules which defined a regular expression as a basis for the construction:
1. The NFA representing the empty string is:
2. If the regular expression is just a character, eg. a, then the corresponding NFA is :
3. The union operator is represented by a choice of transitions from a node; thus a|b
can be represented as:
4. Concatenation simply involves connecting one NFA to the other; eg. ab is:
5. The Kleene closure must allow for taking zero or more instances of the letter from
the input; thus a* looks like:
EXPERIMENT-6
AIM: To check whether a grammar is left Recursion and remove left

Recursion.
#include<conio.h>
void main()
{
char k[5],l[5],s,p;
clrscr();
cout<<"enter the start symbol\n";
cin>>s;
cout<<"enter the first symbol at right\n";
cin>>p;
cout<<"enter the remaining at right\n";
cin>>k;
cout<<"enter the string after /\n";
cin>>l;
cout<<"prod is\n"<<s<<"->"<<p<<k<<"/"<<l;
if(s==p)
{
cout<<"\n grammar is left recursive";
cout<<"\n grammar after rcursion removal\n\n";
cout<<s<<"->"<<l<<s<<"'";
cout<<"\n\n";
cout<<s<<"'"<<"->"<<k<<s<<"'"<<"/"<<"ep";
}
else
{
cout<<"grammar is not left recursive";
}
getch();
}
OUTPUT:
EXPERIMENT-7
AIM:To remove left factoring.
#include<conio.h>
void main()
{
char A,K,L,M,N;
clrscr();
cout<<"enter the value for start symabol :";
cin>>A;
cout<<"enter the Alpha before \ : \n";
cin>>K;
cout<<"enter the Beta value :\n ";
cin>>L;
cout<<"enter the Alpha after \ : \n";
cin>>N;
cout<<"enter the Gama value :\n ";
cin>>M;
cout<<"prod is\n"<<A<<"->"<<K<<L<<"/"<<N<<M;
if(K==N)
{
cout<<"\n\n grammar is left factored";
cout<<"\n\n after removal of left factoring :\n\n";
cout<<A<<"->"<<K<<A<<"\'";
cout<<"\n\n";
cout<<A<<"\'"<<"->"<<L<<"/"<<M;
}
getch();
}
OUTPUT:
EXPERIMENT-8
AIM: To show all the operations of a stack.
//push operation.
#include<stdio.h>
#include<conio.h>
#define MAX 5;
void push();
void pop();
int top=-1;
int stack[5];
int item;
void main()
{ int i;
int a;
do
{ int ch;
clrscr();
printf("-------1.Push-------\n");
printf("-------2.pop--------\n");
printf("enter the choice");
scanf("%d",&ch);
switch(ch)
{
case 1:
push();
break;
case 2:
pop();
break;
case 3:
exit(0);
default:
printf("u have entered a wrong choice");
}
}while(a!=3);
}
void push()
{
if(stack[top]==5)
printf("overflow");
else
{
printf("enter the item:");
scanf("%d",&item);
top=top+1;
stack[top]=item;
}
}
void pop()
{
if(top==-1)
printf("underflow");
else
{
item=stack[top];
top=top-1;
printf("%d",item);
getch();
}
}
OUTPUT
Experiment – 10
AIM : To show read, write operation on file.
// file operation
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
void main()
{
FILE *fp;
int i,num;
clrscr();
fp=fopen("INTEGER.DAT","w");
if(fp==NULL)
{
printf("error opening file");
exit(1);
}
for(i=0;i<5;i++)
{
scanf("%d",&num);
if(num==-99)
break;
putw(num,fp);
}
fclose(fp);
fp=fopen("INTEGER.DAT","r");
if(fp==NULL)
{
printf("error opening file");
exit(1);
}
while((num=getw(fp)) !=EOF)
{
printf("%d",num);
fclose(fp);
}
getch();
Output :

KIIT Compiler - File

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

KIIT Compiler - File

Загружено:

Авторское право:

Доступные форматы

EXPERIMENT-1

Aim: To study LEX tools.

A LEX source program is a specification of lexical analyzer consisting of set of regular

LEX Source LEX Lexical analyzer

Auxiliary Definitions Transition rules

Letter A/B/……/Z Translation rules

Look Ahead Operator

Certain programming languages requires look-ahead feature in order to specify the

Implementation of lexical analyzer

• On reaching at the termination point it is necessary to review the states of the

State a b Token found

AIM: To check whether a string belong to the grammar or not.

AIM : To generate parse tree.

AIM: To compute FIRST of non terminal.

AIM: Construction of NFA from regular expression

Operands in a regular expression can be:

Operators used in regular expressions include:

• Union: If R1 and R2 are regular expressions, then R1 | R2 (also written as R1 U

L(R1|R2) = L(R1) U L(R2).

• Concatenation: If R1 and R2 are regular expressions, then R1R2 (also written as

L(R1R2) = L(R1) concatenated with L(R2).

L(R1*) = epsilon U L(R1) U L(R1R1) U L(R1R1R1) U ...

Closure has the highest precedence, followed by concatenation, followed by union.

NonDeterministic Finite Automata

• A nondeterministic finite automata (NFA) allows transitions on a symbol from

Converting a regular expression to a NFA - Thompson's Algorithm

1. The NFA representing the empty string is:

AIM: To check whether a grammar is left Recursion and remove left

AIM:To remove left factoring.

AIM: To show all the operations of a stack.

Вам также может понравиться