Вы находитесь на странице: 1из 18

1.Write a program to check whether a string belongs to the grammar or not.

Algorithm:

Start.
Declare two character arrays str[],token[] and initialize integer variablesa=0,b=0,c,d. Input the string from the
user.
Repeat steps 5 to 12 till str[a] =’\0’.
If str[a] =='(' or str[a] =='{' then token[b] =’4’, b++.
If str[a] ==')' or str[a] =='}’ then token[b] =’5’, b++. Check if isdigit(str[a]) then repeat steps 8 tillisdigit(str[a
]) a++.
a--, token[b] =’6’, b++.
If str[a]=='+’ then token[b]='2',b++.If(str[a]=='*') then token[b]=’3’,b++.a++.
token[b]='\0';
then print the token generated for the string. b=0.
Repeat step 22 to 31 tilltoken[b]!='\0' c=0.
Repeat step 24 to 30 till (token[b]=='6' and token[b+1]=='2' and token[b+2]=='6') or(token[b]=='6' and token[
b+1]=='3'and token[b+2]=='6') or (token[b]=='4' andtoken[b+1]=='6' and token[b+2]=='5') or (token[c]!='\0').
token[c]='6';c++;
Repeat step 27 to 28 till token[c]!='\0'.token[c]=token[c+2].
c++.
token[c-2]=’\0’. printtoken. b++.
Compare token with 6 and store the result in d.
If d=0 then print that the string is in thegrammar. Else print that the string is not in thegrammar.
Stop.

Program:

#include<stdio.h>
#include<conio.h>
#include<ctype.h>
#include<string.h> void main()
{
int a = 0, b = 0, c;
char str[20], tok[11];
clrscr();
printf("Input the expression = ");
gets(str);
while (str[a] != '\0') {
if ((str[a] == '(') || (str[a] == '{')) {
tok[b] = '4';
b++;
}
if ((str[a] == ')') || (str[a] == '}')) {
tok[b] = '5';
b++;
}
if (isdigit(str[a])) {
while (isdigit(str[a])) {
a++;
}
a--;
tok[b] = '6';
b++;
}
if (str[a] == '+') {
tok[b] = '2';
b++;
}
if (str[a] == '*') {
tok[b] = '3';
b++;
}
a++;
}
tok[b] = '\0';
puts(tok);
b = 0;
while (tok[b] != '\0') {

if (((tok[b] == '6')&&(tok[b + 1] == '2')&&(tok[b + 2] == '6')) || ((tok[b] == '6')&&(tok[b + 1


] == '3')&&(tok[b + 2] == '6')) || ((tok[b] == '4')&&(tok[b + 1] == '6')&&(tok[b + 2] == '5'))) {
tok[b] = '6';
c = b + 1;
while (tok[c] != '\0') {
tok[c] = tok[c + 2];
c++;
}
tok[c] = '\0';
puts(tok);
b = 0;
} else {
b++;
puts(tok);

}
}
int d;
d = strcmp(tok, "6");
if (d == 0) {

printf("It is in the grammar.");

} else {
printf("It is not in the grammar.");

}
getch();
}
Output:
2.Practice of Lex of Compiler writing.
Flex (Fast Lexical Analyzer Generator )
FLEX (fast lexical analyzer generator) is a tool/computer program for generating lexical analyzers
(scanners or lexers) written by Vern Paxson in C around 1987. It is used together with Berkeley Yacc parser
generator or GNU Bison parser generator. Flex and Bison both are more flexible than Lex and Yacc and
produces faster code.
Bison produces parser from the input file provided by the user. The function yylex()is automatically
generated by the flex when it is provided with a .l file and this yylex() function is expected by parser to call
to retrieve tokens from current/this token stream.
Note: The function yylex() is the main flex function which runs the Rule Section and extension (.l) is the
extension used to save the programs.

Installing Flex on Ubuntu:


sudo apt-get update
sudo apt-get install flex
Note: If Update command is not run on the machine fom a while, it’s better to run it first so that a newer
version is installed as an older version might not work with the other packages installed or may not be
present now.
Given image describes how the Flex is used:

Step 1: An input file describes the lexical analyzer to be generated named lex.l is written in lex language.
The lex compiler transforms lex.l to C program, in a file that is always named lex.yy.c.
Step 2: The C complier compile lex.yy.c file into an executable file called a.out.
Step 3: The output file a.out take a stream of input characters and produce a stream of tokens.
Program Structure:
In the input file, there are 3 sections:
1. Definition Section: The definition section contains the declaration of variables, regular definitions,
manifest constants. In the definition section, text is enclosed in “%{ %}” brackets. Anything written in this
brackets is copied directly to the file lex.yy.c
Syntax:
%{
// Definitions
%}
2. Rules Section: The rules section contains a series of rules in the form: pattern action and pattern must be
unintended and action begin on the same line in {} brackets. The rule section is enclosed in “%% %%”.
Syntax:
%%
pattern action
%%
Examples: Table below shows some of the pattern matches.
PATTERN IT CAN MATCH WITH

[0-9] all the digits between 0 and 9

[0+9] either 0, + or 9

[0, 9] either 0, ‘, ‘ or 9

[0 9] either 0, ‘ ‘ or 9

[-09] either -, 0 or 9

[-0-9] either – or all digit between 0 and 9

[0-9]+ one or more digit between 0 and 9

[^a] all the other characters except a

[^A-Z] all the other characters except the upper case letters

a{2, 4} either aa, aaa or aaaa

a{2, } two or more occurrences of a

a{4} exactly 4 a’s i.e, aaaa

. any character except newline

a* 0 or more occurrences of a

a+ 1 or more occurrences of a

[a-z] all lower case letters

[a-zA-Z] any alphabetic letter

w(x | y)z wxz or wyz

3. User Code Section: This section contain C statements and additional functions. We can also compile
these functions separately and load with the lexical analyzer.
Basic Program Structure:
%{
// Definitions
%}

%%
Rules
%%

User code section

How to run the program:


To run the program, it should be first saved with the extension .l or .lex. Run the below commands on
terminal in order to run the program file.
Step 1: lex filename.l or lex filename.lex depending on the extension file is saved with
Step 2: gcc lex.yy.c
Step 3: ./a.out
Step 4: Provide the input to program in case it is required
Note: Press Ctrl+D or use some rule to stop taking inputs from the user. Please see the output images of
below programs to clear if in doubt to run the programs.
Recommended: Please try your approach on {IDE} first
Example 1: Count the number of characters in a string
filter_none
brightness_4
/*** Definition Section has one variable
which can be accessed inside yylex()
and main() ***/
%{
int count = 0;
%}

/*** Rule Section has three rules, first rule


matches with capital letters, second rule
matches with any character except newline and
third rule does not take input after the enter***/
%%
[A-Z] {printf("%s capital letter\n", yytext);
count++;}
. {printf("%s not a capital letter\n", yytext);}
\n {return 0;}
%%

/*** Code Section prints the number of


capital letter present in the given input***/
int yywrap(){}
int main(){

// Explanation:
// yywrap() - wraps the above rule section
/* yyin - takes the file pointer
which contains the input*/
/* yylex() - this is the main flex function
which runs the Rule Section*/
// yytext is the text in the buffer

// Uncomment the lines below


// to take input from file
// FILE *fp;
// char filename[50];
// printf("Enter the filename: \n");
// scanf("%s",filename);
// fp = fopen(filename,"r");
// yyin = fp;

yylex();
printf("\nNumber of Captial letters "
"in the given input - %d\n", count);

return 0;
}
Output:
3.Write YACC program to recognize strings of { anb | n≥5 }.

Explanation:
Yacc (for “yet another compiler compiler.”) is the standard parser generator for the Unix operating system.
An open source program, yacc generates code for the parser in the C programming language. The acronym
is usually rendered in lowercase but is occasionally seen as YACC or Yacc.
Examples:
Input: ab
Output: invalid string

Input: aaaaab
Output: valid string

Input: aabb
Output: invalid string

Input: aaaaaaab
Output: valid string

Input: aaaaaabb
Output: invalid string
Lexical Analyzer Source Code:
filter_none
brightness_4
%{
/* Definition section */
#include "y.tab.h"
%}

/* Rule Section */
%%
[aA] {return A;}
[bB] {return B;}
\n {return NL;}
. {return yytext[0];}
%%

int yywrap()
{
return 1;
}
Parser Source Code:
filter_none
brightness_4
%{
/* Definition section */
#include<stdio.h>
#include<stdlib.h>
%}

%token A B NL

/* Rule Section */
%%
stmt: A A A A A S B NL {printf("valid string\n");
exit(0);}
;
S: S A
|
;
%%

int yyerror(char *msg)


{
printf("invalid string\n");
exit(0);
}

//driver code
main()
{
printf("enter the string\n");
yyparse();
}
Output:
4.Write a LEX program to count number of printf and scanf from a given
program file and replace them with write and read respectively.
/* declaration section in this sections we will declare the different value and include the header file
which we are using in this program to run this program */

%{
#include<stdio.h>
int sf=0,pf=0;
%}

/* defined section */

%%
"scanf" { sf++; fprintf(yyout,"readf");} // replace scanf with readf
"printf" { pf++; fprintf(yyout,"writef");} // replace printf with writef

%%

int main()
{
yyin=fopen("open.c","r"); // input file open.c
yyout=fopen("new.c","w"); // output file new.c with replace
yylex();

//no of printf and scanf in the file


printf("Number of scanfs=%d\nNumber of Printf's=%d\n",sf,pf);

return 0;

How to run this program

save the file as "ass1.6.l"


open terminal and run " flex ass1.6.l "
it generates c file run " cc lex.yy.c -o ass1.6 -ll " asoutputt in file count
now run for output " ./ass1.6"

"open.txt"

#include<stdio.h>

main()
{

int a;
printf("enter the number for sum\n")

scanf("%d",&a);
}

new.txt
#include<stdio.h>

main()
{

int a;
writef("enter the number for sum\n")

readf("%d",&a);

}
5.Write a program to check whether a grammar is left recursive and remove
left recursion.
Left recursion :

#include<iostream.h>
#include<stdio.h>
#include<conio.h>
#include<string.h>

struct production
{
char l;
char r[10];
int rear;
};
struct production prod[20],pr_new[20];

int p=0,b=0,d,f,q,n,flag=0;
char terminal[20],nonterm[20],alpha[10];
char x,epsilon='^';

void main()
{
clrscr();

cout<<"Enter the number of terminals: ";


cin>>d;
cout<<"Enter the terminal symbols for your production: ";
for(int k=0;k<d;k++)
{
cin>>terminal[k];
}

cout<<"\nEnter the number of non-terminals: ";


cin>>f;
cout<<"Enter the non-terminal symbols for your production: ";
for(k=0;k<f;k++)
{
cin>>nonterm[k];
}

cout<<"\nEnter the number of Special characters(except non-terminals): ";


cin>>q;
cout<<"Enter the special characters for your production: ";
for(k=0;k<q;k++)
{
cin>>alpha[k];
}
cout<<"\nEnter the number of productions: ";
cin>>n;
for(k=0;k<=n-1;k++)
{
cout<<"Enter the "<< k+1<<" production: ";
cin>>prod[k].l;
cout<<"->";
cin>>prod[k].r;
prod[k].rear=strlen(prod[k].r);
}

for(int m=0;m<f;m++)
{
x=nonterm[m];
for(int j=0;j<n;j++)
{
if((prod[j].l==x)&&(prod[j].r[0]==prod[j].l))
flag=1;
}
for(int i=0;i<n;i++)
{
if((prod[i].l==x)&&(prod[i].r[0]!=x)&&(flag==1))
{
pr_new[b].l=x;
for(int c=0;c<prod[i].rear;c++)
pr_new[b].r[c]=prod[i].r[c];
pr_new[b++].r[c]=alpha[p];
}
else if((prod[i].l==x)&&(prod[i].r[0]==x)&&(flag==1))
{
pr_new[b].l=alpha[p];
for(int a=0;a<=prod[i].rear-2;a++)
pr_new[b].r[a]=prod[i].r[a+1];
pr_new[b++].r[a]=alpha[p];
pr_new[b].l=alpha[p];
pr_new[b++].r[0]=epsilon;
}
else if((prod[i].l==x)&&(prod[i].r[0]!=x)&&(flag==0))
{
pr_new[b].l=prod[i].l;
strcpy(pr_new[b].r,prod[i].r);
b++;
}
}
flag=0;
p++;
}

cout<<"\n\n*******************************************";
cout<<"\n AFTER REMOVING LEFT RECURSION ";
cout<<"\n*******************************************"<<endl;
for(int s=0;s<=b-1;s++)
{
cout<<"Production "<<s+1<<" is: ";
cout<<pr_new[s].l;
cout<<"->";
cout<<pr_new[s].r;
cout<<endl;
}

getche();
}
6.Write a program to remove left factoring.
In LL(1) Parser in Compiler Design, Even if a context-free grammar is unambiguous and non-left-recursion
it still can not be a LL(1) Parser. That is because of Left Factoring.

What is Left Factoring ?

Consider a part of regular grammar,

E->aE+bcD
E->aE+cBD

Here, grammar is non-left recursive, and unambiguous but there is left factoring.

How to resolve ?

E=aB | aC | aD | ............

then,

E=aX
X=B | C | D |...........

So, the above grammar will be as :

E=aE+X
X=bcD | cBD

Program :

1: #include<stdio.h>

2: #include<string.h>

3: int main()

4: {

5: char gram[20],part1[20],part2[20],modifiedGram[20],newGram[20],tempGram[20];

6: int i,j=0,k=0,l=0,pos;

7: printf("Enter Production : A->");

8: gets(gram);

9: for(i=0;gram[i]!='|';i++,j++)

10: part1[j]=gram[i];

11: part1[j]='\0';

12: for(j=++i,i=0;gram[j]!='\0';j++,i++)

13: part2[i]=gram[j];

14: part2[i]='\0';
15: for(i=0;i<strlen(part1)||i<strlen(part2);i++)

16: {

17: if(part1[i]==part2[i])

18: {

19: modifiedGram[k]=part1[i];

20: k++;

21: pos=i+1;

22: }

23: }

24: for(i=pos,j=0;part1[i]!='\0';i++,j++){

25: newGram[j]=part1[i];

26: }

27: newGram[j++]='|';

28: for(i=pos;part2[i]!='\0';i++,j++){

29: newGram[j]=part2[i];

30: }

31: modifiedGram[k]='X';

32: modifiedGram[++k]='\0';

33: newGram[j]='\0';

34: printf("\n A->%s",modifiedGram);

35: printf("\n X->%s\n",newGram);

36: }
7.Write a YACC & LEX program to identify valid and if-else statement.
Program:

(Lex Program: ift.l)


alpha [A-Za-z]
digit [0-9]
%%
[ \t\n]
if return IF;
then return THEN;
else return ELSE;
{digit}+ return NUM;
{alpha}({alpha}|{digit})* return ID;
"<=" return LE;
">=" return GE;
"==" return EQ;
"!=" return NE;
"||" return OR;
"&&" return AND;
. return yytext[0];
%%

(Yacc Program: ift.y)

%{
#include <stdio.h>
#include <stdlib.h>
%}
%token ID NUM IF THEN LE GE EQ NE OR AND ELSE
%right '='
%left AND OR
%left '<' '>' LE GE EQ NE
%left '+''-'
%left '*''/'
%right UMINUS
%left '!'
%%

S : ST {printf("Input accepted.\n");exit(0);};
ST : IF '(' E2 ')' THEN ST1';' ELSE ST1';'
| IF '(' E2 ')' THEN ST1';'
;
ST1 : ST
|E
;
E : ID'='E
| E'+'E
| E'-'E
| E'*'E
| E'/'E
| E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
;
E2 : E'<'E
| E'>'E
| E LE E
| E GE E
| E EQ E
| E NE E
| E OR E
| E AND E
| ID
| NUM
;

%%

#include "lex.yy.c"

main()
{
printf("Enter the exp: ");
yyparse();
}

Output:

students@cselab-desktop:~$ lex ift.lex


students@cselab-desktop:~$ yacc ift.y
students@cselab-desktop:~$ gcc y.tab.c -ll -ly
students@cselab-desktop:~$ ./a.out
Enter the exp: if(a==1) then b=1; else b=2;
Input accepted.
students@cselab-desktop:~$

Оценить