Вы находитесь на странице: 1из 4

ASSIGNMENT NO : 24 Problem Statement: Theory:

Lex generates C code for a lexical analyzer, or scanner. Lex uses patterns that match strings in the input and converts the strings to tokens. Lex itself doesnt produce an executable programinstead it translates the lex specification into a file containing a C routine called yylex().%left + - , %left * / :-Each of these declarations defines a level of precedence. They tell yaccthat + - are left associative and at the lowest precedence level, * and / are left associative and at a higher precedence level.Whenever the lexer returns a token to the parser, if the token has an associatedvalue, the lexer must store the value in yylvalbefore returning. In more complex parsers, yaccdefines yylvalas a union and puts the definition in y.tab.h. Yacc generates C code for syntax analyzer, or parser.Yacc uses grammar rules that allow it to analyze tokens from Lex and create a syntax tree. The yacc command converts a context-free grammar specification into a set of tables for a simple automaton that executes an LALR(1) parsing algorithm. The grammar can be ambiguous; specified precedence rules are used to break ambiguities. You must compile the output file, y.tab.c, with a C language compiler to produce ayyparse function. This function must be loaded with the yylex lexical analyzer, as well as with the main subroutine and the yyerror error-handling subroutine (you must provide these subroutines). The lex command is useful for creating lexical analyzers usable by the yyparse subroutine. Simple versions of main and yyerror subroutines are available through the yacc library, liby.a. Also, yacc can be used to generate C++ output. Write an ambiguous CFG to recognize an infix expression and implement a parser that recognizes the infix expression using YACC.

Various functions used:


1) yylex() - This is an important function. As soon as call to yylex() is encountered scanner starts scanning the source program. 2) yywrap() - This is called when scanner encounters end of file. If yywrap() returns 0 then scanner continues scanning.Whenyywrap() returns 1 that means end of file is encountred 3) yyparse() yyparse() calls a routine called yylex() every time it wants to obtain a token from the input. 4) yyerror() -The first thing the parser does when it performs the Error action is to call a function named yyerror().The yyerror() function is passed one operand: a character string describing the type of error that just took place.

Lex and Yacc:

Input token stream Char stream

Parsed Input

Grammar:A grammar G is a tuple G = (V, T, P, S) where 1. V is the (finite) set of variables (or non-terminals or syntactic categories). Each variable represents a language, i.e., a set of strings 2. T is a finite set of terminals, i.e., the symbols that form the strings of the language being defined 3. P is a set of production rules that represent the recursive definition of the language. 4. S is the start symbol that represents the language being defined. Other variables represent auxiliary classes of strings that are used to help define the language of the start symbol.

CFG(Context Free Grammar):A grammar G = (V, T, P, S) is said to be context-free grammar if every production rule P follows following rule-

If is the production rule, Then = V (i.e. single occurrence of non terminal) And = (V U T) * (i.e. zero or more occurrences of non terminals or terminals)

Ambiguous CFG:_ A context-free grammar with more than one parse tree for some expression is called ambiguous. _ Ambiguity is dangerous, because it can affect the meaning of expressions; e.g. in the expression 1- 2+ 3, it matters whether - has precedence over + or vice versa. _ To address this problem, parser generators (like YACC) allow the language designer to specify the operator associativity and precedence. _ Infix Expression:-Consider the simple arithmetic expression; A+B We have three possibilities for the positioning of the operator; 1. before the operands as: +AB which is called Prefix notation or Prefix Expression 2. between the operands as: A+B which is called Infix notation or Infix Expression 3. after the operands as: AB+ which is called Postfix notation or Reverse Polish notation. BISON PARSER ALGORITHM 1) As Bison reads tokens, it pushes them onto a stack. The stack is called the parser stack. Pushing a token is traditionally called shifting.

2) For example, suppose the infix calculator has read `1 + 5 *', with a `3' to come. The stack will have four elements, one for each token that was shifted.

3) But the stack does not always have an element for each token read. When the last n tokens and groupings shifted match the components of a grammar rule, they can be combined according to that rule. This is called reduction.

For example, if the infix calculator's parser stack contains this: 1+5*3 and the next input token is a newline character, then the last three elements can be reduced to 15 via the rule: E: E '*' E; Then the stack contains just these three elements: 1 + 15 At this point, another reduction can be made, resulting in the single value 16. Then the newline token can be shifted. The parser tries, by shifts and reductions, to reduce the entire input down to a single grouping whose symbol is the grammar's start-symbol . This kind of parser is known in the literature as a bottom-up parser.It is a simple shift/reduce parser. Example: 10+5*2-1/1 1) E->E*E Rule is reduced ---- 10 -> 5 * 2 2)E->E+E Rule is reduced ---- 20 -> 10 + 10 3)E->E/E Rule is reduced ---- 1 -> 1 / 1 4) E->E-E Rule is reduced ---- 19-> 20- 1 Answer=19

CONCLUSION:
Thus we have studied infix expression recognization using yacc.

Вам также может понравиться