Sie sind auf Seite 1von 6

ASSIGNMENT NO : 25

Title: Three Address Code Generation Using LEX and YACC Problem Statement: Write A program using Lex and Yacc to generate Intermediate Code in the
form of 3 address code (Quadruple) using Assignment operator.

Theory:
Lex itself doesnt produce an executable program ;instead it translates the lex specification into a file containing a C routine called yylex().%left + - , %left * / :-Each of these declarations defines a level of precedence. They tell yacc that + - are left associative and at the lowest precedence level, * and / are left associative and at a higher precedence level.Whenever the lexer returns a token to the parser, if the token has an associated value, the lexer must store the value in yylval before returning. In more coniplex parsers, yacc defines yylval as a union and puts the definition in y.tab.h.

LEX:
The lexical analyser is the first phase of compiler. LEX generates C code for a lexical analyzer,

or scanner. It converts the input strings to sequence of tokens. Each token describes the class or category of input string .For example identifier, keywords, constants etc. Here a lex specification file is created using extension .l (pronounced as dot Then this file is given to LEX to produce lex.yy.c. This lex.yy.c is a C program wheich is actually a lexical analyzer program consisting of the tabular representation consists of the transition diagrams for regular expressions in specification file The lexemes are then recognized using these tables. YACC:
YACC generates C code for a syntax analyzer, or parser. YACC uses grammar rules that allow it

to analyze tokens from LEX and create a syntax tree.


A syntax tree imposes a hierarchical structure on tokens. For example, operator precedence and

associativity are apparent in the syntax tree


Thus, by building syntax tree, parser finds the errors if any . It is also necessary to recover

from the errors so that further processing of input can take place.

Functions Used:yytext():Array of or pointer to (see man) char where lex places the current tokens lexeme. The lexical analyzer function, yylex, recognizes tokens from the input stream and returns them to the parser

The string is automatically null-terminated. It can be modified but not lengthened. yywrap():Used for multiple input files. Specifies what to do when the EOF is recognised. yyparse():The parser function. It returns an integer value: zero if there is success or non-zero if unable to parse the token sentence provided by yylex(); %token [<type>] token [number] [token [number]... ]: Lists tokens or terminal symbols to be used in the rest of the input file. This line is needed for tokens that do not appear in other % definitions. If type is present, the C type for all tokens on this line is declared to be the type referenced by type. If a positive integer number follows a token, that value is assigned to the token. %left [<type>] token [number] [token [number]... ]: Indicates that each token is an operator, all tokens in this definition have equal precedence, and a succession of the operators listed in this definition are evaluated left to right. %right [<type>] token [number] [token [number]... ]: Indicates that each token is an operator, that all tokens in this definition have equal precedence, and that successions of the operators listed in this definition are evaluated right to left. %type <type> symbol [symbol ...]: Defines each symbol as data type type, to resolve ambiguities. If this construct is present, yacc performs type checking and otherwise assumes all symbols to be of type integer. %union union-def: Defines the yylval global variable as a union, where union-def is a format: { type member ; [type member ; ...] } At least one member should be an int. Any valid C data type can be defined, including structures. When you run yacc with the -d option, the definition of yylval is placed in the <y.tab.h> file and can be referred to in a lex input file. standard C definition in the

Intermediate code:
The intermediate representation should have two important properties: It should be easy to produce, And easy to translate into target program. Intermediate representation can have a variety of forms. One of the forms is: three address code; which is like the assembly language for a machine in which every location can act like a register.

Three address code consists of a sequence of instructions, each of which has at most three operands. In the first pass of the compiler, source program is converted into intermediate code. The second pass converts the intermediate code to target code. The intermediate code generation is done by intermediate code generation phase. It takes input from front end which consists of lexical analysis, syntax analysis and semantic analysis and generates intermediate code and gives it to code generator. Necessity Of intermediate code: -Lexical analysis,syntax analysis & semantic analyser are independent machine and can be reused. -Every machine has its own set of characteristics. -M languages N target machine m*n compilers need to be designed.

of programming language

Fig. shows the position of intermediate code generator in compiler. Although some source code can be directly converted to target code, there are some advantages of intermediate code. Some of these advantages are: a. Target code can be generated to any machine just by attaching new machine as the back end. This is called retargeting. b. It is possible to apply machine independent code optimization. This helps in faster generation of code. Intermediate representation in the form of three address code. Other formsof intermediate representations are syntax tree, postfix notation or Directed Acyclic Graph (DAG). Thesemantic rule for syntax tree and three address code are almost similar.Graphical and Linear representation Intermediate representation

Three address code


Most instruction of three address code is of the form a = b op c where b and c are operands and op is an operator. The result after applying operator op on b and c is stored in a. Operator op can be like +, -, * or . Here operator op is assumed as binary operator. The operands b and c represents the address in memory or can be constants or literal value with no runtime address. Result a can be address in memory or temporary variable. Example: a = b * c + 10 The three address code will be t1 = b * C t2 = t1 + 10 a = t2 Here t1 and t2 are temporary variables used to store the intermediate result. Data Structure Three address code is represented as record structure with fields for operator and operands. Theserecords can be stored as array or linked list. Most common implementations of three address code are- Quadruples, Triples and Indirect triples. Quadruples- Quadruples consists of four fields in the record structure. One field to store operator op, two fields to store operands or arguments arg1and arg2 and one field to store result res. res = arg1 op arg2 Example: a = b + c b is represented as arg1, c is represented as arg2, + as op and a as res. Unary operators like -do not use agr2. Operators like param do not use agr2 nor result. For conditional and unconditional statements res is label. Arg1, arg2 and res are pointers to symbol table or literal table for the names. Example: a = -b * d + c + (-b) * d Three address code for the above statement is as follows t1 = - b t2 = t1 * d t3 = t2 + c t4 = - b t5 = t4 * d t6 = t3 + t5 a = t6

Quadruples for the above example is as follows: INDEX (0) (1) (2) (3) (4) (5) (6) OPERATOR * + * + = ARG1 b t1 t2 b t4 t3 t6 ARG2 d c d t5 RESULT t1 t2 t3 t4 t5 t6 a

Triples Triples uses only three fields in the record structure. One field for operator, two fields for operands named as arg1 and arg2. Value of temporary variable can be accessed by the position of the statement the computes it and not by location as in quadruples. Indirect Triples :In the indirect triple representation the listing of triples has been done.and listing pointers are used instead of using statements.

ALGORITHM:
1. Start 2. Give input arithmetic expression to lexical analyser . 3. To recognize the token give it to yylex function. 4. Declare the tokens Number and Letter. 5. Declare the associativity of operators. 6. In rule section, define the patterns for various operators. 7. End of file through yywrap() function.

CONCLUSION:
Thus we have studied using lex and yacc to generate the intermediate code using while statement.

Das könnte Ihnen auch gefallen