Sie sind auf Seite 1von 29

Lecture on Compiler Design

Chapter 8: Intermediate Code Generation

By:
Md. Amjad Hossain
Lecturer, Dept. of CSE, KUET
Intermediate Code
What?
- Simple Machine independent code.
- Front end of compiler generates the I.C from which back end generate target
program.
Benefits:
- Retargeting is facilitated. Diff. back end with an existing front end
- Machine independent code optimizer can be applied (though its in back end).

Position of intermediate code generator:

Parser Static Intermediate Intermediate


Code
Checker code Generator code Generator
Intermediate Languages

Intermediate representations: -
Syntax trees , postfix notation and three address code but semantic rules for all are similar.

1. Graphical representation:
- syntax tree and DAG – natural hierarchical structure of a source program.
- Example: a := b * - c + b * - c

assign assign

+
a + a
*
* *

b uminus uminus
uminus b
c c b c

Syntax Tree DAG


Intermediate Languages

Postfix notation for the expression (from syntax tree):


a b c uminus * b c uminus * + assign

Syntax Dir. Definition for Assignment Statements:

PRODUCTION Semantic Rule


S  id := E { S.nptr = mknode (‘assign’, mkleaf(id, id.entry), E.nptr) }
:=
E  E1 + E2 {E.nptr = mknode(‘+’, E1.nptr,E2.nptr) }

E  E1 * E2 {E.nptr = mknode(‘*’, E1.nptr,E2.nptr) }

E  - E1 {E.nptr = mkunode(‘uminus’,E1.nptr) }

E  ( E1 ) {E.nptr = E1.nptr }

E  id {E.nptr = mkleaf(id, id.entry) }


Intermediate Languages
Two representations of syntax tree: (For DAG , return a pointer to an existing node
whenever possible instead of constructing new node)

assign
left right
0 id b
id a
1 id c
2 uminus 1
+
3 * 0 2
4 id b
* *
5 id c
6 uminus 5
id b id b
7 * 4 6
uminus uminus 8 + 3 7
9 id a
10 Assign 9 8
id c id c start
11 … … …
Three Address Code
Three address code:

• Sequence of Statements of general form x:=y op z

• No built-up arithmetic expressions are allowed, only one operator at right


hand side of the statement. So, the expression
x:=y + z * w might be translated as:
t1:=z * w
t2:=y + t1
x:=t2

• Each statement usually contains three address, two for operands and
another for the result.
• In fact three-address code is a linearization of the tree.
Example of 3-address code
Three address code from syntax tree/ DAG:
- Three address code is a linearized representation of syntax tree or DAG in
which the explicit names correspond to the interior nodes of the graph.

Three address code for the Syntax tree for Three address code for the DAG for

a := b * - c + b * - c
a := b * - c + b * - c

t1:=- c t1:=- c
t2:=b * t1
t2:=b * t1
t3:=- c
t4:=b * t3 t5:=t2 + t2
t5:=t2 + t4 a:=t5
a:=t5
Types of Three-Address Statements.
Assignment Statement: x:=y op z
Assignment Statement: x:=op z
Copy Statement: x:=z
Unconditional Jump: goto L
Conditional Jump: if x relop y goto L
Stack Operations: Push/pop
More Advanced:
Procedure:
param x1
param x2

param xn
call p,n
Index Assignments:
x:=y[i]
x[i]:=y
Address and Pointer Assignments:
x:=&y
x:=*y
*x:=y
Syntax-Directed Translation into 3-address code

• First deal with assignments.


• Use attributes
– E.place: the name that will hold the value of E
• Identifier will be assumed to already have the place attribute
defined.
– E.code:hold the three address code statements that
evaluate E (this is the `translation’ attribute).
• Use function newtemp that returns a new temporary
variable that we can use.
• Use function gen to generate a single three address
statement given the necessary information (variable
names and operations).
Syntax-Dir. Definition for 3-address code
PRODUCTION Semantic Rule
S  id := E { S.code = E.code|| gen(id.place ‘=’ E.place ‘;’) }
E  E1 + E2 {E.place= newtemp ;
E.code = E1.code || E2.code ||
|| gen(E.place‘:=’E1.place‘+’E2.place) }
E  E1 * E2 {E.place= newtemp ;
E.code = E1.code || E2.code ||
|| gen(E.place‘=’E1.place‘*’E2.place) }
E  - E1 {E.place= newtemp ;
E.code = E1.code ||
|| gen(E.place ‘=’ ‘uminus’ E1.place) }
E  ( E1 ) {E.place= E1.place ; E.code = E1.code}
E  id {E.place = id.entry ; E.code = ‘’ }

e.g. a := b * - (c+d)
What about things that are not assignments?
• E.g. while statements of the form “while E do S”
(interpreted as while the value of E is not 0 do S)

Extension to the previous syntax-dir. Def.


PRODUCTION
S  while E do S1 S. begin: E.code
If E.place=0 goto S.after
Semantic Rule S1.code
S.begin = newlabel; goto S.begin
S.after = newlabel ; S. after:
S.code = gen(S.begin ‘:’)
|| E.code
|| gen(‘if’ E.place ‘=’ ‘0’ ‘goto’ S.after)
|| S1.code
|| gen(‘goto’ S.begin)
|| gen(S.after ‘:’)
Implementations of 3-address statements
-Can be implemented as records with fields for operator and operands.
- Such representations are Quadruples, triples and indirect triples.
Quadruples: - records op arg1 arg2 result
with 4 fields
(0) uminus c t1
t1:=- c
t2:=b * t1 (1) * b t1 t2
t3:=- c (2) uminus c
t4:=b * t3
t5:=t2 + t4 (3) * b t3 t4
a:=t5
(4) + t2 t4 t5
(5) := t5 a

- arg1, arg2 and results are pointers to the symbol table entries for the names represented by
these fields.
-temporary names must be entered into the symbol table as they are created.
Implementations of 3-address statements (contd…)

op arg1 arg2
Triples: (0) uminus c
t1:=- c
(1) * b (0)
t2:=b * t1
(2) uminus c
t3:=- c
(3) * b (2)
t4:=b * t3
(4) + (1) (3)
t5:=t2 + t4
(5) assign a (4)
a:=t5

-Temporary names are not entered into the symbol table.


-so only three fields, op, arg1 and arg2
Other types of 3-address statements
• e.g. ternary operations like
x[i]:=y x:=y[i]
• require two or more entries. e.g.
op arg1 arg2

(0) []= x i

(1) assign (0) y

op arg1 arg2

(0) []= y i

(1) assign x (0)


Implementations of 3-address statements, III

• Indirect Triples
statement op arg1 arg2

(0) (14) (14) uminus c

(1) (15) (15) * b (14)

(2) (16) (16) uminus c

(3) (17) (17) * b (16)

(4) (18) (18) + (15) (17)

(5) (19) (19) assign a (18)


Comparison of Representations
- immediate access to the location for the temporary variable- for
quadruples
- Optimizing compiler requires reordering the statements- easy for
quadruples and indirect Triple
- Indirect triple can save some space { combine (14) and (16)/ (15) and (17)}
Dealing with Procedures
P  procedure id ‘;’ block ‘;’
Semantic Rule
begin = newlabel;
Enter into symbol-table in the entry of the procedure name the begin
label.
P.code = gen(begin ‘:’) || block.code ||
gen(‘pop’ return_address) || gen(“goto return_address”)

S  call id
Semantic Rule
Look up symbol table to find procedure name. Find its begin label
called proc_begin
return = newlabel;
S.code = gen(‘push’return); gen(goto proc_begin) || gen(return “:”)
Declarations – computing the types and relative addresses
of declared name
offset – global variable , keep the next available relative address for the declarations
in the procedure. Initialized to 0 then incremented according to the size of
declared variable.
enter ( name, type, offset) - create symbol table entry for name,
gives type and relative address.

PRODUCTION Semantic Rule


PMD {}
M {offset:=0 }
D  id : T { enter(id.entry, T.type, offset)
offset:=offset + T.width }
T  char {T.type = char; T.width = 4; }
T  integer {T.type = integer ; T.width = 4; }
T  array [ num ] of T1
{T.type=array(1..num.val,T1.type)
T.width = num.val * T1.width}
T  ^T1 {T.type = pointer(T1.type);
T1.width = 4}
Addressing array elements

• W- width of array element, low – lower bound the subscript


and base – relative address of the storage allocated for array
• ith element in the array in the location
base + (i-low) х w
Or i x w + (base – low x w)

Is calculated at compile time


and saved into symbol table

Record for 1934 is in 200 + (1934-1932)*4 = 208


Addressing array elements
For two dimensional array: its may be two types
Addressing array elements
For two dimensional array A stored in row major order, relative
address for A[i1 ,i2] can be calculated as follows:
base + ((i1 – low1) x n2 + i2 – low2) x w
Low1 and low2 are lower bound on i1 and i2
n2= high2 – low2 + 1

After separating pre- calculation part:


((i1 x n2 ) + i2) x w + ( base - ((low1 x n2 ) + low2) x w)

For Generalization: relative address of A [ i1, i2, …ik] :


Boolean Expressions
E → E or E | E and E | not | ( E ) |id relop id | true | false
Relop: <, <=, ==, >, >=, !=

Two method of representing values of boolean expression:


- Numerical
- Flow of control
Numerical Representation: ( 0 or 1)
Three address code for the expression a or b and not c :
t1 = not c
t2 = b and t1
t3= a or t2
Boolean Expressions
- A relational expression such as a < b is
equivalent to if a < b then 1 else 0. three
address code for the statement: - use arbitrary
label

100: if a < b goto 103


101: t = 0
102: goto 104
103: t = 1
104:
Boolean Expressions – Generating three address code

emit () – place three address statements into a


output file.
nextstat – gives the index of the next three
address statement in the output sequence
Newtemp - new temporary variable t1,t2,t3….
Boolean Expressions – Generating three
address code
Production rules Semantic Rules
Boolean Expressions – Generating three address code

Example:
Flow of control statements

S -> if E then S1
| if E then S1 else S2
| while E do S1
Flow of control statements
Flow of control statements

Das könnte Ihnen auch gefallen