Sie sind auf Seite 1von 8

© Theo Ruys

Overview of Lecture 6

•  Ch 7 – Code Generation
Code Generation 7.1 Code selection
7.2 A code generation algorithm
7.3 Constants and variables
HC7 7.4 Procedures and functions
7.5 Case study – Triangle compiler
Vertalerbouw HC6

VB HC6 http://fmt.cs.utwente.nl/courses/vertalerbouw/!

Michael Weber!
Theo Ruys kamer: INF 5037"
University of Twente
Department of Computer Science telefoon: 3716"
Formal Methods & Tools email: michaelw@cs.utwente.nl!
VB HC 6 Ch. 7 - Code Generation 3

© Theo Ruys © Theo Ruys

Compiler Phases Code Generation


Source Program
•  In week 3 (laboratory) we have seen that we can walk the
AST to interpret the source program directly.
Syntax Analysis Ch. 4

AST •  A code generator also walks the AST but generates


object code instructions, which
Contextual Analysis Ch. 5 !  can directly be written to a file/standard output
•  code selection laboratory week 4
AST AST
Decorated •  storage allocation
•  register allocation !  can be stored in a temporary array of instructions and
written to a file after walking the complete AST
Code Generation Ch. 7

Object Code The language Calc has a monolithic block structure; all variables
are global and can be allocated on the global stack (relative to SB).
Ch. 8: Interpreter (Abstract) Machine Ch. 6: Run-Time Organization

VB HC 6 Ch. 7 - Code Generation 4 VB HC 6 Ch. 7 - Code Generation 5


© Theo Ruys © Theo Ruys
translation rules (vertaalregels)
Code Selection Code Templates (1)

•  A compiler translates a program from a high-level language into an •  The translation of source language to target language is
equivalent program in a low-level language. defined inductively over the (abstract) syntax of the
!  Source and target program must be semantically equivalent. language.
Given a phrase of the source language, we specify the
let PUSH 2
LOADL 2
sequence of corresponding target code instructions using
var x: integer;
var y: integer STORE(1) 1[SB] the translation of its sub-phrases.
in begin LOADL 7 Instruction* is a
y := 2;
STORE(1) 0[SB] •  Code functions: sequence of target
LOAD(1) 1[SB]
x := 7; CALL putint run : Program ! Instruction* instructions.
printint(y); LOAD(1) 0[SB]
printint(x) CALL putint execute : Command ! Instruction*
end POP 2 evaluate : Expression ! Instruction* Naturally, the semantics
HALT of a language dictates
fetch : V-name ! Instruction* the code generation,
i.e., the code functions.
Code generation is concerned with the semantics of the language. assign : V-name ! Instruction*
elaborate : Declaration ! Instruction*

VB HC 6 Ch. 7 - Code Generation 6 VB HC 6 Ch. 7 - Code Generation 7

© Theo Ruys © Theo Ruys

Code Templates (2) Code Templates (3)

Summary of code functions for Triangle and TAM •  Sequential Command: C1 ; C2


!  Semantics: the sequential command C1 ; C2 is executed as
class code function effect of the generated code follows: first C1 is executed then C2 is executed.
Program run P Run the program P and then halt, starting and finishing !  execute [C1 ; C2]
with an empty stack. Code template for C1 ; C2: the code to execute
= execute [C1] C1 ; C2 consists of the code to execute C1
Command execute C Execute the command C, possibly updating variables, but execute [C2]
neither expanding nor contracting the stack. followed by the code to execute C2.

Expression evaluate E Evaluate the expression E, pushing its result on the stack
top, but having no other effects. •  Assignment Command: V:= E
V-name fetch V Push the value of the constant or variable named V on the !  Semantics: the expression E is evaluated to yield a value:
stack. the variable identified by V is updated with this value.
V-name assign V Pop a value from the stack top, and store it in the variable
named V. !  execute [V:= E]
Declaration elaborate D Elaborate the declaration D, expanding the stack to make = evaluate [E]
space for any constants and variables declared therein. assign [V] = STORE a
where a is the address
of variable V.
VB HC 6 Ch. 7 - Code Generation 8 VB HC 6 Ch. 7 - Code Generation 9
© Theo Ruys © Theo Ruys

Code Templates (4) Code Templates (5)

•  if-then-else Command: if E then C1 else C2 •  while Command: while E do C


!  Semantics: the expression E is evaluated; if its value is true, then C1 is !  Semantics: the expression E is evaluated; if its value is true, then C is
executed, and then the while-command is executed again; if its value is false,
executed; if its value is false, then C2 is executed. then execution of the while-command is completed.
!  execute [if E then C1 else C2] !  execute [while E do C]
= evaluate [E] = Lwhile: evaluate [E]
JUMPIF(0) Lelse JUMPIF(0) Lend Watt & Brown use
execute [C1] execute [C] a slightly different
JUMP Lfi JUMP Lwhile code template.
Lelse: execute [C2] Lend:
Lfi: •  let-in Command: let D in C
!  Semantics: the declaration D is elaborated; then C is executed, in the
environment of the block command overlaid by the bindings produced by D.
A code template specifies the object code to which !  execute [let D in C]
a phrase is translated, in terms of the object code = elaborate [D]
to which its subphrases are translated. execute [C] only if s > 0
POP(0) s where s = amount of
storage allocated by D.
VB HC 6 Ch. 7 - Code Generation 10 VB HC 6 Ch. 7 - Code Generation 11

© Theo Ruys © Theo Ruys


execute [while E do C] Decorated AST
= Lwhile: evaluate [E]
Code Templates (6) JUMPIF(0) Lend
execute [C]
Code Generator: input
JUMP Lwhile
Code Generator
Lend:
let
•  Example: while i>0 do i:=i-2 var n: Integer;
Program
TAM object code
50: LOAD i var c: Char
evaluate [i>0] 51: LOADL 0 in begin LetCmd
52: CALL gt c := ‘&’;
execute [while i>0 53: JUMPIF(0) 59 n := n+1 SequentialCmd
54: LOAD i end
do i:=i-2]
55: LOADL 2
execute [i:=i-2] 56: CALL sub AssignCmd AssignCmd
57: STORE i
58: JUMP 50 SequentialDecl BinaryExpr
59: :int
CharExpr
•  There often several ways to generate target code for an expression. :char
VarDecl VarDecl VnameExpr
Sometimes we can get more efficient code for special cases. :int
SimpleVar IntExpr
:int :int
•  Example: evaluate [i+1] More efficient code for SimpleT SimpleT SimpleVar SimpleVar
:char :int
the special case “+1”.
general template: Ident. Ident. Ident. Ident. Ident Char-Lit Ident Ident Op Int-Lit
special case:
LOAD i
LOAD i Often used for inlining
LOADL 1 n Integer c Char c ‘&’ n n + 1
CALL succ constant denotations.
CALL add
VB HC 6 Ch. 7 - Code Generation 12 VB HC 6 Ch. 7 - Code Generation 13
© Theo Ruys © Theo Ruys
package TAM; package Triangle.CodeGenerator;
TAM object code (1) TAM object code (2)
Again, W&B use bytes here, but the
Triangle source code uses int-variables.
public class Instruction { public class Encoder extends Visitor {
public int op; // op-code (LOADop, LOADAop, etc.) /** Append an instruction to the object program. */
public int n; // length field private void emit(int op, int n, int r, int d) {
public int r; // register field (SBr, LBr, L1r, etc.) Instruction nextInstr = new Instruction();
public int d; // operand field if (n > 255) {
} reporter.reportRestriction(
W&B (p. 260) uses byte-variables. "length of operand can't exceed 255 words");
TAM.Instruction.java, however, uses int-variables. n = 255; // to allow code generation to continue
public class Machine {
}
public static final byte // op-codes (Table C.2) nextInstr.op = op;
LOADop = 0, LOADAop = 1, ...; nextInstr.n = n;
nextInstr.r = r;
public static final byte // register numbers (Table C.1) nextInstr.d = d;
CBr = 0, CTr = 1, PBr = 2, PTr = 3, ...; if (nextInstrAddr == Machine.PB)
reporter.reportRestriction(
private static Instruction[] code = new Instruction[1024]; "too many instructions for code segment");
} else {
Machine.code[nextInstrAddr] = nextInstr;
public class Interpreter { nextInstrAddr = nextInstrAddr + 1;
... }
} }
An implementation of the Triangle Abstract Machine. private short nextInstrAddr = 0;
}
Address (within Machine.code) of the next instruction.
VB HC 6 Ch. 7 - Code Generation 14 VB HC 6 Ch. 7 - Code Generation 15

© Theo Ruys © Theo Ruys


HC3 When using a compiler generator (like ANTLR), one
AST Class Hierarchy Encoder (1) only has to specify the code templates (in tree parser):
the generator will generate the “visiting” methods.

public class Encoder implements Visitor {


AST public Object visitProgram(Program prog, Object arg ) {
prog.C.visit(this,arg);
emit(Machine.HALTop, 0, 0, 0);
return null;
Declaration Command }
... Code Generator as Visitor
}

SeqDecl AssignCmd phrase class visitor method behaviour of the visitor method

ConstDecl CallCmd Program visitProgram generate code as specified by run[P]

VarDecl IfCmd Command visit..Cmd generate code as specified by execute[C]

Expression visit..Expr generate code as specified by evaluate[E]


WhileCmd
Expression return “entity description” for the visited variable
V-name visit..Vname or constant name (i.e. use the “decoration”).
SequentialCmd
Declaration visit..Decl generate code as specified by elaborate[D]
IntegerExpr VnameExpr UnaryExpr BinaryExpr
Type-Den visit..TypeDen return the size of the type

VB HC 6 Ch. 7 - Code Generation 17 VB HC 6 Ch. 7 - Code Generation 21


© Theo Ruys © Theo Ruys

Encoder (2) Encoder (3)

•  For variables we use two distinct code generation execute [V:= E] = evaluate [E]
assign [V]
methods: fetch and assign.
public Object visitAssignCmd(AssignCmd cmd, Object arg) {
These two methods deal with the scope information of the variables.
cmd.E.visit(this, arg);
encodeAssign(cmd.V); AssignCmd
public class Encoder implements Visitor {
... }
public void encodeFetch(Vname name) {
// as specified by fetch code template ... V E
} execute [I ( E ) ] = evaluate [E]
} CALL p
public void encodeAssign(Vname name) { public Object visitCallCmd(CallCmd cmd, Object arg) {
// as specified by assign code template ...
cmd.E.visit(this, arg);
}
} short p = address of primitive routine for name cmd.I
These methods are not implemented emit(Instruction.CALLop,
as visitor methods but as separate Instruction.SBr, CallCmd
Instruction.PBr, p);
methods of the class Encoder.
return null;
} The real encoder for the CallCmd is much more I E
complex due to parameter passing and scoping.

VB HC 6 Ch. 7 - Code Generation 22 VB HC 6 Ch. 7 - Code Generation 23

© Theo Ruys © Theo Ruys

Encoder (4) Encoder (5)

evaluate [E1 op E2] = evaluate [E1] execute [while E do C] = Lwhile: evaluate [E]
evaluate [E2] JUMPIF(0) Lend
CALL p execute [C]
JUMP Lwhile
public Object visitBinaryExpression( Lend:
BinaryExpression expr, Object arg) {
expr.E1.visit(this, arg); •  Backwards jumps are easy: the address of the target has
expr.E2.visit(this, arg); already been generated and is known.
short p = address for expr.O operation
emit(Instruction.CALLop, •  Forward jumps are harder: when the jump is generated the
Instruction.SBr, BinaryExpr
Instruction.PBr, p);
target is not yet generated so its address is not (yet) known.

}
return null; E1 O E2 •  Solution: backpatching
1.  Emit jump with dummy address (e.g. simply 0).
•  Visiting methods for LetCmd, IfCmd, WhileCmd are more complex: 2.  Remember the address where the jump instruction
occurred.
!  LetCmd involves scope information
!  IfCmd and WhileCmd are complicated due to jumps
3.  When the target label is reached, go back and patch the
jump instruction.
VB HC 6 Ch. 7 - Code Generation 24 VB HC 6 Ch. 7 - Code Generation 25
© Theo Ruys © Theo Ruys

Encoder (6) Constants and Variables (1)

execute [while E do C] = Lwhile: evaluate [E]


JUMPIF(0) Lend
•  In Mini-Triangle, the LetCmd is where declarations appear.
execute [C]
JUMP Lwhile execute [let D in C] = elaborate [D]
Lend: only if s > 0
execute [C]
where s = amount of
POP(0) s
storage allocated by D.
public Object visitWhileCmd(WhileCmd cmd, Object arg) {
short lwhile = nextInstrAddr;
cmd.E.visit(this, arg); •  How to “elaborate a declaration”?
short jump2end = nextInstrAddr; •  Variables (and unknown constants) are Remember: Mini-Triangle has
emit(Instruction.JUMPIFop, 0, Instruction.CBr, 0); a flat memory-model (no
given a memory location relative to SB. user-defined procedures).
cmd.C.visit(this, arg);
emit(Instruction.JUMPop, 0, Instruction.CBr, lwhile); •  When a scope is closed, the locations of the
short lend = nextInstrAddr; old scope can then be popped.
backpatching
code[jump2end].d = lend;
WhileCmd
}
fetch [V] = LOAD(1) d[SB]
where d is the address of
E C V relative to SB
assign [V] = STORE(1) d[SB]

VB HC 6 Ch. 7 - Code Generation 26 VB HC 6 Ch. 7 - Code Generation 27

© Theo Ruys © Theo Ruys


let
Constants and Variables (2) const b ~ 10;
var i: Integer
Constants and Variables (3)
in
Program i := i*b
•  known value and known address:
make room for i
LetCmd PUSH 1
let
const b ~ 10; LOAD(1) 4[SB]
SequentialDecl AssignCmd var i: Integer LOADL 10
in call mult
i := i*b STORE(1) 4[SB]
ConstDecl VarDecl POP(0) 1
SimpleVar BinaryExpr

Int.Expr Integer VnameExpr VnameExpr •  unknown value and known address:


SimpleVar PUSH 1 ; room for x
SimpleVar let PUSH 1 ; room for y
var x: Integer LOADL 365
Ident. Int.Lit. Ident. Ident Ident Op Ident in let LOAD(1) 5[SB] ; load x
const y ~ 365 + x CALL add ; 365+x
in putint(y) STORE(1) 6[SB] ; y ~ 365+x
b 10 i i i * b
LOAD(1) 6[SB]
known value: known address: unknown value: CALL putint
known address: size = 1 POP(0) 1 Not really
size = 1 address: 4[SB] address = 5
value = 10 address = 6 POP(0) 1 needed!
(for example)

VB HC 6 Ch. 7 - Code Generation 28 VB HC 6 Ch. 7 - Code Generation 29


© Theo Ruys © Theo Ruys

Constants and Variables (4) Constants and Variables (5)

•  Code generator: declarations and applied occurrences: public abstract class RuntimeEntity {
public short size;
!  When a declaration of identifier id is encountered, the code ...
}
generator binds id to a newly created entity description. public class KnownValue extends RuntimeEntity {
–  known value: just record the value + its size public short value;
...
–  known address: record the address + reserve space }
public class UnknownValue extends RuntimeEntity {
!  When an applied occurrence of identifier id is encountered, the public short address;
code generator consults the entity description bound to id, and ...
translates the applied occurrence w.r.t. the entity. }
public class KnownAddress extends RuntimeEntity {
public short address;
known value const declaration using a literal ...
}
unknown value const declaration using an expression
public abstract class AST {
known address variable declaration public RuntimeEntity entity;
... Mostly used within Declaration.
unknown address argument address bound to a var-parameter }

VB HC 6 Ch. 7 - Code Generation 30 VB HC 6 Ch. 7 - Code Generation 31

© Theo Ruys © Theo Ruys

Static Storage Allocation (1) Static Storage Allocation (2)

•  Example: global variables •  The code generator must keep track of how much
let
var size address storage has been allocated at each point in the source
var a: Integer;
var b: Boolean; a 1 [0]SB program.
var c: Integer b 1 [1]SB !  We use the extra argument Object arg to the visiting
in begin c 1 [2]SB
... methods to pass the current amount of storage in use.
end For TAM this is the case, !  Furthermore, we let a visiting method return an Object
but not for “real machines” with the extra amount of storage it needed.
•  Example: nested blocks
!  We encode both numbers in a Short-object.
let var a: Integer var size address
in begin
... a 1 [0]SB
public Object visitXYZ(XYZ xyz, Object arg)
let var b: Boolean; b 1 [1]SB
var c: Integer c 1 [2]SB
in ...
d 1 [1]SB
if not null, a Short-object with Short-object with the current
let var d: Integer the extra storage needed. amount storage so far.
in ... Note that variable d “reuses” the
end location of b of the previous scope.
VB HC 6 Ch. 7 - Code Generation 32 VB HC 6 Ch. 7 - Code Generation 33
© Theo Ruys © Theo Ruys

Static Storage Allocation (3) Static Storage Allocation (4)

elaborate [var I :T] = PUSH s where s = size of T VarDecl


public Object visitVarDecl(VarDecl decl, Object arg) {
execute [let D in C] = elaborate [D] only if s > 0
short gs = shortValueOf(arg); I T execute [C] where s = amount of
short s = shortValueOf(decl.T.visit(this, null)); POP(0) s storage allocated by D.
decl.entity = new KnownAddress(s, gs); Next free address.
emit(Instruction.PUSHop, 0, 0, s); public Object visitLetCmd(LetCmd cmd, Object arg) {
short gs = shortValueOf(arg); LetCmd
return new Short(s); Remember the size and
} short s = shortValueOf(cmd.D.visit(this, gs));
address of the variable.
cmd.C.visit(this, new Short(gs+s)); D C
if (s > 0)
elaborate [D1 ; D2] = elaborate [D1] SeqDecl emit(Instruction.POPop, 0, 0, s);
elaborate [D2] return null;
public Object visitSeqDecl(SeqDecl decl, Object arg) { D1 D2 }
short gs = shortValueOf(arg);
short s1 = shortValueOf(decl.D1.visit(this, gs));
short s2 = shortValueOf(decl.D2.visit(this, private static short shortValueOf(Object obj) {
new Short(gs+s1))); return ((Short)obj).shortValue();
return new Short(s1+s2); }
}

VB HC 6 Ch. 7 - Code Generation 34 VB HC 6 Ch. 7 - Code Generation 35

© Theo Ruys © Theo Ruys


let

Static Storage Allocation (5)


var x: Integer;
var y: Integer Static Storage Allocation (6)
in
x := y
Program
fetch [I] = LOADL v where v = value bound to I
fetch [I] = LOAD(s) d[SB] where d is address bound to I
LetCmd and s = size(type of I)

public Object encodeFetch(Vname name, short s) {


SequentialDecl AssignCmd RuntimeEntity entity = Note that the visit
gs s1+s2 (RuntimeEntity) name.visit(this, null); of a Vname does not
.... if (entity instanceof KnownValue) { return a Short!
gs+s1
short v = ((KnownValue)entity).value;
s1 s2
VarDecl VarDecl emit(Instruction.LOADLop, 0, 0, v);
} else {
short d = (entity instanceof UnknownValue) ?
public Object visitSeqDecl(SeqDecl decl,
Integer Integer Object arg) {
((UnknownValue)entity).address :
short gs, s1, s2; ((KnownAddress)entity).address;
Ident. Ident. gs = sVal(arg);
emit(Instruction.LOADop, s, Instruction.SBr, d);
s1 = sVal(decl.D1.visit(this, new Short(gs)));
s2 = sVal(decl.D2.visit(this, }
new Short(gs+s1)));
x y return new Short(s1+s2); } A bad style of Object-Oriented design. One would expect
}
sVal = shortValueOf
that a RuntimeEntity-object knows how to fetch itself.

VB HC 6 Ch. 7 - Code Generation 36 VB HC 6 Ch. 7 - Code Generation 37