Sie sind auf Seite 1von 36

SEMANTIC PROCESSING

c
Chuen-Liang Chen

Department of Computer Science


and Information Engineering
National Taiwan University
Taipei, TAIWAN

Chuen-Liang Chen, NTUCS&IE / 1


Action symbols
 to determine when to call semantic routines
 1. <program>  #start begin <statement list> end
2. <statement list>  <statement> { <statement> }
3. <statement>  <ident> := <expression> #assign ;
4. <statement>  read ( <id list> ) ;
5. <statement>  write ( <expr list> ) ;
6. <id list>  <ident> #read_id { , <ident> #read_id }
7. <expr list>  <expression> #write_id
c
{ , <expression> #write_id }
8. <expression>  <primary> { <add op> <primary> #gen_infix }
9. <primary>  ( <expression> )
10. <primary>  <ident>
11. <primary>  INTLITERAL #process_literal
12. <add op>  + #process_op
13. <add op>  - #process_op
14. <ident>  ID #process_id
15. <system goal>  <program> SCANEOF #finish
 possibly, with some modifications

Chuen-Liang Chen, NTUCS&IE / 2


Semantic record
 to keep semantic information associated with grammar symbol
 #define MAXIDLEN 33
typedef char string[MAXIDLEN];
/* for operators */
typedef struct operator {
enum op { PLUS, MINUS } operator;
} op_rec; c
/* for <primary> and <expression> */
enum expr { IDEXPR, LITERALEXPR, TEMPEXPR };
typedef struct expression {
enum expr kind;
union {
string name; /* for IDEXPR, TEMPEXPR */
int val; /* for LITERALEXPR */
};
} expr_rec;

Chuen-Liang Chen, NTUCS&IE / 3


Parser + semantic routines
void expression (expr_rec *result)
{
expr_rec left_operand, right_operand;
void expression(void) op_rec op;
{
token t; /* <expression> ::= <primary> { <add op>
<primary> #gen_infix } */
/* <expression> ::= <primary> { <add
op> <primary> } */ primary(&left_operand)
c while (next_token() == PLUSOP ||
primary(); next_token() == MINUSOP) {
for (t = next_token(); t == PLUSOP || t == add_op(&op);
MINUSOP; t = next_token()) { primary(&right_operand);
add_op(); left_operand = gen_infix(left_operand,
primary(); op, right_operand);
} }
} *result = left_operand;
}
 QUIZ: where is syntatic structure?

Chuen-Liang Chen, NTUCS&IE / 4


Semantics - meaning
 syntax : semantics = structure : meaning
 implementation of “meaning” --
attribute attached to each node of (abstract) syntax tree
 operations on “meaning” --
 understand

– associating semantic information (attribute) to each node


c
– initially, on some nodes (leaves, usually)
propagation until “decorated”
 check “meaningful”

– checking static semantics


– may only dependent on attribute or also dependent on structure
 interpret

– generating code
(intermediate representation or final output of compiler)

Chuen-Liang Chen, NTUCS&IE / 5


Derivation tree v.s. abstract syntax tree
<assign stmt> :=

id := <exp> id +

<prim> + <prim> * id

<term> * <term> <term> c const id

const id id

<if stmt> if-then-endif

if <cond> then <stmts> endif <cond> <stmts>

Chuen-Liang Chen, NTUCS&IE / 6


Brief example of semantic processing
 example -- Y := 3 * X + I
 abstract syntax tree: output: ( 3, int )  ( 3.0, real ) 5
:= check 13 3.0 * X  T1 6
( I, int )  ( II, real ) 10
check 9 T1 + II  T2 11
id (Y, real) 1 + (T2, real) 12
T2  Y 14
check 4
c
* (T1, real) 7 id (I, int) 8

const (3, int) 2 id (X, real) 3


 post-order traversal
 after step 7, the lowest level are useless
 encountered tree is not the whole tree, usually

Chuen-Liang Chen, NTUCS&IE / 7


Semantic processing techniques

 semantic record -- representation of meaning

 semantic routine -- executor for semantic processing

 when to call?

 do what? c
 semantic stack

 communications among semantic routines

– local variables, parameters (for non-table-driven parser)

– semantic stack (for table-driven parser)

Chuen-Liang Chen, NTUCS&IE / 8


Semantic record (1/2)
(1/2)
 representation for attribute
 parameters among semantic routines
 unify declaration is required when passing through semantic stack exa
mple --
#define MAXIDLEN 33 enum semantic_record_kind
typedef char string[MAXIDLEN]; { OPREC, EXPRREC, ERROR };
typedef struct sem_rec {
typedef struct operator { enum semantic_record_kind record_kind;
enum op { PLUS, MINUS } operator; c
} op_rec; union {
op_rec op_record; /* OPREC */
enum expr expr_rec expr_record; /* EXPRREC */
{ IDEXPR, LITERALEXPR, TEMPEXPR }; /* empty variant */ /* ERROR */
typedef struct expression { };
enum expr kind; } semantic_record;
union {
string name;
/* for IDEXPR and TEMPEXPR */
int val; /* for LITERALEXPR */
} expr_rec;
Chuen-Liang Chen, NTUCS&IE / 9
Semantic record (2/2)
(2/2)

 3: EXPRREC  +: OPREC
LITERALEXPR PLUS
00000011

 X: EXPRREC  3+X : EXPRREC


IDEXPR TEMPEXPR
X T1

Chuen-Liang Chen, NTUCS&IE / 10


Semantic routines (1/7)
(1/7)
 action symbols in grammar
 the same for top-down and bottom-up parser except triggering places
 for top-down parsing
 may appear anywhere in production rule, due to predictive nature
 push onto parse stack when the production rule is predicted
 execute and pop out of parse stack when it is on the top

 for bottom-up parsing c


 be able to appear only after a product rule is fully recognized
i.e., at the very end of right-hand side
– state -- all possible partially matched production rules
 rewriting of some grammar rules is required

– <stmt>  if <exp> #start_if then <stmts> endif #finish_if


<stmt>  <if_head> then <stmts> endif #finish_if
<if_head>  if <exp> #start_if /* called semantic hook */
 Yacc automatically does the rewriting

Chuen-Liang Chen, NTUCS&IE / 11


Semantic routines (2/7)
(2/7)
 example grammar with parameterized action symbols
<program>  #start begin <statement list> end
<statement list>  <statement> <statement tail>
<statement tail>  <statement> <statement tail> | 
<statement>  <ident> := <expression>; #assign($1,$3)
<statement>  read ( <id list> );
<statement>  write ( <expr list> );
<id list>  <ident> #read_id($1) <id tail>
<id tail>  , <ident> #read_id($2) <id tail> | 
<expr list>  <expression> #write_expr($1)
c <expr tail>
<expr tail>  , <expression> #write_expr($2) <expr tail> | 
<expression> <primary> #copy($1,$2) <primary tail> #copy($2,$$)
<primary tail> <add op><primary>#gen_infix($$,$1,$2,$3)<primary tail>#copy($3,$$)
<primary tail>  
<primary> ( <expression> ) #copy($2,$$)
<primary> <ident> #copy($1,$$)
<primary> INTLlTERAL #process_literal($$)
<add op> PLUSOP #process_op($$)
<add op> MINUSOP #process_op($$)
<ident> ID #process_id($$)
<system goal> <program> $ #finish

Chuen-Liang Chen, NTUCS&IE / 12


Semantic routines (3/7)
(3/7)
 example semantic routines

#include <assert.h>

/* <primary> ( <expression> ) #copy($2,$$) */


void copy(semantic_record *source, semantic_record *dest)
{
/* Copy information from one part of the Semantic Stack to another */
*dest = *source;
} c
/* <ident> ID #process_id($$) */
void process_id(semantic_record *id_record)
{
/* Declare ID & build corresponding semantic record */
check_id(token_buffer);
id_record->record_kind = EXPRREC;
id_record->expr_record.kind = IDEXPR;
strcpy(id_record->expr_record.name,token_buffer);
}
Chuen-Liang Chen, NTUCS&IE / 13
Semantic routines (4/7)
(4/7)

/* <primary> INTLlTERAL #process_literal($$) */


void process_literal(semantic_record *id_record)
{
/* Convert literal to a numeric representation and build semantic record. */
id_record->record_kind = EXPRREC;
id_record->expr_record.kind = LITERALEXPR;
sscanf(token_buffer, "%d", &id_record->expr_record.val);
}
c
/* <add op> PLUSOP #process_op($$) */
void process_op(semantic_record *op)
{
/* Produce operator descriptor. */
op->record_kind = OPREC;
if (current_token == PLUSOP)
op->op_record.operator = PLUS;
else
op->op_record.operator = MINUS;
}

Chuen-Liang Chen, NTUCS&IE / 14


Semantic routines (5/7)
(5/7)

/*<primary tail> <add op><primary>#gen_infix($$,$1,$2,$3)<primary tail>#copy($3,$$)*/


void gen_infix( const semantic_record e1,
const semantic_record op,
const semantic_record e2,
semantic_record *result )
{
assert(e1.record_kind == EXPRREC);
assert(op.record_kind = OPREC); c
assert(e2.record_kind == EXPRREC);
/* Result is an expr_rec with temp variant set. */
result->record_kind = EXPRREC;
result->expr_record.kind = TEMPEXPR;
/* Generate code for infix operation.
* Get result temp and semantic record for result. */
strcpy(result->expr_record.name, get_temp());
generate(extract(op), extract(e1), extract(e2), result->expr_record.name);
}

Chuen-Liang Chen, NTUCS&IE / 15


Semantic routines (6/7)
(6/7)
/* <statement>  <ident> := <expression>; #assign($1,$3) */
void assign(const semantic_record target, const semantic_record source)
{
assert(target.record_kind == EXPRREC);
assert(target.expr_record.kind = IDEXPR);
assert(source.record_kind == EXPRREC);
/* Generate code for assignment. */
generate("Store", extract(source), target.expr_record.name, "");
}
/* <id list>  <ident> #read_id($1) <id tail> */
void read_id(const semantic_record in_var)
c
{
assert(in_var.record_kind == EXPRREC);
assert(in_var.expr record.kind = IDEXPR);
/* Generate code for read. */
generate("Read", in_var.expr_record.name, "Integer", " ");
}
/* <expr tail>  , <expression> #write_expr($2) <expr tail> |  */
void write_expr(const semantic_record out_expr)
{
assert(out_expr.record_kind == EXPRREC);
generate("Write", extract(out_expr), "Integer", " ");
}
Chuen-Liang Chen, NTUCS&IE / 16
Semantic routines (7/7)
(7/7)
 tracing example
<stmt>

<id>1 := <exp>9 ; #a($1,$3)

ID #i($$) <p>2 #c($1,$2) <pt>3,8 #c($2,$$)

INT #l($$) <ao>4 c #g($$,$1,$2,$3) <pt>7 #c($3,$$)


<p>6

PLUS #o($$) <id>5 #c($1,$$) 

ID #i($$)
1 2,3 4 5,6 7,8,9
EXPRREC EXPRREC OPREC EXPRREC EXPRREC
IDEXPR LITERALEXPR PLUS IDEXPR TEMPEXPR
Y 00000011 X T1

Chuen-Liang Chen, NTUCS&IE / 17


Semantic stack
 the place to interchange information among semantic routines
 be not necessarily treated as an abstract stack

 action-controlled : controlled by action routines


 parser-controlled : controlled by the parser driver

c
 action-controlled semantic stack
 open the interface of stack to all semantic action routines
 disadvantages:

1. difficult to change
2. action routines have to manipulate the stack
 QUIZ: detailed implementation

Chuen-Liang Chen, NTUCS&IE / 18


Tracing example (1/2)
(1/2)

Step Remaining Input Parse Stack Action


(1) begin A:=BB-314+A; end $ <s.g.> Predict 22
(2) begin A:=BB-314+A; end $ <p> $ Predict 1
(3) begin A:=BB-314+A; end $ begin <s.l.> end $ Match
(4) A:=BB-314+A; end $ <s.l.> end $ Predict 2
(5) A:=BB-314+A; end $ <s> <s.t.> end $ Predict 5
(6) A:=BB-314+A; end $ IDc:= <e> ; <s.t.> end $ Match
(7) :=BB-314+A; end $ := <e> ; <s.t.> end $ Match
(8) BB-314+A; end $ <e> ; <s.t.> end $ Predict 14
(9) BB-314+A; end $ <p> <p.t.> ; <s.t.> end $ Predict 18
(10) BB-314+A; end $ ID <p.t.> ; <s.t.> end $ Match
(11) -314+A; end $ <p.t.> ; <s.t.> end $ Predict 15
(12) -314+A; end $ <a.o.> <p> <p.t.> ; <s.t.> end $ Predict 21
(13) -314+A; end $ - <p> <p.t.> ; <s.t.> end $ Match

Chuen-Liang Chen, NTUCS&IE / 19


Example of shift-reduce parsing (3/3)
(3/3)
 grammar G0
1. <program>  begin <stmts> end $
2. <stmts>  SimpleStmt ; <stmts>
3. <stmts>  begin <stmts> end ; <stmts>
4. <stmts>  
 tracing steps
Step Parse Stack Remaining Input Action
(1) 0 begin SimpleStmt ; cSimpleStmt ; end $ Shift 1
(2) 0,1 SimpleStmt ; SimpleStmt ; end $ Shift 5
(3) 0,1,5 ; SimpleStmt ; end $ Shift 6
(4) 0,1,5,6 SimpleStmt ; end $ Shift 5
(5) 0,1,5,6,5 ; end $ Shift 6
(6) 0,1,5,6,5,6, end $ /* goto(6,<stmts>) = 10 */ Reduce 4
(7) 0,1,5,6,5,6,10 end $ /* goto(6,<stmts>) = 10 */ Reduce 2
(8) 0,1,5,6,10 end $ /* goto(1,<stmts>) = 2 */ Reduce 2
(9) 0,1,2 end $ Shift 3
(10) 0,1,2,3 $ Accept
 QUIZ: compare LL and LR parse stack

Chuen-Liang Chen, NTUCS&IE / 20


Parser-controlled semantic stack
 for LR parser
 example -- Y := 3 * X + I
scanned
– grammar -- <S> id := <E> # assign
<E> <E> + <T> # add
– still useful semantic information --
1. the location of Y
2. the location of c
temporary to store 3*X
– parse stack -- id := <E> +
 O.K. ; can be combined with parse stack
 QUIZ: how to combine?

 for LL parser
 example (continued)

– parse stack -- <T> #add #assign


 need new technique

Chuen-Liang Chen, NTUCS&IE / 21


LL parser-controlled semantic stack (1/11)
(1/11)
 LL driver top_index += m;
void lldriver(void) /* m is # of non-action symbols */
{ current_index = right_index;
int left_index = -1, right_index = -1; } else if (is_terminal(X) && X == a) {
int current_index, top_index; Place token information from scanner
/* Push the Start Symbol onto in sem_stack[current_index];
* an empty parse stack.*/ current_index++;
push(s); scanner(&a); /* Get next token */
/* Initialize the semantic stack. */ } else if (X == EOP) {
current_index = 0; top_index = 1; c Restore left, right, current, top_index
while (! stack_empty() ) { from the EOP symbol;
/* Let a be the current input token. */ /* Move to next symbol in RHS */
X = pop(); /* of previous production */
if (is_nonterminal(X) current_index++;
&& T[X][a] = X  Y1 . . . Ym) { } else if (is_action_symbol(X))
/* Expand nonterminal */ Call Semantic Routine corresponding
Push EOP(left, right, current, top_index) to X;
on the parse stack; else
Push Ym . . . Yl on the parse stack; /* Process syntax error */
left_index = current_index; }
right_index = top_index; }

Chuen-Liang Chen, NTUCS&IE / 22


LL parser-controlled semantic stack (2/11)
(2/11)
 action: pridict <system goal> <program> $ #finish
parse stack semantic stack

<program> top_index
$ $
#finish <program> right_index, current_index
EOP(-1,-1,0,1) <system goal> left_index
Chuen-Liang Chen, NTUCS&IE / 23
LL parser-controlled semantic stack (3/11)
(3/11)
 action: pridict <program>  #start begin <statement list> end
parse stack semantic stack

#start
“begin” top_index
<stmt list> "end"
"end" <stmt list>
EOP(0,1,1,3) "begin" right_index, current_index
$ $
#finish <program> left_index
EOP(-1,-1,0,1) <system goal>
Chuen-Liang Chen, NTUCS&IE / 24
LL parser-controlled semantic stack (4/11)
(4/11)
 action: do #start; match begin
parse stack semantic stack

top_index
<stmt list> "end"
"end" <stmt list> current_index
EOP(0,1,1,3) "begin" right_index
$ $
#finish <program> left_index
EOP(-1,-1,0,1) <system goal>
Chuen-Liang Chen, NTUCS&IE / 25
LL parser-controlled semantic stack (5/11)
(5/11)
 action: predict <statement list>  <statement> <statement tail>
parse stack semantic stack

c
top_index
<statement> <stmt tail>
<stmt tail> <statement> right_index, current_index
EOP(1,3,4,6) "end"
"end" <stmt list> left_index
EOP(0,1,1,3) "begin"
$ $
#finish <program>
EOP(-1,-1,0,1) <system goal>
Chuen-Liang Chen, NTUCS&IE / 26
LL parser-controlled semantic stack (6/11)
(6/11)
 action: predict <statement>  <ident> := <expression>; #assign($1,$3)
parse stack semantic stack

<ident> top_index
":=" ";"
<expression> <expression>
";" ":=" c
#assign($1,$3) <ident> right_index, current_index
EOP(4,6,6,8) <stmt tail>
<stmt tail> <statement> left_index
EOP(1,3,4,6) "end"
"end" <stmt list>
EOP(0,1,1,3) "begin"
$ $
#finish <program>
EOP(-1,-1,0,1) <system goal>
Chuen-Liang Chen, NTUCS&IE / 27
LL parser-controlled semantic stack (7/11)
(7/11)
 action: predict <ident> ID #process_id($$)
parse stack semantic stack
ID
#proc_id($$) top_index
EOP(6,8,8,12) ID right_index, current_index
":=" ";"
<expression> <expression>
";" ":=" c
#assign($1,$3) <ident> left_index
EOP(4,6,6,8) <stmt tail>
<stmt tail> <statement>
EOP(1,3,4,6) "end"
"end" <stmt list>
EOP(0,1,1,3) "begin"
$ $
#finish <program>
EOP(-1,-1,0,1) <system goal>
Chuen-Liang Chen, NTUCS&IE / 28
LL parser-controlled semantic stack (8/11)
(8/11)
 action: match ID
parse stack semantic stack

#proc_id($$) top_index, current_index


EOP(6,8,8,12) ID right_index
":=" ";"
<expression> <expression>
";" ":=" c
#assign($1,$3) <ident> left_index
EOP(4,6,6,8) <stmt tail>
<stmt tail> <statement>
EOP(1,3,4,6) "end"
"end" <stmt list>
EOP(0,1,1,3) "begin"
$ $
#finish <program>
EOP(-1,-1,0,1) <system goal>
Chuen-Liang Chen, NTUCS&IE / 29
LL parser-controlled semantic stack (9/11)
(9/11)
 action: do #proc_id; restore EOP(6,8,8,12)
parse stack semantic stack

top_index
":=" ";"
<expression> <expression>
";" ":=" c current_index
#assign($1,$3) <ident> right_index
EOP(4,6,6,8) <stmt tail>
<stmt tail> <statement> left_index
EOP(1,3,4,6) "end"
"end" <stmt list>
EOP(0,1,1,3) "begin"
$ $
#finish <program>
EOP(-1,-1,0,1) <system goal>
Chuen-Liang Chen, NTUCS&IE / 30
LL parser-controlled semantic stack (10/11)
(10/11)
 action: match :=
parse stack semantic stack

top_index
";"
<expression> <expression> current_index
";" ":=" c
#assign($1,$3) <ident> right_index
EOP(4,6,6,8) <stmt tail>
<stmt tail> <statement> left_index
EOP(1,3,4,6) "end"
"end" <stmt list>
EOP(0,1,1,3) "begin"
$ $
#finish <program>
EOP(-1,-1,0,1) <system goal>
Chuen-Liang Chen, NTUCS&IE / 31
LL parser-controlled semantic stack (11/11)
(11/11)
 action: predict <expression> <primary> #copy($1,$2) <primary tail> #copy($2,$$)
parse stack semantic stack
<primary> top_index
#copy($1,$2) <primary tail>
<primary tail> <primary> right_index, current_index
#copy($2,$$) ";"
EOP(6,8,10,12) <expression> left_index
";" ":=" c
#assign($1,$3) <ident>
EOP(4,6,6,8) <stmt tail>  QUIZ: space complexity
<stmt tail> <statement>  QUIZ: how to improve?
EOP(1,3,4,6) "end"  QUIZ: comparison,
"end" <stmt list> LL v.s. LR
EOP(0,1,1,3) "begin" action v.s. parser-
$ $ controlled
#finish <program>
EOP(-1,-1,0,1) <system goal>
Chuen-Liang Chen, NTUCS&IE / 32
Intermediate code

 to separate high-level language-dependent realizations


from low-level machine-dependent realizations
 forms of intermediate codes
 postfix notation c
 three-address code
– 1 opcode + 2 input operands + 1 result operand
 abstract syntax tree (DAG)
 QUIZ: comparison

Chuen-Liang Chen, NTUCS&IE / 33


Example tuple language (1/3)
(1/3)
 an alternative representation of three-address code
varying number of operands

ADDI, ADDF, SUBI, SUBF, MULTI, MULTF, DIVI, DIVF, MOD, REM,
EXPI, EXPF, AND, OR, XOR, EQ, NE, GT, GE, LT, LE
RESULT := ARG1 OP ARG2
UMINUS ARG2 := -ARG1
NOT ARG2 := not ARG1
ASSIGN ARG3 := ARG1, size is ARG2
FLOAT ARG2 := FLOAT(ARG1 c ) [ARG 1 in an integer]
ADDRESS ARG2 := the address of ARG1
RANGETEST abort execution if ARG3 < ARG1 or ARG3 > ARG2
LABEL ARG1 is used to label the next tuple
JUMP jump to tuple labeled ARG1
JUMP0 jump to ARG2 if ARG1 = 0
JUMP1 jump to ARG2 if ARG1 = 1
CASEJUMP ARG1 is case selector expression
CASELABEL ARG1 is a case statement label
CASERANGE ARG1 is lower bound of label range,
ARG2 is upper bound of range

Chuen-Liang Chen, NTUCS&IE / 34


Example tuple language (2/3)
(2/3)
CASEEND no arguments, end of case statement
PROCENTRY enter subprogram at nesting level ARG1
PROCEXIT exit subprogram at nesting level ARG1
STARTCALL ARG1 is temporary to reference activation record
REFPARAM ARG1 is actual parameter
ARG2 is parameter offset
ARG3 is reference to activation record
COPYIN ARG1 is actual parameter
ARG2 is parametercoffset
ARG3 is reference to activation record
COPYOUT ARG1 is actual parameter
ARG2 is parameter offset
ARG3 is reference to activation record
COPYINOUT ARG1 is actual parameter
ARG2 is parameter offset
ARG3 is reference to activation record
PROCJUMP ARG1 is subprogram start address (a label)
ARG2 is reference to activation record

Chuen-Liang Chen, NTUCS&IE / 35


Example tuple language (3/3)
(3/3)

 example
begin (READI, A)
read(A,B); (READI, B)
if A > B then (GT, A, B, t1)
C := A + 5; (JUMP0, t1, L1)
else c (ADDI, A, 5, C)
C := B + 5; (JUMP, L2)
end if; (LABEL, L1)
write(2 * (C -1)); (ADDI, B, 5, C)
end (LABEL, L2)
(SUBI, C, 1, t2)
(MULTI, 2, t2, t3)
(WRITEI, t3)

Chuen-Liang Chen, NTUCS&IE / 36

Das könnte Ihnen auch gefallen