07 Semant

SEMANTIC PROCESSING
c
Chuen-Liang Chen
Department of Computer Science

and Information Engineering
National Taiwan University
Taipei, TAIWAN
Chuen-Liang Chen, NTUCS&IE / 1

Action symbols
 to determine when to call semantic routines
 1. <program>  #start begin <statement list> end
2. <statement list>  <statement> { <statement> }
3. <statement>  <ident> := <expression> #assign ;
4. <statement>  read ( <id list> ) ;
5. <statement>  write ( <expr list> ) ;
6. <id list>  <ident> #read_id { , <ident> #read_id }
7. <expr list>  <expression> #write_id
c
{ , <expression> #write_id }
8. <expression>  <primary> { <add op> <primary> #gen_infix }
9. <primary>  ( <expression> )
10. <primary>  <ident>
11. <primary>  INTLITERAL #process_literal
12. <add op>  + #process_op
13. <add op>  - #process_op
14. <ident>  ID #process_id
15. <system goal>  <program> SCANEOF #finish
 possibly, with some modifications

Semantic record
 to keep semantic information associated with grammar symbol
 #define MAXIDLEN 33
typedef char string[MAXIDLEN];
/* for operators */
typedef struct operator {
enum op { PLUS, MINUS } operator;
} op_rec; c
/* for <primary> and <expression> */
enum expr { IDEXPR, LITERALEXPR, TEMPEXPR };
typedef struct expression {
enum expr kind;
union {
string name; /* for IDEXPR, TEMPEXPR */
int val; /* for LITERALEXPR */
};
} expr_rec;

Parser + semantic routines
void expression (expr_rec *result)
{
expr_rec left_operand, right_operand;
void expression(void) op_rec op;
{
token t; /* <expression> ::= <primary> { <add op>
<primary> #gen_infix } */
/* <expression> ::= <primary> { <add
op> <primary> } */ primary(&left_operand)
c while (next_token() == PLUSOP ||
primary(); next_token() == MINUSOP) {
for (t = next_token(); t == PLUSOP || t == add_op(&op);
MINUSOP; t = next_token()) { primary(&right_operand);
add_op(); left_operand = gen_infix(left_operand,
primary(); op, right_operand);
} }
} *result = left_operand;
}
 QUIZ: where is syntatic structure?

Semantics - meaning
 syntax : semantics = structure : meaning
 implementation of “meaning” --
attribute attached to each node of (abstract) syntax tree
 operations on “meaning” --
 understand
– associating semantic information (attribute) to each node

c
– initially, on some nodes (leaves, usually)
propagation until “decorated”
 check “meaningful”
– checking static semantics

– may only dependent on attribute or also dependent on structure
 interpret
– generating code
(intermediate representation or final output of compiler)

Derivation tree v.s. abstract syntax tree
<assign stmt> :=
id := <exp> id +
<prim> + <prim> * id
<term> * <term> <term> c const id
const id id
<if stmt> if-then-endif
if <cond> then <stmts> endif <cond> <stmts>

Brief example of semantic processing
 example -- Y := 3 * X + I
 abstract syntax tree: output: ( 3, int )  ( 3.0, real ) 5
:= check 13 3.0 * X  T1 6
( I, int )  ( II, real ) 10
check 9 T1 + II  T2 11
id (Y, real) 1 + (T2, real) 12
T2  Y 14
check 4
c
* (T1, real) 7 id (I, int) 8
const (3, int) 2 id (X, real) 3

 post-order traversal
 after step 7, the lowest level are useless
 encountered tree is not the whole tree, usually

Semantic processing techniques
 semantic record -- representation of meaning
 semantic routine -- executor for semantic processing
 when to call?
 do what? c
 semantic stack
 communications among semantic routines
– local variables, parameters (for non-table-driven parser)
– semantic stack (for table-driven parser)

Semantic record (1/2)
(1/2)
 representation for attribute
 parameters among semantic routines
 unify declaration is required when passing through semantic stack exa
mple --
#define MAXIDLEN 33 enum semantic_record_kind
typedef char string[MAXIDLEN]; { OPREC, EXPRREC, ERROR };
typedef struct sem_rec {
typedef struct operator { enum semantic_record_kind record_kind;
enum op { PLUS, MINUS } operator; c
} op_rec; union {
op_rec op_record; /* OPREC */
enum expr expr_rec expr_record; /* EXPRREC */
{ IDEXPR, LITERALEXPR, TEMPEXPR }; /* empty variant */ /* ERROR */
typedef struct expression { };
enum expr kind; } semantic_record;
union {
string name;
/* for IDEXPR and TEMPEXPR */
int val; /* for LITERALEXPR */
} expr_rec;
Semantic record (2/2)
(2/2)
 3: EXPRREC  +: OPREC
LITERALEXPR PLUS
00000011
 X: EXPRREC  3+X : EXPRREC

IDEXPR TEMPEXPR
X T1

Semantic routines (1/7)
(1/7)
 action symbols in grammar
 the same for top-down and bottom-up parser except triggering places
 for top-down parsing
 may appear anywhere in production rule, due to predictive nature
 push onto parse stack when the production rule is predicted
 execute and pop out of parse stack when it is on the top
 for bottom-up parsing c

 be able to appear only after a product rule is fully recognized
i.e., at the very end of right-hand side
– state -- all possible partially matched production rules
 rewriting of some grammar rules is required
– <stmt>  if <exp> #start_if then <stmts> endif #finish_if

<stmt>  <if_head> then <stmts> endif #finish_if
<if_head>  if <exp> #start_if /* called semantic hook */
 Yacc automatically does the rewriting

(2/7)
 example grammar with parameterized action symbols
<program>  #start begin <statement list> end
<statement list>  <statement> <statement tail>
<statement tail>  <statement> <statement tail> | 
<statement>  <ident> := <expression>; #assign($1,$3)
<statement>  read ( <id list> );
<statement>  write ( <expr list> );
<id list>  <ident> #read_id($1) <id tail>
<id tail>  , <ident> #read_id($2) <id tail> | 
<expr list>  <expression> #write_expr($1)
c <expr tail>
<expr tail>  , <expression> #write_expr($2) <expr tail> | 
<expression> <primary> #copy($1,$2) <primary tail> #copy($2,$$)
<primary tail> <add op><primary>#gen_infix($$,$1,$2,$3)<primary tail>#copy($3,$$)
<primary tail>  
<primary> ( <expression> ) #copy($2,$$)
<primary> <ident> #copy($1,$$)
<primary> INTLlTERAL #process_literal($$)
<add op> PLUSOP #process_op($$)
<add op> MINUSOP #process_op($$)
<ident> ID #process_id($$)
<system goal> <program> $ #finish

(3/7)
 example semantic routines
#include <assert.h>
/* <primary> ( <expression> ) #copy($2,$$) */

void copy(semantic_record *source, semantic_record *dest)
{
/* Copy information from one part of the Semantic Stack to another */
*dest = *source;
} c
/* <ident> ID #process_id($$) */
void process_id(semantic_record *id_record)
{
/* Declare ID & build corresponding semantic record */
check_id(token_buffer);
id_record->record_kind = EXPRREC;
id_record->expr_record.kind = IDEXPR;
strcpy(id_record->expr_record.name,token_buffer);
}
(4/7)
/* <primary> INTLlTERAL #process_literal($$) */

void process_literal(semantic_record *id_record)
{
/* Convert literal to a numeric representation and build semantic record. */
id_record->record_kind = EXPRREC;
id_record->expr_record.kind = LITERALEXPR;
sscanf(token_buffer, "%d", &id_record->expr_record.val);
}
c
/* <add op> PLUSOP #process_op($$) */
void process_op(semantic_record *op)
{
/* Produce operator descriptor. */
op->record_kind = OPREC;
if (current_token == PLUSOP)
op->op_record.operator = PLUS;
else
op->op_record.operator = MINUS;
}

(5/7)
/*<primary tail> <add op><primary>#gen_infix($$,$1,$2,$3)<primary tail>#copy($3,$$)*/

void gen_infix( const semantic_record e1,
const semantic_record op,
const semantic_record e2,
semantic_record *result )
{
assert(e1.record_kind == EXPRREC);
assert(op.record_kind = OPREC); c
assert(e2.record_kind == EXPRREC);
/* Result is an expr_rec with temp variant set. */
result->record_kind = EXPRREC;
result->expr_record.kind = TEMPEXPR;
/* Generate code for infix operation.
* Get result temp and semantic record for result. */
strcpy(result->expr_record.name, get_temp());
generate(extract(op), extract(e1), extract(e2), result->expr_record.name);
}

(6/7)
/* <statement>  <ident> := <expression>; #assign($1,$3) */
void assign(const semantic_record target, const semantic_record source)
{
assert(target.record_kind == EXPRREC);
assert(target.expr_record.kind = IDEXPR);
assert(source.record_kind == EXPRREC);
/* Generate code for assignment. */
generate("Store", extract(source), target.expr_record.name, "");
}
/* <id list>  <ident> #read_id($1) <id tail> */
void read_id(const semantic_record in_var)
c
{
assert(in_var.record_kind == EXPRREC);
assert(in_var.expr record.kind = IDEXPR);
/* Generate code for read. */
generate("Read", in_var.expr_record.name, "Integer", " ");
}
/* <expr tail>  , <expression> #write_expr($2) <expr tail> |  */
void write_expr(const semantic_record out_expr)
{
assert(out_expr.record_kind == EXPRREC);
generate("Write", extract(out_expr), "Integer", " ");
}
(7/7)
 tracing example
<stmt>
<id>1 := <exp>9 ; #a($1,$3)
ID #i($$) 2 #c($1,$2) <pt>3,8 #c($2,$$)
INT #l($$) <ao>4 c #g($$,$1,$2,$3) <pt>7 #c($3,$$)

6
PLUS #o($$) <id>5 #c($1,$$) 
ID #i($$)
1 2,3 4 5,6 7,8,9
EXPRREC EXPRREC OPREC EXPRREC EXPRREC
IDEXPR LITERALEXPR PLUS IDEXPR TEMPEXPR
Y 00000011 X T1

Semantic stack
 the place to interchange information among semantic routines
 be not necessarily treated as an abstract stack
 action-controlled : controlled by action routines

 parser-controlled : controlled by the parser driver
c
 action-controlled semantic stack
 open the interface of stack to all semantic action routines
 disadvantages:
1. difficult to change
2. action routines have to manipulate the stack
 QUIZ: detailed implementation

Tracing example (1/2)
(1/2)
Step Remaining Input Parse Stack Action

(1) begin A:=BB-314+A; end $ <s.g.> Predict 22
(2) begin A:=BB-314+A; end $ $ Predict 1
(3) begin A:=BB-314+A; end $ begin <s.l.> end $ Match
(4) A:=BB-314+A; end $ <s.l.> end $ Predict 2
(5) A:=BB-314+A; end $ <s> <s.t.> end $ Predict 5
(6) A:=BB-314+A; end $ IDc:= <e> ; <s.t.> end $ Match
(7) :=BB-314+A; end $ := <e> ; <s.t.> end $ Match
(8) BB-314+A; end $ <e> ; <s.t.> end $ Predict 14
(9) BB-314+A; end $ <p.t.> ; <s.t.> end $ Predict 18
(10) BB-314+A; end $ ID <p.t.> ; <s.t.> end $ Match
(11) -314+A; end $ <p.t.> ; <s.t.> end $ Predict 15
(12) -314+A; end $ <a.o.> <p.t.> ; <s.t.> end $ Predict 21
(13) -314+A; end $ - <p.t.> ; <s.t.> end $ Match

Example of shift-reduce parsing (3/3)
(3/3)
 grammar G0
1. <program>  begin <stmts> end $
2. <stmts>  SimpleStmt ; <stmts>
3. <stmts>  begin <stmts> end ; <stmts>
4. <stmts>  
 tracing steps
Step Parse Stack Remaining Input Action
(1) 0 begin SimpleStmt ; cSimpleStmt ; end $ Shift 1
(2) 0,1 SimpleStmt ; SimpleStmt ; end $ Shift 5
(3) 0,1,5 ; SimpleStmt ; end $ Shift 6
(4) 0,1,5,6 SimpleStmt ; end $ Shift 5
(5) 0,1,5,6,5 ; end $ Shift 6
(6) 0,1,5,6,5,6, end $ /* goto(6,<stmts>) = 10 */ Reduce 4
(7) 0,1,5,6,5,6,10 end $ /* goto(6,<stmts>) = 10 */ Reduce 2
(8) 0,1,5,6,10 end $ /* goto(1,<stmts>) = 2 */ Reduce 2
(9) 0,1,2 end $ Shift 3
(10) 0,1,2,3 $ Accept
 QUIZ: compare LL and LR parse stack

Parser-controlled semantic stack
 for LR parser
 example -- Y := 3 * X + I
scanned
– grammar -- <S> id := <E> # assign
<E> <E> + <T> # add
– still useful semantic information --
1. the location of Y
2. the location of c
temporary to store 3*X
– parse stack -- id := <E> +
 O.K. ; can be combined with parse stack
 QUIZ: how to combine?
 for LL parser
 example (continued)
– parse stack -- <T> #add #assign

 need new technique

LL parser-controlled semantic stack (1/11)
(1/11)
 LL driver top_index += m;
void lldriver(void) /* m is # of non-action symbols */
{ current_index = right_index;
int left_index = -1, right_index = -1; } else if (is_terminal(X) && X == a) {
int current_index, top_index; Place token information from scanner
/* Push the Start Symbol onto in sem_stack[current_index];
* an empty parse stack.*/ current_index++;
push(s); scanner(&a); /* Get next token */
/* Initialize the semantic stack. */ } else if (X == EOP) {
current_index = 0; top_index = 1; c Restore left, right, current, top_index
while (! stack_empty() ) { from the EOP symbol;
/* Let a be the current input token. */ /* Move to next symbol in RHS */
X = pop(); /* of previous production */
if (is_nonterminal(X) current_index++;
&& T[X][a] = X  Y1 . . . Ym) { } else if (is_action_symbol(X))
/* Expand nonterminal */ Call Semantic Routine corresponding
Push EOP(left, right, current, top_index) to X;
on the parse stack; else
Push Ym . . . Yl on the parse stack; /* Process syntax error */
left_index = current_index; }
right_index = top_index; }

(2/11)
 action: pridict <system goal> <program> $ #finish
parse stack semantic stack
<program> top_index
$ $
#finish <program> right_index, current_index
EOP(-1,-1,0,1) <system goal> left_index
(3/11)
 action: pridict <program>  #start begin <statement list> end
#start
“begin” top_index
<stmt list> "end"
"end" <stmt list>
EOP(0,1,1,3) "begin" right_index, current_index
$ $
#finish <program> left_index
EOP(-1,-1,0,1) <system goal>
(4/11)
 action: do #start; match begin
top_index
<stmt list> "end"
"end" <stmt list> current_index
EOP(0,1,1,3) "begin" right_index
$ $
#finish <program> left_index
(5/11)
 action: predict <statement list>  <statement> <statement tail>
c
top_index
<statement> <stmt tail>
<stmt tail> <statement> right_index, current_index
EOP(1,3,4,6) "end"
"end" <stmt list> left_index
EOP(0,1,1,3) "begin"
$ $
#finish <program>
(6/11)
 action: predict <statement>  <ident> := <expression>; #assign($1,$3)
<ident> top_index
":=" ";"
<expression> <expression>
";" ":=" c
#assign($1,$3) <ident> right_index, current_index
EOP(4,6,6,8) <stmt tail>
<stmt tail> <statement> left_index
EOP(1,3,4,6) "end"
"end" <stmt list>
$ $
#finish <program>
(7/11)
 action: predict <ident> ID #process_id($$)
ID
#proc_id($$) top_index
EOP(6,8,8,12) ID right_index, current_index
":=" ";"
";" ":=" c
#assign($1,$3) <ident> left_index
<stmt tail> <statement>
EOP(1,3,4,6) "end"
"end" <stmt list>
$ $
#finish <program>
(8/11)
 action: match ID
#proc_id($$) top_index, current_index

EOP(6,8,8,12) ID right_index
":=" ";"
";" ":=" c
#assign($1,$3) <ident> left_index
<stmt tail> <statement>
EOP(1,3,4,6) "end"
"end" <stmt list>
$ $
#finish <program>
(9/11)
 action: do #proc_id; restore EOP(6,8,8,12)
top_index
":=" ";"
";" ":=" c current_index
#assign($1,$3) <ident> right_index
EOP(1,3,4,6) "end"
"end" <stmt list>
$ $
#finish <program>
(10/11)
 action: match :=
top_index
";"
<expression> <expression> current_index
";" ":=" c
#assign($1,$3) <ident> right_index
EOP(1,3,4,6) "end"
"end" <stmt list>
$ $
#finish <program>
(11/11)
 action: predict <expression> <primary> #copy($1,$2) <primary tail> #copy($2,$$)
<primary> top_index
#copy($1,$2) <primary tail>
<primary tail> <primary> right_index, current_index
#copy($2,$$) ";"
EOP(6,8,10,12) <expression> left_index
";" ":=" c
#assign($1,$3) <ident>
EOP(4,6,6,8) <stmt tail>  QUIZ: space complexity
<stmt tail> <statement>  QUIZ: how to improve?
EOP(1,3,4,6) "end"  QUIZ: comparison,
"end" <stmt list> LL v.s. LR
EOP(0,1,1,3) "begin" action v.s. parser-
$ $ controlled
#finish <program>
Intermediate code
 to separate high-level language-dependent realizations

from low-level machine-dependent realizations
 forms of intermediate codes
 postfix notation c
 three-address code
– 1 opcode + 2 input operands + 1 result operand
 abstract syntax tree (DAG)
 QUIZ: comparison

Example tuple language (1/3)
(1/3)
 an alternative representation of three-address code
varying number of operands
ADDI, ADDF, SUBI, SUBF, MULTI, MULTF, DIVI, DIVF, MOD, REM,
EXPI, EXPF, AND, OR, XOR, EQ, NE, GT, GE, LT, LE
RESULT := ARG1 OP ARG2
UMINUS ARG2 := -ARG1
NOT ARG2 := not ARG1
ASSIGN ARG3 := ARG1, size is ARG2
FLOAT ARG2 := FLOAT(ARG1 c ) [ARG 1 in an integer]
ADDRESS ARG2 := the address of ARG1
RANGETEST abort execution if ARG3 < ARG1 or ARG3 > ARG2
LABEL ARG1 is used to label the next tuple
JUMP jump to tuple labeled ARG1
JUMP0 jump to ARG2 if ARG1 = 0
JUMP1 jump to ARG2 if ARG1 = 1
CASEJUMP ARG1 is case selector expression
CASELABEL ARG1 is a case statement label
CASERANGE ARG1 is lower bound of label range,
ARG2 is upper bound of range

(2/3)
CASEEND no arguments, end of case statement
PROCENTRY enter subprogram at nesting level ARG1
PROCEXIT exit subprogram at nesting level ARG1
STARTCALL ARG1 is temporary to reference activation record
REFPARAM ARG1 is actual parameter
ARG2 is parameter offset
ARG3 is reference to activation record
COPYIN ARG1 is actual parameter
ARG2 is parametercoffset
COPYOUT ARG1 is actual parameter
COPYINOUT ARG1 is actual parameter
PROCJUMP ARG1 is subprogram start address (a label)

(3/3)
 example
begin (READI, A)
read(A,B); (READI, B)
if A > B then (GT, A, B, t1)
C := A + 5; (JUMP0, t1, L1)
else c (ADDI, A, 5, C)
C := B + 5; (JUMP, L2)
end if; (LABEL, L1)
write(2 * (C -1)); (ADDI, B, 5, C)
end (LABEL, L2)
(SUBI, C, 1, t2)
(MULTI, 2, t2, t3)
(WRITEI, t3)

07 Semant

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

07 Semant

Hochgeladen von

Copyright:

Verfügbare Formate

SEMANTIC PROCESSING

Department of Computer Science

Chuen-Liang Chen, NTUCS&IE / 1

Chuen-Liang Chen, NTUCS&IE / 2

Chuen-Liang Chen, NTUCS&IE / 3

Chuen-Liang Chen, NTUCS&IE / 4

– associating semantic information (attribute) to each node

– checking static semantics

Chuen-Liang Chen, NTUCS&IE / 5

<term> * <term> <term> c const id

<if stmt> if-then-endif

if <cond> then <stmts> endif <cond> <stmts>

Chuen-Liang Chen, NTUCS&IE / 6

const (3, int) 2 id (X, real) 3

Chuen-Liang Chen, NTUCS&IE / 7

 semantic record -- representation of meaning

 semantic routine -- executor for semantic processing

 communications among semantic routines

– local variables, parameters (for non-table-driven parser)

– semantic stack (for table-driven parser)

Chuen-Liang Chen, NTUCS&IE / 8

 X: EXPRREC  3+X : EXPRREC

Chuen-Liang Chen, NTUCS&IE / 10

 for bottom-up parsing c

– <stmt>  if <exp> #start_if then <stmts> endif #finish_if

Chuen-Liang Chen, NTUCS&IE / 11

Chuen-Liang Chen, NTUCS&IE / 12

/* <primary> ( <expression> ) #copy($2,$$) */

/* <primary> INTLlTERAL #process_literal($$) */

Chuen-Liang Chen, NTUCS&IE / 14

/*<primary tail> <add op><primary>#gen_infix($$,$1,$2,$3)<primary tail>#copy($3,$$)*/

Chuen-Liang Chen, NTUCS&IE / 15

<id>1 := <exp>9 ; #a($1,$3)

ID #i($$) <p>2 #c($1,$2) <pt>3,8 #c($2,$$)

INT #l($$) <ao>4 c #g($$,$1,$2,$3) <pt>7 #c($3,$$)

PLUS #o($$) <id>5 #c($1,$$) 

Chuen-Liang Chen, NTUCS&IE / 17

 action-controlled : controlled by action routines

Chuen-Liang Chen, NTUCS&IE / 18

Step Remaining Input Parse Stack Action

Chuen-Liang Chen, NTUCS&IE / 19

Chuen-Liang Chen, NTUCS&IE / 20

– parse stack -- <T> #add #assign

Chuen-Liang Chen, NTUCS&IE / 21

Chuen-Liang Chen, NTUCS&IE / 22

#proc_id($$) top_index, current_index

 to separate high-level language-dependent realizations

Chuen-Liang Chen, NTUCS&IE / 33

Chuen-Liang Chen, NTUCS&IE / 34

Chuen-Liang Chen, NTUCS&IE / 35

Chuen-Liang Chen, NTUCS&IE / 36

Das könnte Ihnen auch gefallen

/<primary tail> <add op><primary>#gen_infix($$,$1,$2,$3)<primary tail>#copy($3,$$)/