Sie sind auf Seite 1von 8

The Symbol Table

used during all phases of compilation

maintains information about many source


language constructs

incrementally constructed and expanded


during the analysis phases

used directly in the code generation phases

efficient storage and access important


in practice (but we wont worry about efficiency
- well just use a liked list)
may or may not be constructed during
lexical and syntax analysis, depending on
the compiler

Constructing the Symbol Table

There are three main operations to be carried out


on the symbol table:
determining whether a string has already
been stored
inserting an entry for a string
deleting a string when it goes out of scope

This requires three functions:


lookup(s): returns the index of the entry for
string s, or 0 if there is no entry
insert(s,t): add a new entry for string s (of
token t), and return its index
delete(s): deletes s from the table (or, typically,
hides it)

Scope

In most high-level languages, variables and functions


have restricted scope - i.e. they can only be accessed
in specific areas of the source code.
The scope of any particular variable may be global,
or within a specific code file, or in a file after its
declaration, or within specific code blocks.
In C, blocks are files, function declarations and
compound statements (between "{" and "}"). Also,
structures and unions can be considered to be
blocks.

In languages with restrictive scoping rules, it is


possible to construct the symbol table during lexical
analysis.

{L}+

{entry = lookup(yytext);
if (entry == -1) /* i.e. new ID_T */
insert(yytext,ID_T);
}

Scoping Rules
In block structured languages, the same variable
name can be used in different places to refer to
different objects.
We now cannot simply look to see if the name has
already been entered in the table, as the current use
may be a new declaration.

int i;

i is globally accessible

int f1(int k) {
int j;
...
print i;
}

a new integer k, in f1 only


a new integer j, in f1 only

int f2() {
int j;
...
}

(the global variable)

a different j, in f2 only

One-pass symbol table construction


One possible method of constructing the symbol
table during the first pass is shown below.

Prog
Prog
Dec
Dec
VDec
FDec
SFDec
Par
Par
Par
PList
PList

{L}+

-> Dec Prog


-> Main
-> VDec ;
-> FDec
-> int id
-> SFDec Par ) { CStat }
-> int id (
->
-> VDec
-> PList , VDec
-> VDec
-> VDec , PList

decr(stack);
incr(stack);

{entry = lookup(yytext,stack);
if (entry == -1) insert(yytext,ID_T,stack);
}

The stack consists of entries of the form


(nesting level, scope value)
These are extra entries added to the symbol table

The last index is the index of the last entry added to


the symbol table
Initially, the stack is set to < (0,0) > and last to 0.
insert associates the top of the stack with the entry
lookup searches for a matching entry, and obtains its
nesting level. It moves down the stack until it finds a
stack entry with the same nesting level. If the table index
is less than the stack scope value, it ignores it, and
continues searching the table. If no match is found,
it returns -1.
decr deletes the top element of the stack

incr adds a new element to the top of the stack,


increments the nesting level, and assigns the last
index as the scope value.

constructed symbol table


int i;
int f1(int k) {
int j;
...
print i;
}
int f2() {
int j;
...
}
Index
0
1
2
3
4
5

Str
i
f1
k
j
f2
j

Nest
0
0
1
1
0
1

Scope
0
0
1
1
0
4

Atts
...

The changes in the stack are as follows (top on the right):


Event Last
Stack (Nest,Scope)
0
(0,0)
1
1
(0,0), (1,1)
2
3
(0,0)
3
4
(0,0), (1,4)
4
5
(0,0)o

Syntax trees and scope


Prog

VDec

int

id
i

func

int

id
f1

func

VDec
int

id
k

VDec
int

id
j

print

int

id l
f2

id
i

Many compilers simply build a syntax tree on the


first pass (while carrying out lexical and syntax
analysis).
They then make a second pass, constructing the
symbol table, checking data types, etc.
It should be easier to determine the scope of
the identifiers from the syntax tree.
Multiple passes may be slower, but it can result in
more natural grammars, and simpler translation
and analysis routines.

VDec
int

id
j

Das könnte Ihnen auch gefallen