Beruflich Dokumente
Kultur Dokumente
Phases of a Compiler
Source Code
Lexical Analyzer
Syntax Analyzer
Symbol
Table
Manager
Semantic Analyzer
Error
Handler
Parsing Overview
What is syntax ?
The way in which words are put together to form phrases, clauses,
or sentences.
The parser checks the stream of words (tokens) and their parts of speech
for grammatical correctness.
It determines if the input is syntactically well formed.
It guides context sensitive (semantic) analysis (type checking).
Finally, it builds IR for the source program.
The parser ensures that sentences of a programming language that make
up a program abide by the syntax of the language.
If there are errors, the parser will detect them and reports them
accordingly.
It is clear that a scanner based upon regular expressions will not be able
to detect syntax error.
5
Example
Parser input:
IF ID == ID
ID = INT
ELSE
ID = INT
Example
IF-THEN-ELSE
=
==
ID
ID
ID
=
INT
ID
INT
Example
Java expression
x == y ? 1 : 2
Parser input
ID == ID ? INT : INT
Parser output
?:
INT
==
ID
INT
ID
9
Input
Output
Lexical Analyzer
Sequence of
characters
Sequence of tokens
Parser
Sequence of tokens
Parse tree
10
Scanners
Task: recognize language tokens
Implementation: DFA
Transition based on the next character
Parsers
Task: recognize language syntax (organization of tokens)
Implementation:
Top-down parsing
Bottom-up parsing
11
We need
A language for describing valid sequences of tokens
A method for distinguishing valid from invalid sequences of token
An acceptor mechanism that determines if input token stream
satisfies the syntax of the programming language.
12
QUIZZ
13
15
16
17
Key Idea
1.
2.
3.
4.
18
19
Types of derivations:
E+E
E+E
E + id
E E+E
E E + id
id E + E
id id + E
E id + id
id id + id
id id + id
Left-most derivation
Right-most derivation
20
21
E
E+E
E E+E
id E + E
id id + E
id id + id
E
E
id
+
E
E
id
id
22
Right-most derivation:
E
E+E
E + id
E E + id
E id + id
id id + id
id
id
id
E
E
E
id
+
E
E
id
id
23
Example:
24
+
E
+
*
E
id
id
id
id
id
Parse Tree
CFG Ambiguity
26
CFG Ambiguity
Consider E E + E | E E | ( E ) | i n t
We can generate a string int * int + int with two different parse trees.
E
int
E +
* E
int
int E
int
int
E
+ E
int
27
CFG Ambiguity
Examples of non-ambiguous CFG:
Consider a CFG of the language PALINDROME.
aSa
S aSa | bSb | a | b |
bSb
e
28
if
E1
if E1 then
if E2 then S1
else S2
S2
if
E2
S1
if
E1
if
E2
S1
if E1 then
if E2 then S1
else S2
S2
Typically we want the second form because ELSE matches the closest
previously unmatched THEN.
29