Sie sind auf Seite 1von 20

PARSING

8/31/2012

PARSING

In the design of a compiler the second stage after lexical analysis is parsing. It is also called as syntax analysis. Parser will take the stream of tokens generated by the lexical analyzer , check if it is grammatically correct and generate a parse tree. The fundamental theory behind parsing is grammar theory.

8/31/2012

CONTEXT FREE GRAMMAR

A CFG, G=(N, T, P, S) where:


N is a set of non-terminals. T is a set of terminals. P is a set of productions (or rules) which are given by A-> where A denotes a single non-terminal. denotes a set of terminals and nonterminals. S is the start state. If not specified, then it is the nonterminal that appears on the left-hand side of the first production.

8/31/2012

Parse trees
Parse trees are labeled trees characterized by the following: The root is labeled by the start symbol. Each leaf is labeled by a token or !. Each interior node is labeled by a nonterminal. If A is the non-terminal labeling some interior node and X1, X2, , Xn are the labels of the children of that node from left to right, then A ::= X1, X2, , Xn is a production in the grammar.
8/31/2012 4

AMBIGUITY AND UNAMBIGUITY :

A word is said to be ambiguously derivable if there are more than one derivations existing for the word, that is if there are more than one distinct parse tree generated for that word.
There are two kinds of derivations that are important. A derivation is a leftmost derivation if it is always the leftmost non-terminal that is chosen to be replaced. It is a rightmost derivation if it is always the rightmost one. Ambiguity is considered only when words are derived using the same kind of derivation.

8/31/2012

AMBIGUITY AND UNAMBIGUITY

A grammar is said to be ambiguous if there exists at least one word which is ambiguously derivable. A grammar is said to be unambiguous if all the words derived from it are unambiguous.

8/31/2012

A language L is said to be unambiguous if there exists at least one grammar which is unambiguous. A language L is said to be ambiguous if all the grammar of the language are ambiguous.

Programming language grammars must be unambiguous.

8/31/2012

BOOLEAN EXPRESSIONS
The language of Boolean expressions can be defined in English as follows: true is a Boolean expression. false is a Boolean expression. If exp1 and exp2 are Boolean expressions, then so are the following: expression1 OR expression2 expression1 AND expression2 Low || NOT expression1 Higher && ( expression1 )

Highest !
8/31/2012 8

Consider this simple CFG: bexp TRUE bexp FALSE bexp bexp || bexp bexp bexp && bexp bexp ! bexp bexp ( bexp )

8/31/2012

CONTEXT FREE GRAMMAR FOR BOOLEAN EXPRESSIONS


Consider the following short hand form of the CFG for Boolean expressions: E E && E E E || E E!E E (E) Et Ef E is a non-terminal and the start symbol. &&, ||, !, (, ), t and f are terminals.
8/31/2012 10

Here are two different (leftmost derivations). The first one, corresponding to the first tree: E => E && E => E && E && E => t && E && E => t && t && E => t && t && t The second one, corresponding to the second tree: E => E && E => t && E => t && E && E => t && t && E => t && t && t

8/31/2012

11

A CFG is ambiguous if at least one word in the described language has more than one parse tree.

&&

&&

&&

&&

t
8/31/2012

12

We construct an unambiguous version of the context-free grammar for Boolean expressions by making it reflect the following operator precedence conventions: ! (NOT) has the highest precedence && (AND) has the next highest precedence || (OR) has the lowest precedence For example, t v ~f ^ t should be interpreted as t v ((~f)^t). As long as the grammar is unambiguous, you can choose whether or not to accept expressions that would need conventions about operator associatively to disambiguate them, like t ^ t ^ t.
8/31/2012 13

Here is a version that assumes that the binary operators are non- associative. E E1 || E1 E E1 E1 E2 && E2 E1 E2 E2 ! E2 E2 (E ) E2 t E2 f Draw the derivation trees according to your unambiguous grammar for the following two expressions: (i) ! t || f (ii) (f || t) || ! f && t 8/31/2012

14

Parse tree for !t v||f:

E1

||

E1

E2

E2

E2

t
8/31/2012 15

Parse tree for (f || t) || !f&&t:

E E 1 E 1

||

E 2

E 2

&&

E 2

E 2

E 1 E 2 f

||

E 1 E 2 t
8/31/2012

16

ASSOCIATIVITY
The binary operators && and || are be considered to be left-associative in most programming languages. i.e. an expression like t || t || t would be interpreted as (t || t) || t

Short Circuit

8/31/2012

17

Making the production rules for the binary operators left associatively: E E || E1 E E1 E1 E1 && E2 E1 E2 E2 !E3 E2 E3 E3 ( E ) E3 T E3 F
8/31/2012 18

Parse tree for: f||f||t


E

||

E 1

||

E 1

E 2
E 3

E 1 E 2 E 3 f

E 2 E 3

8/31/2012

19

THANK YOU..

8/31/2012

20