Nfa Dfa

Introduction (1)
Language Processors (E2.15) Review: Lexical analysis, regular expressions, finite state automata
Lecture 5: Lexical Analysis

Lexical analysis is concerned with converting an input stream of
characters into an output stream of tokens.
Tokens are the basic constituent elements of the language – for
example, the keywords “begin, for, if, then, else” for PASCAL.
Objectives We already saw that for a regular expression defining the
structure of the tokens of the language, there is a FSA that can be
To introduce non-deterministic finite automata (NFA), constructed to recognize whether a series of characters is a valid
and demonstrate how they can be converted to DFAs part of the language (a valid token)
--
To introduce Thomson’s algorithm for converting
The lexical analyser is usually a routine in the compiler, that when
regular expressions to NFA
it is called, reads the input stream and returns the next token that
To introduce the DFA state minimization algorithm it finds.
Tokens have associated attributes – we want them too!
E2.15 - Language Processors E2.15 - Language Processors

(Lect 5) 1 (Lect 5) 2
Introduction (2) Introduction (2)

Review: Lexical analysis as moving from start to final states of an FSA (1) Review: Lexical analysis as moving from start to final states of an FSA (2)
Lexical analysis can be viewed as the process of q0 < q1 = q2 Return(relop, LE)

starting from the start state, reading the input one
character at the time, and moving between states >
according to the input = q3 Return(relop, NE)
If we are in an accepting state when the input stream is >
terminated, then the input has been recognised as other
q4 * Return(relop, LT)
valid – appropriate messages and attributes are
returned. q5 Return(relop, EQ) Return(relop, GE)
Example, for the relational operator token (relop) in =
Pascal has the following regular expression: q2
relop < | <= | = | <> | > | >= q6 other q2 *

E2.15 - Language Processors
(Lect 5) 3 * = input retraction E2.15 - Language Processors
(Lect 5) Return(relop, GT)
4
1
NFA (1) NFA (1)
DFA and NFA (1) DFA and NFA (2)
Deterministic FSA allow only one transition from one Another difference between DFA and NFA is that NFA
state to another for a given input allow ε-transitions
Non-deterministic FSA allow more than one transition An ε-transition is a transition between states without
possibilities for a given input, reading any input
for example from q0, given a as input we can go either to Indicates equivalent states
q2, or q3
a q3 state input
state input
b ε
a b
q0 a q2 q0 {q2} {q1} b
q0 {q2, q3} q1 q0 q2
b
ε
….
…. q1 q1
(Lect 5) 5 (Lect 5) 6
NFA (2) NFA (2)

Equivalency (1) Equivalency (2) - examples
Ε-transitions elimination
NFA can be converted to a DFA (through a process b
q0 q2
known as subset construction) b
qA q2
The main idea: merge sets of states of the NFA in ε
single states in the DFA; adopt the viewpoint of the q1
input characters:
Any two states in the NFA that are connected by an ε-
transition are essentially the same, since we can move a q3 Non-deterministic transitions elimination
between them without consuming any characters => we can
represent them using only one state in the DFA q0 a q2 a
q0 qA
For any symbol that can result to multiple transitions we can
b
consider that we can have a single transition to a set of states b
(the union of all those states reachable by a transition on the q1
current symbol). This set can be combined into a single DFA q1
state.
(Lect 5) 7 (Lect 5) 8
2
NFA (3) NFA (3)
Subset construction algorithm (2)
Subset construction algorithm (1)
1. The start state of the DFA is the ε-closure of the start

The subset construction algorithm requires state of the NFA
the definition of two functions: 2. For the created state of (1) and for any created state
The ε-closure function takes as input a state and in (2) do:
returns the set of states reachable from it based For each possible input symbol:
on one or more ε-transitions. This set will always Apply the move function to the created state with the input
symbol
include the state itself For the resulting set of states, apply the ε-closure function;
The function move (State, Char) which takes a this might or might not result in a new set of states
state and a character and returns the set of states 3. Continue (2) until no more new states are being
reachable from it with one transition using this created.
character. 4. The final states of the DFA are the ones containing
The algorithm then proceeds as follows: any of the final states of the NFA
[Potential exponential expansion in the state space]
(Lect 5) 9 (Lect 5) 10
NFA (4) Regular Expressions to NFA

NFA – a useless concept? Thompson’s algorithm (1)
So, NFA can be reduced to DFA [which are Key idea: construct an NFA for each symbol and for
each operator (|, union, Kleene *); join the resulting
easier to program]! NFA with ε-transitions (in precedence order)
Why NFA?!
Instrumental for the automatic construction of DFA NFA representing the empty string NFA for symbol a
from regular expressions
ε a
RE -> DFA tough; instead we proceed as q0 q1 q0 q1
RE -> NFA -> DFA
ε-transitions a key element! NFA representing union: ab
Time for Thompson’s algorithm for converting a ε b
a RE to an NFA: q0 q1 q2 q3

(Lect 5) 11 (Lect 5) 12
3
Regular Expressions to NFA Regular Expressions to NFA
Thompson’s algorithm (2) Thompson’s algorithm (3) - Example
NFA representing a | b
Converting regular expression a(b|c)* to a NFA
a
ε q1 q2 ε
q0 We start with each symbol, e.g. a:
ε ε q3
b a
q4 q5 q0 q1 (and similarly for b and c)
NFA representing Kleene’ star: a* Then: b | c b

ε q1 q2 ε
ε a ε
q0 q4 q5 q3 q0
c q5
ε q3 q4
ε
ε
ε
(Lect 5) 13 (Lect 5) 14
Regular Expressions to NFA Regular Expressions to NFA

Thompson’s algorithm (3) – Example (2) Thompson’s algorithm (3) – Example (3)
…Converting regular expression a(b|c)* to a NFA (cont.) And finally: a(b|c)* can be constructed as
Then: (b | c)* can be constructed as

b ε
ε q1 q2
b ε
ε q1 q2 ε q6 ε
q8 q5 q7
ε ε c
q0 q6 q3 q4
c q5 q7 ε ε
ε q3 q4 ε
ε
a
q0 q9
ε ε
(Lect 5) 15 (Lect 5) 16
4
Regular Expressions to NFA Automatic construction (1)
Thompson’s algorithm (3) – Example (4) Overview of the process
?! But can’t we do much better by having instead of: Yes, but the algorithmic nature of Thompson’s
construction means that NFA can be created
b automatically from RE
ε q1 q2 ε
ε Subsequently they can be converted to DFA (the
q8 q6 ε q7 subset construction algorithm), for which its easier to
c q5
automatically generate code
ε q3 q4 There are algorithms for minimizing DFA (ensuring that
ε ε
ε the constructed DFA has the smallest number states
q0 a q9 possible)
b
Minimal source
RE NFA DFA code
a DFA
the equivalent: q0 q1
Automatic construction: process overview
E2.15 - Language Processors c E2.15 - Language Processors
(Lect 5) 17 (Lect 5) 18
Automatic construction (2) Automatic construction (3)

DFA minimization Code generation for the DFA
Start by assuming that all the states in the DFA are Once the DFA has been constructed, code to
equivalent simulate it can be generated.
Work through the states, putting different states in The state transition table defines function move (s,ch)
separate sets which returns the next state from s given ch as input
Two states are considered different if: character. Then:
One is a final state and the other one isn’t S:=S0
The transition function maps them to different states, based on the c:=nextchar();
same input character. while c <> eof do
1. The minimization algorithm starts by initially creating two sets of s:= move (s, c)
states, final and non-final. c:= nextchar();
end;
2. For each state-set created from (1), it examines the transitions
for each state and for each input symbol. It the transition is to a if s is in FinalStates then
different state for any two states, then they are put into different return “yes”;
state-sets. else return “no”;
3. Repeat until no new state sets are being created by 2.

(Lect 5) 19 (Lect 5) 20
5
Summary
Recommended Reading
NFA provide a useful intermediary point between RE
and DFA. Chapter 3 of Aho et al
Lexical analysers can be generated by hand, or
automatically by converting regular expressions to Section 2.1 of Grune et al.
NFA (Thompson’s algorithm), then the NFA to a DFA
(subset construction algorithm), minimizing the DFA
states, and converting into code.
Next lecture:
Lexical analysis using LEX

(Lect 5) 21 (Lect 5) 22

Nfa Dfa

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Nfa Dfa

Hochgeladen von

Copyright:

Verfügbare Formate

Introduction (1)

Lecture 5: Lexical Analysis

E2.15 - Language Processors E2.15 - Language Processors

Introduction (2) Introduction (2)

Lexical analysis can be viewed as the process of q0 < q1 = q2 Return(relop, LE)

relop < | <= | = | <> | > | >= q6 other q2 *

NFA (2) NFA (2)

1. The start state of the DFA is the ε-closure of the start

NFA (4) Regular Expressions to NFA

E2.15 - Language Processors E2.15 - Language Processors

NFA representing Kleene’ star: a* Then: b | c b

Regular Expressions to NFA Regular Expressions to NFA

Then: (b | c)* can be constructed as

Automatic construction (2) Automatic construction (3)

3. Repeat until no new state sets are being created by 2.

E2.15 - Language Processors E2.15 - Language Processors

Das könnte Ihnen auch gefallen

Nfa Dfa

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Nfa Dfa

Hochgeladen von

Copyright:

Verfügbare Formate

Introduction (1)

Lecture 5: Lexical Analysis

E2.15 - Language Processors E2.15 - Language Processors

Introduction (2) Introduction (2)

 Lexical analysis can be viewed as the process of q0 < q1 = q2 Return(relop, LE)

relop < | <= | = | <> | > | >= q6 other q2 *

NFA (2) NFA (2)

1. The start state of the DFA is the ε-closure of the start

NFA (4) Regular Expressions to NFA

E2.15 - Language Processors E2.15 - Language Processors

NFA representing Kleene’ star: a* Then: b | c b

Regular Expressions to NFA Regular Expressions to NFA

Then: (b | c)* can be constructed as

Automatic construction (2) Automatic construction (3)

3. Repeat until no new state sets are being created by 2.

E2.15 - Language Processors E2.15 - Language Processors

Das könnte Ihnen auch gefallen

Lexical analysis can be viewed as the process of q0 < q1 = q2 Return(relop, LE)