Sie sind auf Seite 1von 58

AUTOMATA

AND
COMPILER DESIGN
B.Tech III IT B
sumalatha
UNIT - I
Formal Language and Regular Expressions:
Languages Definition
regular expressions
Regular sets
identity rules.

Finite Automata:
DFA
NFA
NFA with e transitions Significance
acceptance of languages
NFA to DFA conversion
minimization of DFA
Finite Automata with output
Moore and Mealy machines
Constructing finite Automata for a given
regular expressions
Conversion of Finite Automata to Regular
expressions.
What is Automata Theory?
Study of abstract computing devices, or
machines
Automaton = an abstract computing device
Note: A device need not even be a physical
hardware!
A fundamental question in computer science:
Find out what different models of machines can do
and cannot do
The theory of computation
sumalatha
Alan Turing (1912-1954)
Father of Modern Computer
Science

English mathematician

Studied abstract machines
called Turing machines
even before computers
existed

sumalatha
The Central Concepts of
Automata Theory
Alphabet
Strings (words)
Language
sumalatha
Alphabet
An alphabet is a finite, non-empty set of
symbols
We use the symbol (sigma) to denote an
alphabet
Examples:

Binary: = {0,1}

All lower case letters: = {a,b,c,..z}

Alphanumeric: = {a-z, A-Z, 0-9}
sumalatha
Strings
A string or word is a finite sequence of
symbols chosen from
Ex: The string 01011 is formed over an
alphabet { 0,1 }

Empty string is c (or epsilon)

Length of a string w, denoted by |w|, is equal
to the number of (non- c) characters in the string

E.g., x = 010100 |x| = 6
x = 01 c 0 c 1 c 00 c |x| = ?


sumalatha

Language

A language, L, is simply any set of strings over a fixed alphabet.
Alphabet Languages
{0,1} {0,10,100,1000,100000}
{0,1,00,11,000,111,}
{a,b,c} {abc, aabbcc,
aaabbbccc,}
{A, ,Z} {TEE,FORE,BALL,}
{FOR,WHILE,GOTO,}
{A,,Z,a,,z,0,9, { All legal C programs}
+,-,,<,>,}
Special Languages: C - EMPTY LANGUAGE
e - contains e string only
sumalatha
Regular Expressions
A declarative way to express the pattern of any string
over an alphabet
or
Regular expressions are an algebraic way to describe
languages.

If E is a regular expression, then L(E) is the language it
defines.

A Language denoted by a regular expression is said tobe
regular set.



sumalatha
Operation on these two languages are:
1. union of L and D written L D: is the set of
letters and digits
2. concatenation of L and D written LD :is the set of
strings consisting of letter followed by a digit.
3. L
4
is the set of all four letter string
4. Kleene closure of L written L*: is the set of all
strings of letters, including the empty string ()
(denotes zero or more concatenations of L)
5. positive closure of D written D
+
: D
+
is the set of all
strings of one or more digits. (denotes one or more
concatenations of D)


Ex: Let L be the set consists of alphabets
Let D be the set consists of digits
L={ A,B,Z , a, b,...z}
D = {0,1,..9}
sumalatha


Regular Expressions
Let = { a , b}
1. The regular expression (a+b) denotes following set {a, b}
2. (a+b)(a+b) = { aa, ab ,ba ,bb}
3. a* = {, a, aa ,aaa , ..}
4. (a+b)* = {, a , b ,ab, ba ,aaa ,bbb, aba ,aab ,bbabb }
5. (ab)* = { ,ab ,abab, ababab}
6. a*b* = { ,a,b,aaaa, bbb, aaabbbb,abbb,aaaaab..}
7. a*|b* = {, aaaa , bbbbbbb}
8. a
+
= {a , aa, aaa , ..}
9. a(a+b)* denotes any no. of as and any no. of bs but the string
start with a

sumalatha
Examples
1. 01* = {0, 01, 011, 0111, ..}
2. (01*)(01) = {001, 0101, 01101, 011101, ..}
3. (0+1)*
4. (0+1)*01(0+1)*
5. ((0+1)(0+1)+(0+1)(0+1)(0+1))*
6. ((0+1)(0+1))*+((0+1)(0+1)(0+1))*
7. (1+01+001)*(c+0+00)

sumalatha
Regular Expression
Examples
1. All Strings that start with tab or end with bat:

tab{A,,Z,a,...,z}*|{A,,Z,a,....,z}*bat

2. All Strings in Which Digits 1,2,3 exist in ascending
numerical order:

{A,,Z}*1 {A,,Z}*2 {A,,Z}*3 {A,,Z}*
sumalatha
Find a regular expressions:


1.The set of bit strings with even length
(00 +01 +10 +11)*

2.Set of bit strings ending with a 0 not containing
11 not the null string
(0 +10)*(0+10) or (0+10)
+


3.The set of bit strings containing and odd
number of 0s
1*01*(01*01*)*
sumalatha
Following rules are used to
simplifying Regular Expression
1. u+R= R
2. e R =R
3.R+R =R
4.R*R*=R*
5.(R*)*=R*
6. e + RR*=R*
7. (P+Q)*= (P*Q*)* = (P*+Q*)*
8.RR*=R
+
sumalatha



Transition Diagrams


The language is recognized by using diagrammatic representation
called Transition Diagram.

The Transition diagram is made up of set of states and transitions
from one state to another.

The Transition Diagram(TD) has:

States : Represented by Circles
Transitions(actions) : Represented by Arrows between states
Start State : Beginning of a pattern (Arrowhead)
Final State(s) : End of pattern (Concentric Circles)

sumalatha
- transition
Each edge in TD is labeled with i/p character scanned by the machine

The e -transition in TD is used to move from one state to next state
without reading any input character.
A
B
e
sumalatha



1) a+b

2) ab


3) a*


4) (a+b)*

5) (ab)*
q
0
q
1
a/b
q
0
q2
q
1
a
b
q
1
a
q
1
a/b
q
0
q2
q
1
a
b

Regular Expression Transition Diagram

sumalatha
1) a
+
=(aa*)


2) a*|b*

3) a*b*

4) a(a+b)*






q
0
q
1
a
q
1
a
a
q
1
q
1

q
0
b
a q
1
a/b
q
2
q
0
a
b




sumalatha
Finite Automata
The generalized transition diagram is called Finite Automata
Formally a Finite Automata is defined as a five tuple set

M={Q,, o, q
0
F}
Where Q Finite set of states
input symbol (an alphabet)
o Transition function specifies from which state on which i/p
symbol ,where the transition goes. It maps
o (p , a) = q
where p, q are states
a is i/p symbol
q
0
Initial state
F Set of Final states F_Q

sumalatha
The Finite Automata is classified into two ways







Finite Automata
Finite Automata
Without output
Finite Automata
With output
NFA DFA Mealy Moore
sumalatha
Finite Automaton without output
(Language Recognizers)

Input
Accept
or
Reject
String
Finite
Automaton
NFA or DFA
Output
sumalatha
Finite Automata : A recognizer that takes an input string &
determines whether its a valid sentence of the
language
Non-Deterministic : Has more than one alternative action for the
same input symbol
Non-Deterministic Finite Automata (NFAs)
easily represent regular expression, but are
somewhat less precise.


Deterministic : Has at most one action for a given input
symbol. Deterministic Finite Automata (DFAs)
require more complexity to represent regular
expressions, but offer more precision.

Both types are used to recognize regular expressions.
sumalatha
Non-Deterministic Finite
Automata
An NFA is a mathematical model that consists of :
N= {Q ,,o , q
0
, F}
Q, a set of states
E, the symbols of the input alphabet
o () a transition function.
o(state, symbol) set of states
o : Q E{e} power(Q) (2
Q
)
A state, q
0
e Q, the start state
F _ Q, a set of final or accepting states.
sumalatha
Representing NFAs
Transition Diagrams :


Transition Tables:
Number of states (circles),
arcs, final states,
More suitable
representation within a
computer
sumalatha
NFA -Example
Given the regular expression : (a|b)*abb
start
0 3
b
2 1
b a
a
b
Q = { 0, 1, 2, 3 }
Q
0
= 0
F = { 3 }
E = { a, b }
EXAMPLE:
Input: ababb
o(0, a) = 0
o(0, b) = 0
o(0, a) = 1
o(1, b) = 2
o(2, b) = 3
ACCEPT !
sumalatha
NFA - Example
3 2
b
c a
1
6
7
c
a
c
4
b
a (b*c)
a (b | c
+
)
Given the regular expression : (a (b*c)) | (a (b | c
+
))
5
sumalatha
(a (b*c)) | (a (b | c
+
))
3 2
b
c a
1
6
5
7
c
a
c
4
b
0
e
e
Input : abbc
abcc
sumalatha
Transition table:
A tabular representation can also be used to represent the finite automata is
called transition table. In Transition table there is a row for each state and
column for each input symbol

The entry for row i and symbol a in the table is a set of states that can be
reached by transition from state i on input a.

q1 q0
0/1
0
1
states Input symbol

0 1
q0 {q0 ,q1} q0
q1 | q1
Transition Table
sumalatha
An NFA accepts an input string x if and only if there is some path in
the transition diagram from the initial state to final state.
q0
0/1
0
1
q1 q2
0
1
String : 001001
q0 q1 q2
0
0
q0 q1 q1 q2
q0 q0 q1 q2 q2
q0 q1 q1
q0 q0
0
0
0
0 0 0
0
0
0
1
1
1
1
1
accepted
sumalatha
Thomsons construction of an NFA from
a Regular Expression
e
start
i f
1. The NFA for e

a
start
i f
2. NFA for a input symbol
sumalatha
3. NFA for R1R2 where R1, R2 are two regular expressions
R1
start
i
R2
f
3. NFA for R1|R2 where R1, R2 are two regular expressions
e
i f
e
R1
R2
e
e
sumalatha
5. NFA for R*
R
e
start
i e
f
e
e
sumalatha
e
e
R1
R2
e
e
e
e
e e
i
f
6. NFA for (R1|R2)*
sumalatha
The NFA for (a|b)*abb using Thomsons construction
1
2 3
5 4
6
e
e
e
e
a
b
2
a
3
4
b
5
1. r1 = a
2. r2 = b
3. r3 = r1|r2 i.e a|b
sumalatha
7 9 10
a
b
b
8
5. r5 = abb
0 1
2 3
5 4
6 7
e
e
e
e
e
e
e
e
a
b
start
4. r4 = (r3)* i.e (a|b)*
sumalatha
0 1
2 3
5 4
6 7 8 9
10
e
e
e
e
e
e
e
e a
a
b
b
b
start
6. r6= r4r5 i.e : (a|b)* abb
sumalatha
Regular Expression : (a|b)*abb
NFA without e moves
NFA with e moves
start
0 3
b
2 1
b a
a
b
0 1
2 3
5 4
6 7 8 9
10
e
e
e
e
e
e
e
e a
a
b
b
b
start
sumalatha
Deterministic Finite
Automata
A DFA is defined as Five tuple set with the following properties

D= {Q ,,o , q
0
, F}

Where transition function defines mapping from QQ


i.e in DFA

i) No state has an e - transition
ii) There exists only one transition from a state on same input symbol
sumalatha





DFA simulated as
Q= { q0 , q1}
={0,1}
q0 ={q0}
F={ q1}
o(q0,0)q1
o(q0,1)q0
o(q1,0)q1
o(q1,1)q0


q
0
q
1
0
1
1
0
sumalatha
Example -DFA
start
0 3
b
2 1
b a
b
a
b
a
a
Regular Expression : (a|b)*abb
DFA:
sumalatha
Conversion of an NFA (without emoves) into equivalent DFA
For every NFA there exists an equivalent DFA.
Consider an NFA for the regular expression (a/b)*abb
start
q0
q3
b
q2 q1
b a
a
b
Q ={ q0 ,q1 ,q2 ,q3}
E = {a,b}
q0 = {q0}
F= {q3}
a b
q0
q1
q2
q3
{q0,q1}
|
|
|
q0
q2
q3
|
Transition Table
sumalatha
a b
q0
q1
q2
q3

{q0,q1}

{q0,q2}

{q0,q3}

{q0,q1}
|
|
|

{q0,q1}

{q0,q1}

{q0,q1}

q0
q2
q3
|

{q0,q2}

{q0,q3}

q0
The DFA transition table
DFA Transition Diagram
q0 q0,q1
q2 q1 q3
q0,q2 q0,q3
b
b
b
b
b
b
a
a
a
a
This part is eliminated
sumalatha
e-closure(s) : set of NFA states reachable from NFA state s on e transitions
alone.

Move(T,a) : set of NFA states to which there is a transition on input symbol a from
some NFA state s in T


1
2 3
5 4
6
e
e
e
e
a
b
e-closure(1): { 1 , 2 ,4}
e-closure(3) : {3, 6}
e-closure(3 ,5) : {3 ,5 , 6 }

move(2 ,a) : {3}
Move( {1 ,2 ,3} , a) : { 3}

NFA (with e moves) to DFA
sumalatha
NFA (with e moves) to DFA
(subset construction)
First we calculate: e-closure(0) (i.e., state 0)
e-closure(0) = {0, 1, 2, 4, 7} (all states reachable from 0
on e-moves)
Let A={0, 1, 2, 4, 7} be a state of new DFA, D.
0 1
2 3
5 4
6 7 8 9
10
e
e
e
e
e
e
e
e a
a
b
b
b
start
Start with NFA: R.E: (a | b)*abb
e
sumalatha
b : e-closure(move(A,b)) = e-closure(move({0,1,2,4,7},b))

adds {5} ( since move(4,b)=5)

From this we have : e-closure({5}) = {1,2,4,5,6,7}
(since 56 1 4, 6 7, and 1 2 all by e-moves)

Let C={1,2,4,5,6,7} be a new state. Define Dtran[A,b] = C.
2
nd
, we calculate : a : e-closure(move(A,a)) and
b : e-closure(move(A,b))

a : e-closure(move(A,a)) = e-closure(move({0,1,2,4,7},a))}
adds {3,8} ( since move(2,a)=3 and move(7,a)=8)

From this we have : e-closure({3,8}) = {1,2,3,4,6,7,8}
(since 36 1 4, 6 7, and 1 2 all by e-moves)

Let B={1,2,3,4,6,7,8} be a new state. Define Dtran[A,a] = B.
sumalatha
3
rd
, we calculate for state B on {a,b}

a : e-closure(move(B,a)) = e-closure(move({1,2,3,4,6,7,8},a))}
= {1,2,3,4,6,7,8} = B
Define Dtran[B,a] = B.

b : e-closure(move(B,b)) = e-closure(move({1,2,3,4,6,7,8},b))}
= {1,2,4,5,6,7,9} = D
Define Dtran[B,b] = D.

4
th
, we calculate for state C on {a,b}

a : e-closure(move(C,a)) = e-closure(move({1,2,4,5,6,7},a))}
= {1,2,3,4,6,7,8} = B
Define Dtran[C,a] = B.

b : e-closure(move(C,b)) = e-closure(move({1,2,4,5,6,7},b))}
= {1,2,4,5,6,7} = C
Define Dtran[C,b] = C.

sumalatha
5
th
, we calculate for state D on {a,b}

a : e-closure(move(D,a)) = e-closure(move({1,2,4,5,6,7,9},a))}
= {1,2,3,4,6,7,8} = B
Define Dtran[D,a] = B.

b : e-closure(move(D,b)) = e-closure(move({1,2,4,5,6,7,9},b))}
= {1,2,4,5,6,7,10} = E
Define Dtran[D,b] = E.

Finally, we calculate for state E on {a,b}

a : e-closure(move(E,a)) = e-closure(move({1,2,4,5,6,7,10},a))}
= {1,2,3,4,6,7,8} = B
Define Dtran[E,a] = B.

b : e-closure(move(E,b)) = e-closure(move({1,2,4,5,6,7,10},b))}
= {1,2,4,5,6,7} = C
Define Dtran[E,b] = C.

sumalatha
Dstates
Input Symbol
a b
A B C
B B D
C B C
E B C
D B E
A
C
B D E start b b
b
b
b
a
a
a
a
This gives the transition table Dtran for the DFA of:
sumalatha
Minimizing number of states in
DFA
sumalatha
1.All states in DFA are divided into two groups:
Final states
Non final states














2. If for every input symbol a, two states s and t in G have
transitions on a to the same group, then s and t stay in the
same group.
Otherwise, divide G and put s and t to different groups

3. Repeat the division, until no changes on grouping
DFA Minimization
(a|b)*abb
A, C, E has
the same transitions

Can be merged
But E is a final
state and A and
C are not
Merge A and C
A
A
sumalatha
Dstates
Input Symbol
a b
A B A
B B D
E B A
D B E
A B D E start b b
b b
a
a
a
a
DFA after minimizing the states
sumalatha
Regular expression from
finite automata
0
To calculate regular expression for this automata will apply following procedure
q0= e + q01+q11 ..(1)
q1=q00+q10 .(2)
apply ardens theorem
q1=q000*
q1= q00
+
(3)
Use (3) in (1)
q0= e + q01+ q00
+
1
apply ardens theorm
q0 =(1+0
+
1)* ..(4)
q1 = (1+0
+
1)*00*

q
0
q
1
0
1
1
Arden theorem :
If the equation in the form
R= Q+RP if P doesnt contain
where P ,Q are 2
regular expressions
Then
R=QP*


sumalatha
Finite automata with output
Mealy Machine: In this output is associated with input symbol.
Moore Machine: In this output is associated with state.

Both the machines define a six tuple ( Q ,E ,A ,o , ,q
0
)
where
Q Set of states
E Input symbol
A output symbol
o transition function Q X E Q
output function (Q A in Moore and Q X E A
in Mealy)
q
0
initial state
Both the machines are deterministic in nature.
No final states.
sumalatha
Construction of Mealy and Moore machines to generate
the output , same as binary input.

Mealy Machine Moore Machine






sumalatha
q0
0/0 , 1/1
q0/0
q1/1
1
0
0
1
0
NS O/P
1
NS O/P
q0 q0 0 q0 1
O/P 0 1
q0 0 q0 q1
q1 1 q0 q1
Design a Mealys Machine which will increment the value of a given binary
number by 1.
Design a Mealys Machine to obtain the 2s complement of a given binary
number.
Design a Moore Machine to determine the mod 3 for each binary string
treated as Binary Integer
In Moore machine if n bit length input is received , produces n+1 bit output
Assignment - 1
1. Define the following
(a) Alphabet (b) String (c) Language
2. Explain in detail about closed properties and identity rules of Regular sets.
3. Write the regular expressions for the following languages
(a) All strings of lowercase letters that contain the five vowels in order.
(b) All strings of lowercase letters in which the letters are in ascending lexicographic
order.
4. Construct NFA with -moves for the following regular expressions.
(i) (11+0)
*
(00+1)
*
(ii) 10 + (0+11)0
*
1 (iii) (a + b)
*
(aa+bb)(a + b)
*
5. Construct Minimum state DFA for the following regular languages.
(i) (0+1)
*
1 (0+1)
*

(ii) Let L be the set of all binary strings whose last two symbols are same.
6.Obtain the regular expression for the following Finite Automata.





7. Construct Moore and Mealy machines that accepts all binary strings as input
and produces output y if the string ends with two consecutive symbols of same
type, n otherwise.
8. Prove (a*|b*)* and ((|a)b*)* are equal.

Das könnte Ihnen auch gefallen