Beruflich Dokumente
Kultur Dokumente
Regular Expressions
Deepak DSouza
Department of Computer Science and Automation
Indian Institute of Science, Bangalore.
18 August 2014
Regular Expressions Kleenes Theorem Equation-based alternate construction
Outline
1
Regular Expressions
2
Kleenes Theorem
3
Equation-based alternate construction
Regular Expressions Kleenes Theorem Equation-based alternate construction
Examples of Regular Expressions
Expressions built from a, b, , using operators +, , and .
(a
+ b
) c
Strings of only as or only bs, followed by a c.
(a + b)
abb(a + b)
b(a + b)(a + b)
3rd last letter is a b.
(b
ab
a)
( + 0 + 1 + + 111)
Regular Expressions Kleenes Theorem Equation-based alternate construction
Examples of Regular Expressions
Expressions built from a, b, , using operators +, , and .
(a
+ b
) c
Strings of only as or only bs, followed by a c.
(a + b)
abb(a + b)
b(a + b)(a + b)
3rd last letter is a b.
(b
ab
a)
( + 0 + 1 + + 111)
Regular Expressions Kleenes Theorem Equation-based alternate construction
Examples of Regular Expressions
Expressions built from a, b, , using operators +, , and .
(a
+ b
) c
Strings of only as or only bs, followed by a c.
(a + b)
abb(a + b)
b(a + b)(a + b)
3rd last letter is a b.
(b
ab
a)
( + 0 + 1 + + 111)
Regular Expressions Kleenes Theorem Equation-based alternate construction
Examples of Regular Expressions
Expressions built from a, b, , using operators +, , and .
(a
+ b
) c
Strings of only as or only bs, followed by a c.
(a + b)
abb(a + b)
b(a + b)(a + b)
3rd last letter is a b.
(b
ab
a)
( + 0 + 1 + + 111)
Regular Expressions Kleenes Theorem Equation-based alternate construction
Formal denitions
Syntax of regular expresions over an alphabet A:
r ::= | a | r + r | r r | r
where a A.
Semantics: associate a language L(r ) A
with regexp r .
L() = {}
L(a) = {a}
L(r + r
) = L(r ) L(r
)
L(r r
) = L(r ) L(r
)
L(r
) = L(r )
.
Question: Do we need in syntax?
No.
.
Regular Expressions Kleenes Theorem Equation-based alternate construction
Formal denitions
Syntax of regular expresions over an alphabet A:
r ::= | a | r + r | r r | r
where a A.
Semantics: associate a language L(r ) A
with regexp r .
L() = {}
L(a) = {a}
L(r + r
) = L(r ) L(r
)
L(r r
) = L(r ) L(r
)
L(r
) = L(r )
.
Question: Do we need in syntax?
No.
.
Regular Expressions Kleenes Theorem Equation-based alternate construction
Formal denitions
Syntax of regular expresions over an alphabet A:
r ::= | a | r + r | r r | r
where a A.
Semantics: associate a language L(r ) A
with regexp r .
L() = {}
L(a) = {a}
L(r + r
) = L(r ) L(r
)
L(r r
) = L(r ) L(r
)
L(r
) = L(r )
.
Question: Do we need in syntax?
No.
.
Regular Expressions Kleenes Theorem Equation-based alternate construction
Example: Semantics of regexp
(a
+ b
) c
+
{, a, b, aa, bb, . . .}
a b
{a} {b}
c
{c}
{, a, aa, . . .} {, b, bb, . . .}
{c, ac, bc, aac, bbc, . . .}
Regular Expressions Kleenes Theorem Equation-based alternate construction
Example: Semantics of regexp
(a
+ b
) c
+
{, a, b, aa, bb, . . .}
+ b
) c
+
{, a, b, aa, bb, . . .}
+ b
) c
+ {, a, b, aa, bb, . . .}
+ b
) c
+ {, a, b, aa, bb, . . .}
|
(p, w) = q}.
Then L(A) =
f F
L
sf
.
For X Q, dene L
X
pq
= {w A
|
(p, w) =
q via a path that stays in X except for rst and last states}
X
p q
Then L(A) =
f F
L
Q
sf
.
Regular Expressions Kleenes Theorem Equation-based alternate construction
DFA RE: Kleenes construction
p q
r
X
Advantage:
L
X{r }
pq
= L
X
pq
+ L
X
pr
(L
X
rr
)
L
X
rq
.
Regular Expressions Kleenes Theorem Equation-based alternate construction
DFA RE: Kleenes construction (2)
Method:
Begin with L
Q
sf
for each f F.
Simplify by using terms with strictly smaller Xs:
L
X{r }
pq
= L
X
pq
+ L
X
pr
(L
X
rr
)
L
X
rq
.
For base terms, observe that
L
{}
pq
=
{a | (p, a) = q} if p = q
{a | (p, a) = q} {} if p = q.
Exercise: convert NFA/DFAs below to REs:
s
b
a, b
f
b
a
b
a
e o
Regular Expressions Kleenes Theorem Equation-based alternate construction
DFA RE: Kleenes construction (2)
Method:
Begin with L
Q
sf
for each f F.
Simplify by using terms with strictly smaller Xs:
L
X{r }
pq
= L
X
pq
+ L
X
pr
(L
X
rr
)
L
X
rq
.
For base terms, observe that
L
{}
pq
=
{a | (p, a) = q} if p = q
{a | (p, a) = q} {} if p = q.
Exercise: convert NFA/DFAs below to REs:
s
b
a, b
f
b
a
b
a
e o
Regular Expressions Kleenes Theorem Equation-based alternate construction
DFA RE using system of equations
Aim: to construct a regexp for
L
q
= {w A
|
(q, w) F}.
Note that L(A) = L
s
.
Example:
b
a
b
a
e o
Set up equations to capture L
q
s:
x
e
= b x
e
+ a x
o
x
o
= a x
e
+ b x
o
+ .
Solution is a RE for each x, such that languages denoted by
LHS and RHS REs coincide.
Regular Expressions Kleenes Theorem Equation-based alternate construction
DFA RE using system of equations
Aim: to construct a regexp for
L
q
= {w A
|
(q, w) F}.
Note that L(A) = L
s
.
Example:
b
a
b
a
e o
Set up equations to capture L
q
s:
x
e
= b x
e
+ a x
o
x
o
= a x
e
+ b x
o
+ .
Solution is a RE for each x, such that languages denoted by
LHS and RHS REs coincide.
Regular Expressions Kleenes Theorem Equation-based alternate construction
Solutions to a system of equations
L
q
s are a solution to the system of equations
In general there could be many solutions to equations.
Consider x = A
x
e
x
o
b a
a b
x
e
x
o
a b
c d
(a + bd
c)
(a + bd
c)
bd
(d + ca
b)
ca
(d + ca
b)