REToC

Anuncio
Step-by-step Conversion of
Regular Expressions to C Code
On the regular expression:
((a⋅ b)|c)*
THOMPSON’S CONSTRUCTION
Convert the regular expression to an NFA.
Step 1: construct NFA for r1.
( (a ⋅ b) | c
r1
)*
r 1:
1
a
2
Step 2: construct NFA for r2.
( (a ⋅ b) | c
r1
r2
)*
r 1:
r2:
1
3
a
b
2
4
Step 3: construct NFA for r3.
( (a ⋅ b) | c )*
r3
r 3:
1
a
2
b
4
Step 4: construct NFA for r4.
( (a ⋅ b) | c )*
r3
r4
r 3:
r4:
1
5
a
b
2
c
6
4
Step 5: construct NFA for r5.
( (a ⋅ b) | c )*
r5
𝜀
r 5:
1
a
2
b
4
𝜀
7
8
𝜀
5
c
6
𝜀
Step 6: construct NFA for r5*.
𝜀
1
9
𝜀
a
2
b
4
𝜀
𝜀
8
7
𝜀
5
c
𝜀
6
𝜀
𝜀
10
SUBSET CONSTRUCTION
Convert the NFA to a DFA.
Draw transition table for DFA
Dstates
𝜀
NFA States
1
𝜀
𝜀
9
7
𝜀
a
2
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
DFA
State
Next State
a
b
c
Add 𝜀-closure(9) as DFA start state
Dstates
𝜀
NFA States
1
𝜀
𝜀
9
7
𝜀
a
2
b
{9,7,1,5,10}
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
DFA
State
A
Next State
a
b
c
Subset construction: algorithm
while (there is an unmarked state T in Dstates) {
mark T;
for (each input symbol a) {
U = 𝜀-closure(move(T, a));
Dtran[T, a] = U
if (U is not in Dstates)
add U as unmarked state to Dstates;
}
}
Mark state A
Dstates
𝜀
NFA States
1
𝜀
𝜀
9
7
𝜀
a
2
b
{9,7,1,5,10}
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
DFA
State
A
Next State
a
b
c
Compute 𝜀-closure(move(A, a))
Dstates
𝜀
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
{9,7,1,5,10}
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
{2}
Next State
DFA
State
a
A
B
B
b
c
Compute 𝜀-closure(move(A, b))
Dstates
𝜀
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
{9,7,1,5,10}
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
{2}
Next State
DFA
State
a
b
A
B
-
B
c
Compute 𝜀-closure(move(A, c))
Dstates
𝜀
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
{9,7,1,5,10}
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
a
b
c
A
B
-
C
{2}
B
{6,8,10,7,1,5}
C
Mark B
Dstates
𝜀
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
{6,8,10,7,1,5}
C
Compute 𝜀-closure(move(B, a))
Dstates
𝜀
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
-
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
{6,8,10,7,1,5}
C
Compute 𝜀-closure(move(B, b))
Dstates
𝜀
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
-
D
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
{6,8,10,7,1,5}
C
{4,8,7,1,5,10}
D
Compute 𝜀-closure(move(B, c))
Dstates
𝜀
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
-
D
-
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
{6,8,10,7,1,5}
C
{4,8,7,1,5,10}
D
Mark C
Dstates
𝜀
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
-
D
-
{6,8,10,7,1,5}
C
{4,8,7,1,5,10}
D
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
Compute 𝜀-closure(move(C, a))
Dstates
𝜀
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
-
D
-
{6,8,10,7,1,5}
C
B
{4,8,7,1,5,10}
D
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
Compute 𝜀-closure(move(C, b))
Dstates
𝜀
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
-
D
-
{6,8,10,7,1,5}
C
B
-
{4,8,7,1,5,10}
D
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
Compute 𝜀-closure(move(C, c))
Dstates
𝜀
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
-
D
-
{6,8,10,7,1,5}
C
B
-
C
{4,8,7,1,5,10}
D
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
Mark D
Dstates
𝜀
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
-
D
-
{6,8,10,7,1,5}
C
B
-
C
{4,8,7,1,5,10}
D
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
Compute 𝜀-closure(move(D, a))
Dstates
𝜀
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
-
D
-
{6,8,10,7,1,5}
C
B
-
C
{4,8,7,1,5,10}
D
B
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
Compute 𝜀-closure(move(D, b))
Dstates
𝜀
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
-
D
-
{6,8,10,7,1,5}
C
B
-
C
{4,8,7,1,5,10}
D
B
-
C
NFA States
1
a
2
𝜀
9 𝜀 7
𝜀
b
4
𝜀
5
c
𝜀
6 𝜀
8 𝜀 10
Next State
DFA
State
Draw DFA
Next State
DFA
State
a
b
c
{9,7,1,5,10}
A
B
-
C
{2}
B
-
D
-
{6,8,10,7,1,5}
C
B
-
C
{4,8,7,1,5,10}
D
B
-
C
NFA States
a
a
A
B
b
D
a
c
c
C
c
TRANSLATION TO C
Convert the DFA into C code.
int match(char* next) {
goto A;
}
A:
if (*next == '\0') return 1;
if (*next == 'a') { next++; goto B; }
if (*next == 'c') { next++; goto C; }
return 0;
B:
if (*next == '\0') return 0;
if (*next == 'b') { next++; goto D; }
return 0;
C:
if (*next == '\0') return 1;
if (*next == 'a') { next++; goto B; }
if (*next == 'c') { next++; goto C; }
return 0;
D:
if (*next == '\0') return 1;
if (*next == 'a') { next++; goto B; }
if (*next == 'c') { next++; goto C; }
return 0;
Descargar