Академический Документы
Профессиональный Документы
Культура Документы
regular expressions.
Here is a regular definition for the set of Pascal identifiers that is define as the set of strings of
letter and digits beginning with a letters.
letter → A | B | . . . | Z | a | b | . . . | z
digit → 0 | 1 | 2 | . . . | 9
id → letter (letter | digit)*
The regular expression id is the pattern for the Pascal identifier token and defines letter and
digit.
Where letter is a regular expression for the set of all upper-case and lower case letters in the
alphabet and digit is the regular for the set of all decimal digits.
The pattern for the Pascal unsigned token can be specified as follows:
digit → 0 | 1 | 2 | . . . | 9
digit → digit digit*
Optimal-fraction → . digits | ε
Optimal-exponent → (E (+ | - | ) digits) | ε
num → digits optimal-fraction optimal-exponent.
Operations
L U M = {s | s is in L or s is in M}
LM = {st | s is in L and t is in M}
The Kleene Closure of a language L is written as
Notations
If r and s are regular expressions denoting the languages L(r) and L(s), then
letter = [a – z] or [A – Z]
digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 or [0-9]
sign = [ + | - ]
Decimal = (sign)?(digit)+
The only problem left with the lexical analyzer is how to verify the validity of a regular
expression used in specifying the patterns of keywords of a language. A well-accepted solution is
to use finite automata for verification.