Вы находитесь на странице: 1из 2

1. Penganalisa leksikal : membaca program sumber, karakter demi karakter.

Sederetan (satu
atau lebih) karakter dikelompokkan menjadi satu kesatuan mengacu kepada pola kesatuan
kelompok karakter (token) yang ditentukan dalam bahasa sumber. Kelompok karakter yang
membentuk sebuah token dinamakan lexeme untuk token tersebut. Setiap token yang
dihasilkan disimpan di dalam table simbol. Sederetan karakter yang tidak mengikuti pola
token akan dilaporkan sebagai token tak dikenal (unidentified token).Contoh : Misalnya pola
token untuk identifier I adalah : I = huruf(huruf angka)*. Lexeme ab2c dikenali sebagai token
sementara lexeme 2abc atau abC tidak dikenal.

https://hackernoon.com/lexical-analysis-861b8bfe4cb0

Applications of RE [1] Two common applications of RE:

–Lexical analysis in compiler

–Finding patterns in text

https://docplayer.info/39754156-Ekspresi-reguler-definisi-notasi-ekspresi-regular-contoh-ekspresi-
reguler-2.html

Lexical Analyzer [1]

•Recognize “tokens”in a program source code.

•The tokens can be variable names, reserved words, operators, numbers, ...etc.

•Each kind of token can be specified as an RE, e.g., a variable name is of the form [A-Za-z][A-Za-z0-
9]*. We can then construct an ε-NFA to recognize it automatically.

•By putting all these ε-NFA’s together, we obtain one that can recognize different kinds of tokens in
the input string.

•We can convert this ε-NFA to NFA and then to DFA, and implement this DFA as a deterministic
program - the lexical analyzer.

Text Search [1]

• “grep”in Unix stands for “Global (search for) Regular Expression and Print”.

•Unix has its own notations for regular expressions:

–Dot “.”stands for “any character”.

–[a1a2...ak] stands for {a1, a2,...,ak}, e.g., [bcd12] stands for the set {b, c, d, 1, 2}.

–[x-y] stands for all characters from x to y in the ASCII sequence.

–| means “or”, i.e., + in our normal notation.

–* means “Kleene star”, as in our normal notation.

–? means “zero or one”, e.g., R? is ε+ R


–+ means “one or more”, e.g., R+is RR*

–{n} means “n copies of”, e.g., R{5} is RRRRR(You can find out more by “man grep”, “man regex”)

•We can use these notations to search for string patterns in text.

–For example, credit card numbers:[0-9]{16} | [0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{4}

–For example, phone numbers:[0-9]{8} | [0-9]{3}-[0-9]{5} | 852-[0-9]{8} |852-[0-9]{3}-[0-9]{5}

What is lex? https://luv.asn.au/overheads/lex_yacc/index.html

Вам также может понравиться