CST 8152 - Lexical Glossary
- We want to recognize a LANGUAGE.
- We write a GRAMMAR for the
language.
- The grammar is composed of PRODUCTIONS
containing TERMINAL symbols and NON-TERMINAL
symbols.
- Terminal symbols are TOKENS,
classified by a LEXICAL ANALYSER or SCANNER.
- The scanner reads the input and recognizes
character sequences called LEXEMES and classifies
them into tokens.
- The structure of the lexemes is based on PATTERNS.
- The patterns are often expressed using REGULAR
EXPRESSIONS and REGULAR DEFINITIONS.
- Regular expressions are equivalent to FINITE
AUTOMATA.
- Finite automata can be either DETERMINISTIC
or NON-DETERMINISTIC.
- Deterministic finite automata can be
constructed from non-deterministic finite automata;
however, the construction may require an exponential
number of states.
- We can compile a regular expression
into a recognizer by constructing the finite automaton to
which it is equivalent.
- Finite automata can be represented using STATE
MACHINES.
- State machines may be DETERMINISTIC
or NON-DETERMINISTIC.
- State machines may be expressed using TRANSITION
DIAGRAMS (deterministic) and TRANSITION GRAPHS
(non-deterministic).
- From the transition diagrams and graphs we
can construct TRANSITION TABLES that tell what the
next state is, based on the current state and the current
input character.
- Transition tables can be coded.