Program Grammar #1

This page last updated: Sunday September 27, 1998 01:07

Here are the productions for the Program Grammar #1:

        program := stmt * TT_EOF
        stmt    := ( asst | ø ) ';'
        asst    := TT_ID '=' expr
        expr    := term ( ('+'|'-') term ) *
        term    := factor ( ('*'|'/') factor ) *
        factor  := TT_ID | TT_UINT | '(' expr ')'
Notes:  * means zero or more repetitions of the preceding item
        ? means zero or one repetition of the preceding item
        ø stands for epsilon, the empty string

The above grammar has 6 non-terminal symbols.  These non-terminal symbols will become C Language functions in the Parser.  All the other grammar symbols must be recognized, named, and returned as token types by the Scanner.

Follow the existing TokenType naming convention: Start each TokenType with the prefix "TT_", e.g. TT_PLUS.

Here are descriptions of some of the above tokens:

Epsilon is not a token.  It is a grammar meta-symbol, as are the other symbols described in the Notes, above. It stands for "nothing" (or the empty string). You can rewrite this production containing epsilon:

       stmt    := ( asst | ø ) ';'

as this equivalent production:

       stmt    := asst ';' | ø ';'

and then eliminate the empty string, ending up with this equivalent production:

       stmt    := asst ';' | ';'

which you can also write as this equivalent production:

       stmt    := asst ? ';'

Any one of these generates the same strings.