CST 8152 - Assignment Four

Topic: A Predictive, Recursive-Descent Parser.

This page last updated: Sunday September 27, 1998 01:07

Deliverables for this assignment:

Purpose:

This assignment implements and tests a simple predictive, recursive-descent Parser. The input to the Parser is tokens generated by the Scanner (lexical analyser). The output of each call to the Scanner is a single Token that is checked by the Parser against Program Grammar #1.

This assignment only recognizes the tokens and checks them against the Grammar; it doesn't try to generate any arithmetic output.  (To do calculations, we will implement a value stack in the next assignment.)  The only output from the Parser is some statistical information and some statement information printed after each statement is recognized.

You may use the main() program given in the source directory to test your Parser (or, you may write your own driver).  The main() function of this assignment is a modification of that in the previous assignment. Instead of looping reading and printing lexemes using the scanner() function, we drive the Scanner using the functions of the predictive-parser, starting with the parser() function in the parser.c source file.

Instructions:

Part 0 (optional but highly recommended)
Get the MEM package and install it in a simple program.  Learn how it works.   Then install it in this assignment, and use it all subsequent assignments.  It will save you a lot of debugging time.
Part 1
Examine Program Grammar #1.  It first requires that the Scanner recognize some new tokens.  You will need to design a state transition diagram for your Scanner that handles all the new tokens.  Identify which accepting states need to push back a character.  Once you have the state diagram for your Scanner, here's what to do to enhance your Scanner from the previous assignment to recognize these tokens:
  1. Name any new states required to recognize the new Tokens and add the names to the State enum typedef in your Scanner.  Make sure you separate the accepting states from the non-accepting states.
  2. Fill in a transition table from your transition diagram.  Put non-accepting states across the top (columns) and character classes down the side (rows), since that makes the table fit best on paper when you turn it into C code.  Use the state names inside the table; do not use numbers!
  3. Copy your new transition table to the two-dimensional array used by the next_state() function in your Scanner.
  4. To the body of the main loop in scanner(), add code to save the appropriate lexeme characters in the lexeme buffer.
  5. Name the new Token types required by the grammar and add the names to the TokenType enum typedef.  (Make the corresponding changes in the tokenName[] table.)
  6. After the main loop in scanner() accepts, for each new accepting state, set the new recognized token type that was accepted.  Be on the alert for "put the character back" accepting states!  Most of the new token types do not require any push-back character.
  7. Test your new Scanner by linking it with the main driver from Assignment 3.  That driver will print all the different token types that your new Scanner recognizes.  Make sure it still recognizes all the identifiers correctly! If you feed the big a4test.txt test file to your program, the (also big) output should look something like this.
Part 2
Once your Scanner correctly recognizes all the new tokens, use the two-part method described in class and in the text (p.46 "Designing a Predictive Parser") to write code for a predictive parser for the Program Grammar #1.

Start with the given fixme-parser.c file skeleton from the source directory.  This file has suggested function templates for all the functions needed to parse the grammar.  Finish the algorithms and do the coding of the recursive-descent functions in parser.c so that your Parser recognizes the given grammar. You might want to refer to the examples from class, as mentioned above.

The key places you need to write code are marked in each source file with "//FIXME--" tags.  Look for them.  If you insert the correct code in these exact locations, you do not need to make any other changes to the source files.  The files with names having a prefix of "fixme-" contain the tags that you need to edit.  Rename the files to remove the "fixme-" prefixes.

You may use the main.c file from the source directory to drive and test your Parser, or you may write your own driver.
Part 3
Test your program on every type of input you can think of.  Document your testing strategy and submit a summary of what you did and why.  Try your program on the a4test.txt input file in the source directory and submit as part of your test output the first and last 20 lines of output from processing that file. Your output should resemble my sample output.

For incorrect input, make sure your Parser's error messages give enough detail to locate the error.  See my sample error messages for guidance.

Source directory

Copy these source files and modify them according to the assignment instructions.  The header files are examples only; you are not obligated to use them; however, you must divide up your program source into more than one module.

The output of running the Parser using the Assignment 4 main program and the large a4test.txt file as input is in the output.txt file.  The output of running the Scanner using the Assignment 3 main program and the a4test.txt file as input is in the (large) part1out.txt file.  Some sample error messages for syntax errors are in the errors.txt file.

The MEM package in this source directory has some minor modifications to reduce the number of compiler warnings it generates.


Ian D. Allen CST8152 Home Page

This page last updated: Sunday September 27, 1998 01:07