CST 8152 - Assignment Four

Topic: A Predictive, Recursive-Descent Parser.

This page last updated: Sunday September 27, 1998 01:07

Deliverables for this assignment:

Due Date: 12:00 noon, Friday March 27. There is no penalty for handing in this assignment one week late. The usual penalty applies for assignments more than one week late.
Hand in, on paper:
1. your state transition diagram and transition table for your enhanced Scanner that recognizes all the Tokens in the Program Grammar #1 grammar.
2. the fully documented source listing of your parser.c source file containing the Parser and its support functions.
3. just the TokenType enum from your scanner.h header file.
4. your test results, output, and documentation from testing your program.
Do not hand in anything other than the above items.
The submission must follow the online course submission standards, including an Assignment Submission Label and a Table of Contents. (The Label and Contents may be on the same exterior cover page.)
Testing: Include a description or listing of your input test file(s) (and output, if appropriate) showing how you tested your program. Don't kill a hundred trees; submit descriptions and short excerpts of your testing input and output if the files are large. The intent of your test submissions is to convince me that your program works. Short excerpts of actual program output are usually more convincing than mere words, so give me at least some sample output. Try your program on the a4test.txt input file in the source directory. In addition to your other testing, submit hard copy of the first and last 20 lines of output from processing that file.
No demonstration is required for this Lab; however, I am always pleased to see working code during lab hours.

Purpose:

This assignment implements and tests a simple predictive, recursive-descent Parser. The input to the Parser is tokens generated by the Scanner (lexical analyser). The output of each call to the Scanner is a single Token that is checked by the Parser against Program Grammar #1.

This assignment only recognizes the tokens and checks them against the Grammar; it doesn't try to generate any arithmetic output. (To do calculations, we will implement a value stack in the next assignment.) The only output from the Parser is some statistical information and some statement information printed after each statement is recognized.

You may use the main() program given in the source directory to test your Parser (or, you may write your own driver). The main() function of this assignment is a modification of that in the previous assignment. Instead of looping reading and printing lexemes using the scanner() function, we drive the Scanner using the functions of the predictive-parser, starting with the parser() function in the parser.c source file.

Instructions:

Here is a suggested order to tackle this assignment.
Be sure you take advantage of any code that I write for you!
Read the frequently asked questions file.

Part 0 (optional but highly recommended)

Get the MEM package and install it in a simple program. Learn how it works. Then install it in this assignment, and use it all subsequent assignments. It will save you a lot of debugging time.

Part 1

Examine Program Grammar #1. It first requires that the Scanner recognize some new tokens. You will need to design a state transition diagram for your Scanner that handles all the new tokens. Identify which accepting states need to push back a character. Once you have the state diagram for your Scanner, here's what to do to enhance your Scanner from the previous assignment to recognize these tokens:

Name any new states required to recognize the new Tokens and add the names to the State enum typedef in your Scanner. Make sure you separate the accepting states from the non-accepting states.
Fill in a transition table from your transition diagram. Put non-accepting states across the top (columns) and character classes down the side (rows), since that makes the table fit best on paper when you turn it into C code. Use the state names inside the table; do not use numbers!
Copy your new transition table to the two-dimensional array used by the next_state() function in your Scanner.
To the body of the main loop in scanner(), add code to save the appropriate lexeme characters in the lexeme buffer.
Name the new Token types required by the grammar and add the names to the TokenType enum typedef. (Make the corresponding changes in the tokenName[] table.)
After the main loop in scanner() accepts, for each new accepting state, set the new recognized token type that was accepted. Be on the alert for "put the character back" accepting states! Most of the new token types do not require any push-back character.
Test your new Scanner by linking it with the main driver from Assignment 3. That driver will print all the different token types that your new Scanner recognizes. Make sure it still recognizes all the identifiers correctly! If you feed the big a4test.txt test file to your program, the (also big) output should look something like this.

Part 2

Once your Scanner correctly recognizes all the new tokens, use the two-part method described in class and in the text (p.46 "Designing a Predictive Parser") to write code for a predictive parser for the Program Grammar #1.

Start with the given fixme-parser.c file skeleton from the source directory. This file has suggested function templates for all the functions needed to parse the grammar. Finish the algorithms and do the coding of the recursive-descent functions in parser.c so that your Parser recognizes the given grammar. You might want to refer to the examples from class, as mentioned above.

The key places you need to write code are marked in each source file with "//FIXME--" tags. Look for them. If you insert the correct code in these exact locations, you do not need to make any other changes to the source files. The files with names having a prefix of "fixme-" contain the tags that you need to edit. Rename the files to remove the "fixme-" prefixes.

You may use the main.c file from the source directory to drive and test your Parser, or you may write your own driver.

Part 3

Test your program on every type of input you can think of. Document your testing strategy and submit a summary of what you did and why. Try your program on the a4test.txt input file in the source directory and submit as part of your test output the first and last 20 lines of output from processing that file. Your output should resemble my sample output.

For incorrect input, make sure your Parser's error messages give enough detail to locate the error. See my sample error messages for guidance.

Source directory

Copy these source files and modify them according to the assignment instructions. The header files are examples only; you are not obligated to use them; however, you must divide up your program source into more than one module.

The output of running the Parser using the Assignment 4 main program and the large a4test.txt file as input is in the output.txt file. The output of running the Scanner using the Assignment 3 main program and the a4test.txt file as input is in the (large) part1out.txt file. Some sample error messages for syntax errors are in the errors.txt file.

The MEM package in this source directory has some minor modifications to reduce the number of compiler warnings it generates.

Ian D. Allen CST8152 Home Page

This page last updated: Sunday September 27, 1998 01:07