The Print and Dump Statements

The Toy Language has "print" and "dump" statements with arguments. This file describes them.

The "print" and "dump" tokens are reserved words in the grammar. See below for an easy way to recognize reserved words.

The Print statement

The Print statement simply prints the value of its argument(s) on the standard output, without extra blanks, commas, or extra newlines, e.g.:

print "Hello World!\n";
print "Look!\nMultiple lines!\nJust the way C would do it!\n";
print "1 + 5 * 3 is ", 1+5*3, ", which isn't a big number.\n";
print "The value of 1000 times (30+1) is ", 1000*(30+1), ".\n";
print "The value of 1000 times 1000 is ", 1000*1000, ".\n";
print "Note how we print \"quoted\" strings properly.\n";
print "What happens when you add a number to a string?" + 32767;
print "Optionally, you can try" + " string concatenation" + " using plus.\n";

Each argument to Print (each expression) is printed correctly according to its type. No punctuation or blanks are added. Numbers print as numbers; strings print as strings.

The expression parser handles type checking and expression parsing errors; the Print command merely pops each expression off the stack, checks its type, and prints it.

The Dump statement

The Dump statement is a symbol table debugging tool that takes optional arguments. If no arguments are given, it prints the contents of the entire symbol table. With arguments, it prints a dump of the given symbol table entry or entries, either by number or, if the argument is a string, by looking up the symbol name found in the string, e.g.:

dump;
dump 1,2;
dump "abc","mysymbol","id3";
dump 1+1;

The argument is an expression, so arithmetic is possible inside each argument. (The expression parser will handle the arithmetic.) All the Dump statement need do is pop each expression off the stack, check its type, and perform the appropriate action on the symbol table.

The Dump output might look something like this, only nicer:

Symbol         Type     Value
------------   ------   ---------------------------------------------
abc            INT      32769
mysymbol       STRING   Hello World!
id3            INT      -65339

The list of output fields might also include other fields in your symbol table, such as the line number on which the symbol was last changed, or even the hex address of the strings in the table. Print whatever information you need to assure yourself that your symbol table is working correctly.

To preserve the integrity of the symbol table and protect its internal details from external view, make sure your parser does not directly access the symbol table. It must call a symbol table function to do the work. Only that symbol table function should know the inner workings of the symbol table data structures.

Implementing Print and Dump as reserved words

To have the words "print" and "dump" work as reserved words, the lexical analyser must recognize and return appropriate T_PRINT and T_DUMP token types to the parser. This is best handled by a simple string comparison at the point where an identifier is about to be returned. If the identifier is "print" or "dump", return the tokens T_PRINT or T_DUMP, otherwise return T_ID as before.

Make a conscious decision as to whether your string comparison will be case-sensitive (Unix-style) or case-insensitive (DOS-style). Do you want all of "Print", "prInt", "PRINT", and "printT" to be the same as "print"?

If your reserved words are case-insensitive, what about the identifiers in your language? Are they also case-insensitive?