CST 8152 - Stacks

The parser of an interpreter may use a semantic expression value stack to hold the operands for expressions. The stack must hold both the value of the expression and also its type. (You may want to save other information, such as token line number, as well.)

The stack is a classic, Last In First Out (LIFO) data structure.

Purpose

The purpose of the expression value stack is to turn infix expressions such as "1 + 2" into postfix expressions where all the required operands are saved on the stack by the parser, then the operation is performed by popping the correct number of arguments off the stack and pushing back the single result value. For the Toy Grammar, the stack holds the operands of various arithmetic operations. The <expression> and <term> productions both pop pairs of arguments off the stack and push back the results of the associated operations.

Note that the productions <expression>, <term>, and <factor> all return exactly one new pushed element on the stack whenever they are used. Both <expression> and <term> combine into one result any number of operands separated by binary operators such as '+' and '*'. When the functions that implement each of these productions return, the expressions they parse have been reduced, pairwise, to one value on the stack. Thus, the topmost and first call to <expression> returns the single number that is the result of the whole expression.

A stack of structures

Here is an example of how one might start to code a stack of structures:

static int stkptr = 0;
static StackElt stack[STK_SIZE];

#define NEL(x) (sizeof(x)/sizeof(x[0]))  /* number of elements */

   void
push( StackElt *item ){
   if( stkptr >= NEL(stack) )
      error("overflow");         /* message and quit */
   stack[stkptr++] = *item;      /* structure copy */
}

   StackElt
pop(){
   if( stkptr <= 0 )
      error("underflow");        /* message and quit */
   return stack[--stkptr];       /* return entire element */
}

A proper implementation would return an error code instead of exiting the program when an error is encountered, unless the error was something fatal like "out of memory". Returning an error code is preferred to actually printing a message, since in some circumstances a condition such as "underflow" may not be an error. (For example: when cleaning out the stack and pop()ping off all the elements.) If the function always prints an error message, the calling function won't be able to use it in cases where it would be useful.

Returning structures or pointers

Returning the entire pushed structure results in more memory copying than returning a pointer to the pushed structure. The entire pushed structure is returned, instead of a pointer to the structure, because in some implementations of the stack (e.g. a linked list) the storage used by the popped element is free()d after the pop is done, so it cannot be pointed to.

Sometimes, to avoid returning an entire structure, one can return a pointer to static storage within a function, e.g.

   StackElt *                    /* returns a pointer */
alternatepop(){
   static StackElt tmp;          /* static storage */
   if( stkptr <= 0 )
      error("underflow");        /* message and quit */
   tmp = stack[--stkptr];        /* structure copy into static storage */
   return &tmp;                  /* return pointer to static storage */
}

This does not work well in an interpreter, because the Parser does several calls to pop() in a row, and the second and subsequent calls would overwrite the data in the static tmp buffer. All the returned pointers would point to the same tmp buffer; the data saved by the first calls would be lost unless the Parser saved a copy between calls to alternatepop(). To simplify the Parser, we do the copy in the pop() function by returning the entire structure contents.

Using the stack

Using the stack functions looks something like this:

StackElt tmp;
. . .
tmp.type = IT_INT ;
tmp.value.num = token.value.num;
push(&tmp);

tmp = pop();
if( tmp.type == IT_INT )
   printf("Popped an integer with value %d\n", tmp.value.num);
. . .

If the global token structure and the stack share the same type/value structure, the above code is even simpler, since we can push the structure directly without copying it first. Here is an example, where the global token structure and the stack use the same "Item" datatype:

typedef enum { IT_UNDEF, IT_NUM, IT_STR } ItemType;

/* Define a general union data typedef called "Item" */
typedef struct {
   ItemType type;       /* what data type is stored in the union? */
   union {
      int num;
      char *str;
   } value;
} Item;

/* Define the simplest global token structure */
typedef struct {
   TokenType type;      /* type of Token returned */
   Item item;           /* lexeme stored in Item structure */
} Token;

/* Define the semantic stack */
typedef Item StackElt;  /* stack is identical to Item */
. . .
Token token;
. . .
push(&token.item);      /* can push this directly now */

This "Item" data structure will be useful in many areas of the parser, including the lexical analyser, the semantic stack, and the symbol table.


Ian D. Allen CST 8152 Home Page