Programming a State Transition Table

This page last updated: Sunday September 27, 1998 01:07

A Scanner implementing a State Transition Diagram typically needs the following function:

   nextState = Next_State(currentState,character);

Given the current state of the DFA, currentState, and the current input character, character, return the next state of the DFA, nextState.

A convenient way to implement this function is by coding a two-dimensional Transition Table of states, with one table index being the current state and the other table index being the character class of the current input character.

A character class is a set of characters of the same type, e.g. "digits", "letters", etc.  All the characters in the same class index the same row or column in the table and are treated the same way by the Scanner.

In the example below, the states are the rows of the table, the character classes are the columns of the table.

Every row and column in a Transition Table should be unique:

If you find that two columns in your table are identical, it means that you should combine the two character classes into one class and get rid of the redundant column.

If you find that two rows in your table are identical, it means that you should combine those two states in your transition diagram into one state and get rid of the redundant state.

Here is an example:

/* These state numbers must match the row numbers in the
 * State Transition Table, below.
 */
typdef enum State { S_START=0, S_FOO, S_BAR, ..... } State;

   static State
Next_State(State currentState, int character){
   unsigned column = Column( character );

#define NEL(x) (sizeof(x)/sizeof(x[0])) /* number of elements macro */

   /* Make sure the indicies into the table are valid, to
    * catch programming errors that index outside the table.
    */
   if( currentState >= NEL(Table) || currentState < 0
      || column >= NEL(Table[0]) ){
           eprintf("Error: Invalid table index: current %d column %d\n",
                   currentState, column);
           abort();
           exit( EXIT_FAILURE );
           /*NOTREACHED*/
   }
   return Table[currentState][column];
}
/* Turn a character into a "class": a column index.  This could be
 * made part of the NextState() function, instead of being separate.
 */
   static int
Column(int ch){
   if( isascii(ch) && isdigit(ch) ) return 0; /* digit */
   if( isascii(ch) && isalpha(ch) ) return 1; /* alpha */
   switch( ch ){
   case '+': return 2;
   case '-': return 3;
   case EOF: return 4;
   default:  return 5;   /* "other" */
   }
}

/* The State Transition Table.
 * Note the labelling of the table rows and columns to match the State
 * numbers in the State enum and the column indicies defined above.
 */
static State Table[3][6] = {
           /*  digit,    alpha,     '+',       '-',      EOF,   other */
/*S_START*/{   S_FOO,  S_START, S_START,   S_START, S_ACCEPT, S_START },
/*S_FOO*/  {   S_FOO,    S_FOO, S_ACCEPT, S_ACCEPT, S_ACCEPT, S_ERROR },
/*S_BAR*/  { S_ERROR, S_ACCEPT, S_ACCEPT, S_ACCEPT, S_ACCEPT, S_ERROR }
};

Note that columns two and three in the above table (for the characters '+' and '-') are identical.  This means that these two characters are being treated identically by the Scanner.  The two columns should be combined into one single character class column named "plus or minus", and the Column() function fixed to produce the same index for both characters:

   switch( ch ){
   case '+':
   case '-': return 2; /* plus and minus are treated the same way */
   ...
   }

State Transition tables often fit better on a page for printing and editing if you swap the rows and columns, so the few states become the columns and the many character classes become the rows:

static State Table[5][3] = {
           /* S_START,    S_FOO,    S_BAR */
/*digit*/  {    S_FOO,    S_FOO,  S_ERROR },
/*alpha*/  {  S_START,    S_FOO, S_ACCEPT },
/*+|-*/    {  S_START, S_ACCEPT, S_ACCEPT },
/*EOF*/    { S_ACCEPT, S_ACCEPT, S_ACCEPT },
/*other*/  {  S_START,  S_ERROR,  S_ERROR }
}

If you do swap the rows and columns, make sure you swap the Table subscripts when you use the table!

return Table[row][currentState];