Panic-Mode Error Recovery

This page last updated: Sunday September 27, 1998 01:07

Error Handling in nested functions

Read Aho section 4.1 regarding panic-mode as a method of compiler error recovery.

A simple panic-mode error handling system requires that we return to a high-level parsing function when a parsing or lexical error is detected.  The high-level function re-synchronizes the input stream by skipping tokens until a suitable spot to resume parsing is found.  For a grammar that ends statements with semicolons, the semicolon becomes the synchronizing token.

We add error-handling code to all the parsing functions so that when they detect parsing errors, instead of exiting, they return FALSE.  We check the return codes of all the parsing functions, and return FALSE when any of them return FALSE.  The high-level parsing function detects the FALSE return and does the appropriate re-synchronization of the input stream by skipping tokens.

All the parsing functions become Boolean functions. Each parsing function may succeed, in which case we continue parsing, or fail, in which case we stop parsing and return the failure indication to our parent function. For example:

static Boolean expression(void){
   CALL term(), return FALSE if it fails
   WHILE TYPEOFTOKEN is PLUS or is MINUS DO
      get the next token
      CALL term(), return FALSE if it fails
   END WHILE
   return TRUE;          -- parsing succeeded so far
}
static Boolean factor(void){
   SWITCH TYPEOFTOKEN
   CASE ID:
      get the next token
      return TRUE        -- parsing succeeded so far
   CASE CONST:
      get the next token
      return TRUE        -- parsing succeeded so far
   CASE LEFTPAREN:
      get the next token
      CALL expression(), return FALSE if it fails
      CALL match(RIGHTPAREN), return FALSE if it fails
      return TRUE        -- parsing succeeded so far
   DEFAULT:
      /*FALLTHROUGH*/
   END SWITCH

   eprintf("File %s Line %ld: Expecting %s, %s, or %s;"
      " found: %s '%s'",
         filename,
         LINENUMBER,
         tokenType(ID),
         tokenType(CONST),
         tokenType(LEFTPAREN),
         tokenType(TYPEOFTOKEN),
         LEXEMESTR );
   return FALSE
}

Re-synchronizing the input by finding the semi-colon

To complete the panic-mode error recovery, some upper-level parsing function must detect the failure to parse and skip forward until an appropriate re-synchronizing token is found.  Here is an example.

The function below is the top-level (root) parsing function for a parser that recognizes assignment statements and print statements that end in semicolons. On error, we call a panic() function to re-synchronize the input. We return the number of times that panic() was called, so that our calling function can print it.

Each parsing function is responsible for issuing an error message at the point where it detects a syntax error.

    static int
doParsing(void){

   initialize errorcounter to zero

   WHILE TYPEOFTOKEN is not EOF DO
      SWITCH TYPEOFTOKEN
      CASE ID:        -- ID is in the FIRST set of assignment()
         returnStatus = assignment()
         break
      CASE PRINT:     -- PRINT is in the FIRST set of print()
         returnStatus = print()
         break
      CASE ...
          -- Other cases can go here, for other statement types
          break
      DEFAULT:
         eprintf("File %s Line %ld: Expecting %s or %s;"
            " found: %s '%s'",
               filename,
               LINENUMBER,
               tokenType(ID),
               tokenType(PRINT),
               tokenType(TYPEOFTOKEN),
               LEXEMESTR );
         returnStatus = FALSE
         break
      END SWITCH

      IF returnStatus is FALSE THEN
         CALL panic()
         increment errorcounter
      ENDIF
   END WHILE
   return errorcounter
}

A semi-colon is a good re-synchronizing token to use in the grammars used in this course. Skipping and stopping just before the reserved words that start statements would also be a good strategy.  For debugging purposes, we print out the type and value of all the tokens we skip over.

Once the semi-colon is found, one more token of look ahead is read; this prepares the parser to resume parsing after the statement with the syntax error. The code for this looks like this:

   static void
panic(void){
   WHILE TYPEOFTOKEN is not SEMICOLON and is not EOF DO
      eprintf("File %s Line %ld: Skipping over %s '%s'",
         filename, LINENUMBER, tokenType(TYPEOFTOKEN), LEXEMESTR);
      get next token
   END WHILE
   eprintf("File %s Line %ld: Skipped to %s '%s'\n",
      filename, LINENUMBER, tokenType(TYPEOFTOKEN), LEXEMESTR);
   IF TYPEOFTOKEN is SEMICOLON THEN
      get next token
   ENDIF
}

Advanced Topic: Non-local GOTOs in C Language

Error handling in the above, Boolean way, where we have to test the return code of every function and propagate the return codes all the way back up a nested call sequence, complicates the code and makes it harder to read and maintain. An alternative implementation might use the setjmp() and longjmp() library functions to return directly to the error handling code, without requiring all the intermediate functions to return TRUE/FALSE.

More on this, another day.