Updated: 2003-01-19 05:50

Improving and Fixing C Code

This page last updated: Sunday January 19, 2003 05:50

Here are some examples of poor (or wrong) code, with occasional suggested improvements. Most of these are taken from actual student assignments in the 4th term course in Compilers:

Walking off the end of arrays

This code is an example of a very common error:

    char x[SIZE];
    int cursize = 0;
 
    for(;;){
        x[cursize++] = 'a';    // put character in buffer, increment
        x[cursize] = '\0';     // WRONG!  keep buffer NUL terminated
        if( cursize == SIZE )
            break;             // exit loop when no more room
    }

The programmer writes one past the end of the array by failing to realize that you can't safely say x[SIZE]='\0' if x is declared as x[SIZE]. Testing that you have written past the end of the array after having made the error is too late; the program is doomed to fail once the memory is overwritten.

Another bad example:

/* The wrong way to strip trailing blanks */
strip( char *str ){
    char *p = strchr(str,'\0');   // find NUL at end of string
    --p;                          // back up in front of NUL
    while( isspace(*p) )
        *p-- = '\0';              // overwrite space with NUL    
}

	If the passed-in string contains only blanks, nothing in the above code stops the loop from decrementing the loop pointer right past the beginning of the string.
	If the passed-in string is empty, the loop pointer makes its first test on the character before the start of the string.

Either of these problems can lead to a program abort. Do not use a pointer to address memory that is before or after the end of the string; that memory may not be accessible.

Note that a pointer may safely contain an address that is before or after either end of the string. What is not permitted is to use that pointer to address memory. The following is a safe use of a pointer past the end of the string, since the pointer is moved inside the string before it is used to address memory:

char *p = strchr(str,'\0') + 1;   // one past end of string
while( p > str )
   *--p = 'x';                    // overwrite all chars with 'x'

Redundant return statements

Look over the following four lines of a compound IF statement:

    if ( ptr != NULL )
        return ptr;
    else
        return NULL;

and you'll see that it can be completely replaced with a single line that returns the same values:

    return ptr;

Here's another example:

    if( (ptr=malloc(size)) != NULL ){
        do things with ptr
        return ptr;
    } else
        return NULL;

The above code is also redundant. The compound IF can be replaced by a simple IF, and the duplicate return statements can be made into a single line:

    if( (ptr=malloc(size)) != NULL ){
        do things with ptr
    }
    return ptr;    /* may be NULL */

Another example:

    if( a == SOME_THING ){
        print something about a
        return a;
    }
    else {
        do something else
    }

The "else" branching structure can be removed, since the code behaves exactly the same way without it because of the "return" statement in the first part of the if statement:

    if( a == SOME_THING ){
        print something about a
        return a;
    }
    do something else

Test before you store or index an array

The incorrect code below stores the integer into the array before it tests to see if there is room for the item to be stored. This can fail if no space is left in the array, causing memory to be overwritten and the program to fail:

    ptr[used++] = i;    /* WRONG! stores before testing */
    if ( used == allocated ){
        ptr = myrealloc(ptr,++allocated * sizeof(*ptr));
    }

Always test to make space first, arrange for enough space if needed, and then store the item in the array.

A similar problem happens when looping up toward either end of an array. The following code is incorrect:

p = str + strlen(str);
while( *p != 'x' && p >= str )  /* WRONG */
   --p;

If 'x' is not found in the string, the loop indexes str[-1], which is off the end of the array and may cause a processor memory fault. The correct code does the test for the end of the array first, and the index second:

while( p >= str && *p != 'x' )  /* CORRECT */
   --p;

Don't cast that data item

Casting of pointers and values can cause problems in your code. Remove them and rework the code so that it works cleanly without the use of casts. Using an ANSI compiler, even the return from malloc() doesn't need to be cast.

An example of an incorrect use of casting:

    int ch;
    while( ((char)ch = getchar()) != '\n' ){  /* WRONG! */
    /* ... */

Here, the programmer is unaware that C always promotes characters to integers when used in expressions.

The programmer tells the compiler to chop the integer return value from getchar() down to character size, then the compiler has to widen it back to integer to compare it with the '\n' that is also widened to be an integer. Had the programmer left out the cast, the integer returned from getchar() would be directly usable in the comparison with the integer widening of '\n'.

EOF is an integer, not a character

The following code was intended to copy standard input to standard output, stopping only at end-of-file. It doesn't work on files containing the character 255 (0xFF):

    char ch;    /* WRONG! */
    
    while( (ch=getchar()) != EOF )
        putchar(ch);

On most architectures, characters are signed characters. They behave exactly like 8-bit integers, and they can be negative if the sign bit is turned on. The character 0xFF (decimal 255) is one such "negative" character, and when it is read from a file and stored into a character variable and then compared with the integer constant EOF (which is -1, or 0xFFFF), the comparison succeeds. The loop stops as soon as it reads the character 0xFF, not when it reaches end-of-file.

The character 0xFF and the integer 0xFFFF compare equal because the compiler widens the character 0xFF to the integer 0xFFFF by sign-extending the top bit into all the upper 8 bits of the 16-bit integer. Thus, 0xFF becomes 0xFFFF, and 0xFFFF == 0xFFFF and the loop stops. (Remember that C always widens characters into integers when performing arithmetic or comparisons.)

If the programmer had correctly stored the return value from getchar() into an integer, the character 0xFF would be stored as 0x00FF and it would not compare equal to 0xFFFF. If ch is declared correctly as an integer, the loop stops only when getchar() returns the integer 0xFFFF (EOF).

The following code won't work, for the same reason as above (EOF can't fit into a character):

char buf[MAX];
for( i=0; i<MAX; i++ ){
   buf[i] = getchar();
   if( buf[i] == EOF )   /* WRONG */
      break;
}

Using typedef

What was this programmer thinking?

    typedef struct Item {
        int i;
        double j;
    } Item;
    ...
    ptr = mymalloc(sizeof(struct Item));  /* Bad style */

The programmer defined an "Item" type to be an Item structure, then forgot to use the "Item" type in the sizeof() call, using the structure name instead.

If you use typedef to define a new data type, remember to use it everywhere: sizeof(Item).

Less than what?

Each of the following two if statements has exactly the same meaning. Which is simpler and easier to understand?

    char x[SIZE];
    
    if( i <= SIZE-1 )
        x[i] = '\0';
 
    if( j < SIZE )
        x[j] = '\0';

Stop on fatal errors

This programmer did an excellent job of detecting and reporting the error when opening a file:

    fd = fopen(file, "rt");
 
    if( fd == NULL )
        fprintf(stderr, "'%s': Cannot open for reading; reason: %s\n",
            file, strerror(errno));
 
    while( (ch=getc(fd)) != EOF )
        putchar(ch);

Unfortunately, the programmer did not remember to stop the program after this fatal error by using exit() or return(). For some unexplained reason, the program continues on, incorrectly using the NULL pointer in the call to getc().

If a fatal error happens, stop. Do not blunder on as if nothing unusual has happened.

Don't use garbage in stack variables

This programmer forgot that local variables have undefined, garbage values in them:

    int ch;
    
    while( ch != '\n' && ch != EOF ){   /* WRONG! */
        putchar(ch);
        ch = getchar();
    }

If the programmer is lucky, ch will only have a simple garbage value that prints an ugly blotch on the terminal screen. One day though, ch will have the random garbage value -1 (EOF) or the value 10 ('\n'), and the loop won't even execute at all.

Rewrite the loop so that the character is obtained before testing it, and so that only good ASCII characters are sent to the terminal. (Don't make the mistake of sending the EOF integer (-1) to the terminal! EOF is not a printable character.)

Another example of uninitialized storage:

Node *e;
e->next = NULL;         /* WRONG! */

The node e points to garbage. Using a pointer to garbage as a structure pointer variable will cause memory errors.

Compound logical operations

This programmer made sure to assign to the local variable before testing its value:

    int ch;
    
    do {
        ch = getchar();
        putchar(ch);
    } while( ch != EOF || ch != '\n' );  /* WRONG! */

Unfortunately, the programmer made two common mistakes.

At some point, getchar() will return the integer EOF (-1). Sending this -1 to the terminal screen as a character using putchar() is not a good thing to do. (Storing it in a buffer or character string is also incorrect.)
The logic in the while loop is wrong. If ch is not EOF, the first half of the condition is TRUE and the loop continues. If the character is EOF, the first half is FALSE and the second half must be TRUE, and the loop continues. No matter what character is read, the loop continues and never stops.

One way to arrive at the correct logical test is to consider when we want the loop to stop. We want the loop to stop when we reach end-of-file or when the character read is the newline, that is, stop when: ch == EOF || ch == '\n'. If this is the stopping condition, we need its logical opposite for the going condition in the while() loop. We can either invert the whole condition by using the logical NOT operator:

while( ! (ch == EOF || ch == '\n') );

or we can use deMorgan's theorem to invert the separate halves, giving:

while( ch != EOF && ch != '\n' );

Both logical conditions are the same; either is correct.

Don't close files that are not successfully opened

This programmer printed great error messages:

    fd = fopen(file,"rt");
    if( fd != NULL ){
        process(file, fd);
    }
    else{
        fprintf(stderr,"'%s': Cannot open for reading; reason: %s",
            file, strerror(errno));
    }
    fclose(fd);   /* WRONG! */

The problem with this code is that it will incorrectly use fclose() on a NULL pointer if the file does not open properly. Don't close files that are not actually open. (Move the fclose() right after the process() function call, where we know that fd is not NULL.)

Don't point to local storage in a function

This programmer created a function to turn an integer into a hexadecimal character string:

    #define HEXBUFSIZE 20   /* character width to print a hex number */
        static char *
    makehex( int i ){
        char hex[HEXBUFSIZE];
    
        sprintf(hex,"0x%x", i);
        
        return hex;         /* WRONG! return pointer to hex string */
    }

This programmer doesn't know how local variables work in C. The local variable hex goes away when this function returns. The pointer returned by the function will be useless, since it will point to stack storage that will sooner-or-later be re-used and overwritten. (Good compilers may tell you about this error.)

The solution here might be to use malloc() to get the storage for the string, in which case the calling program would have to remember to free() it; or, to use static storage that isn't allocated on the run-time stack, in which case the calling program needs to take other precautions. See the next section on using static storage.

Using static storage in functions

Given this improved re-implementation of the makehex() function used above:

    #define HEXBUFSIZE 20   /* character width to print a hex number */
        static char *
    makehex( int i ){
        static char hex[HEXBUFSIZE];  /* note use of STATIC */
    
        sprintf(hex,"0x%x", i);
        
        return hex;         /* return pointer to static hex string */
    }

What is the output of this print statement?

    printf( "%d is %s and %d is %s\n",
        1, makehex(1), 2, makehex(2) );

The answer depends on the order in which your compiler evaluates arguments to functions; but, it is one of the following two incorrect possibilities:

1 is 0x1 and 2 is 0x1 (wrong!)
1 is 0x2 and 2 is 0x2 (wrong!)

The expected output, 1 is 0x1 and 2 is 0x2, can't happen. Before printf() can be called, both calls to makehex() have to be made to build the argument list for printf(). Because both calls to makehex() print into the same static storage buffer and return the address of the same static storage to printf(), that storage either has 0x1 in it (if makehex(1) was computed last), or it has 0x2 in it (if makehex(2) was computed last).

Watch out for static storage.

Writing error messages: Whose fault is it?

Here is a typical junior programmer error message:

    if( x <= 0 )
        printf("***\007\007 ILLEGAL INPUT -- RETYPE ***\n");

Junior programmers often write error messages that *** YELL AT THEIR USERS!!! ***, make loud beeping noises, and blame the users for mistakes. They don't tell the users what input the program actually received, what is wrong with the input, nor do they offer examples of what correct input might be.

Here's what senior programmers have learned to say:

    #define MIN 0
    if( x <= MIN ){
        printf("%s: Unrecognized input number %d\n", argv[0], x);
        printf("Please enter an integer greater than %d\n", MIN);
    }

	The error message prints the name of the program issuing the error (argv[0]).
	The error message does NOT YELL AT THE USER. The text is lower-case. No BEL characters are included that make loud noises. No exclamation marks.
	The error message echoes back the data value received by the program, so that typing errors can be noticed by the user of the program. (The user might have accidentally typed -5 instead of 15.)
	The error message tells what the correct input might be, using variables or manifests (MIN) to ensure that the same numbers are used in the test as in the error message.

Most importantly, the error message takes the point of view that the input was "unrecognized" by the program; that is, that the program is at fault for not being clever enough to recognize what the user meant. The message does not lay the blame on the user.

Always write your error messages from the point of view that the user knows best, and any input that doesn't meet the input criteria show faults in the program, not in the user.

Using pointers that point nowhere

This is a common error:

    Foo *ptr;            /* pointer contains garbage */

    ptr->thing = value;  /* WRONG!  trying to use garbage */

You must not use a pointer until you point it at something. The above code takes the value of ptr (garbage) and then offsets from that garbage memory location a few bytes to the thing structure element. It then copies value into that memory.

Remember, C treats ptr->thing the same way as (*ptr).thing, which may make it clearer why you must have a valid value in ptr before you use it.

With luck, your program will die immediately when you use an uninitialized pointer. Without luck, you will overwrite some part of memory that you won't discover until later.

A pointer variable is no different from an integer variable. You must set it before you use it.

Freeing the wrong memory

This program demonstrates stripping leading blanks from a string:

#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
#include <string.h>

#define SIZE 100

   int
main(void){
   char *ptr = my_malloc(SIZE);  /* cover function for malloc */

   strncpy(ptr,"          leading blanks", SIZE);

   while( isspace(*ptr) )
   	++ptr;

   printf("No blanks: '%s'\n", ptr);

   free(ptr);  /* WRONG! */

   return 0;

}

Note how the programmer has received a particular pointer value from malloc(); but, has passed a different pointer value to free(). Perhaps the program will die right away; perhaps it won't die until much later. (What it will not do is free only part of the memory.)

The only values you may safely pass to free() are non-NULL values you got from malloc(). If you need to adjust the value of the pointer returned by malloc(), make sure you save a copy for later use by free().

Freeing the same memory twice

Some programmers confuse pointers and the memory the pointers point to. The system library function malloc() allocates memory and returns the address of that memory. Your program stores that address into one or more pointers:

   char *ptr1 = malloc(SIZE);  /* one block of memory allocated here  */
   char *ptr2 = ptr1;          /* ptr2 points to the *same* memory    */
   char *ptr3;

   ptr3 = ptr2;                /* ptr3 also points to the same memory */

   free(ptr3);   /* this frees the block of memory pointed to by ptr3 */
   free(ptr1);   /* WRONG!  tries to free the same block twice */
   free(ptr2);   /* WRONG!  tries to free the same block a third time */

In this program fragment, there is only one block of memory allocated, yet the programmer is trying to free that block three times. All three pointers contain the same address; all three pointers therefore point to the same memory. Any one of the three pointers would do in a call to free(). Only the first call to free() works; the other two attempt to free the same block over again and may cause your program to abort.

What does free do?

Many C programmers forget what the argument to free() is. It is a memory address, probably 2 or 4 bytes long. That address is the start of the block of memory returned by, say, malloc(), realloc(), or strdup().

To free memory, you pass the address of the start of the memory. You do not pass the entire block of memory:

Blort *p = malloc(sizeof(Blort));  // p contains a 4-byte address

free(p);                 // correct; copies address in p down into free()
free(&p);                // WRONG! passes address of p, not address in p
free(*p);                // WRONG! tries to copy whole structure to free()

Since arguments in C language are passed by value, it is a copy of the 4-byte address in p that is sent into free(). When free() has freed the memory at that address, the 4-byte pointer p remains untouched and unchanged; it still has the old memory address in it. If you said free(p) again, it would again take a copy of the old memory address, send it down to free(), and free() would try to free that memory a second time. (This would most likely cause your program to die.)

Writing cover functions (e.g. my_malloc)

As an experienced programmer, you now always check the return of your malloc(), realloc(), and strdup() function calls to make sure they are not the NULL pointer. For example:

struct Foo *foop = malloc(sizeof(struct Foo));
if( foop == NULL ){
    eprintf("Out of memory getting %lu bytes.  Program exiting.",
        (unsigned long)sizeof(struct Foo) );
    exit(RET_FAILURE);
}

foop->buffer = malloc(MY_SIZE);
if( foop->buffer == NULL ){
    eprintf("Out of memory getting %lu bytes.  Program exiting.",
        (unsigned long)MY_SIZE );
    exit(RET_FAILURE);
}

But, when you do this, your code has much duplicated error checking in it. The error checking takes up more space than the actual algorithm.

To make the code simpler, write a cover function for each of these library routines that does the error checking. Here is the cover function for malloc():

/* Cover function for malloc().  Returns a good pointer,
 * or else the function prints a message and exits the program.
 */
    static void *
my_malloc( size_t size /*use same type as library function malloc*/ ){
    void *ptr = malloc( size );

    if( ptr == NULL ){
        eprintf("Out of memory getting %lu bytes.  Program exiting.",
            (unsigned long)size );
        exit(RET_FAILURE);
    }
    return ptr;
}

The other cover functions, for realloc() and strdup(), would look similar to my_malloc(). Their arguments are exactly the same number and type of arguments as the library functions that they cover.

Now, your code is much simpler and easier to read, since the error checking doesn't dominate your code:

struct Foo *foop = my_malloc(sizeof(struct Foo));

foop->buffer = my_malloc(MY_SIZE);

Using cover functions that cause your program to exit is only appropriate if the type of error is serious enough that your program should exit on error. You shouldn't do this for functions whose return values you want to check, e.g. fopen(). (Having your program exit simply because it couldn't open the given file name would be rude.)

How wide are your arguments to printf?

The width of the typedef size_t argument to malloc() depends on the compilation environment. Sometimes size_t is an unsigned integer; sometimes it is an unsigned long integer. This makes it difficult to print in a straightforward manner:

size_t size = MY_SIZE;
printf("1. The value is %u\n", size);  /* WRONG!  NON PORTABLE! */
printf("2. The value is %lu\n", size); /* WRONG!  NON PORTABLE! */
printf("3. The value is %lu\n", (unsigned long)size); /* GOOD */

The first printf() statement fails if size_t is wider than an unsigned integer; it doesn't match the '%u' that expects an unsigned integer. The second printf() statement fails if size_t is narrower than an unsigned long integer (as it is on many PC architectures); it doesn't match the '%lu' that expects an unsigned long integer. The third statement always works, because it casts the size to an unsigned long integer so that it always matches the '%lu' of the printf() call.

In general, you must only use arguments to printf/scanf and similar functions if you know how wide they are, so that the width of the arguments match the '%' format specifiers used in the format string. Here is another example:

printf("1. The value is %u\n", MY_CONST);  /* WRONG!  NON PORTABLE! */
printf("2. The value is %lu\n", MY_CONST); /* WRONG!  NON PORTABLE! */
printf("3. The value is %lu\n", (unsigned long)MY_CONST); /* GOOD */

For the same reasons as above, the first two printf() statements may fail depending on the width of the constant MY_CONST. MY_CONST might be defined by the programmer to be any of 1, 32767, 65535, or 123456L (a long int). Only if the width of the constant matches the width of the printf() format will the print statement work. By casting the argument to a known width (an unsigned long integer), it is guaranteed always to match the '%lu' specification in the printf(). If you don't do the cast, some values of MY_CONST will cause the print statement to fail.

Another example:

char *a = "A string.";     /* a string with a blank */
char *b = strchr(a,' ');   /* find first blank */
printf("First word is %.*s\n", b-a, a);   /* WRONG!  NON PORTABLE! */
printf("First word is %.*s\n", (int)(b-a), a); /* GOOD */

On many architectures, subtracting pointers yields a long integer result. The field width specifier to match the '*' in the printf call expects an integer. The result is to mis-align the following string pointer, and usually cause a fault. This happens in the following example, too, if integers and long integers are different sizes:

long int foo = 9;  char *bar = "hello";
printf("Number %d string %s\n", foo, bar);    /* WRONG! */
printf("Number %ld string %s\n", foo, bar);   /* GOOD */

The programmer forgot to say '%ld' in the first printf, causing printf to pick up only two of the four bytes of the "foo" argument. When printf went to get the four-byte pointer for the string, it picked up the remaining two bytes of the integer and then the next two bytes from the pointer. The result is usually a fault.

Writing code that always works isn't easy.

C Language Modules

To hide data and functions in C, you have to put the data and the functions that use them in the same source file and use the static keyword to confine the scope of the identifiers to the source file.

Consider this main program that uses a module (a source file) named module.c containing the public functions init() and dump():

File main.c:

#include "module.h"          /* prototypes for public functions */
    int
main(void){
    init(123, "Some data");  /* pass in the initializing data */
    doStuff();               /* call a function in the module */
    term();                  /* call a function in the module */
    return EXIT_SUCCESS;
}

File module.c:

#include "module.h"          /* make sure prototypes match */

/* Here are some static data, local to this source file.
 * The "static" keyword protects these names.
 * These static items are only accessible within this source file.
 */
static int localint;             /* local storage */
static char *localstr;           /* local storage */
static int localfunc( int i );   /* forward prototype declaration */

/* This initialization function brings in data from outside the
 * module and saves it in variables local to this source file.
 * This function is "public"; it is not declared "static", so
 * it can be called from outside the module.  A prototype for
 * this function must be put in "module.h".
 * It uses a static utility function that is not visible and
 * not callable outside of this source file.  
 */
    void
init( int num, char *str ){
    localint = localfunc(num);   /* call utility function */
    localstr = str;              /* save the data here */
}

/* This termination function undoes anything done by init().
 * This function is "public"; it is not declared "static", so
 * it can be called from outside the module.  A prototype for
 * this function must be put in "module.h".
 */
    void
term( void ){
    localint = -1;
    localstr = NULL;             /* return ptr to NULL state */
}

/* This function uses the static data saved by the init() function.
 * This function is "public"; it is not declared "static", so
 * it can be called from outside the module.  A prototype for
 * this function must be put in "module.h".
 */
    void
doStuff( void ){
    printf("The saved values are %d and %s\n",
        localint, localstr);
}

/* This function is hidden in this file by the "static" keyword.
 * It is "private": only functions in this file may use it.
 * This function must *NOT* be prototyped in "module.h".
 */
    static int
localfunc( int num ){
    return num * 2;
}

Most modules require an initialization function that initializes or allocates data structures used by the module. In the above example, the initialization function brings external data into the module and saves it for later use by another public function in the module.

If the initialization function does things that need later undoing, the termination function does the undoing. Typical uses for the termination function are:

writing out statistics or completed data structures

freeing dynamic memory allocated by init()

closing any files opened by init()

restoring the state of data structures so that init() may be called again

The data structures and local utility functions used by the source file module are protected from access by functions outside the module by declaring them as "static". The "static" keyword confines the names to the one source file in which they are compiled. Only the "public" functions have prototypes in the "module.h" include file. The "module.h" include file is included in module.c to make sure that the prototypes for the public functions all match the definitions at compile time.

Understand your Compiler and Linker

Here are some common errors and warnings that people get when compiling their projects:

Warning BUFFER.CPP 42: Possibly incorrect assignment in function

The compiler is warning you about bad style in a logical condition, such as:

    if( str = (char *)malloc(.....) ){
The compiler thinks perhaps you meant to say the following, but made a typing mistake:
    if( str == (char *)malloc(.....) ){
Variable str is not a Boolean; it's a pointer. Don't write the code as if str were a Boolean value. Make the test for non-NULL explicit and the compiler warning goes away:
    if( (str=(char *)malloc(.....)) != NULL ){
The above code makes it clear that str is a pointer. Both the compiler and other readers of your code won't complain about it.

Linker Error: Undefined symbol _bf_free in module MAIN.C

Your main.c file has code that calls a function named bf_free that has no visible definition in any of your source files. The code compiles, so you probably have the correct function prototypes included; but, the linker can't find the compiled function body to link to.

For a function definition to be visible to the main.c source file, the function body has to be either:

local to the main.c source file, or

global to all source files

Some people forget that the static keyword hides a function definition and makes it local to the source file in which it is compiled.

What is wrong with writing: `printf(prompt)`?

What is wrong with this code?

func( char *prompt ){ printf(prompt); ... }

The above code works fine until the day someone calls it with a prompt containing a printf format specification tag, perhaps similar to this:

func("Please enter the number of %s in your address:");

The prompt string gets passed to printf, which interprets the %s as a string format argument, for which there is no corresponding string:

printf("Please enter the number of %s in your address:");

Your program will either fault or print garbage. The correct, and only safe way to print a string using printf is this way:

printf("%s", prompt);

An alternate way to print the prompt string is not to use printf at all:

puts(prompt);

A final observation is that prompts should always appear on the terminal and not be redirected into output files when command line redirection is being used, so the prompt really should be sent to the standard error unit, not to standard output:

fprintf(stderr,"%s",prompt); fputs(prompt,stderr);

Mutually recursive #include files

Some students have #include files that include each other recursively. For example, you might have a file of string function prototypes or data structures that use some buffer data types, and a file of buffer function prototypes or data structures that use some of the string data types. Thus, the file mystring.h might need to #include "buffer.h", and the file buffer.h might need to #include "mystring.h".

This doesn't work.

At worst, the preprocessor will get into a loop trying to read either of the two files. At best, if you remember to put #ifndef _BUFFER_H_ in front of the buffer.h file and #ifndef _MYSTRING_H_ in front of the mystring.h file, then one of the two files will be only partially processed when the other file is #included, causing the #included file to generate errors about unknown data types and invalid declarations.

Rework your source code to eliminate this mutual recursion.

sizeof() vs. strlen()

The following code compiles but doesn't work:

char *p; char *q = "Hello There"; p = (char *)malloc(sizeof(q)+1); /* WRONG */ strcpy(p,q);

The sizeof() operator gives the compile-time size of a C-language data structure. It does not compute the run-time length of a string. The above code allocates five bytes of memory (four for the size of the pointer, plus one), then dies trying to copy 13 bytes of a string into it.

Don't confuse sizeof() with strlen().

sizeof() and malloc()

The following code compiles but doesn't work:

char *p; int numbytes = 100; p = (char *)malloc(sizeof(numbytes)); /* WRONG */ strcpy(p,"You are confused about what sizeof() does");

The argument to malloc() should be the number of bytes of memory to reserve. The sizeof() operator returns the compile-time size of its argument. The size of numbytes is the size of a C-language integer; that is 2 bytes on most PC architectures and 4 bytes elsewhere. Thus, the above code allocates only 2 or 4 bytes of memory and then copies a long string into that memory. This overwrites and corrupts the memory pool. The programmer won't likely discover the error until the next malloc() or free() is attempted, at which point the program will die because of the corrupt memory pool.

Be clear about where to use sizeof().

Modifying string constants

This code doesn't work:

char *msg = "The file name is: "; strcat(msg,"BIGBEN.TXT");

The pointer msg points to a small area of memory containing the text "The file name is: " followed by the NUL byte ('\0'). The strcat function tries to add 10 more bytes to this string, catastrophically overwriting whatever was in memory after the string and possibly causing a processor fault.

You can only use string functions to move bytes into memory that you have already reserved.

This attempt to fix the problem may or may not work:

char *msg = "The file name is: \0xxxxxxxxxxxxxxxxxxxxxxx"; strcat(msg,"BIGBEN.TXT");

Now, the pointer msg points to enough memory to hold the 10 new bytes after the first NUL character; but, this memory is part of a "string constant" compiled by the compiler. Many modern compilers put these constants into the read-only text segment of your executing process, making it impossible to overwrite any of this string. A processor fault may result.

If you need to build a string of variable length, use dynamic memory.