MEM.TXT - Revised by Ian! D. Allen Original MEM.TXT obtained from www.snippets.org dated 05-Jul-1997 Clarified by idallen@freenet.carleton.ca 05-Jun-1998 Walter Bright's MEM Package --------------------------- WHAT IS MEM?: ------------- MEM is a set of functions used for debugging C pointers and memory allocation problems. Quoting Walter, "Symptoms of pointer bugs include: hung machines, scrambled disks, failures that occur once-in-10,000 iterations, irreprodu- cible results, and male pattern baldness." After writing MEM for use in developing his own compiler and tools, he reported that its use reduced pointer bugs by as much as 75%. MEM is simple to add to existing programs and adds little or no overhead. USING MEM: ---------- Most of the MEM functions exactly parallel the standard library memory allocation functions that have similar names. To implement MEM in your program, follow these steps: 1. Do a global search-and-replace of the following functions in all of your source files: malloc() -> mem_malloc() calloc() -> mem_calloc() realloc() -> mem_realloc() free() -> mem_free() strdup() -> mem_strdup() Do not forget to change all the function calls! Any you leave out will cause your program to generate spurious memory errors. 1.1 Go back and make sure you got ALL the function calls. 2. At the beginning of main(), add the following two lines: mem_init(); atexit(mem_term); If you don't know what the standard library function atexit() does, see the manual for your compiler. If your compiler doesn't implement atexit(), see the next point: 2.1 If you don't want your program complaining about allocated memory when it exits abmormally and doesn't free all the memory, don't use atexit(). Instead, place calls to mem_term() just before your program exits successfully (e.g. at the end of main()). Using an explicit call to mem_term(), instead of using atexit(), will mean that memory will only be checked when mem_term() is called, not any time the program exits. 3. Edit mem.h and add this line to the top of the file, if it isn't already there: #define MEM_DEBUG 1 This turns on maximum debugging. Details are given below. 4. In the header section of each of your C files, add this: #include "mem.h" to every .C file which calls any of the above memory functions. 4.1 Go back and make sure you got EVERY source file that uses a mem_* function. 5. (Optional) Place calls to mem_check() at strategic points in your program, to detect memory problems. You may want to run your program a few times first, to see where the problems might be. See item #9 in the "What MEM Does" section, below. 6. The final step is to recompile your program, and compile and link in the MEM.C source file at the same time. Add the MEM.C source file to your compilation unit and make sure the MEM package *.H files are where the compiler can find them. Recompile your program. Some compilers generate warnings when compiling the MEM.C file. 7. Run your program. Your memory allocation errors will be output whenever mem_term() is called. USING MEM_DEBUG: ---------------- MEM has 2 modes of operation, debugging and non-debugging. Use debugging mode during program development, then turn debugging off for final production code. Maximum debugging is turned on by defining the MEM_DEBUG macro: #define MEM_DEBUG 1 If the macro is defined, debugging is on; if undefined, debugging is off. When debugging is turned off, the MEM functions become trivial wrappers for the standard functions, incurring virtually no overhead. Very little memory checking is done. WHAT MEM DOES: -------------- 1. ISO/ANSI verification: When Walter wrote MEM, compiler compliance with ANSI standards was still quite low. MEM verifies ISO/ANSI compliance for situations such as passing NULL or size 0 to allocation/reallocation functions. 2. Logging of all allocations and frees: All MEM's functions pass the __FILE__ and __LINE__ arguments. During alloca- tion, MEM makes an entry into a linked list and stores the file and line information in the list for whichever allocation or free function is called. This linked list is the backbone of MEM. When MEM detects a bug, it tells you where to look in which file to begin tracking the problem. 3. Verification of frees: Since MEM knows about all allocations, when a pointer is freed, MEM can verify that the pointer was allocated originally. Additionally, MEM will only allow a pointer to be freed once. Freed data is overwritten with a non-zero known value, flushing such problems as continuing to reference data after it's been freed. The value written over the data is selected to maximize the probability of a segment fault or assertion failure if your application references it after it's been freed. MEM obviously can't directly detect "if" instances such as... mem_free(p); if (p) ... ...but by guaranteeing that `p' points to garbage after being freed, code like this will hopefully never work and will thus be easier to find. 4. Detection of pointer over- and under-run: Pointer overrun occurs when a program stores data past the end of a buffer, e.g. p = malloc(strlen(s)); /* No space for terminating NUL */ strcpy(p,s); /* Terminating NUL clobber memory */ Pointer underrun occurs when a program stores data before the beginning of a buffer. This error occurs less often than overruns, but MEM detects it anyway. MEM does this by allocating a little extra at each end of every buffer, which is filled with a known value, called a sentinel. MEM detects overruns and underruns by verifying the sentinel value when the buffer is freed. 5. Dependence on values in buffer obtained from malloc(): When obtaining a buffer from malloc(), a program may develop erroneous and creeping dependencies on whatever random (and sometimes repeatable) values the buffer may contain. The mem_malloc() function prevents this by always setting the data in a buffer to a known non-zero value before returning its pointer. This also prevents another common error when running under MS-DOS which doesn't clear unused memory when loading a program. These bugs are particularly nasty to find since correct program operation may depend on what was last run! 6. Realloc problems: Common problems when using realloc() are: 1) depending on realloc() *not* shifting the location of the buffer in memory, and 2) depending on finding certain values in the uninitialized region of the realloc'ed buffer. MEM flushes these out by *always* moving the buffer and stomping on values past the initialized area. 7. Memory leak detection: Memory "leaks" are areas that are allocated but never freed. This can become a major problem in programs that must run for long periods without interrup- tion (e.g. BBS's). If there are leaks, eventually the program will run out of memory and fail. Another form of memory leak occurs when a piece of allocated memory should have been added to some central data structure, but wasn't. MEM find memory leaks by keeping track of all allocations and frees. When mem_term() is called, a list of all unfreed allocations is printed along with the files and line numbers where the allocations occurred. 8. Pointer checking: Sometimes it's useful to be able to verify that a pointer is actually pointing into free store. MEM provides a function to do this: mem_checkptr(void *p); Call this wherever you would like to verify the validity of a pointer. 9. Consistency checking: Occasionally, even MEM's internal data structures get clobbered by a wild pointer. When this happens, you can track it down by sprinkling your code temporarily with calls to: mem_check(); mem_check() performs a consistency check on the free store. 10. Out of memory handling: MEM can be set using mem_setexception() (see MEM.H) to handle out-of-memory conditions in any one of several predefined ways: 1. Present an "Out of memory" message and terminate the program. 2. Abort the program with no message. 3. Mimic ISO/ANSI and return NULL. 4. Call a user-specified function, perhaps involving virtual memory or some other "emergency reserve". 5. Retry (be careful to avoid infinite loops!) 11. Companion techniques: Since MEM presets allocated and stomps on freed memory, this facilitates adding your own code to add tags to your data structures when debugging. If the structures are invalid, you'll know it because MEM will have clobbered your verification tags. SUMMARY: -------- Since it is, in the final analysis, a software solution, MEM is fallible. As the saying goes, "Nothing is foolproof because fools are so ingenious." Walter himself readily acknowledges that there are circumstances where your code can do sufficient damage to MEM's internal data structures to render it useless. The good news is such circumstances are few and far between. For most memory debugging, MEM is a highly reliable and valuable addition to your C programming toolchest. HISTORY: -------- The files, MEM.H and MEM.C which constitute the MEM package were originally published in the March-April 1990 edition of Micro Cornucopia magazine, now sadly out of print. The files as they appear in SNIPPETS have been edited somewhat to remove compiler dependencies and correct minor discrepancies. For those who don't already know, Walter Bright is the author of Datalight Optimum-C, the original optimizing C compiler for PC's. Through a succession of sales and acquisitions plus continual improvement by Walter,, his compiler became Zortech C++ and is now sold as Symantec C++. As such, it is the only major PC compiler which can claim single authorship. It also compiles faster than most other compilers and is still a market leader in its optimization technology. Like many other library and ancillary functions unique to Walter's compilers, the MEM package was originally something he wrote for his own use. As noted above, he published it only once but it has been included as an unheralded "freebie" in Walter's compilers for the past several years. Walter was kind enough to grant permission for its inclusion in SNIPPETS beginning with the April, '94 release. PORTING MEM: ------------ Included in the MEM package is TOOLKIT.H, which isolates compiler and environmental dependencies. It should work as-is for most PC compilers and the Microsoft compiler for SCO Unix. Other environments may be customized by writing your own HOST.H file, using the existing definitions in TOOLKIT.H as examples and modifying the values to match your system. Using these techniques, the MEM package has been used successfully on Amigas, Macs, VAXes, and many other non-DOS systems.