------------------------------------------------- Strings without NUL and Avoiding buffer overflows ------------------------------------------------- -Ian! D. Allen - idallen@idallen.ca Background: Read the news. Every week some Internet client or server software is compromised by a "buffer overflow", where data is written off into memory and the resulting fault lets the attacker take over the machine. Internet-facing programs have to be robust and well-written. An Internet-visible server hands some amount of control of your machine to anyone anywhere on the planet who wants to connect to it. The slightest programming error on your part will be used to take down your server or compromise it so that it can be used to attack others. My goal is to help you to write small but solid Internet client/server programs that cannot be exploited by crackers. That means zero tolerance for memory errors and buffer overflows. Handling strings that don't end in NUL '\0' ------------------------------------------- If you use the low-level Unix system routines read() or recv() (or their cover functions readall() and friends) the buffers you get back don't have NUL ('\0') bytes on the end. (Of course you do get back the exact length instead.) That affects all of the string handling routines, including sprintf(). Most of the string routines will keep looking through memory until they find a NUL byte, even if that means causing your program to fault and die. You can specify a printf/sprintf format string to pick off a length of bytes where the NUL might be missing: printf("%.*s",len,buf); /* the "*" picks up the current value of "len" */ * Use strncpy() or memcpy() or memmove() instead of strcpy(). * Use memchr() instead of index() or strchr(). Limit the number of bytes copied -------------------------------- A big problem with sprintf() is that it doesn't have any way for you to specify the size of the *output* buffer, and thus it isn't safe to use unless the format string and all the rest of your surrounding code is safe (and that's risky to assume). This kind of sprintf() programming is wrong: buf1[256]; buf2[256]; ... len = sprintf(buf2, "%s: %s", somestring, buf1); /* OVERFLOW */ You must make buf2 large enough to hold all of buf1 plus the length of whatever stuff might be in "somestring", plus the few bytes between the strings. How big should that be? You can't easily make *sure* that, now and through all future code and format modifications: strlen(somestring)+strlen(buf1)+format < sizeof(buf2) That makes sprintf() awkward to use safely - it doesn't know when to stop. If "somestring" gets longer, or if you make the format string a bit longer, it will overflow buf2. Your code is waiting for a buffer overflow to happen, now or in future. You might link the size of buf2 to be larger than buf1 to preclude future maintenance problems (a very good idea, though not sufficient): buf1[256]; buf2[sizeof(buf1)+80]; ...but how do you know that "+80" will always be enough? You can't know. For buffer safety (not overflowing the output buffer), you can replace sprintf() with snprintf(), and check to make sure that all the data fit in the given output buffer size. (Read the man page on how you will know if snprintf() truncated the output!) See also the C FAQ: http://c-faq.com/ or http://www.faqs.org/faqs/C-faq/faq/ Section 12.21 deals with the sprintf() problem and snprintf() solution. To limit the amount of data moved by printf/sprintf, you can also replace all your %s formats with %.*s formats, so you can limit how much data they pick up. You will still have to find a way to detect that %.*s didn't read all the data, and you still have to make sure the output buffer will hold the sum of all the data copied. Programs that operate over the Internet *MUST NOT* allow buffer overflows. They must not be written so that simple modifications to the code (software maintenance) will trigger buffer overflows.