--------------------------- Notes on coding the readall() function, including pseudocode --------------------------- -Ian! D. Allen - idallen@idallen.ca Most of these notes are edited versions of comments posted to the mailing list regarding programming the readall() function. Pseudocode is also included. --- Just as your server child process currently loops reading lines from the client socket (until EOF or error) using read(), you replace read() with readall() (only for reading sockets). The use of readall() instead of read() simply assures that you get single lines back from the socket, not partial lines or multiple lines (as is possible with read() or recv() on a socket). Since your server now knows that you get back only single lines, the server can process each line separately, check for leading /msg, distribute the line to all the clients, etc. This makes the server's job easier, since the server knows that calling readall() will only fetch a single line, not multiple lines or partial lines. Note that if readall() fetches data that contains no newline from the client socket, either the client has died without sending the newline or else the client sent a line longer than the buffer you passed in to readall(). You have to decide how to handle both cases. If the line is too long, you can't simply throw away that part of it and continue to loop back for another readall(), since you'll fetch more of the same line, not a new line. To flush or throw away a line that is too long to fit in the buffer, you have to keep reading (keep calling readall()) until you finally find the ending newline of the very long line. Then you know you are re-synchronized and ready to read the next line. Your server may be able to handle very long lines, passing all the data to the clients; or, it may decide to throw long lines away, looping until it finds a terminating newline. You choose. --- NOTE: You can not use strcpy or memcpy to copy overlapping strings. (Strings overlap if the place where the output bytes go overlaps the place from where the input bytes will come.) See the man pages. The output of this command line may help you find a replacement function that does allow overlapping: $ man -k memory | grep copy NOTE: Use the extra compiler warning options I mailed to you! And add one more: -O (Optimize) which is necessary to trigger flow analysis. CFLAGS = -O -Wall -Wextra -Wshadow -Wstrict-prototypes \ -Wmissing-prototypes -Wmissing-declarations \ -Wmissing-field-initializers -Wredundant-decls -Wunreachable-code foo: foo.c gcc -o foo $(CFLAGS) foo.c You may get some "false" warnings; but, you'll also catch some real errors, such as the classic error "if(x=5){" instead of "if(x==5){". --- Given this readall() call: ret = readall(fd,buffer,sizeof(buffer)); The readall function has to fetch into the passed buffer one of these: Case #1. a string of bytes ending in a newline (of length less than or equal to sizeof(buffer)), or Case #2. a string of exactly sizeof(buffer) bytes not ending in a newline or Case #3. a string of less than sizeof(buffer) bytes not ending in a newline (only when EOF reached on the file descriptor) The only time Case #2 happens is if the input line is so long that it doesn't all fit in the passed buffer. (The caller will have to collect the very long line in pieces with repeated calls to readall() until finally the newline is found. Hopefully this won't happen very often.) The only time Case #3 happens (short line, no newline) is on the last call to readall() before EOF is detected. The sending process may have closed the descriptor without writing the final newline into it, so readall() couldn't pack the caller's buffer full. Your programs can safely treat Case #3 as if it did have a newline on the end - the same as Case #1. Note that the next time you call readall() after Case #3, you will get zero bytes back, indicating EOF. *** When does your readall() need to actually do a read() to put bytes *** into its private internal buffer? Your readall() has to first try to satisfy Case #1 or Case #2 just using the data in the private buffer. Only if that doesn't work do you have to read more data onto the end of your private buffer, then try again. You have to read() more bytes into the private buffer if you can't find a newline in the data already in the private buffer and the number of bytes in the private buffer is less than the sizeof(buffer) passed by the caller. You have to read() more bytes to fill out the private buffer and see if the newline appears within the passed sizeof(buffer). If the newline appears in your private buffer within the passed sizeof(buffer), you copy that full line from your private buffer to the caller and return the number of characters. (This is Case #1, above.) If the newline doesn't appear within sizeof(buffer) bytes, you have to pack the caller's buffer with exactly sizeof(buffer) bytes of data. (This is case #2, above.) Remember: You are trying to fulfill Case #1, #2, or #3 above, and that means you can't stop doing read() until one of the conditions is satisfied. Find the newline, or else copy out sizeof(buffer) bytes, or copy out fewer bytes because the stream is at EOF and you can't get any more bytes from it. (What happens if your private buffer is *smaller* than the buffer supplied by your caller? If your private buffer is smaller, you could fill your private buffer full of bytes and still not have enough to satisfy Case #1 or Case #2 above. What do you do to handle this case?) The only way readall() returns is satisfying Case #1, #2, or #3. Your readall() will return zero bytes (EOF), when there are no bytes left in its private buffer and the previous read() also returned EOF. Many calls to readall() may need to happen to empty the lines out of the private buffer even after read() sees EOF. (You will need to remember that read() saw EOF - you are not allowed to read() a file descriptor once you have detected EOF on it!) --- As a drop-in replacement for read() with internal buffering, note that you can't use the same readall() on two different streams at the same time. A readall() can only be used to read from one file descriptor, and that descriptor cannot change once you start using it. For example, you can't use the same readall() code on both a socket descriptor and on stdin - the internal buffering would mix up the two input streams! You could program readall() to have different internal buffers for different file descriptors; but, that's more work than I intended. You only need to use readall() on a single open network socket. You don't need to use it on standard input. That means you only need one readall() in your client, and one in each separate child process in your server. --- I've seen some incorrect EOF handling in your client/server code. If you don't handle EOF correctly, writing automated test scripts will be a problem. I wrote a new note file to cover how it should work, see eof_handling.txt http://teaching.idallen.com/cst8165/06f/notes/eof_handling.txt --- Here is a sample readall() pseudocode (my version; yours may differ). Calling sequence: ret = readall(int fd, char *buf, size_t len) "len" is usually just "sizeof(buf)" The readall function has to fetch into the passed "buf" one of these: Case #1. a string of bytes ending in a newline (of length less than or equal to len), or Case #2. a string of exactly len bytes not ending in a newline, or Case #3. a string of less than len bytes not ending in a newline (only when EOF reached on the file descriptor) START READALL ( fd, buf, len) // declare some static variables that remember values between calls // declare some local variables used in the algorithm below // static locbuf[???] // our local internal buffer static locsiz = 0 // remember how many bytes are in locbuf between calls static seenEOF = FALSE // remember if we have read EOF on fd bufp = buf // set the buf output pointer to the start of output buf copyoutlen = 0 // how many bytes to copy when we break the main loop // declare any other variables // this loop has three breaks, one for each of Case #1, #2, #3 // we first try to get data out of our internal buffer; if that fails, // we have to read more bytes into the internal buffer and try again // loop { // outroom counts the number of bytes of space available in output buf outroom = len - (bufp-buf) // lookmax limits how far ahead we will look in locbuf for a newline // (don't look farther than we have space to copy out!) lookmax = MIN(locsiz,outroom) // Case #1. We find a newline in our local buffer between 0 and lookmax // if ( local buffer contains a newline between 0 and lookmax ) copyoutlen = length of locbuf up to and including newline break loop // Case #1. // Case #2. We have enough bytes to pack the rest of the caller's buf // if ( locsiz >= outroom ) copyoutlen = outroom // fill the rest of the caller's buffer break loop // Case #2. // if we still have a full buffer, our little locbuf must be *smaller* // than the room in the caller's output buf! Copy all what we have to // the caller, advance output bufp, and then fall through to read more. // if ( locsiz == sizeof(locbuf) ) copy all of locbuf to the caller's buf via bufp bufp += locsiz // move our output pointer locsiz = 0 // FALLTHROUGH - do not break loop! // We get here if we have to read some more bytes into our local buffer. // (locsiz is smaller than outroom and we haven't got a newline yet) // If we have already seen EOF on fd, we can't read any more bytes; // Case #3. EOF was read - just return whatever bytes we have left // if ( seenEOF is TRUE ) copyoutlen = locsiz // we know locsiz bytes will fit in output buf break loop // Case #3. // We are now actually going to read() more bytes from the open fd; // read more bytes to after the end of the data already in locbuf // ret = read(fd, locbuf+locsiz, sizeof(locbuf)-locsiz) if ( ret <= 0 ) if ret < 0 perror the error // but do not exit - treat error as EOF seenEOF = TRUE // remember for later that we hit error/EOF on the fd else locsiz += ret // we have more bytes in our local buffer } // end of loop // we can only get here, break out of the loop, for Case #1, #2, #3 // copy bytes out to the caller and adjust our saved data structures // copy copyoutlen bytes from our local buffer to the caller buf via bufp bufp += copyoutlen // move our output pointer locsiz -= copyoutlen // account for the bytes we just copied move what data remains in our local buffer to the beginning of the buffer // if we are about to return zero (EOF), reinitialize the internal EOF flag; // this would allow us to open another stream and reuse this readall() on it // if ( bufp-buf is zero ) seenEOF = FALSE // reinitialize for subsequent use on another fd return bufp-buf // return a count of all the bytes we copied to the caller END READALL