----------------------- Lab #01 for CST8165 due January 28, 2008 (Week 4) ----------------------- -Ian! D. Allen - idallen@idallen.ca - www.idallen.com Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) Global weight: 5% of your total mark this term. Interim submission: Submit what you have done so far in lab on January 21. Due date: before 16h00 Monday January 28 Interim submissions in Week 3: You will submit whatever progress you have made on this assignment before the end of your weekly lab period in Week 3. The deliverables for this exercise are to be submitted online in the Linux Lab T127 using the "cstsubmit" method described in the exercise description, below. No paper; no email; no FTP. Late-submission date: I will accept without penalty exercises that are submitted late but before 16h00 on Tuesday, January 29. After that late-submission date, the exercise is worth zero marks. Exercises submitted by the *due date* will be marked online and your marks will be sent to you by email after the late-submission date. Exercise Synopsis and Summary: Copy (with proper credit to the original author and source) a TCP server from the given sources. Modify and test the server. - Remember how to use Makefiles - Remember how to structure programs in multiple .c/.h files - Learn to use the socket/bind/listen/accept system calls. - Learn to create C preprocessor (cpp) output via "cc -E" Coding and submission standards: - provide File Headers (Program Headers) using my Assignment Label - provide Function Headers documenting arguments and return values - use Block Comments (see programming_style.txt) - error messages must have the four-part format from programming_style.txt Where to work: Submissions must compile and run cleanly in the T127 Linux Lab, though you are free to work on them anywhere you like. Since this course has no textbook, use the Internet instead: Background tutorial on using sockets: The Sockets Tutorial: http://www.cs.rpi.edu/academics/courses/fall96/sysprog/sockets/sock.html (alternate: http://www.linuxhowtos.org/C_C++/socket.htm ) The server2.c code is explained line-by-line in the above web page. Beej also has a line-by-line socket programming tutorial here: http://beej.us/guide/bgnet/ Helpful code: ------------ a. printf a fixed length string You can printf exactly 9 bytes from a buffer (no \0 needed) using: printf("%.9s",buf); // print only 9 bytes from buf AND it gets better if you use '*' instead of 9 (more useful in this case): n = read(fd,buf, .... ); ... printf("%.*s",n,buf); // the "*" means pick up the current value of "n" This kind of printf can be useful for buffers that don't have \0 in them. (Note that this will *not* work to output binary data, since the \0 in binary data will stop both fprintf() and printf().) You can also output n bytes in a buffer directly using write(fd,buf,n). The write() system call handles binary data correctly. (printf does not) - Standard input (usually your keyboard) is Unix fd 0. - Standard output (usually your screen) is Unix fd 1. - Standard error (usually your screen) is Unix fd 2. WARNING! printf() stops printing when it hits a NUL byte ('\0'). If you are passing binary data, do not use printf() for output. You may use fwrite() to write binary data. b. proper use of buffer sizes Never do this: char buf[256]; ... read(fd,buf,256); Do this: char buf[MYBUFSIZ]; ... read(fd,buf,sizeof(buf)) Buffer sizes must be set in only *one* place for easy maintenance. c. equivalence of read() and recv(), write() and send() for sockets If you don't set any special TCP/IP flags in recv() or send(), the system calls recv() and read() are the same/equivalent for accessing sockets, as are the syscalls send() and write(). You can't use the socket syscalls recv() or send() on file descriptors that are *not* sockets (even if the TCP/IP flags are zero); using read() and write() works for both sockets and ordinary files. All of the code in this lab uses plain read() and write() instead of recv() and send(). See also: "man 2 recv" and "man 2 send" d. do not clobber errno before perror() or error() If you want to use [f]printf() before a call to perror() or error(), read the NOTES section of "man 3 errno" first! Rather than using [f]printf(), you could use snprintf() to construct the string you wish to pass to perror(). (Why is using [f]printf() before perror() not allowed; but, using snprintf() before perror() is allowed?) Do you know how to write va_list (variadic, varargs) functions? You might find such a function useful for printing error messages; since, it avoids the need for buffers and snprintf(). http://www.gnu.org/software/libc/manual/html_node/Variadic-Functions.html The GNU-specific function error() takes a printf-like list of arguments and can optionally call perror() for you. I recommend it. Recall that most any program or C function has a manual page: RTFM Part I - Coding a TCP/IP server ------------------------------- 0) Go back and read the above sections you just skipped over. 1) Find the Sockets Tutorial: http://www.cs.rpi.edu/academics/courses/fall96/sysprog/sockets/sock.html (alternate: http://www.linuxhowtos.org/C_C++/socket.htm ) The server2.c code is explained line-by-line in the Sockets Tutorial. 2) Copy the code for the second C server "server2.c" from the section "Enhancements to the server code". Credit your code sources in comments in the code after you have copied it. (No plagiarism!) See "man wget" for an easy way to download any URL to your current directory. Don't cut-and-paste - the tabs will be all wrong. 3) Create a Makefile target "all" for this code that compiles server2.c into "server2" using gcc and these gcc required options: CFLAGS = -g -O -Wall -Wextra -Wshadow -Wstrict-prototypes \ -Wmissing-prototypes -Wmissing-declarations \ -Wdeclaration-after-statement -Wmissing-field-initializers \ -Wredundant-decls -Wunreachable-code -fstack-protector-all Note the spelling of "CFLAGS". Your Makefile needs that spelling. Strive for the minimal Makefile - don't include command lines for things that "make" will do for you automatically (such as automatically compile .o files from .c files using CFLAGS). 4) Create a Makefile target "clean" that removes the compiled files, leaving only the source file. Make sure that running "make clean" twice doesn't generate errors. 5) Fix the gcc compile warnings by adding the missing header files: - use the man pages to find the missing header lines for each function - fix the warning about signed/unsigned use of int - make the error() function static instead of global - you may still have one spurious "will never be executed" error on the line containing htons() that won't go away, that you can ignore 6) Add this line to main(): alarm(30*60); /* kill program after 30 minutes */ - this is in case you forget to kill the program yourself when you leave! - note where is the correct place to put this line. - test it using an alarm of 10 seconds instead of 30 minutes. - to see your background processes on Linux/Unix, use command: ps gx 7) Add code to check the return code of "listen()" and handle any error. Print a message on stderr if listen() fails and exit the program with a non-zero return code. Since listen() is a system call, you may use perror() to print the reason for the system call failure. 8) Fix "The zombie problem" using the AIX-style fix from the Tutorial. Use the AIX one-line fix, not the fix that uses a signal handler. 9) Pick up the setsockopt SO_REUSEADDR code fragment from section 4.2 of http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html and add it to your server before calling bind(). This will make the server re-use socket ports nicely and avoid "Address already in use". - note what variables do you need to create to make this code work. - note where is the correct place to put this code. This Unix command shows active TCP connections: netstat -natp 10) Fix your server2.c to accept only *exactly* one port argument, not more, not less. Never ignore user input. Do not run the program if too many or too few arguments are given. 11) Fix your server2.c to convert INADDR_ANY to network byte order. Reference: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#bind Read the highlighted paragraphs above and below the one containing: "you might have seen that I didn't put INADDR_ANY into Network Byte Order! Naughty me." Note the change in the line "// use my IP address". 12) If you (accidentally) write on a closed socket, your program will be killed with SIGPIPE, without any error message. Ignore the SIGPIPE signal in your server. Add this code near the start of main(): signal(SIGPIPE,SIG_IGN); This will prevent your server from being killed (with no error message) if you accidentally write onto an incorrect file descriptor. From man 7 socket: "When writing onto a connection-oriented socket that has been shut down (by the local or the remote end) SIGPIPE is sent to the writing process and EPIPE is returned." 13) Replace the deprecated bzero() function everywhere. (See "man bzero".) No programs should be using bzero() any more. 14) The current server only accepts one line and then exits. Verify that this is true before you continue, using the standard TCP client netcat: $ nc -v localhost e.g. you would first start your server in the background (or in a separate window) and then connect to it using netcat: $ ./server2 55555 & $ nc -v localhost 55555 hi [...some server output prints here and netcat exits...] $ killall server2 Part II - making the server loop -------------------------------- We will recode the server to loop reading all the lines from the client, returning them all to the client (an echo server), as follows: 15) Rewrite the server2.c dostuff() function incrementally as follows: (Rename this dostuff() function to something meaningful!) a. Delete the server's "Here is the message" line. b. Replace the 18-byte write() statement by writing all the bytes received from the client back to the client (to the same socket). Only write the bytes that were actually read from the client. c. Recode dostuff() to eliminate and remove the need for bzero/memset() Zeroing out the entire buffer is unnecessary and wasteful. Recode it. d. Using the pseudocode developed in class, recode dostuff() to loop reading/writing lines from/to a connected client until either EOF (zero bytes read) or error. Make sure the loop handles binary data. (Don't use printf and friends.) e. Issue the following message on standard error when EOF is seen: XXX: EOF reading from unit NNN where XXX is the program name, and NNN is the I/O descriptor number on which EOF is detected. (You can use the GNU error() function to write this message.) f. Issue the following message on standard error just before the server's forked client process exits: XXX: done with unit NNN where XXX is the program name, and NNN is the I/O descriptor number used by the client. (You can use the GNU error() function to write this message.) Make sure the code continues to test for errors on all system calls. 16) When writing to a network socket, the write() may write fewer bytes than you ask it to write. Reference: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#sendrecv http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#sendall In server2.c, replace the write() with a function sendall() that loops sending bytes until either all the bytes are sent, or error. You can get the sendall() function from here: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#sendall 17) Copy (with credit) that sendall() function from beej and fix the missing-initialization bug that happens if "*len" is zero. 18) If you want to generalize sendall() to write to a file descriptor that is not a socket (e.g. to your screen), you must ensure that sendall() uses write() and not send(). Both send() and recv() only work on sockets; write() and read() work on both sockets and ordinary file descriptors. Replace send() with write() for maximum portability. 19) Put sendall() in its own file sendall.c and create sendall.h Reference: Class Notes file header_files.txt For full marks, do not put unnecessary #includes in your *.h files. 20) Update your Makefile to handle sendall.c and sendall.h Make sure that your server2 is rebuilt if any of its source files change. Test this! Part III - cleaning up the error() messages ------------------------------------------- Internet-facing programs need to validate all input and give good error messages. 21) Read the Notes file programming_style.txt Pay attention to the required four-part format of error messages. Improve all the error messages to have the four-part format described in Notes file programming_style.txt. *Every* error message must be output on standard error and be prefixed by the actual name of the program. Do not hard-code the program name into the code; use the actual name from the command line. If you rename the binary, the error message name must also change. (The GNU error() function does this for you, automatically.) System call errors must have the correct system call error text output (as would be output by perror() or any related functions). Non-system-call errors must *not* try to call perror(). The GNU extension function "error()" is useful for printing these four-part error messages; though, it is not portable and would have to be fixed for a non-GNU environment. See "man 3 error". Remove the existing error() function from the code and convert all the error messages to use the multi-argument GNU error() function instead. Supply the "errno" argument only for system calls that fail. Examples of using GNU error(): if( systemcall(port) < 0 ) error(1,errno,"ERROR in systemcall using port %d",port); if( argc > 1 ) error(1,0,"This program does not accept arguments"); if( frob > 2*fritz ) error(0,0,"Warning: frob %d larger than double fritz %d", frob, fritz); 22) Verify that your program works using the standard TCP client netcat: $ nc -v localhost e.g. you would start your server in the background and then connect: $ ./server2 55555 & $ nc -v localhost 55555 hi hi (lines that you type should echo back to you via the server) ^C (interrupt netcat) $ killall server2 You should also test that binary data passes through your server correctly, using shell I/O redirection to netcat. Part IV - looking at C pre-processor output ------------------------------------------- The server2.c code we are working on has a "will never be executed" warning on code that is clearly being executed. To find out the cause of this warning, we need to look at what changes the C preprocessor has made to our source file before the compiler sees it and compiles it. You must have a working Makefile that can build server2. Consult your instructor for assistance in getting a broken Makefile to work before you continue. 23) Add the following variable near the top of the Makefile: CF2 = $(CFLAGS) -E 24) Find the explanation of what the "-E" option does in the gcc man page. Read it. Note that the output of "cc -E" is preprocessor output - it is text C code. The output is not an object file and is not a binary executable! Why are there no "#include", "#define", or "#ifdef" statements in the output of "cc -E"? 25) To the bottom of your Makefile, add a new Makefile target "serverX.c" with dependency "server2.c" that runs gcc with the $(CF2) options to pre-process source file server2.c into output source file "serverX.c". Unlike Part I, where you could use the default Makefile actions to call the gcc compiler, in this case you *must* specify the full gcc compile line, including specifying the output file, so that you can pass the $(CF2) options to gcc and capture the preprocessor output. 26) Add serverX.c to the list of files to be cleaned in "make clean". Make sure that running "make clean" twice doesn't generate errors. (Test it!) 27) Run "make serverX.c" to run the gcc pre-processor to create source file "serverX.c" from server2.c using the $(CF2) flags. Confirm that all the CFLAGS options are used, plus the new -E option. Confirm that the created file "serverX.c" is C source code - the output of running the C preprocessor on the server2.c source file. 28) Look at the bottom 60 lines of the serverX.c source file, where the serv_addr.sin_port and htons() code gives "unreachable" warnings. (You saw this htons() warning "will never be executed" in the previous lab - it was the only warning you couldn't get rid of.) Note what a mess the htons() code has expanded into after going through the gcc pre-processor! The resulting unreachable code in the "if" statement is the reason this line generates an unreachable code warning. You would never know this unless you looked at the output of the C pre-processor. Looking at pre-processor output (using -E on Unix) is often the only way to know what code you are actually compiling, and why some compile errors and warnings happen. 29) Experiment temporarily with gcc options to find out which compiler flag is causing the htons() code to be expanded into the "unreachable code" line. (Hint: Removing one gcc option will cause htons() to be left unexpanded and untouched. Which option?) Do not remove the -Wunreachable-code warning as you experiment. 30) After you have identified the gcc option that causes htons() to be expanded, restore the gcc options to their standard form, except, *remove* the "-g" option from the standard CFLAGS list. 31) Add a new binary Makefile target "serverX" that depends on "serverX.c". Rely on the the default "make" rules to build this target from source. Your Makefile will now have these five targets: all server2 clean serverX.c serverX 32) Add the serverX binary to the list of files to be cleaned in "make clean". Make sure that running "make clean" twice doesn't generate errors. 33) Run "make serverX" to compile and create binary file "serverX". Run "make all" or "make server2" to create binary file "server2". Confirm that most of the CFLAGS options are used in both cases. Make sure that "-g" is *not* used for this step. 34) Run "cmp -l -b server2 serverX" and note the one-byte difference in the two binary files. Why is that byte different? To help understand the output of "cmp -l -b", try this: $ echo "one two three" >a $ echo "one abc three" >b $ cmp -l -b a b Some system man pages don't document the -b option to "cmp" - try "cmp --help" or look up other man pages on the Internet by searching for "man cmp". Old versions of cmp use "-c" instead of "-b". 35) Put back the full set of CFLAGS options to gcc, including -g. (Include all the options mentioned earlier, including -g and -O.) Rebuild your binaries using the full set of options. Part IV - The Tests ------------------- Here are some tests to run now to make sure your Makefile is correct. A later section shows how to run these tests in a "script" session. Put back the full set of CFLAGS options to gcc given at the start of this Lab. Include all the options, including -g and -O. 36) $ make clean - anything created by the Makefile should be removed here - are all the objects created by the Makefile removed? - typing "make clean" twice should not generate any error messages, even if everything has been removed already 37) $ make clean all ; make all ; make all - is server2 built just once? 38) $ make clean ; make ; make ; make - is server2 built just once? 39) $ make clean all ; sleep 2 ; touch server2.c ; make all ; make all - is server2 built exactly twice? 40) $ make clean serverX.c ; make serverX.c - is serverX.c built just once? 41) $ make clean serverX ; make serverX - is serverX.c built from server2.c? - is serverX built just once? 42) $ make clean serverX ; sleep 2 ; touch server2.c ; make serverX serverX - is serverX built exactly twice? Capturing the tests using a script session ------------------------------------------ 43) When you are ready to demonstrate that your program works, start a "script" session (man script) using my script cover and with output file "testing.txt" (note the pathname to my script cover): $ ~alleni/bin/script testing.txt This will save your terminal command lines and output into the given file name. Make sure you use my cover script, as shown above. Note: Your output file "testing.txt" will not contain the full results of your script session until you exit the subshell started by the script command. When you exit the subshell, script will tell you the name of the output file. See the next few steps: 44) Run the above Makefile tests #36-#42. 45) Test your error() code by starting your server2 using a privileged port argument number greater than zero and less than 1024 (e.g. 123). Make sure you see the bind() error message: Permission denied Confirm that the above system call error text appears and that it appears on standard error (not on standard output). You must also include the invalid port number to the user as part of the error message. Why don't you have permission to bind to ports less than 1024 on Unix? 46) Start your server in the background, as you did earlier. Use netcat ("man nc") to connect to your server and demonstrate that input from a binary file goes via your server and returns accurately: $ ./server2 55555 & $ nc localhost 55555 out $ cmp /bin/bash out # should be no differences On some Unix systems, you may need the "-q 5" option to netcat to enable it to pass on EOF to the remote server. 47) After testing, kill your background server using killall. (See above.) 48) Exit the script session subshell. Script will tell you the name of your script output file: $ exit Script done, file is testing.txt $ less testing.txt You may now examine your saved script session. Make sure your session doesn't contain anything except test data - no vim sessions or other junk. 49) Fix the inconsistent indentation in all your source files. One quick way to do this is the command "indent -kr -i8 *.c"; but, you may want to add more options to treat comments differently. Review: http://lxr.linux.no/source/Documentation/CodingStyle 50) Using the given examples in programming_style.txt, add internal comments and proper file and function header comments to all the source files and functions. (The Assignment Label is not a programming header comment - add a proper file header.) You must fix or add comments to all the code, even if you didn't write the code yourself. (If you use code from other sources, you have to bring the coding style up to the standard of your company.) The function headers are suggested minimums. If you have a header that provides more detail, you may use it. 51) Fix the comments in the source file(s). Remove useless comments. Add useful block-style comments. Fix any indentation and formatting problems. See programming_style.txt 52) In a new file named "README.txt", enter these three things: a) Completion: Summarize briefly the results of your testing. If any of your Makefile tests failed, or if the netcat test did not work properly, document which tests failed. If everything worked, say so. b) Objectives: Comment on the assignment objectives vs. the course outline and indicate if, on your opinion, the assignment is relevant to the course and contributes to your learning of the course material. c) Difficulty Level: Using a scale from 0-5, where 0 is "easy", document how difficult the assignment was, and how much time you spent doing it. Documentation ------------- 53) Document the code. Did you bring all source files up to programming and comment standards? Are comments relevant? Is indentation consistent? Reference: Notes file programming_style.txt 54) I haven't asked for any separate User Manual for using the server2 program. Include brief syntax/usage information (including how to run the server program) in the comments at the top of server2.c. 55) Submit the files using the exact file names given below. Submission ---------- Submission Standards: A. At the top of each and every submitted file, as comments, create an Exterior Assignment Submission label following the example you will find under the "Assignment Standards" button on my teaching home page teaching.idallen.com . (The Teaching home page is not the same as the Course home page.) For full marks, follow the directions for the label exactly. The label has exactly 7 lines, plus an optional Comments line. The spelling of the label fields on the seven lines must be exactly as shown (machine readable). The spelling must be exact. Exact! Every file must have a submission label; use comments as needed. Make sure that adding the label doesn't break a source or make file. B. For material you copy from other sources, credit the author and source. (You may only copy code with written permission from your instructor.) Failure to acknowledge the source of copied material and claiming you wrote it yourself will constitute academic fraud (plagiarism). C. Using the cstsubmit command: Reference Class Notes file: cstsubmit.txt Submit these files for marking as Exercise 01 using the following *single* cstsubmit command line, including all the given file names: $ ~alleni/bin/cstsubmit 01 \ README.txt server2.c sendall.c sendall.h testing.txt Makefile See the Class Notes file cstsubmit.txt for details on using cstsubmit. All file names must be spelled *exactly* as given above. Always submit *all* files any time you submit. Incorrect and past-cut-off-time submissions are worth zero marks. This "cstsubmit" program will copy the selected files to me for marking. Always submit *all* your files at the same time. Do not delete your copies after you submit; keep them. Verify that you submitted all your files, using this command line: $ ~alleni/bin/cstsubmit 01 -list Note that the digit '1' and the letter 'l' (lower-case 'L') are different. The digit '0' and the letter 'O' are also different. Do not confuse these characters on your screen. You may redo this exercise and re-submit your results as many times as you like; but, you must always submit *all* your exercise files every time. The "-delete" option of cstsubmit will delete the most recent submission you have made. I will mark only the most recent submission that is submitted before the final hand-in cutoff date. For Exercise 01, always use "01" as the first argument to "cstsubmit". Always submit *all* the files each time you submit an exercise. P.S. Did you spell all the assignment label fields and file names correctly?