-----------------------
Lab #01 for CST8165 due January 28, 2008 (Week 4)
-----------------------
-Ian! D. Allen - idallen@idallen.ca - www.idallen.com

Remember - knowing how to find out an answer is more important than
memorizing the answer.  Learn to fish!  RTFM!  (Read The Fine Manual)

Global weight: 5% of your total mark this term.
Interim submission: Submit what you have done so far in lab on January 21.
Due date: before 16h00 Monday January 28

Interim submissions in Week 3:
    You will submit whatever progress you have made on this assignment
    before the end of your weekly lab period in Week 3.

The deliverables for this exercise are to be submitted online in the
Linux Lab T127 using the "cstsubmit" method described in the exercise
description, below.  No paper; no email; no FTP.

Late-submission date: I will accept without penalty exercises that
are submitted late but before 16h00 on Tuesday, January 29.
After that late-submission date, the exercise is worth zero marks.

Exercises submitted by the *due date* will be marked online and your
marks will be sent to you by email after the late-submission date.

Exercise Synopsis and Summary:
    Copy (with proper credit to the original author and source) a TCP
    server from the given sources.  Modify and test the server.
    - Remember how to use Makefiles
    - Remember how to structure programs in multiple .c/.h files
    - Learn to use the socket/bind/listen/accept system calls.
    - Learn to create C preprocessor (cpp) output via "cc -E"

Coding and submission standards:
 - provide File Headers (Program Headers) using my Assignment Label
 - provide Function Headers documenting arguments and return values
 - use Block Comments (see programming_style.txt)
 - error messages must have the four-part format from programming_style.txt

Where to work:
    Submissions must compile and run cleanly in the T127 Linux Lab,
    though you are free to work on them anywhere you like.

Since this course has no textbook, use the Internet instead:
Background tutorial on using sockets: The Sockets Tutorial:
    http://www.cs.rpi.edu/academics/courses/fall96/sysprog/sockets/sock.html
    (alternate: http://www.linuxhowtos.org/C_C++/socket.htm )
    The server2.c code is explained line-by-line in the above web page.

    Beej also has a line-by-line socket programming tutorial here:
    http://beej.us/guide/bgnet/

Helpful code:
------------

a. printf a fixed length string

    You can printf exactly 9 bytes from a buffer (no \0 needed) using:
       printf("%.9s",buf); // print only 9 bytes from buf
    AND it gets better if you use '*' instead of 9 (more useful in this case):
       n = read(fd,buf, .... );
       ...
       printf("%.*s",n,buf); // the "*" means pick up the current value of "n"
    This kind of printf can be useful for buffers that don't have \0 in them.
    (Note that this will *not* work to output binary data, since the
    \0 in binary data will stop both fprintf() and printf().)

    You can also output n bytes in a buffer directly using write(fd,buf,n).
    The write() system call handles binary data correctly.  (printf does not)
     - Standard input (usually your keyboard) is Unix fd 0.
     - Standard output (usually your screen) is Unix fd 1.
     - Standard error (usually your screen) is Unix fd 2.

   WARNING! printf() stops printing when it hits a NUL byte ('\0').
   If you are passing binary data, do not use printf() for output.
   You may use fwrite() to write binary data.

b. proper use of buffer sizes

    Never do this:   char buf[256]; ... read(fd,buf,256);
    Do this:         char buf[MYBUFSIZ]; ... read(fd,buf,sizeof(buf))

    Buffer sizes must be set in only *one* place for easy maintenance.

c. equivalence of read() and recv(), write() and send() for sockets
    
    If you don't set any special TCP/IP flags in recv() or send(), the
    system calls recv() and read() are the same/equivalent for accessing
    sockets, as are the syscalls send() and write().  You can't use the
    socket syscalls recv() or send() on file descriptors that are *not*
    sockets (even if the TCP/IP flags are zero); using read() and write()
    works for both sockets and ordinary files.

    All of the code in this lab uses plain read() and write() instead
    of recv() and send().  See also:  "man 2 recv" and "man 2 send"

d. do not clobber errno before perror() or error()
    
    If you want to use [f]printf() before a call to perror() or error(),
    read the NOTES section of "man 3 errno" first!  Rather than using
    [f]printf(), you could use snprintf() to construct the string you
    wish to pass to perror().  (Why is using [f]printf() before perror()
    not allowed; but, using snprintf() before perror() is allowed?)

    Do you know how to write va_list (variadic, varargs) functions?
    You might find such a function useful for printing error messages;
    since, it avoids the need for buffers and snprintf().

    http://www.gnu.org/software/libc/manual/html_node/Variadic-Functions.html

    The GNU-specific function error() takes a printf-like list of
    arguments and can optionally call perror() for you.  I recommend it.

Recall that most any program or C function has a manual page: RTFM

Part I - Coding a TCP/IP server
-------------------------------

0)  Go back and read the above sections you just skipped over.

1)  Find the Sockets Tutorial:
    http://www.cs.rpi.edu/academics/courses/fall96/sysprog/sockets/sock.html
    (alternate: http://www.linuxhowtos.org/C_C++/socket.htm )
    The server2.c code is explained line-by-line in the Sockets Tutorial.

2)  Copy the code for the second C server "server2.c" from the section
    "Enhancements to the server code".  Credit your code sources in
    comments in the code after you have copied it.  (No plagiarism!)

    See "man wget" for an easy way to download any URL to your current
    directory.  Don't cut-and-paste - the tabs will be all wrong.

3)  Create a Makefile target "all" for this code that compiles
    server2.c into "server2" using gcc and these gcc required options:

    CFLAGS = -g -O -Wall -Wextra -Wshadow -Wstrict-prototypes \
          -Wmissing-prototypes -Wmissing-declarations \
          -Wdeclaration-after-statement -Wmissing-field-initializers \
          -Wredundant-decls -Wunreachable-code -fstack-protector-all

    Note the spelling of "CFLAGS".  Your Makefile needs that spelling.
    Strive for the minimal Makefile - don't include command lines
    for things that "make" will do for you automatically (such as
    automatically compile .o files from .c files using CFLAGS).

4)  Create a Makefile target "clean" that removes the compiled files,
    leaving only the source file.  Make sure that running "make clean"
    twice doesn't generate errors.

5)  Fix the gcc compile warnings by adding the missing header files:
    - use the man pages to find the missing header lines for each function
    - fix the warning about signed/unsigned use of int
    - make the error() function static instead of global
    - you may still have one spurious "will never be executed" error
      on the line containing htons() that won't go away, that you can ignore

6)  Add this line to main():  alarm(30*60); /* kill program after 30 minutes */
    - this is in case you forget to kill the program yourself when you leave!
    - note where is the correct place to put this line.
    - test it using an alarm of 10 seconds instead of 30 minutes.
    - to see your background processes on Linux/Unix, use command: ps gx

7)  Add code to check the return code of "listen()" and handle any error.

    Print a message on stderr if listen() fails and exit the program
    with a non-zero return code.  Since listen() is a system call,
    you may use perror() to print the reason for the system call failure.

8)  Fix "The zombie problem" using the AIX-style fix from the Tutorial.
    Use the AIX one-line fix, not the fix that uses a signal handler.

9)  Pick up the setsockopt SO_REUSEADDR code fragment from section 4.2 of
    http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html and add
    it to your server before calling bind().  This will make the server
    re-use socket ports nicely and avoid "Address already in use".
    - note what variables do you need to create to make this code work.
    - note where is the correct place to put this code.

    This Unix command shows active TCP connections:  netstat -natp

10) Fix your server2.c to accept only *exactly* one port argument, not
    more, not less.  Never ignore user input.  Do not run the program
    if too many or too few arguments are given.

11) Fix your server2.c to convert INADDR_ANY to network byte order.
    Reference: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#bind
    Read the highlighted paragraphs above and below the one containing:
    "you might have seen that I didn't put INADDR_ANY into Network Byte
    Order! Naughty me."  Note the change in the line "// use my IP address".

12) If you (accidentally) write on a closed socket, your program will be
    killed with SIGPIPE, without any error message.  Ignore the SIGPIPE
    signal in your server.  Add this code near the start of main():

       signal(SIGPIPE,SIG_IGN);

    This will prevent your server from being killed (with no error
    message) if you accidentally write onto an incorrect file descriptor.

    From man 7 socket:

      "When  writing onto a connection-oriented socket that has been shut
       down (by the local or the remote end) SIGPIPE is sent to the writing
       process and  EPIPE  is returned."

13) Replace the deprecated bzero() function everywhere.  (See "man bzero".)
    No programs should be using bzero() any more.

14) The current server only accepts one line and then exits.  Verify that
    this is true before you continue, using the standard TCP client netcat:

      $ nc -v localhost <portnum>

    e.g. you would first start your server in the background (or in a
    separate window) and then connect to it using netcat:

      $ ./server2 55555 &
      $ nc -v localhost 55555
      hi
      [...some server output prints here and netcat exits...]
      $ killall server2

Part II - making the server loop
--------------------------------

We will recode the server to loop reading all the lines from the client,
returning them all to the client (an echo server), as follows:

15) Rewrite the server2.c dostuff() function incrementally as follows:
    (Rename this dostuff() function to something meaningful!)

      a. Delete the server's "Here is the message" line.

      b. Replace the 18-byte write() statement by writing all the bytes
         received from the client back to the client (to the same socket).
         Only write the bytes that were actually read from the client.

      c. Recode dostuff() to eliminate and remove the need for bzero/memset()
         Zeroing out the entire buffer is unnecessary and wasteful.  Recode it.

      d. Using the pseudocode developed in class, recode dostuff() to
         loop reading/writing lines from/to a connected client until
         either EOF (zero bytes read) or error.  Make sure the loop
	 handles binary data.  (Don't use printf and friends.)

      e. Issue the following message on standard error when EOF is seen:

           XXX: EOF reading from unit NNN

	 where XXX is the program name, and NNN is the I/O descriptor
	 number on which EOF is detected.  (You can use the GNU error()
	 function to write this message.)

      f. Issue the following message on standard error just before the
         server's forked client process exits:

           XXX: done with unit NNN

	 where XXX is the program name, and NNN is the I/O descriptor
	 number used by the client.  (You can use the GNU error()
	 function to write this message.)

    Make sure the code continues to test for errors on all system calls.

16) When writing to a network socket, the write() may write fewer
    bytes than you ask it to write.  Reference:

       http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#sendrecv
       http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#sendall

    In server2.c, replace the write() with a function sendall() that
    loops sending bytes until either all the bytes are sent, or error.
    You can get the sendall() function from here:

       http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#sendall

17) Copy (with credit) that sendall() function from beej and fix the
    missing-initialization bug that happens if "*len" is zero.

18) If you want to generalize sendall() to write to a file descriptor
    that is not a socket (e.g. to your screen), you must ensure that
    sendall() uses write() and not send().  Both send() and recv()
    only work on sockets; write() and read() work on both sockets and
    ordinary file descriptors.  Replace send() with write() for maximum
    portability.

19) Put sendall() in its own file sendall.c and create sendall.h
    Reference: Class Notes file header_files.txt
    For full marks, do not put unnecessary #includes in your *.h files.

20) Update your Makefile to handle sendall.c and sendall.h
    Make sure that your server2 is rebuilt if any of its source files change.
    Test this!

Part III - cleaning up the error() messages
-------------------------------------------

Internet-facing programs need to validate all input and give good error
messages.

21) Read the Notes file programming_style.txt
    Pay attention to the required four-part format of error messages.
    Improve all the error messages to have the four-part format described
    in Notes file programming_style.txt.

    *Every* error message must be output on standard error and be prefixed
    by the actual name of the program.  Do not hard-code the program
    name into the code; use the actual name from the command line.
    If you rename the binary, the error message name must also change.
    (The GNU error() function does this for you, automatically.)

    System call errors must have the correct system call error text
    output (as would be output by perror() or any related functions).
    Non-system-call errors must *not* try to call perror().

    The GNU extension function "error()" is useful for printing these
    four-part error messages; though, it is not portable and would have
    to be fixed for a non-GNU environment.  See "man 3 error".

    Remove the existing error() function from the code and convert all
    the error messages to use the multi-argument GNU error() function
    instead.  Supply the "errno" argument only for system calls that fail.
    Examples of using GNU error():

    if( systemcall(port) < 0 )
    	error(1,errno,"ERROR in systemcall using port %d",port);
    if( argc > 1 )
    	error(1,0,"This program does not accept arguments");
    if( frob > 2*fritz )
    	error(0,0,"Warning: frob %d larger than double fritz %d", frob, fritz);

22) Verify that your program works using the standard TCP client netcat:

      $ nc -v localhost <portnum>

    e.g. you would start your server in the background and then connect:

      $ ./server2 55555 &
      $ nc -v localhost 55555
      hi
      hi   (lines that you type should echo back to you via the server)
      ^C   (interrupt netcat)
      $ killall server2

      You should also test that binary data passes through your server
      correctly, using shell I/O redirection to netcat.

Part IV - looking at C pre-processor output
-------------------------------------------

The server2.c code we are working on has a "will never be executed"
warning on code that is clearly being executed.  To find out the cause
of this warning, we need to look at what changes the C preprocessor has
made to our source file before the compiler sees it and compiles it.

You must have a working Makefile that can build server2.  Consult your
instructor for assistance in getting a broken Makefile to work before
you continue.

23) Add the following variable near the top of the Makefile:
    
    CF2 = $(CFLAGS) -E

24) Find the explanation of what the "-E" option does in the gcc man
    page.  Read it.  Note that the output of "cc -E" is preprocessor
    output - it is text C code.  The output is not an object file and
    is not a binary executable!  Why are there no "#include", "#define",
    or "#ifdef" statements in the output of "cc -E"?

25) To the bottom of your Makefile, add a new Makefile target "serverX.c"
    with dependency "server2.c" that runs gcc with the $(CF2) options to
    pre-process source file server2.c into output source file "serverX.c".

    Unlike Part I, where you could use the default Makefile actions to
    call the gcc compiler, in this case you *must* specify the full gcc
    compile line, including specifying the output file, so that you can
    pass the $(CF2) options to gcc and capture the preprocessor output.

26) Add serverX.c to the list of files to be cleaned in "make clean".
    Make sure that running "make clean" twice doesn't generate errors.
    (Test it!)

27) Run "make serverX.c" to run the gcc pre-processor to create source file
    "serverX.c" from server2.c using the $(CF2) flags.  Confirm that all
    the CFLAGS options are used, plus the new -E option.  Confirm that
    the created file "serverX.c" is C source code - the output of running
    the C preprocessor on the server2.c source file.

28) Look at the bottom 60 lines of the serverX.c source file, where the
    serv_addr.sin_port and htons() code gives "unreachable" warnings.
    (You saw this htons() warning "will never be executed" in the previous
    lab - it was the only warning you couldn't get rid of.)

    Note what a mess the htons() code has expanded into after going
    through the gcc pre-processor!  The resulting unreachable code in
    the "if" statement is the reason this line generates an unreachable
    code warning.  You would never know this unless you looked at the
    output of the C pre-processor.  Looking at pre-processor output
    (using -E on Unix) is often the only way to know what code you are
    actually compiling, and why some compile errors and warnings happen.

29) Experiment temporarily with gcc options to find out which compiler
    flag is causing the htons() code to be expanded into the "unreachable
    code" line.  (Hint: Removing one gcc option will cause htons() to
    be left unexpanded and untouched.  Which option?)  Do not remove
    the -Wunreachable-code warning as you experiment.

30) After you have identified the gcc option that causes htons() to be
    expanded, restore the gcc options to their standard form, except,
    *remove* the "-g" option from the standard CFLAGS list.

31) Add a new binary Makefile target "serverX" that depends on "serverX.c".
    Rely on the the default "make" rules to build this target from source.

    Your Makefile will now have these five targets:

       all  server2  clean  serverX.c  serverX

32) Add the serverX binary to the list of files to be cleaned in "make clean".
    Make sure that running "make clean" twice doesn't generate errors.
    
33) Run "make serverX" to compile and create binary file "serverX".
    Run "make all" or "make server2" to create binary file "server2".
    Confirm that most of the CFLAGS options are used in both cases.
    Make sure that "-g" is *not* used for this step.

34) Run "cmp -l -b server2 serverX" and note the one-byte difference
    in the two binary files.  Why is that byte different?

    To help understand the output of "cmp -l -b", try this:
       $ echo "one two three" >a
       $ echo "one abc three" >b
       $ cmp -l -b a b

    Some system man pages don't document the -b option to "cmp" - try
    "cmp --help" or look up other man pages on the Internet by searching
    for "man cmp".  Old versions of cmp use "-c" instead of "-b".

35) Put back the full set of CFLAGS options to gcc, including -g.
    (Include all the options mentioned earlier, including -g and -O.)
    Rebuild your binaries using the full set of options.

Part IV - The Tests
-------------------

Here are some tests to run now to make sure your Makefile is correct.
A later section shows how to run these tests in a "script" session.
Put back the full set of CFLAGS options to gcc given at the start of
this Lab.  Include all the options, including -g and -O.

36) $ make clean
    - anything created by the Makefile should be removed here
    - are all the objects created by the Makefile removed?
    - typing "make clean" twice should not generate any error messages,
      even if everything has been removed already

37) $ make clean all ; make all ; make all
    - is server2 built just once?

38) $ make clean ; make ; make ; make
    - is server2 built just once?

39) $ make clean all ; sleep 2 ; touch server2.c ; make all ; make all
    - is server2 built exactly twice?

40) $ make clean serverX.c ; make serverX.c
    - is serverX.c built just once?

41) $ make clean serverX ; make serverX
    - is serverX.c built from server2.c?
    - is serverX built just once?

42) $ make clean serverX ; sleep 2 ; touch server2.c ; make serverX serverX
    - is serverX built exactly twice?

Capturing the tests using a script session
------------------------------------------

43) When you are ready to demonstrate that your program works, start a
    "script" session (man script) using my script cover and with output
    file "testing.txt" (note the pathname to my script cover):

       $ ~alleni/bin/script testing.txt

    This will save your terminal command lines and output into the given
    file name.  Make sure you use my cover script, as shown above.

    Note: Your output file "testing.txt" will not contain the full
    results of your script session until you exit the subshell started
    by the script command.  When you exit the subshell, script will tell
    you the name of the output file.  See the next few steps:

44) Run the above Makefile tests #36-#42.

45) Test your error() code by starting your server2 using a privileged
    port argument number greater than zero and less than 1024 (e.g. 123).
    
    Make sure you see the bind() error message: Permission denied
    Confirm that the above system call error text appears and that it
    appears on standard error (not on standard output).  You must also
    include the invalid port number to the user as part of the error message.

    Why don't you have permission to bind to ports less than 1024 on Unix?

46) Start your server in the background, as you did earlier.  Use
    netcat ("man nc") to connect to your server and demonstrate that
    input from a binary file goes via your server and returns accurately:

      $ ./server2 55555 &
      $ nc localhost 55555 </bin/bash >out
      $ cmp /bin/bash out     # should be no differences

    On some Unix systems, you may need the "-q 5" option to netcat to
    enable it to pass on EOF to the remote server.

47) After testing, kill your background server using killall.  (See above.)

48) Exit the script session subshell.
    Script will tell you the name of your script output file:

       $ exit
       Script done, file is testing.txt
       $ less testing.txt

    You may now examine your saved script session.  Make sure your
    session doesn't contain anything except test data - no vim sessions
    or other junk.

49) Fix the inconsistent indentation in all your source files.  One quick
    way to do this is the command "indent -kr -i8 *.c"; but, you may
    want to add more options to treat comments differently.

    Review: http://lxr.linux.no/source/Documentation/CodingStyle
    
50) Using the given examples in programming_style.txt, add internal
    comments and proper file and function header comments to all
    the source files and functions.  (The Assignment Label is not a
    programming header comment - add a proper file header.)

    You must fix or add comments to all the code, even if you didn't
    write the code yourself.  (If you use code from other sources, you
    have to bring the coding style up to the standard of your company.)

    The function headers are suggested minimums.  If you have a header
    that provides more detail, you may use it.

51) Fix the comments in the source file(s).  Remove useless comments.
    Add useful block-style comments.  Fix any indentation and
    formatting problems.  See programming_style.txt

52) In a new file named "README.txt", enter these three things:
    a) Completion: Summarize briefly the results of your testing.  If any
       of your Makefile tests failed, or if the netcat test did not work
       properly, document which tests failed.  If everything worked, say so.
    b) Objectives: Comment on the assignment objectives vs. the course
       outline and indicate if, on your opinion, the assignment is
       relevant to the course and contributes to your learning of the
       course material.
    c) Difficulty Level: Using a scale from 0-5, where 0 is "easy",
       document how difficult the assignment was, and how much time you
       spent doing it.

Documentation
-------------

53) Document the code.  Did you bring all source files up to programming and
    comment standards?  Are comments relevant?  Is indentation consistent?

    Reference: Notes file programming_style.txt

54) I haven't asked for any separate User Manual for using the server2
    program.  Include brief syntax/usage information (including how
    to run the server program) in the comments at the top of server2.c.

55) Submit the files using the exact file names given below.

Submission
----------

Submission Standards:

A.  At the top of each and every submitted file, as comments, create an
    Exterior Assignment Submission label following the example you will
    find under the "Assignment Standards" button on my teaching home page
    teaching.idallen.com .  (The Teaching home page is not the same as
    the Course home page.)

    For full marks, follow the directions for the label exactly.
    The label has exactly 7 lines, plus an optional Comments line.

    The spelling of the label fields on the seven lines must be exactly
    as shown (machine readable).  The spelling must be exact.  Exact!

    Every file must have a submission label; use comments as needed.
    Make sure that adding the label doesn't break a source or make file.

B.  For material you copy from other sources, credit the author and source.
    (You may only copy code with written permission from your instructor.)
    Failure to acknowledge the source of copied material and claiming
    you wrote it yourself will constitute academic fraud (plagiarism).

C.  Using the cstsubmit command: Reference Class Notes file: cstsubmit.txt
    Submit these files for marking as Exercise 01 using the following
    *single* cstsubmit command line, including all the given file names:

    $ ~alleni/bin/cstsubmit 01 \
       README.txt server2.c sendall.c sendall.h testing.txt Makefile

    See the Class Notes file cstsubmit.txt for details on using cstsubmit.
    All file names must be spelled *exactly* as given above.
    Always submit *all* files any time you submit.
    Incorrect and past-cut-off-time submissions are worth zero marks.

    This "cstsubmit" program will copy the selected files to me for
    marking.  Always submit *all* your files at the same time.  Do not
    delete your copies after you submit; keep them.

    Verify that you submitted all your files, using this command line:

	$ ~alleni/bin/cstsubmit 01 -list

    Note that the digit '1' and the letter 'l' (lower-case 'L') are different.
    The digit '0' and the letter 'O' are also different.
    Do not confuse these characters on your screen.

    You may redo this exercise and re-submit your results as many times as you
    like; but, you must always submit *all* your exercise files every time.

    The "-delete" option of cstsubmit will delete the most recent submission
    you have made.  I will mark only the most recent submission that is
    submitted before the final hand-in cutoff date.

    For Exercise 01, always use "01" as the first argument to "cstsubmit".
    Always submit *all* the files each time you submit an exercise.

P.S. Did you spell all the assignment label fields and file names correctly?