-------------------------
Week 04 Notes for CST8165
-------------------------
-Ian! D. Allen - idallen@idallen.ca - www.idallen.com

Remember - knowing how to find out an answer is more important than
memorizing the answer.  Learn to fish!  RTFM!  (Read The Fine Manual)

Midterm test coming Week 5 - see the questions in the Notes files

Interim submission for Lab 3 required on October 2 - see Lab #3.

Review:
 - you know the basic inputs/outputs of the Unix syscalls:
   socket,bind,listen,accept,read/recv,write/send,close
 - you can program a forking TCP/IP echo server that accepts multiple
   clients and reads and writes one line

Q: Give the PDL for a forking "echo" server that creates a new process
   for each new client connection that reads one line and writes one line
   back to the client

Notes to read:
    programming_style.txt
    deep_indentation.txt 
    buffer_overflows.txt
    eof_handling.txt
    header_files.txt
    makefiles.txt
    Optional (but useful): myerror.c.txt  

Reminder for assignments:
  programming_style.txt
   - keep lines less than 80 characters
   - indentation is critical
  header_files.txt
   - know what goes in header files, and what #includes too

Sending EOF from the keyboard - ^D
----------------------------------

The truth about keyboard ^D and end-of-file (EOF)

Typing the character ^D at your keyboard does not actually signal EOF to
a process.  What ^D does is act somewhat like pushing the RETURN key,
in that whatever characters have been typed since the last RETURN
are sent to the receiving process that is reading your keyboard.
The ^D character itself is never sent.

Unlike pushing the RETURN key, using ^D does not send a newline - it
simply flushes the characters.  If there are no characters to flush,
e.g. ^D is typed right after starting your process or right after pushing
RETURN, then the ^D sends zero characters to the process.  When a process
reads zero characters, it interprets that to mean EOF.

If you type a few characters on your keyboard and then type ^D instead
of RETURN, the ^D sends those few characters to the process without a
newline on the end.  (The ^D character itself is never sent.)  If you
then immediately type a second ^D, the second ^D sends zero characters,
and the process interprets that read of zero characters as EOF.  The ^D
means "send now", and if you send zero bytes, that's interpreted as EOF.

Writing to Network sockets
--------------------------

Writes to sockets can be incomplete!  The system may not write all the
 bytes you asked - the write() or send() will return fewer bytes than the
 size you asked to send.  You need to loop to send the remaining bytes.

 - http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#sendall
   - can "*len" be replaced by "len" in this function?
   - BUG: what value is returned by sendall() if *len <= 0 ?
   - why can't sendall() just return the number of bytes written?
   - if you want to generalize sendall() to write to a file descriptor
     that is not a socket, you must ensure that sendall() uses write()
     and not send().  send() only works for sockets; write() works for
     anything (including sockets, pipes, fifos, files, etc.)

Q: T/F writes to network sockets may only write some of the requested bytes

Q: T/F the sendall() function may write some bytes but still return -1
   indicating an error

Q: why does the sendall() function need both a return value and a
   pass-by-reference number of bytes written?

Q: under what circumstances will sendall() indicate a positive number
   of bytes written but still return -1 indicating failure?

Note: Reading from network sockets can also be incomplete!  How do you
know you have received "everything" sent by a client?  We'll talk about
handling that later.

Coding
------

Checking for internal consistency using the assert() macro:
  http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_28.html

 "When you're writing a program, it's often a good idea to put in checks
  at strategic places for "impossible" errors or violations of basic
  assumptions. These checks are helpful in debugging problems due to
  misunderstandings between different parts of the program."
  - mentions the assert() macro that will abort your program, printing
    the file and line number where the abort happened
  - use assert() ("man assert") to find bugs in your program
    - e.g.  assert( numread > 0 );

Q: What purpose does using the assert() macro have in a program?

VARIADIC/VARARGS functions that take multiple arguments (e.g. printf)

    Do you know how to write va_list (variadic, varargs) functions?
    I highly recommend converting myerror() to varargs, and also creating
    a varargs myperror() that prints but does not exit (or modify
    myerror() not to exit and add exit() calls where needed).  The two
    functions can share the same varargs base function vmyperror(fmt,ap)
    that uses a va_alist argument, in the manner of vfprintf().
    You might find such functions useful for printing error messages;
    since, they avoid the need for buffers and snprintf().  Details:

    http://www.gnu.org/software/libc/manual/html_node/Variadic-Functions.html

    See the Notes file: myerror.c.txt

Q: Why use VARIADIC/VARARGS functions for error messages?

Error messages should only show information from the command line
if that information is relevant to the error:

  - errors from bind() and connect() can depend on command line arguments
    - the error messages must include the user's supplied arguments
  - errors from socket() and listen() have nothing to do with the command line
    - the error messages do not need to show command line arguments

Q: T/F accept() syscall errors are related to the command line arguments

Our client and server processes only read and write single lines.
We should fix them to write *all* the lines.

Writing a looping process that reads one fd and writes to another fd:

  - know before you code: what are the terminating conditions for the loop?
  - for clients/servers you need to terminate on these three conditions:
    - when reading the fd: (1) break loop on errors, and (2) break loop on EOF
    - when writing the fd: (3) break loop on errors
  - start coding a loop with "while(1)" - don't worry about putting any
    conditions in the while loop test at the top until you're done the loop.
    Perhaps the loop will be cleaner if each of the terminating conditions
    uses "break" in the body of the loop and you keep "while(1)" at the top?

    /* this loop has three terminating conditions, as given above */
    WHILE 1
       numread = CALL READ to get some data into a fixed-size buffer
       IF read error THEN print error message and break loop  /* (1) */
       IF end-of-file THEN print EOF message and break loop   /* (2) */
       ASSERT( numread > 0 )      /* man 3 assert */
       WRITE from buffer the numread bytes of data that was read
       IF write error THEN print error message and break loop /* (3) */
    END WHILE

    The "WRITE from buffer" should use the sendall() function, to make
    sure all the bytes are written.  Only write the number of bytes
    "numbytes" that were read; don't write the whole buffer!

Q: Write the PDL for any process that wishes to read data from one
   place and write it to another place.
   - what are the three terminating conditions for the loop?

Q: T/F "EOF" is an error condition that should be followed by perror()

Q: How many of the above loops are coded in an echo server (server2.c)?

Q: How many of the above loops are coded in a client (client.c)
   that reads from your keyboard and writes to a server, and reads
   from the server and writes to your screen?

References to Notes files (required reading):
-------------------------

    programming_style.txt
    deep_indentation.txt 
    buffer_overflows.txt
    eof_handling.txt
    header_files.txt
    makefiles.txt
    screendumps.txt
    Optional (but useful): myerror.c.txt

*** End of material covered in first midterm test on October 2 ***