-------------------------
Week 03 Notes for CST8165
-------------------------
-Ian! D. Allen - idallen@idallen.ca - www.idallen.com

Remember - knowing how to find out an answer is more important than
memorizing the answer.  Learn to fish!  RTFM!  (Read The Fine Manual)

Q: what IP address is this (as a dotted quad):   int ipaddr = -1;  ?

Q: T/F the opposite of  x > 0  is  x < 0   ?

Q: T/F   if(x>0)  is the same as  if(!(x<0)) ?

Helpful code:
------------

a. printf size

    You can printf exactly 9 bytes from a buffer (no \0 needed) using:
       printf("%.9s",buf); // print only 9 bytes from buf
    AND it gets better if you use '*' instead of 9 (more useful in this case):
       n = read(fd,buf, .... );
       ...
       printf("%.*s",n,buf); // the "*" means pick up the current value of "n"
    This kind of printf can be useful for buffers that don't have \0 in them.

    You can also output n bytes in a buffer directly using write(fd,buf,n).
     - standard input (usually your keyboard) is Unix fd 0.
     - standard output (usually your screen) is Unix fd 1.
     - standard error (usually your screen) is Unix fd 2.
     - the first unit you open yourself in your program is usually fd 3.

b. buffer size

    Never do this:   char buf[256]; ... read(fd,buf,256);
    Do this:         char buf[OUT_BUFSIZE]; ... read(fd,buf,sizeof(buf))

    Buffer sizes must be set in only *one* place for easy maintenance.


Client/Server Programming
-------------------------

References:
  Sockets Tutorial: http://www.cs.rpi.edu/courses/sysprog/sockets/sock.html
  Sockets programming: http://beej.us/guide/bgnet/
  Diagram: http://community.borland.com/article/0,1410,26022,00.html

Review: low-level Unix I/O system calls: open,read/recv,write/send,close
Review: using perror() after a system call fails
 - see last week's notes

New four Unix networking system calls: socket,bind,listen,accept
 - man 2 socket
 - man 2 bind
 - man 2 listen
 - man 2 accept
 Sockets Tutorial: http://www.cs.rpi.edu/courses/sysprog/sockets/sock.html

equivalence of read() and recv(), write() and send() for sockets:

    For socket programming, you may see recv() used instead of read() and
    send() instead of write().  Both work equally well; recv() and send()
    allow socket options to be passed using an extra parameter.  Warning:
    read/write work on any type of output (sockets, files, pipes, devices,
    etc.) while recv/send *only* work on network sockets.
     - man 2 recv
     - man 2 send

    If you don't set any special TCP/IP flags in recv() or send(), the
    system calls recv() and read() are the same/equivalent for accessing
    sockets, as are the syscalls send() and write().  You can't use the
    socket syscalls recv() or send() on file descriptors that are *not*
    sockets (even if the TCP/IP flags are zero); using read() and write()
    works for both sockets and ordinary files.

Q: When can write() be used in place of send() in accessing a socket fd?
   (see "man 2 send")  (Note: You cannot use send() in place of
   write(), unless you are writing to a socket!)

Q: When can read() and recv() be interchanged in accessing a socket fd?

Q: Can recv() and send() be used on non-sockets?

Diagram: http://community.borland.com/article/0,1410,26022,00.html

Q:  In one column list in flow-chart form the Unix system calls made to
    set up a TCP/IP server that loops accepting clients, forking children
    that each read one packet, write one packet, and exit.  In a parallel
    column list the system calls used in a TCP/IP client that sends one
    packet and receives one packet then exits.  Connect the two columnts
    with arrows, showing the relationship of the system calls and the
    direction of data travel.

    Google for more TCP/IP client server socket examples and tutorials:
     - http://www.perl.com/doc/manual/html/pod/perlipc.html
     - http://beej.us/guide/bgnet/

Socket programming is similar to low-level Unix file I/O using
open/read/write/close
 - the Unix socket() and accept() system calls return small integer file
   descriptors, just as open() does
 - socket descriptors are just like file descriptors
   - you can use them with read() and write()
     (many socket programs use the equivalent recv() and send())
 - see the simple non-forking sockets server examples:
   http://www.cs.rpi.edu/courses/sysprog/sockets/sock.html
   http://www.cs.rpi.edu/courses/sysprog/sockets/server.c
    - read the explanation of the code in the above socket tutorial
    - note that you should replace the deprecated bzero() with memset()
      - see "man bstring"
 - for our Lab 1 and Lab 2 we use this forking server2.c code:
   http://www.cs.rpi.edu/courses/sysprog/sockets/server2.c
    - a fork()ing server that handles multiple connections
    - the child only reads one single line from a connection, then exits
    - this code does not correctly detect or handle EOF
    - this code inefficiently uses bzero()

The usual order of four network system calls to initialize a TCP/IP server:
 - 1. socket(), 2. bind(), 3. listen(), 4. accept()
 - most TCP/IP servers loop calling accept() to receive multiple connections
   - server may fork() separate child processes to deal with each connection
   - each connection may loop reading/writing the accepted socket,
     to read/write multiple lines from/to the incoming connection
 - the rpi.edu "server.c" only accepts one connection and then exits
 - the rpi.edu "server2.c" loops and fork()s, accepting many connections
   - each connection reads one line and exits
   - this server code does not correctly detect or handle EOF
 - how would you modify server2.c to create "server3.c" to read/write
   multiple lines for each connection?
    - add a loop in the child funcion dostuff() (rename this function!)
    - remember to check for EOF after read() or recv()
    - recode the function not to need bzero()

Sending byte data on the network: Big Endian / Little Endian:
 - what does the function call htons(portno) do?
   - it puts the short integer "portno" into network byte order
 - http://www.cs.rpi.edu/courses/sysprog/sockets/byteorder.html
 - http://www.netrino.com/Publications/Glossary/Endianness.php
 - http://www.rdrop.com/~cary/html/endian_faq.html
 - http://www.unixpapa.com/incnote/byteorder.html
 - "network byte order" is Big Endian (send the most significant byte first)
   - Motorola 680x0, mainframe, and Sun Sparc hardware are big-endian
   - Intel/AMD x86 hardware (e.g. your PC) is little-endian
   - little-endian hardware incurs a byte-swap penalty handling network traffic

Unix read/recv and write/send system call return values:
- for low-level I/O syscalls such as read() and write() that return an integer:
  - a return of less than zero means an error
    - the error reason is put in errno; use perror() to print it
  - a return of zero bytes means EOF when reading via read() or recv()
    - no more reading can be done after EOF is seen
    - the contents of the read() buffer are undefined after EOF; don't use it
    - EOF is not an error - errno is not set
    - do not call perror() after EOF
  - a return of zero means nothing was written when writing with write/send
    - this is not an error: you may need to loop to write everything
    - do not call perror() after writing zero bytes; try again
    - see the sendall() function below under "writing to network sockets"
  - a return of > 0 means you did read or write (some) of the data
    - you may not have read or written *all* of the data!
    - see the sendall() function below under "writing to network sockets"
- EOF is not an error - never call perror() after seeing EOF

Note that a *successful* Unix system call may or may not change errno:
 - see "man 3 errno"
 - errno is only set for sure after a system call *fails*
 - errno is *undefined* after a successful syscall
 - Thus, you cannot test errno to know if a system call failed
 - Thus, the perror() function is only usable on the most recent syscall.
 - If you execute other syscalls (e.g. using printf()), they may
   overwrite errno and you will lose the preceding syscall error.
 - The following perror() is incorrect, since printf() may overwrite errno:

    n = write(...);			  /* the syscall we want to check */
    printf("%d bytes read\n", n);	  /* another successful syscall */
    if ( n < 0 ) perror("write failed");  /* WRONG CHECK OF ERRNO */

 - Read the NOTES section of "man errno" for how to save/restore errno
   across a call to another system call, e.g. printf()

============================================================================

Q:  What do htons()/htonl() do and why are they necessary?

Q: what are the basic inputs and return values of the Unix syscalls:
   socket,bind,listen,accept,read/recv,write/send,close

Q:  What is the purpose/inputs/return of the socket() syscall?

Q:  What is the purpose/inputs/return of the bind() syscall?

Q:  What is the purpose/inputs/return of the listen() syscall?

Q:  What is the meaning of the small integer second parameter to listen()?

Q:  What is the purpose/inputs/return of the accept() syscall?

Q:  T/F the socket() and accept() syscalls return file descriptors that
    can be used directly with standard I/O functions fread/fwrite/fclose

Q:  T/F the successful accept() system call returns a socket file descriptor
    that is a copy of the socket file descriptor that is its first argument

Q:  T/F usually the fd to be returned by the first call to socket() or
    open() in your Unix program will be fd 3 (why or why not?)

Q:  why doesn't the first call to socket() or open() in a Unix program
    return file descriptor 1?

Q:  what is the small integer value usually returned by the first
    successful call to accept() in a TCP/IP server program?
    (Hint: accept() is called *after* socket())

Q:  T/F "network byte order" is Big Endian

Q:  T/F a Big Endian processor stores the Big End (most significant
    byte) of a number in the first (lowest) memory location

Q:  T/F a Little Endian processor sends the Little End (least significant
    byte) of a number first over a byte-stream communications channel

Q:  in a memory dump that shows bytes numbered in ascending order from
    left-to-right on the page, which Endian order shows multi-byte
    quantities as written "backwards" ?

Q: Why can't I use a printf() before calling perror?

Q: Which port numbers can only be bound to by the super-user on Unix/Linux?
   What is the IANA name for this reserved-for-super-user port range?
   (Not all operating systems restrict access to these low-numbered ports.)
   Ref: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#bind
   See the paragraph: "Another thing to watch out for when calling
   bind(): don't go underboard with your port numbers. ..."

Q: Under what circumstances can one omit a call to bind() a socket?
   What happens when a server calls accept() using such an unbound socket?
   What happens when a client calls connect() using such an unbound socket?
   Give an example of a client application that uses unbound sockets.
   Ref: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#bind
   Ref: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#connect

Q: What happens if you forget to call bind() before you call listen()
   in a server?  Does your server program fail to start up?  Can clients
   connect to a server with an unbound socket?  (Why/how, or why not?)

Q: True/False - after a call to accept() you have *two different* open
   socket file descriptors.

Q: True/False - if you close the socket descriptor that is the return
   value from an accept(), you also close the original socket()
   descriptor (and vice-versa - they are the same descriptor).

Q: You want a server to accept only a single client:
   True/False - after the accept() call, you can close the original
   socket file descriptor (the first argument to accept()) and use only
   the socket descriptor returned by the accept() call.

Pedantic Coding
---------------

We used htons(portno) but not htonl(INADDR_ANY) - why?
  - Look at:
    http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#bind
    and find the paragraph starting "If you are into noticing little things,
    you might have seen that I didn't put INADDR_ANY into Network Byte
    Order! Naughty me.".  Read the fix; fix your own code.

  - If you don't fix this, then when you later use some other value than
    INADDR_ANY here, your code will break.  The code is wrong; fix it now!

Q: Why isn't the short int AF_INET put into network byte order?

    my_addr.sin_family = AF_INET;         // host byte order
    serv_addr.sin_addr.s_addr = htonl(INADDR_ANY); // long, network byte order
    my_addr.sin_port = htons(MYPORT);     // short, network byte order

  - "man bind" refers us to "man 7 ip" which contains these lines:

    sa_family_t    sin_family; /* address family: AF_INET */
    u_int32_t      s_addr;     /* address in network byte order */
    u_int16_t      sin_port;   /* port in network byte order */

   "Note that the address and the port are always stored in network
    byte order.  In particular, this means that you need to call htons(3)
    on the number that is assigned to a port. All address/port manipulation
    functions in the standard library work in network byte order."

 - the sin_family is never sent over the network; it doesn't have to
   be in network byte order

Q: Why doesn't the sin_family = AF_INET need to use htonl() or htons()?

See "man bind" for the correct cast to use on the second argument to
bind() and connect():

   "The  only  purpose  of  this structure is to cast the structure
    pointer passed in my_addr in order to avoid compiler warnings."

Q: Do you mean AF_INET or PF_INET?  I see both - which is correct?
 - from "man socket"
   "The manifest constants used under 4.x BSD for protocol families are
    PF_UNIX,  PF_INET,  etc., while AF_UNIX etc. are used for address fami-
    lies. However, already the BSD man page promises: "The protocol  family
    generally is the same as the address family", and subsequent standards
    use AF_* everywhere."

Q: T/F PF_INET and AF_INET are effectively the same thing everywhere

References to Notes files (required reading):
-------------------------

    programming_style.txt
    header_files.txt
    makefiles.txt
    screendumps.txt