------------------------- Week 03 Notes for CST8165 ------------------------- -Ian! D. Allen - idallen@idallen.ca - www.idallen.com Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) Q: what IP address is this (as a dotted quad): int ipaddr = -1; ? Q: T/F the opposite of x > 0 is x < 0 ? Q: T/F if(x>0) is the same as if(!(x<0)) ? Helpful code: ------------ a. printf size You can printf exactly 9 bytes from a buffer (no \0 needed) using: printf("%.9s",buf); // print only 9 bytes from buf AND it gets better if you use '*' instead of 9 (more useful in this case): n = read(fd,buf, .... ); ... printf("%.*s",n,buf); // the "*" means pick up the current value of "n" This kind of printf can be useful for buffers that don't have \0 in them. You can also output n bytes in a buffer directly using write(fd,buf,n). - standard input (usually your keyboard) is Unix fd 0. - standard output (usually your screen) is Unix fd 1. - standard error (usually your screen) is Unix fd 2. - the first unit you open yourself in your program is usually fd 3. b. buffer size Never do this: char buf[256]; ... read(fd,buf,256); Do this: char buf[OUT_BUFSIZE]; ... read(fd,buf,sizeof(buf)) Buffer sizes must be set in only *one* place for easy maintenance. Client/Server Programming ------------------------- References: Sockets Tutorial: http://www.cs.rpi.edu/courses/sysprog/sockets/sock.html Sockets programming: http://beej.us/guide/bgnet/ Diagram: http://community.borland.com/article/0,1410,26022,00.html Review: low-level Unix I/O system calls: open,read/recv,write/send,close Review: using perror() after a system call fails - see last week's notes New four Unix networking system calls: socket,bind,listen,accept - man 2 socket - man 2 bind - man 2 listen - man 2 accept Sockets Tutorial: http://www.cs.rpi.edu/courses/sysprog/sockets/sock.html equivalence of read() and recv(), write() and send() for sockets: For socket programming, you may see recv() used instead of read() and send() instead of write(). Both work equally well; recv() and send() allow socket options to be passed using an extra parameter. Warning: read/write work on any type of output (sockets, files, pipes, devices, etc.) while recv/send *only* work on network sockets. - man 2 recv - man 2 send If you don't set any special TCP/IP flags in recv() or send(), the system calls recv() and read() are the same/equivalent for accessing sockets, as are the syscalls send() and write(). You can't use the socket syscalls recv() or send() on file descriptors that are *not* sockets (even if the TCP/IP flags are zero); using read() and write() works for both sockets and ordinary files. Q: When can write() be used in place of send() in accessing a socket fd? (see "man 2 send") (Note: You cannot use send() in place of write(), unless you are writing to a socket!) Q: When can read() and recv() be interchanged in accessing a socket fd? Q: Can recv() and send() be used on non-sockets? Diagram: http://community.borland.com/article/0,1410,26022,00.html Q: In one column list in flow-chart form the Unix system calls made to set up a TCP/IP server that loops accepting clients, forking children that each read one packet, write one packet, and exit. In a parallel column list the system calls used in a TCP/IP client that sends one packet and receives one packet then exits. Connect the two columnts with arrows, showing the relationship of the system calls and the direction of data travel. Google for more TCP/IP client server socket examples and tutorials: - http://www.perl.com/doc/manual/html/pod/perlipc.html - http://beej.us/guide/bgnet/ Socket programming is similar to low-level Unix file I/O using open/read/write/close - the Unix socket() and accept() system calls return small integer file descriptors, just as open() does - socket descriptors are just like file descriptors - you can use them with read() and write() (many socket programs use the equivalent recv() and send()) - see the simple non-forking sockets server examples: http://www.cs.rpi.edu/courses/sysprog/sockets/sock.html http://www.cs.rpi.edu/courses/sysprog/sockets/server.c - read the explanation of the code in the above socket tutorial - note that you should replace the deprecated bzero() with memset() - see "man bstring" - for our Lab 1 and Lab 2 we use this forking server2.c code: http://www.cs.rpi.edu/courses/sysprog/sockets/server2.c - a fork()ing server that handles multiple connections - the child only reads one single line from a connection, then exits - this code does not correctly detect or handle EOF - this code inefficiently uses bzero() The usual order of four network system calls to initialize a TCP/IP server: - 1. socket(), 2. bind(), 3. listen(), 4. accept() - most TCP/IP servers loop calling accept() to receive multiple connections - server may fork() separate child processes to deal with each connection - each connection may loop reading/writing the accepted socket, to read/write multiple lines from/to the incoming connection - the rpi.edu "server.c" only accepts one connection and then exits - the rpi.edu "server2.c" loops and fork()s, accepting many connections - each connection reads one line and exits - this server code does not correctly detect or handle EOF - how would you modify server2.c to create "server3.c" to read/write multiple lines for each connection? - add a loop in the child funcion dostuff() (rename this function!) - remember to check for EOF after read() or recv() - recode the function not to need bzero() Sending byte data on the network: Big Endian / Little Endian: - what does the function call htons(portno) do? - it puts the short integer "portno" into network byte order - http://www.cs.rpi.edu/courses/sysprog/sockets/byteorder.html - http://www.netrino.com/Publications/Glossary/Endianness.php - http://www.rdrop.com/~cary/html/endian_faq.html - http://www.unixpapa.com/incnote/byteorder.html - "network byte order" is Big Endian (send the most significant byte first) - Motorola 680x0, mainframe, and Sun Sparc hardware are big-endian - Intel/AMD x86 hardware (e.g. your PC) is little-endian - little-endian hardware incurs a byte-swap penalty handling network traffic Unix read/recv and write/send system call return values: - for low-level I/O syscalls such as read() and write() that return an integer: - a return of less than zero means an error - the error reason is put in errno; use perror() to print it - a return of zero bytes means EOF when reading via read() or recv() - no more reading can be done after EOF is seen - the contents of the read() buffer are undefined after EOF; don't use it - EOF is not an error - errno is not set - do not call perror() after EOF - a return of zero means nothing was written when writing with write/send - this is not an error: you may need to loop to write everything - do not call perror() after writing zero bytes; try again - see the sendall() function below under "writing to network sockets" - a return of > 0 means you did read or write (some) of the data - you may not have read or written *all* of the data! - see the sendall() function below under "writing to network sockets" - EOF is not an error - never call perror() after seeing EOF Note that a *successful* Unix system call may or may not change errno: - see "man 3 errno" - errno is only set for sure after a system call *fails* - errno is *undefined* after a successful syscall - Thus, you cannot test errno to know if a system call failed - Thus, the perror() function is only usable on the most recent syscall. - If you execute other syscalls (e.g. using printf()), they may overwrite errno and you will lose the preceding syscall error. - The following perror() is incorrect, since printf() may overwrite errno: n = write(...); /* the syscall we want to check */ printf("%d bytes read\n", n); /* another successful syscall */ if ( n < 0 ) perror("write failed"); /* WRONG CHECK OF ERRNO */ - Read the NOTES section of "man errno" for how to save/restore errno across a call to another system call, e.g. printf() ============================================================================ Q: What do htons()/htonl() do and why are they necessary? Q: what are the basic inputs and return values of the Unix syscalls: socket,bind,listen,accept,read/recv,write/send,close Q: What is the purpose/inputs/return of the socket() syscall? Q: What is the purpose/inputs/return of the bind() syscall? Q: What is the purpose/inputs/return of the listen() syscall? Q: What is the meaning of the small integer second parameter to listen()? Q: What is the purpose/inputs/return of the accept() syscall? Q: T/F the socket() and accept() syscalls return file descriptors that can be used directly with standard I/O functions fread/fwrite/fclose Q: T/F the successful accept() system call returns a socket file descriptor that is a copy of the socket file descriptor that is its first argument Q: T/F usually the fd to be returned by the first call to socket() or open() in your Unix program will be fd 3 (why or why not?) Q: why doesn't the first call to socket() or open() in a Unix program return file descriptor 1? Q: what is the small integer value usually returned by the first successful call to accept() in a TCP/IP server program? (Hint: accept() is called *after* socket()) Q: T/F "network byte order" is Big Endian Q: T/F a Big Endian processor stores the Big End (most significant byte) of a number in the first (lowest) memory location Q: T/F a Little Endian processor sends the Little End (least significant byte) of a number first over a byte-stream communications channel Q: in a memory dump that shows bytes numbered in ascending order from left-to-right on the page, which Endian order shows multi-byte quantities as written "backwards" ? Q: Why can't I use a printf() before calling perror? Q: Which port numbers can only be bound to by the super-user on Unix/Linux? What is the IANA name for this reserved-for-super-user port range? (Not all operating systems restrict access to these low-numbered ports.) Ref: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#bind See the paragraph: "Another thing to watch out for when calling bind(): don't go underboard with your port numbers. ..." Q: Under what circumstances can one omit a call to bind() a socket? What happens when a server calls accept() using such an unbound socket? What happens when a client calls connect() using such an unbound socket? Give an example of a client application that uses unbound sockets. Ref: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#bind Ref: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#connect Q: What happens if you forget to call bind() before you call listen() in a server? Does your server program fail to start up? Can clients connect to a server with an unbound socket? (Why/how, or why not?) Q: True/False - after a call to accept() you have *two different* open socket file descriptors. Q: True/False - if you close the socket descriptor that is the return value from an accept(), you also close the original socket() descriptor (and vice-versa - they are the same descriptor). Q: You want a server to accept only a single client: True/False - after the accept() call, you can close the original socket file descriptor (the first argument to accept()) and use only the socket descriptor returned by the accept() call. Pedantic Coding --------------- We used htons(portno) but not htonl(INADDR_ANY) - why? - Look at: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#bind and find the paragraph starting "If you are into noticing little things, you might have seen that I didn't put INADDR_ANY into Network Byte Order! Naughty me.". Read the fix; fix your own code. - If you don't fix this, then when you later use some other value than INADDR_ANY here, your code will break. The code is wrong; fix it now! Q: Why isn't the short int AF_INET put into network byte order? my_addr.sin_family = AF_INET; // host byte order serv_addr.sin_addr.s_addr = htonl(INADDR_ANY); // long, network byte order my_addr.sin_port = htons(MYPORT); // short, network byte order - "man bind" refers us to "man 7 ip" which contains these lines: sa_family_t sin_family; /* address family: AF_INET */ u_int32_t s_addr; /* address in network byte order */ u_int16_t sin_port; /* port in network byte order */ "Note that the address and the port are always stored in network byte order. In particular, this means that you need to call htons(3) on the number that is assigned to a port. All address/port manipulation functions in the standard library work in network byte order." - the sin_family is never sent over the network; it doesn't have to be in network byte order Q: Why doesn't the sin_family = AF_INET need to use htonl() or htons()? See "man bind" for the correct cast to use on the second argument to bind() and connect(): "The only purpose of this structure is to cast the structure pointer passed in my_addr in order to avoid compiler warnings." Q: Do you mean AF_INET or PF_INET? I see both - which is correct? - from "man socket" "The manifest constants used under 4.x BSD for protocol families are PF_UNIX, PF_INET, etc., while AF_UNIX etc. are used for address fami- lies. However, already the BSD man page promises: "The protocol family generally is the same as the address family", and subsequent standards use AF_* everywhere." Q: T/F PF_INET and AF_INET are effectively the same thing everywhere References to Notes files (required reading): ------------------------- programming_style.txt header_files.txt makefiles.txt screendumps.txt