================================= Programming Conventions and Style ================================= -Ian! D. Allen - idallen@idallen.ca Programming as Literature ------------------------- When writing programs, you are not simply trying to "get it to work"; but, you are also (and most importantly in an academic setting) practicing and demonstrating good programming techniques. These techniques will make it possible for other people to read and understand your programs after you have written them. Software maintenance costs far exceed the costs of developing the original code. Employers tell us that programming techniques are more important than the list of which languages you know. Do you have good style? Good programming techniques include such things as: * Correct documentation (using guidelines given here) - internal documentation (labelling, function headers, etc.) - external documentation (man pages, user guides, testing, etc.) * Writing clean, simple, robust code: "Less code is better code"! - checking the return status of important functions and syscalls - choosing the correct loop and branch control structures - rewriting the algorithm to minimize duplicated code - making it right before you make it clever or fast * Using correct indentation, style, and mnemonic names - variable and file names reflect their use in the algorithm * Using correct modularization and scope (i.e. local vs. global variables) * Correct use of arguments and/or prompting for user input - some programs will prompt the user for input if no arguments are given - some programs will read standard input if no file names are given * Correct use of exit codes on success/failure of the program * Presenting accurate and useful help if user makes an error: 1. error messages must appear on standard error, not on standard output 2. error text must contain the name of the program issuing the error (use argv[0] or $0, never hard-code the name) 3. always echo the input that is in error back to the user, and 4. give a hint as to the valid input that was expected, e.g.: "myprog: Age 23 is not between 12 and 18" These criteria will be assessed in marking your programs. You must test the return codes of important functions and syscalls. Your program must exit with a non-zero status code on a serious error. Prompts and error messages must be sent to Standard Error (cerr, stderr, System.err), not to Standard Output (cout, stdout, System.out). Internal Documentation - Comments --------------------------------- Internal documentation is documentation placed in the same file as the actual code you are writing, in the form of comments. You must have internal documentation for any code you write. See the section below on comment style. Coding style ------------ A program written in an idiosyncratic style is hard to read and hard to maintain. You have to write code that matches the style of the code in your programming environment. In the absence of an existing code base, your style must match the published standards for your course or program of study. Lines in source files should be kept to 80 characters or less. This makes it easy to do side-by-side revision comparisons between files. (For a large real-world example of 80-column style, see the Linux kernel coding standard: http://lxr.linux.no/source/Documentation/CodingStyle ) Don't put the name of a program inside the program, since the name will be wrong if you rename the executable. Use argv[0] or $0 instead. Program arguments and execution syntax - man page style ------------------------------------------------------- The "man page style" for program execution syntax uses {} to surround mandatory items and [] to surround optional items. A program with one mandatory item that is either a file or a directory and one optional item that is one or more userids would have a Syntax line: $0 {file|directory} [userid...] Multiple optional arguments must nest: $0 arg [ opt1 [ opt2 [ opt3 ] ] ] which means that if you want to specify opt3 you have to specify opt1 and opt2 before it. All are shown as optional. Typical Four-Part Algorithm Structure ------------------------------------- Functonally, most Unix commands typically have this form: 1) Input Stage - verify the correct number of command-line arguments - optionally prompt for any missing arguments - complain about missing or extra arguments and exit non-zero 2) Validation Stage - validate the input: make sure numbers are numbers, files are files, directories are directories, etc. - test to make sure arguments are not empty or null strings - test to make sure things exist and are accessible - test to make sure things are readable/writable 3) Processing Stage - run your algorithm on your validated input - solve the problem; generate the results - servers may stay at this stage "forever" until killed 4) Output Stage - display the output Don't mix up code among the four stages. Keep them distinct: Do not duplicate your Processing Stage or Validation Stage for various combinations of Input arguments. Where possible, collect all the input first; then Validate it *once*, then write your Processing code *once*. Do not duplicate your Output Stage for various combinations of Input arguments. Generate your results and display the results *once*, at the end of the program. Where possible, don't duplicate similar kinds of output statements all over your code. Write one output statement that uses variables to contain the variable part of the output. Less code is better code. Prompts and Error Messages -------------------------- Issue prompts on standard error before reading any input from the user; otherwise, the user won't know what to enter, or when. The prompt should explain exactly what kind of input is expected; don't just say "Enter input". Menus are also considered to be part of the prompts, and must also appear on standard error. Always issue prompts and error messages on stderr (not on stdout), so that the prompts and error messages don't get redirected into output files. After reading input, it is often a good idea to echo the input back to the user, so that they know what was entered. This is called "echoing the input back to the user" and improves the readability of test runs. Optional: Your program doesn't need to prompt the user to enter input if standard input is detected to be coming from something that is not a terminal keyboard (e.g. it might be coming from a pipe or redirected from a file). See "man isatty". Error Messages -------------- Error messages must obey these four rules: 1. error messages must appear on standard error, not standard output 2. give the name of the program that is issuing the message (from argv[0]) 3. state what input was expected (e.g. "expecting one file name") 4. display what the user actually entered (e.g. "found 3 (a b c)" *Never* say just "illegal input" or "invalid input" or "too many". Always specify how many is "too many" or "too few": myprog: Expecting 3 file names; found 4 (a b c d) myprog: Student age 4 is not between 12 and 18 myprog: Modify days -2 is less than zero After detecting an error and issuing a message, the usual thing to do is to exit the program with a non-zero return code. Don't keep processing bad data! Commenting Programs - no useless comments ----------------------------------------- Comments should add to a programmer's understanding of the code. They don't comment on the syntax or language mechanism used in the code; since, both these things are obvious to programmers who know the language. (Don't comment that which is obvious to anyone who knows the language.) "Programmer" comments deal with what the line of code means in the *algorithm* used (the "why"), not with syntax or how the language *works*. Comment on "why" this code is here, not "how" it works. "Teacher" or "Instructor" comments talk about how the *language* works, not about what the code means in the algorithm. Do not include Instructor comments in your own code - I already know what the language means. Thus: Do not use comments that state things that relate to the syntax or language mechanism used and are obvious to a programmer, e.g. # THESE COMMENTS BELOW ARE OBVIOUS AND NOT HELPFUL COMMENTS: # x=23 # set x to 23 <== OBVIOUS; NOT HELPFUL open(file,mode) # open the file <== OBVIOUS; NOT HELPFUL close(file) # close the file <== OBVIOUS; NOT HELPFUL x=sqrt(y) # take the sqrt of y <== OBVIOUS; NOT HELPFUL list.head = NULL # set head of list to NULL <== etc. Better, useful programmer-style comments: loop=argc # initialize loop index to max num arguments open(acctlog,READ_ONLY) # open the payroll accounting log r/o stdev=sqrt(variance) # get stdev of student test scores close(pipe) # child process must close this side of pipe Do not copy "instructor-style" comments into your code. Instructor-style comments are put on lines of code by teachers to explain the language and syntax functions to people unfamiliar with programming (e.g. to students of the language). Instructor-style comments are "obvious" comments to anyone who knows how to program; they should never appear in your own programs (unless you become an instructor!). Code Commenting Style --------------------- Comments should appear on their own lines, indented to match the code to which they apply. In rare cases you can add comments to the ends of lines; but, doing so restricts the length of the comments and may make your code harder to edit and maintain by other people. Comments should be grouped in blocks, ahead of blocks of related code to which the comments apply. Do not alternate comments and single lines of code, or put many comments off to the right of single lines of code. Doing so makes the code hard to read. Block comments and code are easier to read. (Occasional long lines and end-line comments are tolerated.) Here is a Before/After example: ------------------------------------------------------------------------------ ----- Before ----------------------------------------------------------------- ------------------------------------------------------------------------------ /* This function take a command (starts with /) and parse and executes it */ void parsecmd(char *msg, size_t size, int fdmax, char ip_addr[][SIZE_IP],int sender){ int startofmsg; // int i; int sent=0; // This variable is used to track if the msg was sent or not /*The fact that MAX_ARG is 256 this sets a limit of 255 for nicknames */ char arg[MAX_ARG]; // Used to store IP in case of /msg or new nickname in case of /nick char pvt_msg[MAX_MSG]; // The pvt msg that will sent to a client if( strncasecmp("/msg",msg,SIZE_MSG) == 0){//Check if the first 5 characters represent the /msg (message) command if((startofmsg=striparg(arg,msg+SIZE_MSG,size-SIZE_MSG))==NOARG){//Get the IP and where the msg starts sprintf(pvt_msg,"ERR:Your /msg messsage was mal-formulated. The message was not sent\n"); sendall(sender,pvt_msg,strlen(pvt_msg));// return error msg to the sender return; } startofmsg += SIZE_MSG; // This gives us where the msg starts in src // For syntax reason, we assume that there's a \n at the end of this line // Also every new line must start with CHT: or it won't be reconized sprintf(pvt_msg,"CHT:Private message from %s:\nCHT:%.*s",ip_addr[sender-STD_FDS-1],size-startofmsg,msg+startofmsg);// Make the private msg for(i=0; iclient chat msg protocol prefix #define ERR_PREFIX "ERR:" // server->client error protocol prefix /* * Recognize the leading /command word on a string and process the line. * cstr: the string to parse from the attached client * clen: length of cstr (cstr may not end in \0) * fdmax: file descriptor max, used by /msg to loop over connected clients * iplist: list of known IP addresses for each connected client fd * clifd: fd of the client that sent cstr */ void parsecmd(char *cstr, size_t clen, int fdmax, char iplist[][SIZE_IP], int clifd) { char pvt_msg[MAX_MSG]; // msg that will sent to a client #define IS_COMMAND(cmd,str) (strncasecmp(cmd,str,sizeof(cmd)-1) == 0) /* Command: /msg [message text] * Send the message text to the given client */ if( IS_COMMAND("/msg",cstr) ){ int fd; // file descriptor int startofmsg; // index of byte after IP address int sent = 0; // was message sent to any client char ipaddr[MAX_ARG]; // parsed IP address after /msg /* Chop out string immediately following /msg into ipaddr. * Bug: doesn't check that /msg is followed by whitespace: * /msgxxxx puts xxxx in IP and ignores actual IP * Bug: striparg can't check for buffer overflow of ipaddr * Bug: ipaddr may end up with \n on the end from striparg; * so, "/msg 127.0.0.1\n" fails to match 127.0.0.1 and * compensating code had to be added below */ startofmsg = striparg(ipaddr, cstr+SIZE_MSG, clen-SIZE_MSG); if( startofmsg == NOARG ){ sprintf(pvt_msg, "%sYour /msg messsage was " "mal-formulated. The message was not sent\n", ERR_PREFIX); sendall(clifd, pvt_msg, strlen(pvt_msg)); return; } /* Copy the client IP address and message into pvt_message. * Have to add back SIZE_MSG bytes to get actual offset * of message start inside cstr so that client message * starts at: cstr+startofmsg * Bug: no check for buffer overflow of pvt_msg */ startofmsg += SIZE_MSG; sprintf(pvt_msg, "%sPrivate message from %s:\n%s%.*s", MSG_PREFIX, iplist[clifd-STD_FDS-1], MSG_PREFIX, clen-startofmsg, cstr+startofmsg); /* Loop over all clients trying to match ipaddr to IP list. * Send the message to all matches and note success. */ for( fd = 0; fd < fdmax; fd++ ){ if( strcmp(iplist[fd], ipaddr) == 0 ){ sendall(fd+STD_FDS+1,pvt_msg,strlen(pvt_msg)); sent = 1; } } if( sent == 0 ){ /* Message was never sent: send error back to client. * Bug: have to compensate for bug where ipaddr * ends in \n (see striparg Bug above) */ if( ipaddr[strlen(ipaddr)-1] == '\n' ) ipaddr[strlen(ipaddr)-1] = ' '; sprintf(pvt_msg, "%sNo one was found with the IP: %s\n", ERR_PREFIX, ipaddr); sendall(clifd, pvt_msg, strlen(pvt_msg)); } } /* Future /command recognition code would go here: * else if( IS_COMMAND("/nick",cstr) ) */ else { // Bug: Doesn't return what client typed back to client sprintf(pvt_msg, "%sCommand unrecongnised\n", ERR_PREFIX); sendall(clifd, pvt_msg, strlen(pvt_msg)); } return; }