----------------------- Lab #05 for CST8165 due April 14, 2008 (Week 14) ----------------------- -Ian! D. Allen - idallen@idallen.ca - www.idallen.com Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) Global weight: 7% of your total mark this term. Interim submission: Submit what you have done so far in lab on April 7, 2008. Due date: before 16h00 Monday, April 14 (Week 14) Interim submission in Week 13: You will submit whatever progress you have made on this assignment at the end of your weekly lab period in Week 13. The deliverables for this exercise are to be submitted on-line in the Linux Lab T127 using the "cstsubmit" method described in the exercise description, below. No paper; no email; no FTP. Late-submission date: I will accept without penalty exercises that are submitted late but before 16h00 on Thursday, April 17. (*NEW DATE*) After that late-submission date, the exercise is worth zero marks. Exercises submitted by the *due date* will be marked on-line and your marks will be sent to you by email after the late-submission date. You will submit whatever progress you have made in-lab on April 7. The program does not have to work. Submit whatever you have done. If you have code that works at the interim submission lab, I am available in the lab to pre-test your code and check for possible bugs. Exercise Synopsis ----------------- Modify an existing Java-based HTTP server to serve Pig Latin. Document it as you modify it. Test it thoroughly using an automated test script. Coding and submission standards: - provide File Headers (Program Headers) using my Assignment Label - provide Function Headers documenting arguments and return values - use Block Comments (see programming_style.txt) - error messages must have the four-part format from programming_style.txt Where to work: Submissions must make, compile, and run cleanly in the T127 Linux Lab, though you are free to work on them anywhere you like. Resources / Documentation ------------------------- See week12notes.txt : "Coding an HTTP server (Java)" Template from which to start coding a simple HTTP server: http://www.brics.dk/ixwt/examples/FileServer.java (A copy named FileServer.java.txt is in the course notes.) Using the sample code (above), you will implement a basic HTTP RFC 2616 server with Java class name "PigLatinHTTP" that handles the two methods that MUST BE supported by a general-purpose HTTP server. (RFC 2616 Section 5.1.1 p.36) The 145-line FileServer code (above) is a good starting point; but, note its many flaws (including the lack of comments, that you will have to add to the code). This HTTP server does not adhere to the HTTP RFC in many respects - it only reads a single Request lines from a client and then closes the connection. (You don't need to fix this particular error.) Your PigLatinHTTP server must accept HTTP protocol versions 1.0 and 1.1 in requests from clients. (The sample code already does this.) Your server does *not* have to implement parts of HTTP not mentioned in this lab. In particular, you do *not* need to implement these more advanced HTTP features: - NO persistent connections - NO continuation lines - NO URIs with blanks or escapes (e.g. + or %20) - NO reading header lines from a client (only read one client request line) - NO checking for a mandatory "Host:" field (even for HTTP 1.1) The sample HTTP server does not conform to the HTTP client Request protocol; the server only reads one Request line per client and then closes the connection; this is okay. The server need only read a single request line from a client. The given sample code already does this; only one line is read from a client. If in doubt about what you need to implement, ask your instructor. Important HTTP Server Programming Notes: - Remember to return lines to your client ending in CR+LF, not just LF - Remember that error messages should appear on stderr, not stdout: Use System.err.println() not System.out.println(). You can get the program name string using PigLatinHTTP.class.getName() or PigLatinHTTP.class.getSimpleName(). - Make sure you use the ".equals(String)" method to compare Java strings. Code Quality and Portability ---------------------------- A. Your code must work on any machine, no matter what its byte ordering. Always convert bytes between host and network byte order. B. You must include useful comment blocks in all your code, even code you obtain from the Internet (except for the supplied PigLatinTranslator.java file that you do not need to change or document). Code submitted without your added useful comments will not be marked. Credit the sources of any code you are allowed to copy from Internet sources. (Any other copying requires permission.) Reference: Notes file programming_style.txt C. Functions that are not used outside the source file must be declared with local scope. This prevents global name space pollution, misuse by other modules, and documents to the reader that this function is local to this file. D. If you find yourself duplicating code, refactor your source to eliminate the duplication. Perhaps you need to create a function to handle the duplicate code in one place? Move code that is common to both the IF and ELSE clauses to either before or after the statement. Don't write code more than once! E. Marks are awarded for readability and elegance, not just correctness. If your code can't be read, you're useless in a team project. Manual and Automated Testing - autotest_http.sh ---------------------------- This section applies to testing your HTTP server. To test your HTTP server, you don't need a real HTTP client. Indeed, a real HTTP client (such as a web browser) will hide most of the Response lines that you need to see when testing your HTTP server. We will use the "netcat" command to act as our HTTP client, and a script (provided) to drive multiple tests. Using two terminal windows, we can run both our our HTTP server and a netcat HTTP client simultaneously. In one window, we can start our HTTP server on localhost port 55555. In another window, we can start our netcat HTTP client and tell it to connect to localhost port 55555. We can then type HTTP client Requests into the client window, that will be read by our HTTP server running in the server window. To test an HTTP Request line, we can "fake" a Request from an HTTP client to our HTTP server by putting the line we want the fake client to send in a text file and using the file as standard input to netcat: $ java PigLatinHTTP 55555 . & # start our HTTP server [...] $ nc -v localhost 55555 /tmp/foo.txt * $ nc -v localhost 55555 localhost [127.0.0.1] 55555 (?) open * GET /foo.txt HTTP/1.0 HTTP/1.0 200 OK Content-Type: text/plain Date: Mon Mar 19 12:57:21 EDT 2007 Server: IXWT FileServer 1.0 hi Note that this server does not need a blank line after the GET request - it only reads a single line. (This is wrong for HTTP; but, you don't need to fix it for this assignment.) 4) Implement a 30-minute time-out for your server; so, it doesn't hang around forever. Apply method setSoTimeout(30*60*1000) to your open server socket, and catch the SocketTimeoutException and exit while your program is looping. Test this: Set the time-out to 5 seconds and make sure your code catches the time-out exception and exits cleanly. 5) Reduce the number of global variables. Modify the class code to remove all the global class variables except port and wwwhome. Move the removed variables into the function that uses them and pass them as arguments to all other functions that need them. 6) Make the remaining two class global variables "private". This class has no public global variables. 7) Hide the private methods of this class. Make all the methods in this class except main() and PigLatinHTTP() "private". 8) Modify the processRequest() method to return one or two strings on any errors in parsing or accessing the browser Request. Do not call errorReport() - return the error message string(s) instead. Return null if everything worked. Why two strings? Read on: 9) On return from processRequest() use errorReport() to process any non-null error message string(s) returned. This will be the only place in your program where errorReport() will be called; all other uses of errorReport() will have been replaced by returning one or two strings of error message text. When you return two strings from processRequest(), you can pass both strings to errorReport() for output. One string should contain the mandatory RFC Status-Code and (short) Reason Phrase from the RFC section 6.1.1 p.40-41. The other (longer) string may contain more (RFC-optional) explanatory material. Why return two strings? One string contains the Response Status-Code and short Reason Phrase (Section 6.1.1 p.39) that is returned as the first *header* line of a server Response; the other string is a more detailed error message that would be returned in the *body* of the Response message. One header string; one body string. For example, the first string might be the Status-Code and Reason Phrase "404 Not Found" and the second string might be "/tmp/nosuch - The requested URL was not found on this server". Your server might generate this Response message using these two strings (this is just an example - you may format the message more clearly): HTTP/1.0 404 Not Found 404 Not Found - /tmp/nosuch - The requested URL was not found on this server.

/tmp/nosuch - The requested URL was not found on this server.

/tmp/nosuch - The requested URL was not found on this server.
IDAllen IXWT PigLatinHTTP 1.0 at localhost Port 55555
The errorReport() function supplied by the original code doesn't handle the input strings very well; it currently accepts three strings but always concatenates the first two "code" and "title" together; so, why not just pass in two strings instead of three, or use "code" to index a table to find the text Reason Phrase? According to the RFC, the Status-Code and Reason Phrase (e.g. "404 Not Found") should be the only strings printed in the first Response line from your server (as shown in the first line in the above example). The longer string (named "msg") may be printed as further explanatory text in the body of the message seen by the browser (as shown above). RFC Section 6.1.1 p.40-41 has a list of possible Status-Codes and Reason Phrases that you should review for possible use by your server. 10) Modify the processRequest() method to reduce indentation levels, using "return" where appropriate. Now that the method returns on error, you can remove many "else" clauses and shift all that code left, reducing indentation and making the program easier to read. For examples of how to reduce indentation, see Class Notes file deep_indentation.txt 11) Fix the request parsing. It is broken and also not very "liberal": - If the client exits (^C) before issuing any request, the HTTP server generates a Null Pointer exception. Fix this! - "GET//foo.txt HTTP/1.0" should fail as bad syntax, but does not. - "GET /foo.txt HTTP/1.0" could be accepted "liberally"; but, it fails. BE LIBERAL in the syntax you accept in client requests to your server. For example: Allow extra white-space after the method and before the HTTP version string. You may assume for this lab that the URI in the client request does not contain any embedded whitespace. (Blanks would usually be escaped as %20; however, your server doesn't need to handle blanks or escapes in URIs.) Improve the server's URL parsing section to fix the errors and also allow extra blanks before and after the URL. Develop tests that demonstrate that your server is more liberal in its parsing. Hints: The "split()" method of String is useful for parsing into an array. The "trim()" method of String is useful for removing leading and trailing whitespace. 12) Implement the following server Response header fields: Server: (pick a name for your PigLatinHTTP server) Content-Length: (the length in bytes of the file being served) Content-Type: (the MIME type of the file being returned) Date: (the current date - RFC section 14.18) Last-Modified: (the last modified date of the file being served - 14.29) Some of the above fields are already implemented for you. Java note: The java.io.File class has methods to return some of the above required information. The "lastModified()" and "length()" methods of File will be useful. The Date(int) class is useful for converting lastModified() seconds into a printable date string. 13) Continue to use the guessContentTypeFromName method to generate your Content-Type field. (The sample code already does this.) 14) Major Change: If the content type being requested is guessed to be "text/plain", return the "Pig Latin" version of the text in the file, not the regular text. You will find a PigLatinTranslator.java file in the course notes. See the comments at the top of the PigLatinTranslator.java file for how to use it in your PigLatinHTTP class. You can try the translator stand-alone by building and running it: $ javac PigLatinTranslator.java $ java PigLatinTranslator &1 | tee test_out.txt Usage (run all tests; no prompting; no display on screen; use "tail -f"): $ ./autotest_http.sh test_out.txt 2>&1 If you don't use "tee", then in a separate window you can run "tail -f test_out.txt" to see the progress of a test script in writing to the test_out file. Remember to edit autotest_http.sh.txt to update the incomplete list of tests at the bottom before you run it. 24) After you have run your tests, edit, title, and number each test output to match the test titles and numbers in the script. (Of course, you may set up your testing script to do this for you.) 25) Note: Nothing in your code or test structure can include absolute pathnames (except to well-known system files); in particular, you must not reference your own home directory or directories that are not public in your test scripts. Anyone must be able to run your tests on any Linux machine. Assignment Review - review.txt ----------------- In a file named review.txt fill in the following information: 26) Progress: Document how much of the lab you completed and submitted in your interim submission at the end of Week 13. Give the Step Numbers. 27) Completion: For each of the steps in this assignment, document whether you completed the step. Any comments? 28) Objectives: Comment on the assignment objectives vs. the course outline and indicate if, on your opinion, the assignment is relevant to the course and contributes to your learning of the course material. 29) Difficulty Level: Using a scale from 0-10, where 0 is "easy", document how difficult the assignment was, and how much time you spent doing it. Submission and Marking Scheme ----------------------------- Submit the files using the exact file names given below. Make sure you submit *all* the files needed to compile your client at the same time! You can't do partial submissions. Submission Standards (see earlier labs for details): A. At the top of each and every submitted file, as comments, create an Exterior Assignment Submission label. This label identifies the file; it is not a substitute for proper documentation in the file. Your file will still need comments and function headers. B. For material you copy from other sources, credit the author and source. If the comments in the source you copy are not sufficient, you must fix and add to them, just as if you wrote the code yourself. (You do not need to add comments to the PigLatinTranslator.java code.) C. Submit all your source files for marking as Exercise 05 using a *single* cstsubmit command line (always submit all files together): $ ~alleni/bin/cstsubmit 05 \ PigLatinHTTP.java PigLatinTranslator.java \ autotest_http.sh test_out.txt review.txt You must submit at least the files shown. Submit all the files necessary to test and run your HTTP server. Do not submit object files, Java bytecode, or binary files. Note: Nothing in your code or test structure can include absolute pathnames (except to well-known system files); in particular, you must not reference your own home directory. Your code must build and run anywhere. D. To be marked, the files named above must have the exact names given. Code submitted without your added useful comments will not be marked. Add proper comments to *all* the code except the supplied PigLatinTranslator.java file. E. If you aren't sure if you've submitted all the necessary files for your project, change to a new empty directory and use cstsubmit to fetch your submission back, expand it, and run your test script. It should work. F. All files submitted must be named correctly and have assignment headers. Marking Scheme -------------- I) Your Automated Testing : 70% - Your test plan is most of your mark. Even if your code doesn't work perfectly, you can write a comprehensive test script. - File names: autotest_http.sh, and output in test_out.txt - Organize and number the test sections logically in your script and in the output file test_out.txt - Document what should happen for every testable type of good and bad input. - see the list of tests given earlier - Submit your testing output file named "test_out.txt" - You must use and enhance the "autotest_http.sh" automated testing script for at least some of your tests. (You may need to do manual testing too.) - The HTTP RFC 2616 sets down the rules for your server behaviour; but, remember that you don't have to implement the tricky stuff. Ask me if you have any doubt about what to implement. II) Coding Style: 30% - Source File: PigLatinHTTP.java - you must add comments to this file, even for parts of the code you did not write or modify - see note file programming_style.txt - are the comments you add useful in understanding what the code does? - no "useless comments" - see programming_style.txt - comments are in block form, not excessively interleaved with code - sparing use of end-of-line comments (stay within 80 columns!) - do the error messages appear on stderr and contain full information, including the name of the program issuing them? - never say "too many" - print the limit - never say "not enough" - tell exactly what was expected - user input should never be called "illegal" unless it's against the law - is this code easy to read and understand? - neat, organized, well-spaced - is the indentation consistent (Unix tabs are every 8 - "man expand") - is this code easy to change and maintain? - no "magic numbers" - document your constants and offsets - no duplicate code (uses functions for common actions, repeated code) - all input presumed hostile and handled safely - all function and system call return codes checked - "less code is better code" - "be liberal in what you accept" For full marks, all files submitted must be named correctly and have assignment headers.