-----------------------
Lab #06 for CST8165 due March 19, 2007 (Week 11)
-----------------------
-Ian! D. Allen - idallen@idallen.ca - www.idallen.com

Remember - knowing how to find out an answer is more important than
memorizing the answer.  Learn to fish!  RTFM!  (Read The Fine Manual)

Global weight: 7% of your total mark this term.
Due date: before 10h00 AM Monday March 19 (Week 11)

Interim submissions: in your Lab periods in Week 9 and 10
  You will submit whatever progress you have made on this assignment
  before the end of your lab periods in Week 9 and Week 10.

The on-line deliverables for this exercise are to be submitted on-line in
the Linux Lab T127 using the "cstsubmit" method described in the exercise
description, below.  No paper; no email; no FTP.

Late-submission date: I will accept without penalty exercises that are
submitted late but before 12h00 (noon) on Tuesday, March 20.  After that
late-submission date, the exercise is worth zero marks.

Interim work (whatever you have) must be submitted in Week 9 and Week 10.

Exercises submitted by the *due date* will be marked on-line and your
marks will be sent to you by email after the late-submission date.

Code submitted without your added useful comments (no matter what its
source) will not be marked.  Add useful comments to *all* submitted code.

Exercise Synopsis:
------------------
    Starting from a given Perl template, develop a basic "raw" SMTP
    client that sends a text message and handles server response codes.
    Document it as you write it.  Test it thoroughly using an
    automated test script.  Write a "man page" for it.

Where to work:
--------------
    Submissions must run cleanly in the T127 Linux Lab, though you are
    free to develop and work on them anywhere you like.

    If you develop elsewhere, make sure the code works on the Linux Lab
    machines as well; that's where I test it!

Resources
---------
    Start with the working template smtpclient_v1.pl that is available
    in the course notes.  Remove all the "Perl:" comments and related
    code from this example.  Fix the problems marked with XXX comments
    (and update the comments) as you go.

    Built-in Perl functions are explained in "man perlfunc".  In the man
    page pager ("less"), you can search the man page for a function,
    e.g. in "man perlfunc" try typing this search pattern:   /^ *join

    Use the Net::Telnet module for Perl SMTP client-server connections:

       "man Net::Telnet" (may not work in Linux Lab - see below)
       http://search.cpan.org/search?module=Net::Telnet

       On most systems (except the Linux Lab) "man Net::Telnet" gives you
       a man page on using the Net::Telnet Perl module.  I've placed the
       missing man page in the course Notes under perl_net_telnet.txt

    Documentation on the SMTP protocol is in RFC 2821:

       http://tools.ietf.org/html/rfc2821
       ftp://ftp.rfc-editor.org/in-notes/rfc2821.txt
       Sample SMTP session: course Notes file smtp_session.txt

Exit codes for mail clients
---------------------------
    SMTP clients are often used by other programs to send email, and
    those other programs expect to know whether or not the sending of
    the mail worked by checking the command exit status.  (In Perl,
    "exit(23)" will cause your program to exit with exit status 23.)

    Mail clients (including this one) must exit with exit codes conforming
    to the list of exit statuses given in <sysexits.h>.  In Perl, you
    will have to create variables containing the exit status numbers
    you want to use use in your code using the numbers from <sysexits.h>.

    On a temporary or permanent error in your client, have the client
    exit with an appropriate exit value chosen from the numbers
    in <sysexits.h>, e.g. exit($EX_USAGE) for a syntax problem.
    You will need to define variables such as $EX_USAGE in your Perl
    program.
    
    Choose the program exit status based on the the type of error.
    All temporary errors (and only temporary errors that may be retried)
    should make your program exit with status $EX_TEMPFAIL.

    Note: Your program cannot use the die() function to exit; since,
    die() exits with a value not authorized by <sysexit.h>.  You must
    always use warn() followed by exit().

Code Quality and Portability
----------------------------
    You must include useful comment blocks ahead of the code you write or
    modify.  Code submitted without your added useful comments will not
    be marked.

    Marks are awarded for readability and elegance, not just correctness.  
    If your code can't be read, you're useless in a team project.

Manual and Automated Testing
----------------------------
    This section applies to testing your SMTP client.

    To test SMTP responses to your client, you don't need a real SMTP
    server.  You can use the "netcat" command to listen on a port and
    make it act as a fake SMTP server:

       $ nc -v -l localhost 55555             # Red Hat systems
       $ nc -v -l -p 55555 localhost          # Debian systems
    
    Using two windows, you can run both your client and your fake SMTP
    server simultaneously.  In one window, you can start the fake SMTP
    server on localhost port 5555.  In another window, you can start
    your SMTP client and tell it to connect to localhost port 55555.
    You can then type SMTP server responses into the server window,
    that will be read by your client running in the client window.
    (You may need to extend the time-out values in your client to wait
    long enough for you to be able to type the server responses.)

    To test a series of SMTP responses automatically, you can "fake" a
    whole set of responses from an SMTP server to your client by putting
    the lines you want the fake server to send in a text file and using
    the file as standard input to your fake SMTP server:

       $ nc -v -l localhost 55555 <file       # Red Hat systems
       $ nc -v -l -p 55555 localhost <file    # Debian systems

    If you start the above fake SMTP server and then connect your SMTP
    client to port 55555, your client will read, one-by-one, the lines
    from the given file used as input to "netcat", allowing you to test
    your client against a whole list of SMTP server responses.

    You will find in the notes as "autotest_smtp.sh.txt" an automated test
    script that creates netcat servers and tests your client against them.
    The script has a few tests already programmed into it.  You need
    to update the given script to organize these tests and add your
    own tests.  Sample output is in sample_smtp_test_out.txt

    For best results using the autotest_smtp.sh script, add a "sleep 1"
    in your SMTP client after every write to the server, so that the
    fake SMTP server has a chance to echo the line onto the screen before
    your client goes on to do the next line.

Coding the client - smtpclient.pl
-----------------

1)  Read the above sections that you just skipped over, or else you're
    going to be very confused by the rest of this document.

2)  This Perl client must be named "smtpclient.pl".   Spelling counts.

    The client will connect to a given SMTP server on a given port
    and send a text message (read from STDIN) to a given email address.

3)  A working template smtpclient_v1.pl is available in the course notes.
    Remove all the "Perl:" comments and related code from this example.

    Fix the problems marked with XXX comments (and update the comments)
    as you understand and work through the rest of this assignment.

4)  The Perl client must be enhanced to accept and validate the following
    command line argument options: 

       --to  email
       --from  email
       --smtpserver  hostname 
       --port  portnum

    You can handle the argument parsing manually; or, check out a nice
    argument parsing library that will help you: man Getopt::Long

    Your Perl script will be run as follows (option spelling counts):

       $ ./smtpclient.pl --to user@domain.com --from sender@domain.com \
          --smtpserver mail.domain.com  --port 25

    The program must run as described above (command line option names
    etc.)  as I am using a script to test this.  Spelling counts.

5)  The command line arguments may appear in any order.  The argument
    parsing library Getopt::Long does this automatically.  Print a
    good error message and Usage message and exit with an appropriate
    <sysexits.h> error code if argument parsing fails.

6)  Use a default port of 25 if one is not given.

7)  Use a default SMTP host of "localhost" if one is not given.

8)  Use a default "from" of the current userid running the script
    ( in Perl use $ENV{USER} ) with domain "@algonquincollege.com" added.
    (Remember to escape the '\@' inside of Perl strings.)

9)  Do not use any default for a missing --to userid.  If the "to" is
    missing, print an error message, a Usage message and exit the program
    with an appropriate <sysexits.h> error code.

10) The "from" and "to" email userid syntax from the command line should
    be gently validated as best you can.  (Must be printable ASCII,
    contain exactly one '@', no blanks, etc.)  The allowed domain
    characters are given in the RFC Section 4.1.2.  Do try to catch
    obvious errors; don't go overboard on this!  Do something simple
    and come back later if you have time.

11) Also gently validate the syntax of the smtp server argument and
    port number.  Make sure the port argument is alphanumeric (no
    funny chars).  Make the port name/number all lower-case, since
    upper-case names may not match /etc/services names.  Validation isn't
    as important for server and port since the Net::Telnet module will
    tell you immediately if the server or port doesn't exist.

12) Following the Four-Part Algorithm structure given in Notes file
    programming_style.txt, ensure that all your command line argument
    parsing and validation is done *before* you use any of the arguments
    in the main part of your program.  Do not mix argument parsing and
    evaluation with the rest of your code; do it first.

13) Replace all the die() functions in the sample code with warn()
    followed by the appropriate choce of exit() code from <sysexits.h>.
    Your program must always exit with a <sysexits.h> exit status.

14) Write the missing three steps that will supply the SMTP envelope
    addresses and message body and send the message (read from STDIN) to
    the remote SMTP server.

15) After each client command line sent to the SMTP server, your
    client must verify that the command was accepted by the server.
    Check the SMTP response code from the server.
    
    RFC 2821 Section 4.2 explains how a client must handle the server
    responses.  Section 4.2.1 documents the basic behaviour of "an
    unsophisticated SMTP client"; start by writing code to implement
    this basic functionality.

    If you have time, come back and refine the code to examine the
    second digit of the response to provide better error messages.
    Section 4.3.2 lists the possible responses to each client message.
    (Note that there are also three codes that may be returned after *any*
    SMTP command, in "unusual circumstances".)

    On a temporary or permanent error from the SMTP server, have the
    client print the SMTP response line and exit the program with an
    appropriate exit value chosen from the numbers in <sysexits.h>.

16) If the SMTP server limits the size of messages in the SMTP options
    given in response to EHLO, extract and save this limit for later use.
    Treat a limit of zero as no limit.  I've done much of the work
    of parsing the value for you; you just have to use it.  The Perl
    function length($str) returns the length of a string.

17) The message text to send will come from standard input.  (Of course,
    you may use shell input redirection from a pipe or file, or type
    it at the keyboard.)  A sample Perl program that prompts and reads
    standard input is in the course Notes under read_stdin.pl.txt

    You do *not* have to verify the semantic content or header lines of
    the message read from standard input (RFC2822).  Your program will
    accept anything and pass it to the SMTP server.  But note:

18) For the message being sent, you must follow (1) the period escaping
    rules given in RFC2821 Section 4.5.2, (2) the length limits given
    for SMTP text lines in section 4.5.3.1, and (3) any message maximum
    SIZE limit given by the server in response to EHLO.

    If you detect an input line longer than the RFC-defined SMTP limit,
    print a good error message to the user; but, let the line pass anyway.

    The precise meaning of the EHLO SIZE extension is defined here:
    http://tools.ietf.org/html/rfc1870 (see section 3, p. 2).

19) When you detect that you cannot send another line of data to the
    SMTP server because you have reached the overall message size limit,
    print a good error message to the user and discard/ignore the rest
    of the message that is oversize.  Send the partial message.

20) If you find yourself duplicating code, re-factor your source and
    create a function to handle the duplicate code in one place.

Code submitted without your added useful comments (no matter what
its source) will not be marked.  Add comments to *all* the code.

Test Plan - README.txt
---------

21) Develop a written Test Plan named "README.txt" for your client.
    See the sample_smtp_README.txt file in the course Notes.

    The object of a Test Plan is to prove to the reader that your code
    works under most possible combinations of inputs (or at least as
    many as you can reasonably test).

    Make the file easy to read.  Lines must be shorter than 80 columns.

    At minimum you need to test these major areas in your client:

    - command line parsing and argument validation
    - connections to local and remote SMTP servers
    - handling of four major categories of SMTP response codes
    - other tests?

    Each test documented in your README.txt file will have:

    0) unique test number (also must appear in test_out.txt)
    a) purpose of this test
    b) parameters/input used to perform the test
    c) expected results
    d) actual results
    
    If you use an automated test script for parts of your test plan,
    you may refer the reader to that script file and its output, by
    test number.  You don't need to duplicate test information between
    the various files as long as you cross-reference it; having it in
    one file is sufficient if it is properly cross-referenced by test
    number in all the other files.  The README.txt, autotest_smtp.sh,
    and test_out.txt files must contain the same test numbers.

    For best results using the autotest_smtp.sh script, add a "sleep 1"
    in your SMTP client after every write to the server, so that the
    fake SMTP server has a chance to echo the line onto the screen before
    your client goes on to do the next line.

    See the autotest_smtp.sh.txt file in the course Notes.
    See the sample_smtp_README.txt file in the course Notes.
    See the sample_smtp_test_out.txt file in the course Notes.

    Partial test plans are work partial marks.

22) Organize the tests in your README.txt by category.  Use headings.
    Make the file easy to read.  Lines must be shorter than 80 columns.
    Every test must have a unique name or number that appears in both
    the README.txt file and the test_out.txt testing results file.

23) Implement the test plan.  Run your tests.  Record the output of
    applying your test plan in file output file "test_out.txt".  To run
    multiple tests, you may copy (with credit) and modify the automated
    testing script "autotest_smtp.sh.txt" available in the Notes area.
    See the section above on Automated Testing.

    Redirect the output of your test script "autotest_smtp.sh" into
    test_out.txt and submit it with your files for marking.  Using a shell
    script to run most of your tests will make your testing much easier.

    Usage (prompting and displaying copy of tests on screen):

	$ ./autotest_smtp.sh 2>&1 | cat -v | tee test_out.txt

    Usage (run all tests; no prompting; no display on screen; use "tail -f"):

	$ ./autotest_smtp.sh </dev/null 2>&1 | cat -v >test_out.txt

    If you don't use "tee", then in a separate window you can run "tail
    -f test_out.txt" to see the progress of a test script in writing to
    the test_out file.

24) After you have run your tests, edit, title, and number each test
    output to match the test titles and numbers in your README.txt file.

    You don't have to copy the test output into the README.txt file if
    you can refer to the test output in test_out.txt by name or number.

Writing the "man page" - smtpclient.txt
----------------------

25) Write a text "man page" file named smtpclient.txt for your smtpclient.pl
    program that has the following standard man page headings:

    NAME
    SYNOPSIS
    DESCRIPTION
    ENVIRONMENT
    AUTHOR
    REPORTING BUGS
    COPYRIGHT
    SEE ALSO

    Use "man date" as your model for each of these sections.
    Lines must be shorter than 80 columns.

    (Optional Note: Unix man pages are actually written using a mark-up
    language named "troff", "nroff", or "groff" that is processed
    to create the on-line text man pages you see with the "man"
    command.  You will find the nroff source to "man date" in the file
    /usr/share/man/man1/date.1.gz and you are free to optionally write
    your own man page by editing and modifying that nroff source format.
    Once you have your page written in nroff format, you can format
    it using "man ./file" where "./file" is tha pathname to your man
    page source.  If the argument to man has a slash, it is taken to
    be an nroff source file to format.  Writing in nroff source format
    is optional.)

Assignment Review - review.txt
-----------------

In a file named review.txt fill in the following information:

26) Completion: For each of the 28 steps in the assignment, document
    whether you completed the step.

27) Objectives: Comment on the assignment objectives vs. the course
    outline and indicate if, on your opinion, the assignment is relevant
    to the course and contributes to your learning of the course material.

28) Difficulty Level: Using a scale from 0-5, where 0 is "easy", document
    how difficult the assignment was, and how much time you spent doing it.

Submission and Marking Scheme
-----------------------------

Submission Standards (see earlier labs for details):

A.  At the top of each and every submitted file, as comments, create an
    Exterior Assignment Submission label.  This label identifies the
    file; it is not a substitute for proper documentation in the file.
    Your file will still need comments and function headers.

B.  For material you copy from other sources, credit the author and source.
    If the comments in the source you copy are not sufficient, you must
    fix and add to them, just as if you wrote the code yourself.

C.  Submit all your source files for marking as Exercise 06 using a
    *single* cstsubmit command line (always submit all files together):

      $ ~alleni/bin/cstsubmit 06 smtpclient.pl smtpclient.txt \
          README.txt test_out.txt autotest_smtp.sh review.txt

    You must submit six files.  Submit all the files necessary to test
    and run your SMTP client.  Do not submit object files or binary files.

    Note: Nothing in your code or test structure can include absolute
    pathnames (except to well-known system files); in particular, you
    must not reference your own home directory.

D.  To be marked, the files named above must have the exact names given.
    Code submitted without your added useful comments (no matter what its
    source) will not be marked.  Add proper comments to *all* the code.

E.  If you aren't sure if you've submitted all the necessary files for your
    project, change to a new empty directory and use cstsubmit to fetch your
    submission back, expand it, and run your test script.  It should work.

F.  All files submitted must be named correctly and have assignment headers.

Marking Scheme
--------------

I) User Manual - "man page" in smtpclient.txt : 10%
  - Man page file name:  smtpclient.txt
  - follow the heading format given in "man date"
  - How do you use your SMTP client program?
  - What are the inputs and outputs?
  - What exit status is returned?
  - What environment variables or external data (e.g. hostname) are used?
  - What other programs are similar to this or useful in conjunction with this?

II) Your Automated Testing and README.txt : 60%
  - Your test plan is most of your mark.  Even if your code doesn't
    work, you can write a comprehensive test plan.
  - File names:  README.txt, autotest_smtp.sh, and output in test_out.txt
  - Number the sections in README.txt and test_out.txt; refer to them in
    your testing, and vice-versa (cross reference both files).
  - Your autotest_smtp.sh and its output must refer to the numbered
    sections in your README file.
  - Your README.txt file must refer to the numbered sections in your
    autotest_smtp.sh and its test_out.txt output file.
  - Document what should happen for every possible type of good and bad input.
     - command line parsing and argument validation
     - connections to local and remote SMTP servers
     - handling of four major categories of SMTP response codes
     - other tests? 
  - Submit your testing output file named "test_out.txt"
  - You must use and enhance the "autotest_smtp.sh" automated testing script.
  - The SMTP RFC 2821 sets down the rules for your client behaviour.

III) Coding Style: 30%
  - see note file programming_style.txt
  - are the comments you add useful in understanding what the code does?
  - no "useless comments" - see programming_style.txt
  - comments are in block form, not excessively interleaved with code
  - sparing use of end-of-line comments (stay within 80 columns!)
  - do the error messages appear on stderr and contain full information,
    including the name of the program issuing them?
    - never say "too many" - print the limit
    - never say "not enough" - tell exactly what was expected
    - user input should never be called "illegal" unless it's against the law
  - is this code easy to read and understand?
    - neat, organized, well-spaced
    - is the indentation consistent (Unix tabs are every 8 - "man expand")
  - is this code easy to change and maintain?
    - no "magic numbers" - document your constants and offsets
    - no duplicate code (uses functions for common actions, repeated code)
  - all input presumed hostile and handled safely
  - all function and system call return codes checked
  - all exit statuses chosen from <sysexits.h>
    - no use of die() in this program!
  - "less code is better code"
  - "be liberal in what you accept"