-------------------------
Week 12 Notes for CST8165
-------------------------
-Ian! D. Allen - idallen@idallen.ca

Remember - knowing how to find out an answer is more important than
memorizing the answer.  Learn to fish!  RTFM!  (Read The Fine Manual)

Keep up on your readings (Course Outline: average 4 hours/week homework)

[************************************************************]
[************************************************************]
[*** Students should be taking their own notes in class   ***]
[*** and updating them with my published summaries.       ***]
[************************************************************]
[************************************************************]

===============================================================================

Testing - black box vs. white box, "behavioral" vs. "structural"
-------
 - I don't have time to read and test all your code; you have to do it

  http://www.scism.sbu.ac.uk/law/Section5/chap3/s5c3p23.html

   "White box testing is concerned only with testing the software
    product, it cannot guarantee that the complete specification
    has been implemented. Black box testing is concerned only with
    testing the specification, it cannot guarantee that all parts
    of the implementation have been tested. Thus black box testing
    is testing against the specification and will discover faults of
    omission, indicating that part of the specification has not been
    fulfilled. White box testing is testing against the implementation
    and will discover faults of commission, indicating that part of the
    implementation is faulty. In order to fully test a software product
    both black and white box testing are required."

  http://www.faqs.org/faqs/software-eng/testing-faq/section-13.html

   "One has to use a mixture of different methods so that they aren't
    hindered by the limitations of a particular one.  Some call this
    "gray-box" or "translucent-box" test design, but others wish we'd
    stop talking about boxes altogether."

Looking at the FileServer white-box style:
  http://www.brics.dk/ixwt/examples/FileServer.java

  - what tests exercise every line of code, especially each of the exceptions?

Q: What is the difference between white-box and black-box testing of a
    piece of code?  Give the advantages and disadvantages of each method,
    especially with regard to testing the specification.

==============================================================================

see Notes: Mail Systems Terminology - mail_systems_terms.txt

===============================================================================

Protocols - Reading Mail - Post Office Protocol (POP)
-----------------------------------------------------
  http://tools.ietf.org/html/rfc1939   (23 pages)
   - note the "Errata" link
   - version 3:  RFC 1081 -> 1225 -> 1460 -> 1725 -> 1939 
     updated by RFC 1957 (one page observation RTFM!) and 2449 (extensions)
   - specified to use TCP port 110 (Section 3)
   - POP is supposed to stay *SIMPLE* (use IMAP for everything else - Section 1)
   - example:  http://tools.ietf.org/html/rfc1939#page-19

  http://tools.ietf.org/html/rfc2449   "19 pages - CAPA extension"
  - on extending POP3 (RFC 2449 intro and section 7):
   "This extension to the POP3 protocol is to be used by a server to
    express policy descisions taken by the server administrator.  It is
    not an endorsement of implementations of further POP3 extensions
    generally.  It is the general view that the POP3 protocol should stay
    simple, and for the simple purpose of downloading email from a mail
    server.  If more complicated operations are needed, the IMAP protocol
    [RFC 2060] should be used.

    Future extensions to POP3 are in general discouraged, as POP3's
    usefulness lies in its simplicity.  POP3 is intended as a download-
    and-delete protocol; mail access capabilities are available in IMAP
    [IMAP4].  Extensions which provide support for additional mailboxes,
    allow uploading of messages to the server, or which deviate from
    POP's download-and-delete model are strongly discouraged and unlikely
    to be permitted on the IETF standards track.

    Clients MUST NOT require the presence of any extension for basic
    functionality, with the exception of the authentication commands"

Q: Why are extensions to POP3 discouraged?

Section 3 - Basic Operation
 - eight case-insensitive 3-4 character command keywords (section 3)
 - traditional CRLF line terminators
 - single space separators
 - arguments only up to 40 characters (!) - very short lines
 - only two status indicators: +OK and -ERR (upper case)
   - no way to distinguish between temporary and permanent failure
   - no way to distinguish "not now" from "not implemented"
 - multi-line responses terminated by a single period on a line
   - leading periods are doubled and then must be removed (like SMTP)
   - called "byte-stuffing" (Section 3 page 3)
 - a state-oriented protocol
   AUTHORIZATION -> TRANSACTION -> UPDATE
   - must authenticate before issuing transactions
   - update happens *after* the client disconnects
 - MUST not time out before 10 minutes (section 3 page 4)
   - a time-out does not trigger an UPDATE - throws away updates

Q: T/F, unlike most Internet protocols, POP3 only requires LF on line ends.
Q: T/F, the POP protocol has different exit codes for temporary and
    permanent failures.
Q: How does the POP protocol handle multi-line server responses (e.g.
   when fetching a message)?
Q: What is "byte-stuffing" with respect to POP3?
Q: Name and describe what happens in each of the three states of a
   POP3 connection.  What triggers the entry into each state?
Q: T/F, if a POP3 client drops the connection, the server skips the
   UPDATE phase.

Authorization/Authentication State (Section 4 page 4)
 - each AUTHORIZATION method is optional; but, you must use at least one (!)
 - RFC defines cleartext USER and PASS or APOP methods
 - RFC says "there is no single authentication mechanism that is required
   of all POP3 servers" (!) but Section 9 lists USER and PASS as
   "Minimal POP3 Commands", implying they are required
 - APOP uses md5 and a shared secret
   - see p.16 - you can calculate this cipher in Linux via:
    $ echo -n '<1896.697170952@dbc.mtview.ca.us>tanstaaf' | md5sum
    c4c9334bac560ecc979e58001b3e22fb  -
 - neither USER/PASS nor APOP encrypt the full connection...

Q: T/F, the USER and PASS POP commands set up an encrypted connection.

  http://tools.ietf.org/html/rfc1734 - POP3 AUTH command
    "the client may request authentication types in decreasing order of
     preference, with the USER/PASS or APOP command as a last resort.  (p.2)

    "A protection mechanism provides integrity and privacy protection
     to the protocol session.  If a protection mechanism is negotiated,
  *  it is applied to all subsequent data sent over the connection.
     The protection mechanism takes effect immediately following the CRLF
     that concludes the authentication exchange for the client, and the
     CRLF of the positive response for the server.  Once the protection
     mechanism is in effect, the stream of command and response octets is
     processed into buffers of ciphertext.  Each buffer is transferred
     over the connection as a stream of octets prepended with a four
     octet field in network byte order that represents the length of
     the following data. (p.2)
 - QUIT is also allowed in Authorization State (Section 4 p.5)

Q: How does POP3 "protection" affect data transfer between client and server?

SASL: Simple Authentication and Security Layer
 - usable via the CAPA extension http://tools.ietf.org/html/rfc2449 (19 pages)
 - see also: SASL use in SMTP http://tools.ietf.org/html/rfc2554 (11 pages)

Authorization State
 - all authorization methods are optional; but, one must be supported

Transaction State
 - Must handle: STAT, LIST, RETR, DELE, NOOP, RSET, QUIT

Update State (can only be entered from Transaction State)
 - entered *only* via QUIT, never by hangup or disconnect
 - no commands

Section 8: Scaling and Operational Considerations
 - people using POP stores as permanent message archives
  "When these facilities are used in this way by casual users, there has
   been a tendency for already-read messages to accumulate on the server
   without bound.  This is clearly an undesirable behavior pattern from
   the standpoint of the server operator.  This situation is aggravated
   by the fact that the limited capabilities of the POP3 do not permit
   efficient handling of maildrops which have hundreds or thousands of
   messages.

Q: T/F, POPmail scales well to handle hundreds or thousands of messages.

Section 11: Message Format
  "It is important to note that the octet count for a message on
   the server host may differ from the octet count assigned to that
   message due to local conventions for designating end-of-line.
   - the size of the message in the file system may not match the size
     transmitted over the wire (especially for Unix/Linux systems)

Q: Give the minimal set of POP3 commands needed to retrieve and delete
   one message on a POP3 server.

=============================================================================

Protocols - Reading Mail - Internet Message Access Protocol (IMAP)
------------------------------------------------------------------
  http://tools.ietf.org/html/rfc3501   (108 pages)

  - RFC 1730 -> 2060 -> 3501
    updated by RFC 4466 (collected extensions)
    updated by RFC 4468 (CATENATE extension)
    updated by RFC 4551 (conditional STORE, etc.)

  "The Internet Message Access Protocol, Version 4rev1 (IMAP4rev1)
   allows a client to access and manipulate electronic mail messages on
   a server.  IMAP4rev1 permits manipulation of mailboxes (remote
   message folders) in a way that is functionally equivalent to local
   folders.  IMAP4rev1 also provides the capability for an offline
   client to resynchronize with the server."

 - requires any reliable data stream, e.g. TCP  (TCP port 143)

 - too many pages to read!

Q: T/F, both POPmail and IMAP permit remote folders.
Q: Why are most advances in reading email done through changes to IMAP rather
    than changes to POP3?

=============================================================================

Internet Mail Consortium - extensive email archives by topic
  http://www.imc.org/

Current Draft Protocols - Stopping SPAM
---------------------------------------
Overview:
  http://mipassoc.org/csv/CSV-Intro-03dc.pdf

E-mail Authentication
  http://en.wikipedia.org/wiki/E-mail_authentication
   "Ensuring a valid identity on an e-mail has become a vital first
    step in stopping spam, forgery, fraud, and even more serious
    crimes. An essential second step will be ensuring the entity has a
    good reputation. Unfortunately, the Simple Mail Transfer Protocol
    (SMTP) that handles most e-mail today was designed in an era when
    users of the Internet were mostly honest techies who expected
    others to be equally honest. This article will explain how e-mail
    identities are forged and the steps that are being taken now to
    prevent it. 

"Limiting Unsolicited Bulk Email (UBE)"
http://www.imc.org/imc-spam/
   "IMC's members have expressed a strong interest in helping to come
    up with solutions to the problem of unsolicitied bulk email (UBE),
    better known as "spam". The use and abuse of UBE is spreading
    rapidly, and many Internet users are complaining loudly about the
    very negative effects it has on them.

Anti-Spam Recommendations for SMTP MTAs
http://tools.ietf.org/html/rfc2505
 - footnote mentions the Monty Python origin of the term "spam"
 - done at SMTP level:
   "Our basic assumption is that refuse/accept is handled at the SMTP
    layer and that an MTA that decides to refuse a message should do so
    while still in the SMTP dialogue. First, this means that we do not
    have to store a copy of a message we later decide to refuse and
    second, our responsibility for that message is low or none - since we
    have not yet read it in, we leave it to the sender to handle the
    error.

Q: Give two reasons why refusing spam during the SMTP dialog (refusing
   to accept the email) is a Good Thing.

 - suggests using 4xx temporary fail codes; however:
  "However, 4xx Temporary Errors may have unexpected interaction with
   MX-records. If the receiving domain has several MX records and the
   lowest preference MX-host refuses to receive mail with a "451" Response
   Code, the sending host may choose to - and often will - use the next
   host on the MX list.  [...] Our intent was to make the offending
   mail stay at the offending sender's host and fill up his mqueue disk,
   but it all ended up at our friend, the next lowest preference MX-host.

Q: What is a major drawback to refusing spam using SMTP Temporary Errors?