-------------------------
Week 13 Notes for CST8165
-------------------------
-Ian! D. Allen - idallen@idallen.ca

Remember - knowing how to find out an answer is more important than
memorizing the answer.  Learn to fish!  RTFM!  (Read The Fine Manual)

-------------------
INDEX to this file:
 - Protocols - Post Office Protocol (POP)
 - Protocols - Reading Mail - Internet Message Access Protocol (IMAP)
 - Current Draft Protocols - Stopping SPAM (Part 1)
-------------------

Comment:
  $ nc -v outmail.algonquincollege.com smtp
  Connection to outmail.algonquincollege.com 25 port [tcp/smtp] succeeded!
  220 mail4.algonquincollege.com -- Server ESMTP (Sun Java System Messaging Server 6.2-7.02 (built Jun 13 2006))
  quit
  quit
  quit
  ...

  - connection hangs after the banner and it appears that it doesn't accept
    any further commands; because, the Sun server demands CR+LF line
    ends, not just LF line ends as given by "nc" (the Sun server is
    RFC-compliant; but, not very liberal in what it accepts!)
  - the fix is to enter ^V<CR><CR> (CTRL-V followed by pushing the
    RETURN key twice) at the end of each line:

  $ nc -v outmail.algonquincollege.com smtp
  Connection to outmail.algonquincollege.com 25 port [tcp/smtp] succeeded!
  220 mail4.algonquincollege.com -- Server ESMTP (Sun Java System Messaging Server 6.2-7.02 (built Jun 13 2006))
  quit^V^M
  221 2.3.0 Bye received. Goodbye.

Q: T/F, the Algonquin SMTP server violates the SMTP RFC by requiring CRLF
   on the end of each line.

Review:
  - SMTP walk-through with comments (was RFC821-based) by Dan Bernstein
  http://cr.yp.to/smtp.html

  RFC2822 - message format - http://cr.yp.to/immhf.html
   - "If you're a new implementor, you'll be shocked at how badly 822
      was designed."

  - RFC2821 standards process "incompetence" by editor Klensin
    http://cr.yp.to/smtp/klensin.html
     - group concensus about HELO/EHLO didn't make the final draft! 

Q: T/F, RFC standards development has been a very organized process.

=============================================================================

Protocols - Post Office Protocol (POP)
--------------------------------------
  http://tools.ietf.org/html/rfc1939   (23 pages)
  http://tools.ietf.org/html/rfc2449   "CAPA extension"

 - version 3:  RFC 1081 -> 1225 -> 1460 -> 1725 -> 1939 
   updated by RFC 1957 (one page observation RTFM!) and 2449 (extensions)

 - specified to use TCP port 110 (Section 3)
 - POP is supposed to stay *SIMPLE* (use IMAP for everything else - Section 1)

Section 3 - Basic Operation
 - eight case-insensitive 3-4 character command keywords (section 3)
 - traditional CRLF line terminators
 - single space separators
 - arguments only up to 40 characters (!) - very short lines
 - only two status indicators: +OK and -ERR (upper case)
   - no way to distinguish between temporary and permanent failure
   - no way to distinguish "not now" from "not implemented"
 - multi-line responses terminated by a single period on a line
   - leading periods are doubled and then must be removed (like SMTP)
   - called "byte-stuffing" (Section 3)
 - a state-oriented protocol
   AUTHORIZATION -> TRANSACTION -> UPDATE
   - must authenticate before issuing transactions
   - update happens *after* the client disconnects
 - MUST not time out before 10 minutes (section 3)
   - a time-out does not trigger an UPDATE - throws away updates

Q: T/F, unlike most Internet protocols, POP3 only requires LF on line ends.
Q: T/F, the POP protocol has different exit codes for temporary and
    permanent failures.
Q: How does the POP protocol handle multi-line server responses (e.g.
   when fetching a message)?
Q: What is "byte-stuffing" with respect to POP3?
Q: Name and describe what happens in each of the three states of a
   POP3 connection.  What triggers the entry into each state?
Q: T/F, if a POP3 client drops the connection, the server skips the
   UPDATE phase.

Authorization/Authentication State
 - each AUTHORIZATION method is optional; but, you must use at least one (!)
 - RFC defines cleartext USER and PASS or APOP methods
 - RFC says "there is no single authentication mechanism that is required
   of all POP3 servers" (!) but Section 9 lists USER and PASS as
   "Minimal POP3 Commands", implying they are required
 - APOP uses md5 and a shared secret
   - see p.16 - you can calculate this cipher in Linux via:
    $ echo -n '<1896.697170952@dbc.mtview.ca.us>tanstaaf' | md5sum
    c4c9334bac560ecc979e58001b3e22fb  -
 - neither USER/PASS nor APOP encrypt the full connection...

Q: T/F, the USER and PASS POP commands set up an encrypted connection.

  http://tools.ietf.org/html/rfc1734 - POP3 AUTH command
    "the client may request authentication types in decreasing order of
     preference, with the USER/PASS or APOP command as a last resort.  (p.2)

    "A protection mechanism provides integrity and privacy protection
     to the protocol session.  If a protection mechanism is negotiated,
     it is applied to all subsequent data sent over the connection.
     The protection mechanism takes effect immediately following the CRLF
     that concludes the authentication exchange for the client, and the
     CRLF of the positive response for the server.  Once the protection
     mechanism is in effect, the stream of command and response octets is
     processed into buffers of ciphertext.  Each buffer is transferred
     over the connection as a stream of octets prepended with a four
     octet field in network byte order that represents the length of
     the following data. (p.2)
 - QUIT is also allowed in Authorization State (Section 4 p.5)

Q: How does POP3 "protection" affect data transfer between client and server?

SASL: Simple Authentication and Security Layer
 - usable via the CAPA extension http://tools.ietf.org/html/rfc2449 (19 pages)
 - see also: SASL use in SMTP http://tools.ietf.org/html/rfc2554 (11 pages)

Transaction State
 - Must handle: STAT, LIST, RETR, DELE, NOOP, RSET, QUIT

Update State (can only be entered from Transaction State)
 - entered *only* via QUIT, never by hangup or disconnect
 - no commands

Section 8: Scaling and Operational Considerations
 - people using POP stores as permanent message archives
  "When these facilities are used in this way by casual users, there has
   been a tendency for already-read messages to accumulate on the server
   without bound.  This is clearly an undesirable behavior pattern from
   the standpoint of the server operator.  This situation is aggravated
   by the fact that the limited capabilities of the POP3 do not permit
   efficient handling of maildrops which have hundreds or thousands of
   messages.

Q: T/F, POPmail scales well to handle hundreds or thousands of messages.

Section 11: Message Format
  "It is important to note that the octet count for a message on
   the server host may differ from the octet count assigned to that
   message due to local conventions for designating end-of-line.
   - the size of the message in the file system may not match the size
     transmitted over the wire (especially for Unix/Linux systems)

Linux Lab work (only works with on-campus access to 10.50.254.230):
  See RFC Section 10: Example POP3 Session

 - Send email to abcd0001@localhost.localdomain using SMTP server 10.50.254.230
   where abcd0001 is replaced by your Algonquin student userid.
   e.g.  $ ./smtpclient.pl -to alleni99@localhost.localdomain -from \
                idallen@idallen.ca -smtpserver 10.50.254.230 -port 25
 - fetch and delete the email using "nc" to the POP3 TCP port.
   - this POP3 server is liberal in accepting LF line ends!
   - login with your Algonquin userid
   - your password is the letter C followed by the last 7 digits of
     your Algonquin student number

Q: Give the minimal set of POP3 commands needed to retrieve and delete
   one message on a POP3 server.

=============================================================================

Protocols - Reading Mail - Internet Message Access Protocol (IMAP)
------------------------------------------------------------------
  http://tools.ietf.org/html/rfc3501   (108 pages)

  - RFC 1730 -> 2060 -> 3501
    updated by RFC 4466 (collected extensions)
    updated by RFC 4468 (CATENATE extension)
    updated by RFC 4551 (conditional STORE, etc.)

  "The Internet Message Access Protocol, Version 4rev1 (IMAP4rev1)
   allows a client to access and manipulate electronic mail messages on
   a server.  IMAP4rev1 permits manipulation of mailboxes (remote
   message folders) in a way that is functionally equivalent to local
   folders.  IMAP4rev1 also provides the capability for an offline
   client to resynchronize with the server."

 - requires any reliable data stream, e.g. TCP  (TCP port 143)

 - too many pages to read!

Q: T/F, both POPmail and IMAP permit remote folders.
Q: Why are most advances in reading email done through changes to IMAP rather
    than changes to POP3?

=============================================================================

Internet Mail Consortium - extensive email archives by topic
  http://www.imc.org/

Current Draft Protocols - Stopping SPAM
---------------------------------------
Overview:
  http://mipassoc.org/csv/CSV-Intro-03dc.pdf

E-mail Authentication
  http://en.wikipedia.org/wiki/E-mail_authentication
   "Ensuring a valid identity on an e-mail has become a vital first
    step in stopping spam, forgery, fraud, and even more serious
    crimes. An essential second step will be ensuring the entity has a
    good reputation. Unfortunately, the Simple Mail Transfer Protocol
    (SMTP) that handles most e-mail today was designed in an era when
    users of the Internet were mostly honest techies who expected
    others to be equally honest. This article will explain how e-mail
    identities are forged and the steps that are being taken now to
    prevent it. 

"Limiting Unsolicited Bulk Email (UBE)"
http://www.imc.org/imc-spam/
   "IMC's members have expressed a strong interest in helping to come
    up with solutions to the problem of unsolicitied bulk email (UBE),
    better known as "spam". The use and abuse of UBE is spreading
    rapidly, and many Internet users are complaining loudly about the
    very negative effects it has on them.

Anti-Spam Recommendations for SMTP MTAs
http://tools.ietf.org/html/rfc2505
 - footnote mentions the Monty Python origin of the term "spam"
 - done at SMTP level:
   "Our basic assumption is that refuse/accept is handled at the SMTP
    layer and that an MTA that decides to refuse a message should do so
    while still in the SMTP dialogue. First, this means that we do not
    have to store a copy of a message we later decide to refuse and
    second, our responsibility for that message is low or none - since we
    have not yet read it in, we leave it to the sender to handle the
    error.

Q: Give two reasons why refusing spam during the SMTP dialog (refusing
   to accept the email) is a Good Thing.

 - suggests using 4xx temporary fail codes; however:
  "However, 4xx Temporary Errors may have unexpected interaction with
   MX-records. If the receiving domain has several MX records and the
   lowest preference MX-host refuses to receive mail with a "451" Response
   Code, the sending host may choose to - and often will - use the next
   host on the MX list.  [...] Our intent was to make the offending
   mail stay at the offending sender's host and fill up his mqueue disk,
   but it all ended up at our friend, the next lowest preference MX-host.

Q: What is a major drawback to refusing spam using SMTP Temporary Errors?

"A Set of Guidelines for Mass Unsolicited Mailings and Postings (spam*)"
http://www.ietf.org/html/rfc2635
 - Section 2 documents the Monty Python origin of the term "spam"

"No anti-UBM measure for SMTP-based Internet mail works"
  http://homepages.tesco.net/~J.deBoynePollard/FGA/smtp-anti-ubm-dont-work.html

History of anti-spam (Sep 04):
  http://www.circleid.com/posts/sender_id_a_tale_of_open_standards_and_corporate_greed_part_i/
  http://www.circleid.com/posts/sender_id_a_tale_of_open_standards_and_corporate_greed_part_ii/

Internet Architecture Board (IAB)
  Internet Research Task Force (IRTF)
    Anti-Spam Research Group (ASRG) 
 - chartered in March 2003
 - the chairs of the ASRG were reluctant to send the idea over to the
   IETF for standardization until it can be better determined that the
   idea actually had merit.
 - Meng Wong, CTO of PoBox.com, forks SPF in Summer 2003
 - ASRG started a dedicated subgroup (LMAP) to merge all the varied proposals
   - failed: trying to do engineering (IETF) instead of research (IRTF)
 - IETF created MARID - MTA Authorization Records In DNS
 - Eventually SPF and Caller-ID proposals would merge in the May of
   2004, and the combination become known as Sender-ID. 
 - Microsoft then revealed it applied for patents on the technology
   - SPF authors considered filing a defensive patent
   "Microsoft is claiming IPR in proposals that may very well not even be
    theirs, which evolved in an open discussion, is asking for a restrictive
    license, and refusing to consider the market truth-that most of email
    server software is FOSS and might not be able to use this standard.
   - http://new.openspf.org/blobs/spf-community-position

IETF MARID group - "MTA Authorization Records In DNS"
  http://en.wikipedia.org/wiki/MARID
  http://www.imc.org/ietf-mxcomp/  mailing list

  http://tools.ietf.org/html/rfc4408 Apr06 experimental
  http://new.openspf.org/RFC_4408/Errata
  http://new.openspf.org/Specifications  (history chart of RFC)
  http://tools.ietf.org/html/draft-ellermann-spf-options-01
  - SPF Version 1
  - applies to the MAIL FROM and HELO SMTP identities only
  - does not try to parse headers in message body

  http://tools.ietf.org/html/rfc4406 Apr06 experimental
   - Sender ID (spf2.0) - Microsoft merger of SPF and Caller ID
   - same idea as other RFC 2822 layer protocols like DomainKeys IM (DKIM)
   - uses PRA - parses RFC2822 headers inside message body

  http://tools.ietf.org/html/rfc4405 Apr06 experimental
   - "Responsible Submitter" SMTP EHLO extension "SUBMITTER"

  http://tools.ietf.org/html/rfc4407 Apr06 experimental
   - Purported Responsible Address (PRA)
   - used by Sender ID

  - no mention of CSV Certified Server Validation in standards

- MARID shutdown September 2004 (7 months!)
  http://www.imc.org/ietf-mxcomp/mail-archive/msg05054.html
   "Concluding a group without it having achieved its goals is never
    a pleasant prospect, and it is always tempting to believe that
    just a small amount of additional time and energy will cause
    consensus to emerge. After careful consideration, however, the
    working group chairs and area advisor have concluded that such
    energy would be better spent on gathering deployment experience.

- comments on MARID failure
  http://www.imc.org/ietf-mxcomp/mail-archive/msg05091.html
   "We need to recognize that the old design philosophy in SMTP 2821,
    a "relaxed internet spirit required for wide deployment with less
    emphasis with security" no longer applies today.

  http://www.imc.org/ietf-mxcomp/mail-archive/msg05055.html
   "Secondly, the co-chairs/AD allowed this working group to try and
    create a standard, instead of standardizing existing practices.
    It is far easier to reach a rough consensus on what people *ARE*
    doing that what people *SHOULD BE* doing.  Even if you don't like
    what people are doing, it is very useful to give clear descriptions
    of what is being done.

  http://new.openspf.org/Press_Release/2005-03-23
  - SPF rejects co-opting efforts by Microsoft Sender ID - March 2005

(continued next week...)