-------------------------
Week 13 Notes for CST8165
-------------------------
-Ian! D. Allen - idallen@idallen.ca - www.idallen.com

Remember - knowing how to find out an answer is more important than
memorizing the answer.  Learn to fish!  RTFM!  (Read The Fine Manual)

Keep up on your readings (Course Outline: average 4 hours/week homework)

=============================================================================

Level 6 Students April 9 deadline:

    Mxi Technologies Ltd.

    Mxi Technologies delivers meaningful improvements in labor
    productivity through integrated, intelligent aviation maintenance
    software. Dan Bourguigon, Sr. Customer Support Analyst, and Karen
    Turner, Talent Acquisition Specialist of Mxi Technologies Ltd. will
    be giving a presentation in T327 on:

       Wednesday, April 9th, 2008
       12:00--14:00
       T327

    Don't miss your chance to hear about Mxi Technologies and
    possibilities of employment with them.  Pre-registration is required.

    To receive your ticket please see Liz Hobbs in T307 before the event
    on Wednesday, April 9.

Liz Hobbs, Clerk Academic, 613-727-4723  ext 7686, hobbse@algonquincollege.com

=============================================================================

DNS (continued)

Resource Records (RRs): Type, Class, TTL, RData
-----------------------------------------------
  http://www.dns.net/dnsrd/rr.html

- resource records (RRs) of various types are stored in Name Servers
- most common look-up is for A (address) records - a "forward DNS look-up"
- a "reverse DNS look-up" turns an IP into a domain name via PTR records

Resource Record (RR) format:
  owner TTL class type RDATA

Example dump of an MX Resource Record:

  $ host -d -t mx idallen.ca.
  idallen.ca.             300     IN      MX      0 idallen.ca.
  ^-owner                 ^-TTL   ^-class ^-type  ^-RDATA

Owner: the actual domain or item being looked up, e.g. idallen.ca

TTL: "Time To Live" of this record (expiry and time-out)
     Every record has an individual time-out, after which the record data
     expires from the cache and must be fetched again from another name server.
     You can set the TTL very small for a device that changes address often.
     The TTL on the root name servers is huge.

Classes of records;
 - IN  (Internet system)
 - CH  (CHAOS system)
 - only the IN class is important in our Internet

Many record Types:  (see http://www.dns.net/dnsrd/rr.html )
 - A
 - CNAME
 - HINFO
 - MX
 - NS
 - PTR
 - SOA
 - TXT (see rfc1035)
 - SRV (not in rfc1034)
 - AAAA (not in rfc1034)
 - A6 (not in rfc1034)

  For this course, know these DNS record types:
    Q: What data is contained in a DNS type "A" record?
    Q: What data is contained in a DNS type "MX" record?
    Q: What data is contained in a DNS type "NS" record?
    Q: What data is contained in a DNS type "PTR" record?
    Q: Which is the preferred IPV6 record type?

RData - various types of data, depending on Type (see RFC p.13)

Querying DNS for a specific record type using "host" or "dig":

    $ host -t a idallen.ca.
    idallen.ca has address 72.18.159.15
    $ host -t mx idallen.ca.
    idallen.ca mail is handled by 0 idallen.ca.
    $ host -t ns idallen.ca.
    idallen.ca name server ns2.totalchoicehosting.com.
    idallen.ca name server ns1.totalchoicehosting.com.

    # dig idallen.ca mx
    [...]
    ;; ANSWER SECTION:
    idallen.ca.         224     IN      MX      0 idallen.ca.

Q: use a Unix command to query a DNS server for A, MX, NS, PTR records

How do DNS lookups happen on Unix?
----------------------------------

Need to use a "resolver" library.  On Unix/Linux, it starts with this file:

  $ cat /etc/resolv.conf 
  domain somedomain.ca
  nameserver 0.0.0.0
  nameserver 192.168.0.1
  nameserver 192.168.0.2

The /etc/resolv.conf file contains the default domain (the one that is
tacked onto the end of relative domain names) and the IP addresses of
name servers in which domain names can be looked up.  (0.0.0.0 is the
address of the local machine.)

  $ host idallen.ca.
    idallen.ca has address 72.18.159.15              # "A" record
    idallen.ca mail is handled by 0 idallen.ca.      # "MX" record

  $ dig idallen.ca.
    ; <<>> DiG 9.3.2 <<>> idallen.ca.
    ;; global options:  printcmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11389
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

    ;; QUESTION SECTION:
    ;idallen.ca.                    IN      A

    ;; ANSWER SECTION:
    idallen.ca.             14373   IN      A       72.18.159.15

    ;; Query time: 1 msec
    ;; SERVER: 205.211.30.21#53(205.211.30.21)
    ;; WHEN: Tue Nov 28 09:38:30 2006
    ;; MSG SIZE  rcvd: 44

Queries for PTR records are reversed and handled automatically by "host":

  $ host 72.18.159.15
  15.159.18.72.in-addr.arpa domain name pointer server320.tchmachines.com.

  $ host -t a 15.159.18.72.in-addr.arpa
  15.159.18.72.in-addr.arpa has no A record

  $ host -t ptr 15.159.18.72.in-addr.arpa
  15.159.18.72.in-addr.arpa domain name pointer server320.tchmachines.com.

Q: what file is used by the Unix resolver library to configure DNS searches?
Q: why aren't name servers specified by name in /etc/resolv.conf ?

Tracing a query
---------------

Here is a trace of an iterative lookup for the A record for "www.idallen.ca.":
- name www.idallen.ca. is:
  - the name "www"
    - in subdomain .idallen
      - in subdomain .ca
        - in the ROOT domain (".")

Steps:
  1. Locate the IP addresses of the ROOT name servers (the NS records)
     (this info may be compiled in, or kept cached in a local file)
  2. Query some root name server for the .ca domain NS name server IP addrs.
  3. Query some .ca name server for the .idallen domain NS name server addrs.
  4. Query some .idallen name server for the A record IP address of "www".

Q: Given the domain www.idallen.ca, list the steps of an iterative DNS
   query that would resolve this domain to its IP address.

An example of a command that can do an iterative query:

    $ dig +trace idallen.ca.    (may not work at Algonquin due to blocking)
    ; <<>> DiG 9.3.1 <<>> +trace www.idallen.ca.
    ;; global options:  printcmd

*** 1. locate addresses of root NS ***
    .                       3600000 IN      NS      F.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      G.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      H.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      I.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      J.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      K.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      L.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      M.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      A.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      B.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      C.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      D.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      E.ROOT-SERVERS.NET.
    ;; Received 436 bytes from 127.0.0.1#53(127.0.0.1) in 1 ms

*** 2. locate addresses of .ca NS ***
    ca.                     172800  IN      NS      CA04.CIRA.ca.
    ca.                     172800  IN      NS      CA05.CIRA.ca.
    ca.                     172800  IN      NS      CA06.CIRA.ca.
    ca.                     172800  IN      NS      NS-EXT.ISC.ORG.
    ca.                     172800  IN      NS      CA01.CIRA.ca.
    ca.                     172800  IN      NS      CA02.CIRA.ca.
    ;; Received 284 bytes from 192.112.36.4#53(G.ROOT-SERVERS.NET) in 43 ms

*** 3. locate addresses of .idallen NS ***
    idallen.ca.             86400   IN      NS      ns2.totalchoicehosting.com.
    idallen.ca.             86400   IN      NS      ns1.totalchoicehosting.com.
    ;; Received 90 bytes from 192.228.28.9#53(CA04.CIRA.ca) in 80 ms

*** 4. look up A record for name "www" ***
    www.idallen.ca.         14400   IN      CNAME   idallen.ca.
    idallen.ca.             14400   IN      A       72.18.159.15
    idallen.ca.             86400   IN      NS      ns2.totalchoicehosting.com.
    idallen.ca.             86400   IN      NS      ns1.totalchoicehosting.com.
    ;; Received 136 bytes from 65.254.32.122#53(ns2.totalchoicehosting.com) in 43 ms

Since most DNS traffic is UDP, it is optimized to fit in one single
UDP packet.  (Full zone transfers will use TCP.)  Only 13 ROOT name
servers exist because only 13 resource records fit in a single UDP packet.

Q: Why aren't there more than 13 ROOT name servers?
Q: T/F, most DNS traffic uses UDP.
Q: Trace the DNS lookups needed to resolve www.idallen.com., starting
   with the root name servers.

Configuring Name Servers
------------------------

How do we get the address of the root name servers "."?

Unix/Linux keeps a copy of the ROOT server IP addresses in a local file.
That file name (usually "named.ca") is usually mentioned in the named.conf
file under "hints".  The BIND name server "named" also has a copy compiled
in (of course, the compiled-in copy may be outdated).

Q: How does a Unix/Linux system know the addresses of the ROOT name
   servers, to start an iterative DNS query?

The root servers are often kept in Unix/Linux file name /var/named/named.ca
 - unreadable in Linux lab, sorry
 - Use "dig @A.ROOT-SERVERS.NET . ns" to update this file if it's outdated.
 - but not at Algonquin (blocked, sorry again)

Unix/Linux DNS server package name is "BIND" - Berkeley Internet Name Daemon
 - actual executable program name is "named", which is a bit confusing
 - see /etc/named.conf for the location of the "." domain "hints"

Q: What is the package name of the common Unix DNS server?
Q: What is the name server executable program name in that package?

You only need to find one working ROOT server, at which point you can
use it to find the current addresses of the rest.

Below is the config file for "named" from Linux Fedora Core 5.
Note the "type hint" file named.ca containing the ROOT name server info.
---------------------------------------------------------------------------
//
// named.conf for Red Hat caching-nameserver 
//

options {
        directory "/var/named";
        dump-file "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        /*
         * If there is a firewall between you and nameservers you want
         * to talk to, you might need to uncomment the query-source
         * directive below.  Previous versions of BIND always asked
         * questions using port 53, but BIND 8.1 uses an unprivileged
         * port by default.
         */
         // query-source address * port 53;
};

// 
// a caching only nameserver config
// 
controls {
        inet 127.0.0.1 allow { localhost; } keys { rndckey; };
};

zone "." IN {
        type hint;
        file "named.ca";
};

zone "localdomain" IN {
        type master;
        file "localdomain.zone";
        allow-update { none; };
};

zone "localhost" IN {
        type master;
        file "localhost.zone";
        allow-update { none; };
};

zone "0.0.127.in-addr.arpa" IN {
        type master;
        file "named.local";
        allow-update { none; };
};

zone "0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa" IN
 {
        type master;
        file "named.ip6.local";
        allow-update { none; };
};

zone "255.in-addr.arpa" IN {
        type master;
        file "named.broadcast";
        allow-update { none; };
};

zone "0.in-addr.arpa" IN {
        type master;
        file "named.zero";
        allow-update { none; };
};
---------------------------------------------------------------------------

Finding domain owners - whois
---------------------

    $ whois climatechaos.ca
    Status:         EXIST                                             
    Registrar:      Tucows.com Co.                                    
    Registrar-no:   156                                               
    Registrant-no:  1599146                                           
    Domaine-no:     1599146                                           
    Subdomain:      climatechaos.ca                                   
    Renewal-Date:   2008/08/01                                        
    Date-Approved:  2006/08/01                                        
    Date-Modified:  2007/09/05                                        
    Organization:   ACT for the Earth                                 
    Description:    Climatechaos.ca is the website of Canada's emerging climate crisis coalitions.
    Admin-Name:     Dylan Penner                                      
    Admin-Title:                                                      
    Admin-Postal:   ACT for the Earth                                 
                    LWR-238 Queen St. West                            
                    Toronto ON M5V 1Z7 Canada                         
    Admin-Phone:    647-436-6398                                      
    Admin-Fax:                                                        
    Admin-Mailbox:  campaigns@actfortheearth.org                      
    Tech-Name:      Dylan Penner                                      
    Tech-Title:                                                       
    Tech-Postal:    ACT for the Earth                                 
                    238 Queen St. West                                
                    Toronto ON M5V 1Z7 Canada                         
    Tech-Phone:     647-436-6398                                      
    Tech-Fax:                                                         
    Tech-Mailbox:   campaigns@actfortheearth.org                      
    NS1-Hostname:   cns1.look.ca                                      
    NS2-Hostname:   cns2.look.ca                     

Q: What Unix/Linux command can find the owner/registrar of a domain name?

Resources
  http://directory.google.com/Top/Computers/Internet/Protocols/DNS/
  http://www.root-servers.org/
    - Ottawa has a copy of "F"

Probes and Tools
  http://www.dnsreport.com/
  http://www.dnsstuff.com/

Software
  http://www.isc.org/index.pl?/sw/bind/
  http://www.dns.net/dnsrd/

Problems
  http://en.wikipedia.org/wiki/Site_Finder  (3 weeks in September 2003)
  http://en.wikipedia.org/wiki/DNS_cache_poisoning

------------------------------------------------------------------------------

State of IPV6 and DNS
---------------------
  http://tools.ietf.org/html/rfc3363
  "Working group consensus as perceived by the chairs of the DNSEXT and
   NGTRANS working groups is that:
   a) AAAA records are preferable at the moment for production
      deployment of IPv6, and
   b) that A6 records have interesting properties that need to be better
      understood before deployment.
   c) It is not known if the benefits of A6 outweigh the costs and risks.

  "Thus, we are forced to conclude that indiscriminate use of long A6
   chains is likely to lead to increased user frustration."

Q: Which DNS record type is currently preferred for IPV6 addresses?
Q: How many bits are in an IPV6 address?

-----------------------------------------------------------------------------
see Notes: Mail Systems Terminology - mail_systems_terms.txt
-----------------------------------------------------------------------------

Sending electronic mail: SMTP
-----------------------------
  http://tools.ietf.org/html/rfc2821

- Remember: The protocol and ports used to send email (SMTP) are completely
  separate from the ports and protocols used to fetch email (POP3, IMAP)!

SMTP - Simple Mail Transfer Protocol - RFC821 -> RFC2821
 - April 2001 - 79 pages on top of TCP (95 pages) on top of IP (45 pages)
 - a "PUSH" protocol - sender initiates  (HTTP is "PULL" protocol)
 - http://tools.ietf.org/html/rfc2821
   "This document is a self-contained specification of the basic protocol
    for the Internet electronic mail transport.  It consolidates, updates
    and clarifies, but doesn't add new or change existing functionality
    of the following: RFC822, DNS, RFC1123"
 - did not add to or change RFC821; dropped obsolete items

Q: T/F RFC2821 replaced RFC821 and added new SMTP functionality

Algonquin SMTP server
---------------------

Algonquin network restrictions prevent access to other SMTP servers from
on campus.  You must connect to the Algonquin SMTP server to send email.
In strict conformace with RFC 2821, the Algonquin SMTP server accepts
only CR+LF line ends - you have to type ^V^M^M (CTRL-V RETURN RETURN)
at the end of every line to make it work.

  $ nc -v outmail.algonquincollege.com smtp
  Connection to outmail.algonquincollege.com 25 port [tcp/smtp] succeeded!
  220 mail4.algonquincollege.com -- Server ESMTP (Sun Java System Messaging Server 6.2-7.02 (built Jun 13 2006))
  quit
  quit
  quit
  ...

  - connection hangs after the banner and it appears that it doesn't accept
    any further commands; because, the Sun server demands CR+LF line
    ends, not just LF line ends as given by "nc" (the Sun server is
    RFC-compliant; but, not very liberal in what it accepts!)
  - the fix is to enter ^V<CR><CR> (CTRL-V followed by pushing the
    RETURN key twice) at the end of each line:

  $ nc -v outmail.algonquincollege.com smtp
  Connection to outmail.algonquincollege.com 25 port [tcp/smtp] succeeded!
  220 mail4.algonquincollege.com -- Server ESMTP (Sun Java System Messaging Server 6.2-7.02 (built Jun 13 2006))
  quit^V^M
  221 2.3.0 Bye received. Goodbye.

Q: T/F, the Algonquin SMTP server violates the SMTP RFC by requiring CRLF
   on the end of each line.

SMTP vs. Message Format
-----------------------
 - the SMTP *protocol* does not define the format of the *message*
   - the *message* delivered by the *protocol* has its own description:
     RFC822 -> RFC2822 "Internet Message Format"  (51 pages)
   - http://tools.ietf.org/html/rfc2822
 - the content of the message (including To/From message header lines) is
   independent of the To/From used in the SMTP protocol!

Q: T/F The SMTP protocol RFC defines the format and headers of an email message

Protocol
--------

* SMTP is a readable ASCII protocol on top of TCP - not binary!
 - you can run it using "nc" or telnet to port 25
 - but you can't do it here at Algonquin College!
   - port 25 blocked leaving the College (must use College servers)
   - College servers implement long wait times before answering
     - to discourage spam programs that don't wait as long
 - SMTP wait times are documented in
   http://tools.ietf.org/html/rfc1122
   "Timeouts are an essential feature of an SMTP
    implementation.  If the timeouts are too long (or worse,
    there are no timeouts), Internet communication failures or
    software bugs in receiver-SMTP programs can tie up SMTP
    processes indefinitely.  If the timeouts are too short,
    resources will be wasted with attempts that time out part
    way through message delivery."

* a sample SMTP session: see Notes file smtp_session.txt

    Note the difference between the SMTP RFC2821 "envelope" FROM/TO lines
    and the RFC2822 Message From:/To: lines.  The Message From:/To:
    lines need not be related to the SMTP RFC2821 envelope FROM/TO
    lines, and application writers are warned not to try to link them:
    (RFC 2821 Section 7.2)

* Extending the original SMTP protocol "HELO" with "EHLO"
 - orignal SMTP "HELO" greeting had no protocol version number
   - no way to negotiate options or features
  - RFC1425 (1993) replaced HELO with new EHLO greeting, allowing extensions
    - http://tools.ietf.org/html/rfc1425
  - awkward way to do protocol versioning
  - latest version of extensions:  http://tools.ietf.org/html/rfc2821
  - SMTP extensions (must be registered with IANA)

  ABNF:  ehlo-cmd ::= "EHLO" SP domain CR LF

Q: Is the EHLO case-sensitive?
Q: Is the domain optional?

 - HELO vs. EHLO:    http://tools.ietf.org/html/rfc2821
   "Contemporary SMTP implementations MUST support the basic extension
    mechanisms.  For instance, servers MUST support the EHLO command even
    if they do not implement any specific extensions and clients SHOULD
    preferentially utilize EHLO rather than HELO."
 - response to EHLO:  http://tools.ietf.org/html/rfc2821
     "Normally, the response to EHLO will be a multiline reply.  Each line
      of the response contains a keyword and, optionally, one or more
      parameters.  Following the normal syntax for multiline replies,
      these keyworks follow the code (250) and a hyphen for all but the
      last line, and the code and a space for the last line."
 - the response to EHLO is a list of options that indicates what optional
   features this email server offers

Q: What SHOULD an SMTP client do if the server refuses EHLO?
   (RFC2821 section 2.2.1 p.7, section 3.2 p. 16)

* Even clever people argue about the interpretation of the RFC documents:
  - http://www.imc.org/ietf-smtp/old-archive/msg01782.html
   "Certain individuals have the impression that the correct response to a
    RSET is ``close the connection'', and insist that RFC-821 backs them up.
    That seems to be an unusually bizarre interpretation, but by golly
    they insist that they Following The Standard (TM).  It quickly became
    clear that attempting to reason with such individuals was hopeless."
  - http://www.imc.org/ietf-smtp/old-archive/msg01783.html
   "having just reread the text in 821, that construing RSET as a synonym
    for QUIT must require real creativity (or trying to think with one's
    head in a normally-uncomfortable position),"

- SMTP continuation syntax: every line but the last of a multi-line
  response contains a "-" immediately following the response number, e.g.

        $ nc -v localhost smtp
        localhost.home.idallen.ca [127.0.0.1] 25 (smtp) open
        220 elm.home.idallen.ca ESMTP Postfix (idallen@idallen.ca)
        EHLO idallen.ca
        250-elm.home.idallen.ca
        250-PIPELINING
        250-SIZE 10240000
        250-VRFY
        250-ETRN
        250-STARTTLS
        250 8BITMIME

Q: How does a SMTP server indicate continuation lines in a reply?

* Reading RFC 2821 - the SMTP protocol
  http://tools.ietf.org/html/rfc2821

  The RFC is the final word on the protocol.

 - note allowed order of SMTP commands p.39
 - you cannot reject an address if the HELO/EHLO name doesn't match the IP
 - note the structure of SMTP reply codes p.40

Q: What is the meaning of the first digit of an SMTP response code?
   1yz   Positive Preliminary reply (not used in standard SMTP)
   2yz   Positive Completion reply
   3yz   Positive Intermediate reply
   4yz   Transient Negative Completion reply
   5yz   Permanent Negative Completion reply

Q: Do SMTP protocol lines end in CR+LF or just LF? (RFC2821 p.12)

Q: Do Internet Message lines end in CR+LF or just LF?
   (RFC2821 p.12, RFC2822 p.17-18)

Q: SMTP commands are given as double-quoted upper-case strings in the
   RFC 2821.  Does this mean they must be upper-case?

Q:  T/F The space following the three-digit SMTP respose code is mandatory
    and all clients MUST look for it, failing if it is not found.
    (RFC 2821 Section 4.2)

Q:  How must an SMTP client handle new response codes that it doesn't
    recognize?  (RFC 2821 Section 4.2, 4.3.2)

Q:  T/F SMTP clients can figure out how to proceed based on just the
    first digit of an SMTP reply code; they can usually ignore the rest.
    (RFC 2821 Section 4.2, 4.2.1, 4.3.2)

Q:  T/F You can queue up and send multiple commands to an SMTP server
    without waiting for any responses.  (RFC 2821 Section 4.3.1)

Looking at RFC 2821 Section 4.3.2, there are three codes that might be
returned by an SMTP server "if the corresponding unusual circumstances
are encountered".  Clients must be prepared to see these codes in response
to any SMTP request.

Q:  T/F SMTP clients only need to handle the fixed set of requests
    listed as responses in the RFC document.

Q:  Looking at RFC 2821 Section 4.5.2, how must clients handle the
    sending of email message lines that start with a period?

Q:  What is the maximum length of an email address (local-part plus
    domain), as passed through the SMTP protocol?  (RFC 2821 Section 4.5.3.1)

Q:  How long may an SMTP server delay before issuing the initial 220
    Message greeting?  (RFC 2821 Section 4.5.3.2)

Q:  Based on experience, what is the suggested policy for retrying failed
    attempts at sending a message?  (RFC 2821 Section 4.5.4.1)

Q:  Should programs attempt to relate the MAIL and RCPT (envelope)
    email addresses with the addresses (that may be) present in the
    headers of the message body?  (RFC 2821 Section 7.2)

http://teaching.idallen.com/cst8165/08w/notes/smtp_session.txt

Review of SMTP:
 - http://tools.ietf.org/html/rfc2821
 - Sample SMTP session (long and short) in Notes: smtp_session.txt
 - SMTP controls the "envelope" TO/FROM, not the message To:/From:
 - a text-based protocol, easily run using netcat.
 - 3-digit numeric response codes (know these five groups)
   - 1yz   Positive Preliminary reply (not used in standard SMTP)
   - 2yz   Positive Completion reply
   - 3yz   Positive Intermediate reply
   - 4yz   Transient Negative Completion reply
   - 5yz   Permanent Negative Completion reply

Q: Name the five main categories of SMTP server responses

Q:  T/F SMTP clients can figure out how to proceed based on just the
    first digit of an SMTP reply code; they can usually ignore the rest.
    (RFC 2821 Section 4.2, 4.2.1, 4.3.2)

SMTP MX records
---------------

How does a mail client know to which SMTP server to connect when sending
mail to a userid at some domain?   It looks up the domain MX records in
the DNS.

An SMTP client queries the DNS for a domain to obtain "MX" (mail
exchange) records that tell which machines accept SMTP mail for the domain:

    $ host -t mx algonquincollege.com
    algonquincollege.com mail is handled by 30 mailgate10.algonquincollege.com.
    algonquincollege.com mail is handled by 20 mailgate11.algonquincollege.com.

    $ host hotmail.com
    hotmail.com has address 64.4.32.7
    hotmail.com has address 64.4.33.7
    hotmail.com mail is handled by 5 mx2.hotmail.com.
    hotmail.com mail is handled by 5 mx3.hotmail.com.
    hotmail.com mail is handled by 5 mx4.hotmail.com.
    hotmail.com mail is handled by 5 mx1.hotmail.com.

    $ host idallen.ca
    idallen.ca has address 72.18.159.15
    idallen.ca mail is handled by 0 idallen.ca.

Q: How does an SMTP mailer know which computer to contact when sending
    mail to someone@domain.ca ?

* SMTP Walk-Through (old RFC 821 version) with comments by Dan Bernstein
  http://cr.yp.to/smtp.html
  - comments based on original RFC 821 not RFC 2821 (but often relevant)

  RFC2822 - message format - http://cr.yp.to/immhf.html
   - "If you're a new implementor, you'll be shocked at how badly 822
      was designed."

  - RFC2821 standards process "incompetence" by editor Klensin
    http://cr.yp.to/smtp/klensin.html
     - group concensus about HELO/EHLO didn't make the final draft! 
     - "What an incredible display of incompetence!"

Q: T/F RFC standards development has been a very organized process.

-------------------------------------------------------------------------

see Notes: Mail Systems Terminology - mail_systems_terms.txt

-------------------------------------------------------------------------

Protocols - Reading Mail - Post Office Protocol (POP)
-----------------------------------------------------
  http://tools.ietf.org/html/rfc1939   (23 pages)
  http://tools.ietf.org/html/rfc1957   (1 page observation)
  http://tools.ietf.org/html/rfc2449   (CAPA extensions)
   - note the "Errata" link
   - version 3:  RFC 1081 -> 1225 -> 1460 -> 1725 -> 1939 
     updated by RFC 1957 (one page observation RTFM!) and 2449 (extensions)
   - specified to use TCP port 110 (Section 3)
   - POP is supposed to stay *SIMPLE* (use IMAP for everything else: Section 1)
   - Section 10 example:  http://tools.ietf.org/html/rfc1939#page-19

  http://tools.ietf.org/html/rfc2449   "19 pages - CAPA extension"
  - on extending POP3 (RFC 2449 intro and section 7):
   "This extension to the POP3 protocol is to be used by a server to
    express policy descisions taken by the server administrator.  It is
    not an endorsement of implementations of further POP3 extensions
    generally.  It is the general view that the POP3 protocol should stay
    simple, and for the simple purpose of downloading email from a mail
    server.  If more complicated operations are needed, the IMAP protocol
    [RFC 2060] should be used.

    Future extensions to POP3 are in general discouraged, as POP3's
    usefulness lies in its simplicity.  POP3 is intended as a download-
    and-delete protocol; mail access capabilities are available in IMAP
    [IMAP4].  Extensions which provide support for additional mailboxes,
    allow uploading of messages to the server, or which deviate from
    POP's download-and-delete model are strongly discouraged and unlikely
    to be permitted on the IETF standards track.

    Clients MUST NOT require the presence of any extension for basic
    functionality, with the exception of the authentication commands"

Q: Why are extensions to POP3 discouraged?

RFC Section 3 - Basic Operation
 - eight case-insensitive 3-4 character command keywords (section 3)
 - traditional CRLF line terminators
 - single space separators
 - arguments only up to 40 characters (!) - very short lines
 - only two status indicators: +OK and -ERR (upper case)
   - no way to distinguish between temporary and permanent failure
   - no way to distinguish "not now" from "not implemented"
 - multi-line responses terminated by a single period on a line
   - leading periods are doubled and then must be removed (like SMTP)
   - called "byte-stuffing" or "dot-stuffing" (Section 3 page 3)
 - a state-oriented protocol
   AUTHORIZATION -> TRANSACTION -> UPDATE
   - must authenticate before issuing transactions
   - update happens *after* the client disconnects
 - MUST not time out before 10 minutes (section 3 page 4)
   - a time-out does not trigger an UPDATE - throws away updates

Q: T/F, unlike most Internet protocols, POP3 only requires LF on line ends.
Q: T/F, the POP protocol has different exit codes for temporary and
    permanent failures.
Q: How does the POP protocol handle multi-line server responses (e.g.
   when fetching a message)?
Q: What is meant by "dot-stuffing" or "byte-stuffing"?
Q: Name and describe what happens in each of the three states of a
   POP3 connection.  What triggers the entry into each state?
Q: T/F, if a POP3 client drops the connection, the server skips the
   UPDATE phase.

Authorization/Authentication State (Section 4 page 4)
 - each AUTHORIZATION method is optional; but, you must use at least one (!)
 - RFC defines cleartext USER and PASS or APOP methods
 - RFC says "there is no single authentication mechanism that is required
   of all POP3 servers" (!) but Section 9 lists USER and PASS as
   "Minimal POP3 Commands", implying they are required
 - APOP uses md5 and a shared secret
   - see p.16 - you can calculate this cipher in Linux via:
    $ echo -n '<1896.697170952@dbc.mtview.ca.us>tanstaaf' | md5sum
    c4c9334bac560ecc979e58001b3e22fb  -
 - neither USER/PASS nor APOP encrypt the full connection...

Q: T/F, the USER and PASS POP commands set up an encrypted connection.

  http://tools.ietf.org/html/rfc1734 - POP3 AUTH command
    "the client may request authentication types in decreasing order of
     preference, with the USER/PASS or APOP command as a last resort.  (p.2)

    "A protection mechanism provides integrity and privacy protection
     to the protocol session.  If a protection mechanism is negotiated,
  *  it is applied to all subsequent data sent over the connection.
     The protection mechanism takes effect immediately following the CRLF
     that concludes the authentication exchange for the client, and the
     CRLF of the positive response for the server.  Once the protection
     mechanism is in effect, the stream of command and response octets is
     processed into buffers of ciphertext.  Each buffer is transferred
     over the connection as a stream of octets prepended with a four
     octet field in network byte order that represents the length of
     the following data. (p.2)
 - QUIT is also allowed in Authorization State (Section 4 p.5)

Q: How does POP3 "protection" affect data transfer between client and server?

SASL: Simple Authentication and Security Layer
 - usable via the CAPA extension http://tools.ietf.org/html/rfc2449 (19 pages)
 - see also: SASL use in SMTP http://tools.ietf.org/html/rfc2554 (11 pages)

Authorization State
 - all authorization methods are optional; but, one must be supported

Transaction State
 - Must handle: STAT, LIST, RETR, DELE, NOOP, RSET, QUIT

Update State (can only be entered from Transaction State)
 - entered *only* via QUIT, never by hangup or disconnect
 - no commands

Section 8: Scaling and Operational Considerations
 - people using POP stores as permanent message archives
  "When these facilities are used in this way by casual users, there has
   been a tendency for already-read messages to accumulate on the server
   without bound.  This is clearly an undesirable behavior pattern from
   the standpoint of the server operator.  This situation is aggravated
   by the fact that the limited capabilities of the POP3 do not permit
   efficient handling of maildrops which have hundreds or thousands of
   messages.

Q: T/F, POPmail scales well to handle hundreds or thousands of messages.

Section 11: Message Format
  "It is important to note that the octet count for a message on
   the server host may differ from the octet count assigned to that
   message due to local conventions for designating end-of-line.
   - the size of the message in the file system may not match the size
     transmitted over the wire (especially for Unix/Linux systems)

Q: Give the minimal set of POP3 commands needed to retrieve and delete
   one message on a POP3 server.

=============================================================================

Protocols - Reading Mail - Internet Message Access Protocol (IMAP)
------------------------------------------------------------------
  http://tools.ietf.org/html/rfc3501   (108 pages)

  - RFC 1730 -> 2060 -> 3501
    updated by RFC 4466 (collected extensions)
    updated by RFC 4468 (CATENATE extension)
    updated by RFC 4551 (conditional STORE, etc.)

  "The Internet Message Access Protocol, Version 4rev1 (IMAP4rev1)
   allows a client to access and manipulate electronic mail messages on
   a server.  IMAP4rev1 permits manipulation of mailboxes (remote
   message folders) in a way that is functionally equivalent to local
   folders.  IMAP4rev1 also provides the capability for an offline
   client to resynchronize with the server."

 - requires any reliable data stream, e.g. TCP  (TCP port 143)

 - too many pages to read!

Q: T/F, both POPmail and IMAP permit remote folders.
Q: Why are most advances in reading email done through changes to IMAP rather
    than changes to POP3?