------------------------- Week 13 Notes for CST8165 ------------------------- -Ian! D. Allen - idallen@idallen.ca - www.idallen.com Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) Keep up on your readings (Course Outline: average 4 hours/week homework) ============================================================================= Level 6 Students April 9 deadline: Mxi Technologies Ltd. Mxi Technologies delivers meaningful improvements in labor productivity through integrated, intelligent aviation maintenance software. Dan Bourguigon, Sr. Customer Support Analyst, and Karen Turner, Talent Acquisition Specialist of Mxi Technologies Ltd. will be giving a presentation in T327 on: Wednesday, April 9th, 2008 12:00--14:00 T327 Don't miss your chance to hear about Mxi Technologies and possibilities of employment with them. Pre-registration is required. To receive your ticket please see Liz Hobbs in T307 before the event on Wednesday, April 9. Liz Hobbs, Clerk Academic, 613-727-4723 ext 7686, hobbse@algonquincollege.com ============================================================================= DNS (continued) Resource Records (RRs): Type, Class, TTL, RData ----------------------------------------------- http://www.dns.net/dnsrd/rr.html - resource records (RRs) of various types are stored in Name Servers - most common look-up is for A (address) records - a "forward DNS look-up" - a "reverse DNS look-up" turns an IP into a domain name via PTR records Resource Record (RR) format: owner TTL class type RDATA Example dump of an MX Resource Record: $ host -d -t mx idallen.ca. idallen.ca. 300 IN MX 0 idallen.ca. ^-owner ^-TTL ^-class ^-type ^-RDATA Owner: the actual domain or item being looked up, e.g. idallen.ca TTL: "Time To Live" of this record (expiry and time-out) Every record has an individual time-out, after which the record data expires from the cache and must be fetched again from another name server. You can set the TTL very small for a device that changes address often. The TTL on the root name servers is huge. Classes of records; - IN (Internet system) - CH (CHAOS system) - only the IN class is important in our Internet Many record Types: (see http://www.dns.net/dnsrd/rr.html ) - A - CNAME - HINFO - MX - NS - PTR - SOA - TXT (see rfc1035) - SRV (not in rfc1034) - AAAA (not in rfc1034) - A6 (not in rfc1034) For this course, know these DNS record types: Q: What data is contained in a DNS type "A" record? Q: What data is contained in a DNS type "MX" record? Q: What data is contained in a DNS type "NS" record? Q: What data is contained in a DNS type "PTR" record? Q: Which is the preferred IPV6 record type? RData - various types of data, depending on Type (see RFC p.13) Querying DNS for a specific record type using "host" or "dig": $ host -t a idallen.ca. idallen.ca has address 72.18.159.15 $ host -t mx idallen.ca. idallen.ca mail is handled by 0 idallen.ca. $ host -t ns idallen.ca. idallen.ca name server ns2.totalchoicehosting.com. idallen.ca name server ns1.totalchoicehosting.com. # dig idallen.ca mx [...] ;; ANSWER SECTION: idallen.ca. 224 IN MX 0 idallen.ca. Q: use a Unix command to query a DNS server for A, MX, NS, PTR records How do DNS lookups happen on Unix? ---------------------------------- Need to use a "resolver" library. On Unix/Linux, it starts with this file: $ cat /etc/resolv.conf domain somedomain.ca nameserver 0.0.0.0 nameserver 192.168.0.1 nameserver 192.168.0.2 The /etc/resolv.conf file contains the default domain (the one that is tacked onto the end of relative domain names) and the IP addresses of name servers in which domain names can be looked up. (0.0.0.0 is the address of the local machine.) $ host idallen.ca. idallen.ca has address 72.18.159.15 # "A" record idallen.ca mail is handled by 0 idallen.ca. # "MX" record $ dig idallen.ca. ; <<>> DiG 9.3.2 <<>> idallen.ca. ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11389 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;idallen.ca. IN A ;; ANSWER SECTION: idallen.ca. 14373 IN A 72.18.159.15 ;; Query time: 1 msec ;; SERVER: 205.211.30.21#53(205.211.30.21) ;; WHEN: Tue Nov 28 09:38:30 2006 ;; MSG SIZE rcvd: 44 Queries for PTR records are reversed and handled automatically by "host": $ host 72.18.159.15 15.159.18.72.in-addr.arpa domain name pointer server320.tchmachines.com. $ host -t a 15.159.18.72.in-addr.arpa 15.159.18.72.in-addr.arpa has no A record $ host -t ptr 15.159.18.72.in-addr.arpa 15.159.18.72.in-addr.arpa domain name pointer server320.tchmachines.com. Q: what file is used by the Unix resolver library to configure DNS searches? Q: why aren't name servers specified by name in /etc/resolv.conf ? Tracing a query --------------- Here is a trace of an iterative lookup for the A record for "www.idallen.ca.": - name www.idallen.ca. is: - the name "www" - in subdomain .idallen - in subdomain .ca - in the ROOT domain (".") Steps: 1. Locate the IP addresses of the ROOT name servers (the NS records) (this info may be compiled in, or kept cached in a local file) 2. Query some root name server for the .ca domain NS name server IP addrs. 3. Query some .ca name server for the .idallen domain NS name server addrs. 4. Query some .idallen name server for the A record IP address of "www". Q: Given the domain www.idallen.ca, list the steps of an iterative DNS query that would resolve this domain to its IP address. An example of a command that can do an iterative query: $ dig +trace idallen.ca. (may not work at Algonquin due to blocking) ; <<>> DiG 9.3.1 <<>> +trace www.idallen.ca. ;; global options: printcmd *** 1. locate addresses of root NS *** . 3600000 IN NS F.ROOT-SERVERS.NET. . 3600000 IN NS G.ROOT-SERVERS.NET. . 3600000 IN NS H.ROOT-SERVERS.NET. . 3600000 IN NS I.ROOT-SERVERS.NET. . 3600000 IN NS J.ROOT-SERVERS.NET. . 3600000 IN NS K.ROOT-SERVERS.NET. . 3600000 IN NS L.ROOT-SERVERS.NET. . 3600000 IN NS M.ROOT-SERVERS.NET. . 3600000 IN NS A.ROOT-SERVERS.NET. . 3600000 IN NS B.ROOT-SERVERS.NET. . 3600000 IN NS C.ROOT-SERVERS.NET. . 3600000 IN NS D.ROOT-SERVERS.NET. . 3600000 IN NS E.ROOT-SERVERS.NET. ;; Received 436 bytes from 127.0.0.1#53(127.0.0.1) in 1 ms *** 2. locate addresses of .ca NS *** ca. 172800 IN NS CA04.CIRA.ca. ca. 172800 IN NS CA05.CIRA.ca. ca. 172800 IN NS CA06.CIRA.ca. ca. 172800 IN NS NS-EXT.ISC.ORG. ca. 172800 IN NS CA01.CIRA.ca. ca. 172800 IN NS CA02.CIRA.ca. ;; Received 284 bytes from 192.112.36.4#53(G.ROOT-SERVERS.NET) in 43 ms *** 3. locate addresses of .idallen NS *** idallen.ca. 86400 IN NS ns2.totalchoicehosting.com. idallen.ca. 86400 IN NS ns1.totalchoicehosting.com. ;; Received 90 bytes from 192.228.28.9#53(CA04.CIRA.ca) in 80 ms *** 4. look up A record for name "www" *** www.idallen.ca. 14400 IN CNAME idallen.ca. idallen.ca. 14400 IN A 72.18.159.15 idallen.ca. 86400 IN NS ns2.totalchoicehosting.com. idallen.ca. 86400 IN NS ns1.totalchoicehosting.com. ;; Received 136 bytes from 65.254.32.122#53(ns2.totalchoicehosting.com) in 43 ms Since most DNS traffic is UDP, it is optimized to fit in one single UDP packet. (Full zone transfers will use TCP.) Only 13 ROOT name servers exist because only 13 resource records fit in a single UDP packet. Q: Why aren't there more than 13 ROOT name servers? Q: T/F, most DNS traffic uses UDP. Q: Trace the DNS lookups needed to resolve www.idallen.com., starting with the root name servers. Configuring Name Servers ------------------------ How do we get the address of the root name servers "."? Unix/Linux keeps a copy of the ROOT server IP addresses in a local file. That file name (usually "named.ca") is usually mentioned in the named.conf file under "hints". The BIND name server "named" also has a copy compiled in (of course, the compiled-in copy may be outdated). Q: How does a Unix/Linux system know the addresses of the ROOT name servers, to start an iterative DNS query? The root servers are often kept in Unix/Linux file name /var/named/named.ca - unreadable in Linux lab, sorry - Use "dig @A.ROOT-SERVERS.NET . ns" to update this file if it's outdated. - but not at Algonquin (blocked, sorry again) Unix/Linux DNS server package name is "BIND" - Berkeley Internet Name Daemon - actual executable program name is "named", which is a bit confusing - see /etc/named.conf for the location of the "." domain "hints" Q: What is the package name of the common Unix DNS server? Q: What is the name server executable program name in that package? You only need to find one working ROOT server, at which point you can use it to find the current addresses of the rest. Below is the config file for "named" from Linux Fedora Core 5. Note the "type hint" file named.ca containing the ROOT name server info. --------------------------------------------------------------------------- // // named.conf for Red Hat caching-nameserver // options { directory "/var/named"; dump-file "/var/named/data/cache_dump.db"; statistics-file "/var/named/data/named_stats.txt"; /* * If there is a firewall between you and nameservers you want * to talk to, you might need to uncomment the query-source * directive below. Previous versions of BIND always asked * questions using port 53, but BIND 8.1 uses an unprivileged * port by default. */ // query-source address * port 53; }; // // a caching only nameserver config // controls { inet 127.0.0.1 allow { localhost; } keys { rndckey; }; }; zone "." IN { type hint; file "named.ca"; }; zone "localdomain" IN { type master; file "localdomain.zone"; allow-update { none; }; }; zone "localhost" IN { type master; file "localhost.zone"; allow-update { none; }; }; zone "0.0.127.in-addr.arpa" IN { type master; file "named.local"; allow-update { none; }; }; zone "0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa" IN { type master; file "named.ip6.local"; allow-update { none; }; }; zone "255.in-addr.arpa" IN { type master; file "named.broadcast"; allow-update { none; }; }; zone "0.in-addr.arpa" IN { type master; file "named.zero"; allow-update { none; }; }; --------------------------------------------------------------------------- Finding domain owners - whois --------------------- $ whois climatechaos.ca Status: EXIST Registrar: Tucows.com Co. Registrar-no: 156 Registrant-no: 1599146 Domaine-no: 1599146 Subdomain: climatechaos.ca Renewal-Date: 2008/08/01 Date-Approved: 2006/08/01 Date-Modified: 2007/09/05 Organization: ACT for the Earth Description: Climatechaos.ca is the website of Canada's emerging climate crisis coalitions. Admin-Name: Dylan Penner Admin-Title: Admin-Postal: ACT for the Earth LWR-238 Queen St. West Toronto ON M5V 1Z7 Canada Admin-Phone: 647-436-6398 Admin-Fax: Admin-Mailbox: campaigns@actfortheearth.org Tech-Name: Dylan Penner Tech-Title: Tech-Postal: ACT for the Earth 238 Queen St. West Toronto ON M5V 1Z7 Canada Tech-Phone: 647-436-6398 Tech-Fax: Tech-Mailbox: campaigns@actfortheearth.org NS1-Hostname: cns1.look.ca NS2-Hostname: cns2.look.ca Q: What Unix/Linux command can find the owner/registrar of a domain name? Resources http://directory.google.com/Top/Computers/Internet/Protocols/DNS/ http://www.root-servers.org/ - Ottawa has a copy of "F" Probes and Tools http://www.dnsreport.com/ http://www.dnsstuff.com/ Software http://www.isc.org/index.pl?/sw/bind/ http://www.dns.net/dnsrd/ Problems http://en.wikipedia.org/wiki/Site_Finder (3 weeks in September 2003) http://en.wikipedia.org/wiki/DNS_cache_poisoning ------------------------------------------------------------------------------ State of IPV6 and DNS --------------------- http://tools.ietf.org/html/rfc3363 "Working group consensus as perceived by the chairs of the DNSEXT and NGTRANS working groups is that: a) AAAA records are preferable at the moment for production deployment of IPv6, and b) that A6 records have interesting properties that need to be better understood before deployment. c) It is not known if the benefits of A6 outweigh the costs and risks. "Thus, we are forced to conclude that indiscriminate use of long A6 chains is likely to lead to increased user frustration." Q: Which DNS record type is currently preferred for IPV6 addresses? Q: How many bits are in an IPV6 address? ----------------------------------------------------------------------------- see Notes: Mail Systems Terminology - mail_systems_terms.txt ----------------------------------------------------------------------------- Sending electronic mail: SMTP ----------------------------- http://tools.ietf.org/html/rfc2821 - Remember: The protocol and ports used to send email (SMTP) are completely separate from the ports and protocols used to fetch email (POP3, IMAP)! SMTP - Simple Mail Transfer Protocol - RFC821 -> RFC2821 - April 2001 - 79 pages on top of TCP (95 pages) on top of IP (45 pages) - a "PUSH" protocol - sender initiates (HTTP is "PULL" protocol) - http://tools.ietf.org/html/rfc2821 "This document is a self-contained specification of the basic protocol for the Internet electronic mail transport. It consolidates, updates and clarifies, but doesn't add new or change existing functionality of the following: RFC822, DNS, RFC1123" - did not add to or change RFC821; dropped obsolete items Q: T/F RFC2821 replaced RFC821 and added new SMTP functionality Algonquin SMTP server --------------------- Algonquin network restrictions prevent access to other SMTP servers from on campus. You must connect to the Algonquin SMTP server to send email. In strict conformace with RFC 2821, the Algonquin SMTP server accepts only CR+LF line ends - you have to type ^V^M^M (CTRL-V RETURN RETURN) at the end of every line to make it work. $ nc -v outmail.algonquincollege.com smtp Connection to outmail.algonquincollege.com 25 port [tcp/smtp] succeeded! 220 mail4.algonquincollege.com -- Server ESMTP (Sun Java System Messaging Server 6.2-7.02 (built Jun 13 2006)) quit quit quit ... - connection hangs after the banner and it appears that it doesn't accept any further commands; because, the Sun server demands CR+LF line ends, not just LF line ends as given by "nc" (the Sun server is RFC-compliant; but, not very liberal in what it accepts!) - the fix is to enter ^V (CTRL-V followed by pushing the RETURN key twice) at the end of each line: $ nc -v outmail.algonquincollege.com smtp Connection to outmail.algonquincollege.com 25 port [tcp/smtp] succeeded! 220 mail4.algonquincollege.com -- Server ESMTP (Sun Java System Messaging Server 6.2-7.02 (built Jun 13 2006)) quit^V^M 221 2.3.0 Bye received. Goodbye. Q: T/F, the Algonquin SMTP server violates the SMTP RFC by requiring CRLF on the end of each line. SMTP vs. Message Format ----------------------- - the SMTP *protocol* does not define the format of the *message* - the *message* delivered by the *protocol* has its own description: RFC822 -> RFC2822 "Internet Message Format" (51 pages) - http://tools.ietf.org/html/rfc2822 - the content of the message (including To/From message header lines) is independent of the To/From used in the SMTP protocol! Q: T/F The SMTP protocol RFC defines the format and headers of an email message Protocol -------- * SMTP is a readable ASCII protocol on top of TCP - not binary! - you can run it using "nc" or telnet to port 25 - but you can't do it here at Algonquin College! - port 25 blocked leaving the College (must use College servers) - College servers implement long wait times before answering - to discourage spam programs that don't wait as long - SMTP wait times are documented in http://tools.ietf.org/html/rfc1122 "Timeouts are an essential feature of an SMTP implementation. If the timeouts are too long (or worse, there are no timeouts), Internet communication failures or software bugs in receiver-SMTP programs can tie up SMTP processes indefinitely. If the timeouts are too short, resources will be wasted with attempts that time out part way through message delivery." * a sample SMTP session: see Notes file smtp_session.txt Note the difference between the SMTP RFC2821 "envelope" FROM/TO lines and the RFC2822 Message From:/To: lines. The Message From:/To: lines need not be related to the SMTP RFC2821 envelope FROM/TO lines, and application writers are warned not to try to link them: (RFC 2821 Section 7.2) * Extending the original SMTP protocol "HELO" with "EHLO" - orignal SMTP "HELO" greeting had no protocol version number - no way to negotiate options or features - RFC1425 (1993) replaced HELO with new EHLO greeting, allowing extensions - http://tools.ietf.org/html/rfc1425 - awkward way to do protocol versioning - latest version of extensions: http://tools.ietf.org/html/rfc2821 - SMTP extensions (must be registered with IANA) ABNF: ehlo-cmd ::= "EHLO" SP domain CR LF Q: Is the EHLO case-sensitive? Q: Is the domain optional? - HELO vs. EHLO: http://tools.ietf.org/html/rfc2821 "Contemporary SMTP implementations MUST support the basic extension mechanisms. For instance, servers MUST support the EHLO command even if they do not implement any specific extensions and clients SHOULD preferentially utilize EHLO rather than HELO." - response to EHLO: http://tools.ietf.org/html/rfc2821 "Normally, the response to EHLO will be a multiline reply. Each line of the response contains a keyword and, optionally, one or more parameters. Following the normal syntax for multiline replies, these keyworks follow the code (250) and a hyphen for all but the last line, and the code and a space for the last line." - the response to EHLO is a list of options that indicates what optional features this email server offers Q: What SHOULD an SMTP client do if the server refuses EHLO? (RFC2821 section 2.2.1 p.7, section 3.2 p. 16) * Even clever people argue about the interpretation of the RFC documents: - http://www.imc.org/ietf-smtp/old-archive/msg01782.html "Certain individuals have the impression that the correct response to a RSET is ``close the connection'', and insist that RFC-821 backs them up. That seems to be an unusually bizarre interpretation, but by golly they insist that they Following The Standard (TM). It quickly became clear that attempting to reason with such individuals was hopeless." - http://www.imc.org/ietf-smtp/old-archive/msg01783.html "having just reread the text in 821, that construing RSET as a synonym for QUIT must require real creativity (or trying to think with one's head in a normally-uncomfortable position)," - SMTP continuation syntax: every line but the last of a multi-line response contains a "-" immediately following the response number, e.g. $ nc -v localhost smtp localhost.home.idallen.ca [127.0.0.1] 25 (smtp) open 220 elm.home.idallen.ca ESMTP Postfix (idallen@idallen.ca) EHLO idallen.ca 250-elm.home.idallen.ca 250-PIPELINING 250-SIZE 10240000 250-VRFY 250-ETRN 250-STARTTLS 250 8BITMIME Q: How does a SMTP server indicate continuation lines in a reply? * Reading RFC 2821 - the SMTP protocol http://tools.ietf.org/html/rfc2821 The RFC is the final word on the protocol. - note allowed order of SMTP commands p.39 - you cannot reject an address if the HELO/EHLO name doesn't match the IP - note the structure of SMTP reply codes p.40 Q: What is the meaning of the first digit of an SMTP response code? 1yz Positive Preliminary reply (not used in standard SMTP) 2yz Positive Completion reply 3yz Positive Intermediate reply 4yz Transient Negative Completion reply 5yz Permanent Negative Completion reply Q: Do SMTP protocol lines end in CR+LF or just LF? (RFC2821 p.12) Q: Do Internet Message lines end in CR+LF or just LF? (RFC2821 p.12, RFC2822 p.17-18) Q: SMTP commands are given as double-quoted upper-case strings in the RFC 2821. Does this mean they must be upper-case? Q: T/F The space following the three-digit SMTP respose code is mandatory and all clients MUST look for it, failing if it is not found. (RFC 2821 Section 4.2) Q: How must an SMTP client handle new response codes that it doesn't recognize? (RFC 2821 Section 4.2, 4.3.2) Q: T/F SMTP clients can figure out how to proceed based on just the first digit of an SMTP reply code; they can usually ignore the rest. (RFC 2821 Section 4.2, 4.2.1, 4.3.2) Q: T/F You can queue up and send multiple commands to an SMTP server without waiting for any responses. (RFC 2821 Section 4.3.1) Looking at RFC 2821 Section 4.3.2, there are three codes that might be returned by an SMTP server "if the corresponding unusual circumstances are encountered". Clients must be prepared to see these codes in response to any SMTP request. Q: T/F SMTP clients only need to handle the fixed set of requests listed as responses in the RFC document. Q: Looking at RFC 2821 Section 4.5.2, how must clients handle the sending of email message lines that start with a period? Q: What is the maximum length of an email address (local-part plus domain), as passed through the SMTP protocol? (RFC 2821 Section 4.5.3.1) Q: How long may an SMTP server delay before issuing the initial 220 Message greeting? (RFC 2821 Section 4.5.3.2) Q: Based on experience, what is the suggested policy for retrying failed attempts at sending a message? (RFC 2821 Section 4.5.4.1) Q: Should programs attempt to relate the MAIL and RCPT (envelope) email addresses with the addresses (that may be) present in the headers of the message body? (RFC 2821 Section 7.2) http://teaching.idallen.com/cst8165/08w/notes/smtp_session.txt Review of SMTP: - http://tools.ietf.org/html/rfc2821 - Sample SMTP session (long and short) in Notes: smtp_session.txt - SMTP controls the "envelope" TO/FROM, not the message To:/From: - a text-based protocol, easily run using netcat. - 3-digit numeric response codes (know these five groups) - 1yz Positive Preliminary reply (not used in standard SMTP) - 2yz Positive Completion reply - 3yz Positive Intermediate reply - 4yz Transient Negative Completion reply - 5yz Permanent Negative Completion reply Q: Name the five main categories of SMTP server responses Q: T/F SMTP clients can figure out how to proceed based on just the first digit of an SMTP reply code; they can usually ignore the rest. (RFC 2821 Section 4.2, 4.2.1, 4.3.2) SMTP MX records --------------- How does a mail client know to which SMTP server to connect when sending mail to a userid at some domain? It looks up the domain MX records in the DNS. An SMTP client queries the DNS for a domain to obtain "MX" (mail exchange) records that tell which machines accept SMTP mail for the domain: $ host -t mx algonquincollege.com algonquincollege.com mail is handled by 30 mailgate10.algonquincollege.com. algonquincollege.com mail is handled by 20 mailgate11.algonquincollege.com. $ host hotmail.com hotmail.com has address 64.4.32.7 hotmail.com has address 64.4.33.7 hotmail.com mail is handled by 5 mx2.hotmail.com. hotmail.com mail is handled by 5 mx3.hotmail.com. hotmail.com mail is handled by 5 mx4.hotmail.com. hotmail.com mail is handled by 5 mx1.hotmail.com. $ host idallen.ca idallen.ca has address 72.18.159.15 idallen.ca mail is handled by 0 idallen.ca. Q: How does an SMTP mailer know which computer to contact when sending mail to someone@domain.ca ? * SMTP Walk-Through (old RFC 821 version) with comments by Dan Bernstein http://cr.yp.to/smtp.html - comments based on original RFC 821 not RFC 2821 (but often relevant) RFC2822 - message format - http://cr.yp.to/immhf.html - "If you're a new implementor, you'll be shocked at how badly 822 was designed." - RFC2821 standards process "incompetence" by editor Klensin http://cr.yp.to/smtp/klensin.html - group concensus about HELO/EHLO didn't make the final draft! - "What an incredible display of incompetence!" Q: T/F RFC standards development has been a very organized process. ------------------------------------------------------------------------- see Notes: Mail Systems Terminology - mail_systems_terms.txt ------------------------------------------------------------------------- Protocols - Reading Mail - Post Office Protocol (POP) ----------------------------------------------------- http://tools.ietf.org/html/rfc1939 (23 pages) http://tools.ietf.org/html/rfc1957 (1 page observation) http://tools.ietf.org/html/rfc2449 (CAPA extensions) - note the "Errata" link - version 3: RFC 1081 -> 1225 -> 1460 -> 1725 -> 1939 updated by RFC 1957 (one page observation RTFM!) and 2449 (extensions) - specified to use TCP port 110 (Section 3) - POP is supposed to stay *SIMPLE* (use IMAP for everything else: Section 1) - Section 10 example: http://tools.ietf.org/html/rfc1939#page-19 http://tools.ietf.org/html/rfc2449 "19 pages - CAPA extension" - on extending POP3 (RFC 2449 intro and section 7): "This extension to the POP3 protocol is to be used by a server to express policy descisions taken by the server administrator. It is not an endorsement of implementations of further POP3 extensions generally. It is the general view that the POP3 protocol should stay simple, and for the simple purpose of downloading email from a mail server. If more complicated operations are needed, the IMAP protocol [RFC 2060] should be used. Future extensions to POP3 are in general discouraged, as POP3's usefulness lies in its simplicity. POP3 is intended as a download- and-delete protocol; mail access capabilities are available in IMAP [IMAP4]. Extensions which provide support for additional mailboxes, allow uploading of messages to the server, or which deviate from POP's download-and-delete model are strongly discouraged and unlikely to be permitted on the IETF standards track. Clients MUST NOT require the presence of any extension for basic functionality, with the exception of the authentication commands" Q: Why are extensions to POP3 discouraged? RFC Section 3 - Basic Operation - eight case-insensitive 3-4 character command keywords (section 3) - traditional CRLF line terminators - single space separators - arguments only up to 40 characters (!) - very short lines - only two status indicators: +OK and -ERR (upper case) - no way to distinguish between temporary and permanent failure - no way to distinguish "not now" from "not implemented" - multi-line responses terminated by a single period on a line - leading periods are doubled and then must be removed (like SMTP) - called "byte-stuffing" or "dot-stuffing" (Section 3 page 3) - a state-oriented protocol AUTHORIZATION -> TRANSACTION -> UPDATE - must authenticate before issuing transactions - update happens *after* the client disconnects - MUST not time out before 10 minutes (section 3 page 4) - a time-out does not trigger an UPDATE - throws away updates Q: T/F, unlike most Internet protocols, POP3 only requires LF on line ends. Q: T/F, the POP protocol has different exit codes for temporary and permanent failures. Q: How does the POP protocol handle multi-line server responses (e.g. when fetching a message)? Q: What is meant by "dot-stuffing" or "byte-stuffing"? Q: Name and describe what happens in each of the three states of a POP3 connection. What triggers the entry into each state? Q: T/F, if a POP3 client drops the connection, the server skips the UPDATE phase. Authorization/Authentication State (Section 4 page 4) - each AUTHORIZATION method is optional; but, you must use at least one (!) - RFC defines cleartext USER and PASS or APOP methods - RFC says "there is no single authentication mechanism that is required of all POP3 servers" (!) but Section 9 lists USER and PASS as "Minimal POP3 Commands", implying they are required - APOP uses md5 and a shared secret - see p.16 - you can calculate this cipher in Linux via: $ echo -n '<1896.697170952@dbc.mtview.ca.us>tanstaaf' | md5sum c4c9334bac560ecc979e58001b3e22fb - - neither USER/PASS nor APOP encrypt the full connection... Q: T/F, the USER and PASS POP commands set up an encrypted connection. http://tools.ietf.org/html/rfc1734 - POP3 AUTH command "the client may request authentication types in decreasing order of preference, with the USER/PASS or APOP command as a last resort. (p.2) "A protection mechanism provides integrity and privacy protection to the protocol session. If a protection mechanism is negotiated, * it is applied to all subsequent data sent over the connection. The protection mechanism takes effect immediately following the CRLF that concludes the authentication exchange for the client, and the CRLF of the positive response for the server. Once the protection mechanism is in effect, the stream of command and response octets is processed into buffers of ciphertext. Each buffer is transferred over the connection as a stream of octets prepended with a four octet field in network byte order that represents the length of the following data. (p.2) - QUIT is also allowed in Authorization State (Section 4 p.5) Q: How does POP3 "protection" affect data transfer between client and server? SASL: Simple Authentication and Security Layer - usable via the CAPA extension http://tools.ietf.org/html/rfc2449 (19 pages) - see also: SASL use in SMTP http://tools.ietf.org/html/rfc2554 (11 pages) Authorization State - all authorization methods are optional; but, one must be supported Transaction State - Must handle: STAT, LIST, RETR, DELE, NOOP, RSET, QUIT Update State (can only be entered from Transaction State) - entered *only* via QUIT, never by hangup or disconnect - no commands Section 8: Scaling and Operational Considerations - people using POP stores as permanent message archives "When these facilities are used in this way by casual users, there has been a tendency for already-read messages to accumulate on the server without bound. This is clearly an undesirable behavior pattern from the standpoint of the server operator. This situation is aggravated by the fact that the limited capabilities of the POP3 do not permit efficient handling of maildrops which have hundreds or thousands of messages. Q: T/F, POPmail scales well to handle hundreds or thousands of messages. Section 11: Message Format "It is important to note that the octet count for a message on the server host may differ from the octet count assigned to that message due to local conventions for designating end-of-line. - the size of the message in the file system may not match the size transmitted over the wire (especially for Unix/Linux systems) Q: Give the minimal set of POP3 commands needed to retrieve and delete one message on a POP3 server. ============================================================================= Protocols - Reading Mail - Internet Message Access Protocol (IMAP) ------------------------------------------------------------------ http://tools.ietf.org/html/rfc3501 (108 pages) - RFC 1730 -> 2060 -> 3501 updated by RFC 4466 (collected extensions) updated by RFC 4468 (CATENATE extension) updated by RFC 4551 (conditional STORE, etc.) "The Internet Message Access Protocol, Version 4rev1 (IMAP4rev1) allows a client to access and manipulate electronic mail messages on a server. IMAP4rev1 permits manipulation of mailboxes (remote message folders) in a way that is functionally equivalent to local folders. IMAP4rev1 also provides the capability for an offline client to resynchronize with the server." - requires any reliable data stream, e.g. TCP (TCP port 143) - too many pages to read! Q: T/F, both POPmail and IMAP permit remote folders. Q: Why are most advances in reading email done through changes to IMAP rather than changes to POP3?