------------------------- Week 12 Notes for CST8165 ------------------------- -Ian! D. Allen - idallen@idallen.ca Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) ------------------- INDEX to this file: - Java is now completely open source via GPL version 2 - Review of midterm test answers - Changes to Lab 5 specification (require Last-Modified header) - Read the comment style example in programming_style.txt - Testing methods - Sniffing Browser HTTP Requests using nc instead of ethereal - RFC tools by IETF - Fetching a raw web page: wget - Mail Systems Terminology - Mail Transport History - Protocols - Post Office Protocol (to be continued...) ------------------- News: Sun open source GPLs Java on Sunday Nov 12, 2006 (yesterday) http://java.sun.com/ http://news.google.ca/nwshp?ie=UTF-8&oe=UTF-8&hl=en&tab=wn&ncl=http://www.desktoplinux.com/news/NS3337915997.html http://community.java.net/javadesktop/ http://news.bbc.co.uk/1/hi/technology/6144748.stm http://www.desktoplinux.com/news/NS3337915997.html http://www.eweek.com/article2/0,1895,2055770,00.asp?kc=EWNAVEMNL111306EOAD Hand back midterm tests. - go over midterm test answers: http://teaching.idallen.com/cst8165/06f/notes/termtest2_answers.txt Review ------ Q: How can I use nc to tell if an SMTP server is an "open relay"? - see last week's notes on "open relay" $ nc -v localhost smtp EHLO somedomain.ca MAIL FROM: RCPT TO: - connect to the server and see if you can use the server to send yourself an email (where "xxx" and "yyy" are both addresses that are foreign to the network on which the SMTP server resides) SMTP MX records --------------- Q: How does a mail client know to which SMTP server to connect when sending mail to a userid? An SMTP client queries the DNS for a domain to obtain "MX" (mail exchange) records that tell which machines accept mail for the domain: $ host hotmail.com hotmail.com has address 64.4.32.7 hotmail.com has address 64.4.33.7 hotmail.com mail is handled by 5 mx2.hotmail.com. hotmail.com mail is handled by 5 mx3.hotmail.com. hotmail.com mail is handled by 5 mx4.hotmail.com. hotmail.com mail is handled by 5 mx1.hotmail.com. $ host idallen.ca idallen.ca has address 72.18.159.15 idallen.ca mail is handled by 0 idallen.ca. Review changes to Lab 5 - Last-Modified: and Date: ----------------------- http://teaching.idallen.com/cst8165/06f/notes/lab05.txt http://teaching.idallen.com/cst8165/06f/notes/test_out3.txt Comment style ------------- http://teaching.idallen.com/cst8165/06f/notes/programming_style.txt - see the pair of example programs at the end of the file - if you wish to use an alternate commenting and indenting style, please provide me with a link to it and we'll discuss it - I'm open to you using any popular real-world programming style; I don't want you inventing your *own* style Testing - black box vs. white box, "behavioral" vs. "structural" ------- - I don't have time to read and test all your code; you have to do it http://www.scism.sbu.ac.uk/law/Section5/chap3/s5c3p23.html "White box testing is concerned only with testing the software product, it cannot guarantee that the complete specification has been implemented. Black box testing is concerned only with testing the specification, it cannot guarantee that all parts of the implementation have been tested. Thus black box testing is testing against the specification and will discover faults of omission, indicating that part of the specification has not been fulfilled. White box testing is testing against the implementation and will discover faults of commission, indicating that part of the implementation is faulty. In order to fully test a software product both black and white box testing are required." http://www.faqs.org/faqs/software-eng/testing-faq/section-13.html "One has to use a mixture of different methods so that they aren't hindered by the limitations of a particular one. Some call this "gray-box" or "translucent-box" test design, but others wish we'd stop talking about boxes altogether." Looking at Lab 5 white-box: http://www.brics.dk/ixwt/examples/FileServer.java - what tests exercise every line of code, especially each of the exceptions? Q: What is the difference between white-box and black-box testing of a piece of code? Give the advantages and disadvantages of each method, especially with regard to testing the specification. Sniffing Browser HTTP Requests ------------------------------ To see what lines a browser sends to an HTTP server, you can use Ethereal and trace a session; or, for a quick dump, just use netcat on a spare port (e.g. 55555) and have the browser access http://localhost:55555/foobar : [ Start a fake HTTP server on a spare port, e.g. 55555 : ] $ nc -v -l -p 55555 localhost # Debian/Ubuntu $ nc -v -l localhost 55555 # RedHat/Mandrake listening on [any] 55555 ... [ Start up your browser and connect to http://localhost:55555/foobar : ] connect to [127.0.0.1] from localhost [127.0.0.1] 40757 GET /foobar HTTP/1.1 Host: localhost:55555 User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060216 Debian/1.7.12-1.1ubuntu2 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-ca,en-us;q=0.9,en-gb;q=0.7,en;q=0.6,fr-ca;q=0.4,fr-fr;q=0.3,fr;q=0.1 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive [ At this point, you can type back a server reply to the browser : ] HTTP/1.1 200 this is my reply to the browser Content-Type: text/plain ab cd ef gh ^C (interrupt) - your browser will show the above text Programming an HTTP client -------------------------- java.net references: http://java.sun.com/j2se/1.5.0/docs/api/java/net/package-summary.html URI uri = new URI("http://java.sun.com/"); URL url = uri.toURL(); InputStream in = url.openStream(); HTTP via java class: http://java.sun.com/javase/6/docs/api/java/net/URLConnection.html http://java.sun.com/j2se/1.5.0/docs/api/java/net/HttpURLConnection.html - obsolete reference to rfc2068 should now be rfc2616 - not everyone is happy with java.net.HttpURLConnection: http://www.oaklandsoftware.com/product_http/overview.html - an alternate class (2001): http://www.innovation.ch/java/HTTPClient/urlcon_vs_httpclient.html Sun tutorial on URL reading/writing http://java.sun.com/docs/books/tutorial/networking/urls/index.html http://java.sun.com/docs/books/tutorial/networking/urls/readingWriting.html - note the need to explicitly tell the URLConnection object that we want to write on the object using: connection.setDoOutput(true) RFC tools by IETF ----------------- http://tools.ietf.org/ - html cross-linked pages - http://tools.ietf.org/html/ - reading tools - Firefox plugin - difference tools - wdiff (word diff) - verification tools - ABNF to regexp converter Fetching a raw web page: wget ----------------------------- wget http://idallen.com/ wget -O output_file -S http://idallen.com/ wget -O output_file --save-headers http://idallen.com/ wget --header="Host: teaching.idallen.com" http://idallen.com/ Mail Systems Terminology ------------------------ - common misconception: the place/protocol you use to fetch your email is the same place/protocol that you use to send your email - sending email uses SMTP - reading email uses POP3 or IMAP - they can be completely separate http://wiki.mutt.org/?MailConcept Q: T/F, unlike POP3, SMTP can be used to both send and receive email. Q: T/F, unlike SMTP, POP3 can be used to both receive and send email. - may be completely different servers - though note POP-before-SMTP (SMTP-after-POP) requires coupling: http://tools.ietf.org/html/rfc2476 (section 3.3) "Requiring a POP [POP3] authentication (from the same IP address) within some amount of time (for example, 20 minutes) prior to the start of a message submission session has also been used, but this does impose restrictions on clients as well as servers which may cause difficulties. Specifically, the client must do a POP authentication before an SMTP submission session, and not all clients are capable and configured for this. Also, the MSA must coordinate with the POP server, which may be difficult. There is also a window during which an unauthorized user can submit messages and appear to be a prior authorized user." Q: Describe briefly how POP-before-SMTP works to authenticate an SMTP session. MSA - Mail Submission Agent http://tools.ietf.org/html/rfc2476 "acts as a submission server to accept messages from MUAs, and either delivers them or acts as an SMTP client to relay them to an MTA." - enforce policy (no open relay) - enforce standards (no forged headers, etc.) - enforce filtering (SpamAssassin, etc.) - may modify messages (section 8 of RFC) http://en.wikipedia.org/wiki/List_of_mail_servers#Mail_filtering Q: Briefly describe the function of a mail system MSA. MTA - Mail Transfer Agent (mail server, mail exchange server) "A process which conforms to [SMTP-MTA], which acts as an SMTP server to accept messages from an MSA or another MTA, and either delivers them or acts as an SMTP client to relay them to another MTA." http://en.wikipedia.org/wiki/Mail_transfer_agent "It receives messages from another MTA (relaying), a mail submission agent (MSA) that itself got the mail from a mail user agent (MUA), or directly from an MUA, thus acting as an MSA itself. The MTA works behind the scenes, while the user usually interacts with the MUA. The delivery of e-mail to a user's mailbox typically takes place via a mail delivery agent (MDA); many MTAs have basic MDA functionality built in, but a dedicated MDA like procmail can provide more sophistication." - transfers email between machines (other MTAs) via SMTP - Internet-facing, open ports: security issues - sendmail, postfix, qmail, exim http://en.wikipedia.org/wiki/List_of_mail_servers#SMTP Q: Briefly describe the function of a mail system MTA. MDA - Mail Delivery Agent http://en.wikipedia.org/wiki/Mail_delivery_agent "A Mail Delivery Agent (MDA) is software that accepts incoming e-mail messages and distributes them to recipients' individual mailboxes (if the destination account is on the local machine), or forwards back to an SMTP server (if the destination is on a remote server). A mail delivery agent is not necessarily a mail transfer agent (MTA), although on many systems the two functions are implemented by the same program." - Unix/Linux: /bin/mail, procmail Q: Briefly describe the function of a mail system MDA. MRA/MAA - Mail Retrieval Agent / Mail Access Agent http://tools.ietf.org/html/rfc1939 - POP3 port 110 http://tools.ietf.org/html/rfc3501 - IMAP-V4-R1 port 143 - often built-in to mail clients (MUAs) - can be stand-alone - e.g. fetchmail gets the mail; MUA reads mail from file system Q: Briefly describe the function of a mail system MRA/MAA. MUA - Mail User Agent (email client) - the user's interface to the protocols - usually gives access to functionality of both MTA and MRA/MAA - but may not itself implement any protocols (may read/write file system) http://en.wikipedia.org/wiki/Mail_user_agent "An e-mail client, also called a Mail User Agent (MUA), is a computer program that is used to read and send e-mail. Originally, the MUA was intended to be a simple program to read the user's mail messages, which the mail delivery agent (MDA) in conjunction with the mail transfer agent (MTA) would transfer into a local mailbox. The most important mailbox formats are mbox and Maildir. These rather simple protocols for locally storing e-mails make import, export and backup of mailfolders quite easy. E-mails to be sent would be handed over to the MTA, perhaps via a mail submission agent, therefore an MUA would not have to provide any transport-related functions. *Since the various Microsoft Windows versions intended for home use never *provided an MTA, most modern MUAs have to support protocols like POP3 *and Internet Message Access Protocol (IMAP) to communicate with a remote *MTA located at the e-mail providers machine." - user-visible email clients of all descriptions - mutt, "mail", "Mail", "mailx", pine, elm - KMail, Eudora, MS Outlook - web-browser email (Netscape Messenger,Mozilla,Thunderbird) - webmail, Horde, SqurrelMail http://en.wikipedia.org/wiki/List_of_mail_servers#POP.2FIMAP Q: Briefly describe the function of a mail system MUA. Mail server comparison ---------------------- http://en.wikipedia.org/wiki/List_of_mail_servers - see comparison near bottom - PUSH protocols - sending email: MTA - SMTP - PULL protocols - reading email: MRA/MAA - POP3, IMAP Single-user PCs often don't run separate MTA or MRA/MAA programs. Your chose of mail reader (e.g. Pine, Elm, Outlook) itself PULLs your incoming email from a remote server (acting as an MRA/MAA) and then PUSHes your outgoing email to the remote server (acking as an MTA). Q: What is the difference between a PUSH protocol and a PULL protocol? Q: T/F, SMTP is a PUSH protocol. Q: T/F, POP3 is a PUSH protocol. Q: T/F, HTTP is a PUSH protocol. A History of MTAs ----------------- Q: Unix/Linux mail user agents didn't need to know how to talk to SMTP servers - you never had to configure your "outgoing mail" preferences. All the Windows MUAs need to be configured with a mail server. Why? I. Incoming - delivering your incoming email via SMTP: * Sending email into Unix/Linux machines: Unix/Linux was traditionally multi-user and ran its own MTA (e.g. sendmail) that accepted incoming SMTP connections. Remote systems could use SMTP to drop off your email with your local MTA (sendmail), and the MTA would hand the email to an MDA (/bin/mail, procmail) to put it in your mailbox in the local file system. Your MUA (e.g. /usr/ucb/Mail) would read the mail from your inbox (no need for POP3 or IMAP in any MUA). There are a few different conventions for inbox formats so that many different MUAs can read your email, all without knowing POP or IMAP. - sendmail (running as root!) has had many security patches - the first Morris Internet worm (Nov 1988) used sendmail security holes - http://en.wikipedia.org/wiki/Morris_worm Q: Why don't many Unix MUAs need to know how to run POP or IMAP? Current single-user Unix/Linux PCs often have a local-only MTA that handles the sending and delivery of local on-machine email but doesn't accept SMTP from off-site. (Best to keep ports closed on Internet-facing machines!) On recent single-user Unix/Linux workstations, the MUAs mimic their Windows counterparts and include MRA/MAA features. Your chosen MUA (e.g. Elm, Pine, Mutt) is responsible for fetching your email via POP3 or IMAP (this is an MRA/MAA function); or, you use an intermediate MRA/MAA program such as "fetchmail" and your MUA reads the mail out of the local file system after the MRA/MAA has put it there. - no Internet-facing MTA means fewer open ports and fewer attacks - don't run an Internet-facing MTA if you don't need it * Sending email into MS Windows machines (or not): Windows had (has?) no MTA - you can't send an email to a Windows PC using SMTP. Your personal MUA has to fetch the email itself via POP3 or IMAP and keep a copy in the local file system. - no open ports for incoming email; no open port security issues * Note that MUAs that implement POP/IMAP typically store the email in the local file system in a format that only that MUA can handle. The concept of a common inbox format usable by different MUAs was lost. Q: T/F, the standards for inbox formats developed under Unix were adopted by MUAs on PCs, so that different MUAs can read the same inbox. II. Outgoing - sending your outgoing email via SMTP: * Unix/Linux machines have traditionally each had their own MTA (sendmail) that could directly deliver email on the Internet using MX record lookup. Every local MUA would put email into a directory where the MTA (sendmail) would eventually pick it up and transfer it, retrying as necessary. No MUA needed to know how to do SMTP; only the MTA did that. You could optionally tell your machine's MTA not to send mail directly to its destination via SMTP over the Internet, but to use a remote "smart" MTA that could accept your outgoing email and figure out how to deliver it. (You have to use such a "smart" host here at Algonquin; since, you cannot connect to any off-campus SMTP servers.) The MTA on your machine would use SMTP to drop off the queued mail at the smart host, and the smart host would do the MX record lookup and final SMTP delivery. Since the local Unix MTAs were separately scheduled programs, you could queue email from a MUA into the file system even when your machine was not connected to the Internet. The MUA or local MTA would queue up your email in the file system until your MTA was finally able to make a connection to deliver it off-machine. (In the days of modems, the Internet connection was often made late at night when rates were lower.) Q: Why don't most Unix MUAs need to know SMTP? Current single-user Unix/Linux PCs now have MUAs that mimic their Windows counterparts - the MUAs ignore the file system and the local MTA and expect you to give the name of a remote "smart" MTA to which all email will be sent via SMTP for actual delivery. The Algonquin Linux lab has both types of mail systems: Command-line email (e.g. the "mail" command) queues up mail for the local MTA (sendmail) to send. (This is currently broken.) GUI MTAs (e.g. Thunderbird, Mozilla) ignore the local file system and the local MTA and use a "smart" remote MTA (e.g. outmail.algonquincollege.com) to deliver the mail. (This supposedly still works.) * MS Windows has no local MTA - no program exists whose job it is just to deliver queued email. Each MUA has to know how to do its own SMTP connection and each MUA has to be configured (separately!) with the address of a smart MTA to which it connects. MUAs on Windows machines all contain networking code to drop off email at some "smart" MTA that does the actual delivery. There is no local MTA queue and much duplication of SMTP code in all the MUAs. On Windows, it is up to each MUA to deal with what happens if the message being composed can't be dropped off right away at the remote smart MTA. Better MUAs will queue the email for later transmission. Poor MUAs will tell you that your mail can't be sent. Q: Why do MUAs on Windows all need to know how to talk SMTP? Protocols - Reading Mail - Post Office Protocol ----------------------------------------------- http://tools.ietf.org/html/rfc1939 (23 pages) http://tools.ietf.org/html/rfc2449 "CAPA extension" - version 3: RFC 1081 -> 1225 -> 1460 -> 1725 -> 1939 updated by RFC 1957 (one page observation RTFM!) and 2449 (extensions) - on extending POP3 (RFC 2449 intro and section 7): "This extension to the POP3 protocol is to be used by a server to express policy descisions taken by the server administrator. It is not an endorsement of implementations of further POP3 extensions generally. It is the general view that the POP3 protocol should stay simple, and for the simple purpose of downloading email from a mail server. If more complicated operations are needed, the IMAP protocol [RFC 2060] should be used. Future extensions to POP3 are in general discouraged, as POP3's usefulness lies in its simplicity. POP3 is intended as a download- and-delete protocol; mail access capabilities are available in IMAP [IMAP4]. Extensions which provide support for additional mailboxes, allow uploading of messages to the server, or which deviate from POP's download-and-delete model are strongly discouraged and unlikely to be permitted on the IETF standards track. Clients MUST NOT require the presence of any extension for basic functionality, with the exception of the authentication commands" Q: Why are extensions to POP3 discouraged? - case-insensitive 3-4 character command keywords (section 3) - traditional CRLF line terminators - single space separators - arguments only up to 40 characters (!) - very short lines - multi-line responses terminated by a single period on a line - leading periods are removed (like SMTP) - state-oriented protocol AUTHORIZATION -> TRANSACTION -> UPDATE - MUST not time out before 10 minutes (section 3) - a time-out does not trigger an UPDATE Q: T/F, unlike most Internet protocols, POP3 only requires LF on line ends. Q: Name and describe what happens in each of the three states of a POP3 connection. What triggers the entry into each state?