-------------------------
Week 14 Notes for CST8165
-------------------------
 - Ian! D. Allen - idallen@idallen.ca - www.idallen.com

Remember - knowing how to find out an answer is more important than
memorizing the answer.  Learn to fish!  RTFM!  (Read The Fine Manual)

-------------------
INDEX to this file:
 - Current Draft Protocols - Stopping SPAM (...continued)
   - SPF, SRS, Sender ID, CSV, BATV
   - DNS-based IP block lists - DNSBL
   - SpamAssassin RBL checking
 - Protocols - Telnet
 - Domain Name System - DNS
-------------------

(continued from last week)

EMail authentication proposals:
------------------------------

** SPF - sender policy framework
  http://www.openspf.org/
  http://new.openspf.org/Introduction  - how it works
  http://en.wikipedia.org/wiki/Sender_Policy_Framework

  - domains publish "txt" records in the DNS listing the valid IP
    addresses that are allowed to send email for that domain
    (some say this an abuse of the purpose of the txt record)
    - these are called "SPF" records, e.g. "host -t txt idallen.ca"
      TXT "v=spf1 ip4:66.11.175.96/30 ?all"
      - IP address block 66.11.175.96/30 can send mail FROM idallen.ca
  - recipients of email can check the IP address of the connection and
    the envelope FROM address against the published SPF records

Q: How does Sender Policy Framework (SPF) help validate incoming email?

  http://www.msexchange.org/tutorials/Sender-Policy-Framework.html
  http://www.msexchange.org/tutorials/Sender-Policy-Framework.html?printversion
   "If a spammer legitimately has an account in that domain, or owns
    the domain, they can still send email. This is a real problem so
    that some experts expect a massive growth in the registration of
    one-way domains for spammers to go around SPF and other
    techniques. But this is not a problem concerning the SPF
    implementation - it is a problem in general.

Q: Why doesn't SPF guarantee that the incoming email is not spam?

Big problem with SPF:
 - Mail can't be forwarded, since forwarding changes the IP address of
   the mail sender to one not authorized by the SPF records...

Q: What is the major problem with the SPF verification system?

 - The SPF people came up with a "Sender Rewriting Scheme" that would
   rewrite the envelope FROM addresses and permit forwarding; but,
   it breaks a long-standing tradition that MTAs must *never* modify
   envelope FROM addresses:

  http://lwn.net/Articles/187736/   - June 2006
   "The SPF folks have suggested solutions for these problems, but many of
    them require fundamental changes in how MTAs operate. The Sender Rewriting
    Scheme (SRS) proposal in particular breaks longstanding email tradition
    by having forwarding MTAs change the envelope sender as they forward
    email. Opponents of SPF not only argue that changing this tradition is
    a bad idea, but also that it is very unlikely to be widely implemented
    any time soon. Additionally, Mail User Agents (MUAs) would need to learn
    about SRS encoding in order to parse sender addresses for filtering
    email at the user end.

Q: What part of the SMTP protocol does SRS change?
Q: What is the major objection/impediment to adopting SPF with SRS?

  http://homepages.tesco.net/~J.deBoynePollard/FGA/smtp-spf-is-harmful.html
   "SPF is harmful. The architectural ramifications of it are so
    extensive and will have such significant changes on the ways that
    people can access and can use Internet mail, that it would
    actually be less costly to switch to an entirely new architecture
    such as IM2000 Internet mail than it would be to switch to SPF and
    deal with all of its consequences properly.  [...]  Most people
    who have analysed SPF in detail have come to the conclusion that
    it is a deeply flawed scheme that should be avoided outright.
    [...]
    It's trivial for malicious senders to create throwaway domain
    names with published DNS data, that they supply, declaring any
    SMTP Relay clients that they like to be (as far as SPF is
    concerned) "legitimate". And there's no shortage of such domain
    names to use once and then throw away.

Q: T/F, if an email passes the SPF test (the IP matches the FROM domain's
   SPF record), you know the message isn't spam.

  http://www.w3.org/Mail/spf/
   "Due to issues with SPF and mail forwarding, we intend to leave our
    SPF record in this state for the forseeable future, so our record
    is useful mainly for whitelisting. (mail with an 'SPF pass' status
    from w3.org is most likely legitimate, but other mail can be
    subject to more scrutiny, e.g. using heuristic-based filters.)

Q: T/F, if an email with an envelope address from a reputable domain
   (e.g. www.w3.org) passes the SPF test (the IP matches the envelope FROM
   domain's SPF record), you can be reasonably sure the FROM address is valid.

  - finding SPF records:

    $ host -t txt gmail.com
    gmail.com descriptive text "v=spf1 redirect=_spf.google.com"

Q: Give an example of a command-line that looks up the SPF record for
   domain foo.com.

    $ host -t txt _spf.google.com
    _spf.google.com descriptive text "v=spf1 ip4:216.239.56.0/23 ip4:64.233.160.0/19 ip4:66.249.80.0/20 ip4:72.14.192.0/18 ?all"

Q: In an SPF record, what does an IP address and net mask mean, e.g.
   "v=spf1 ip4:216.239.56.0/23"?

    $ dig dyndns.org txt
    [...]
    ;; ANSWER SECTION:
    dyndns.org.  86400 IN TXT "v=spf1 mx/24 a:lists.dyndns.com ip4:63.208.196.0/24 ptr:opensrs.net include:outbound.mailhop.org ~all"

  ALSO: aol.com, microsoft.com, hotmail.com, dyndns.org, LiveJournal.com,
  OReilly.com, SAP.com, Spamhaus.org, Symantec.com, Ticketmaster.com, w3.org

  http://new.openspf.org/SPF_vs_Sender_ID
  - comparison of the two approaches
   "SPF can be compared to other SMTP layer protocols like CSV/CSA.
    Sender ID can be compared to other RFC 2822 layer protocols like
    DomainKeys IM (DKIM).

Q: T/F, SPF operates at the SMTP protocol level (RFC2821) and does not
   change the email message body.
Q: T/F, Sender-ID operates on the Internet Message level (RFC2822).

** SRS - Sender Rewriting Scheme for SPF forwarding
  http://www.openspf.org/srs.html
  http://www.openspf.org/srspng.html  - how it works (messy!)
  - needed to pass SMTP FROM address through SPF forwarders
  - very invasive! Handling SRS needs MUA/MTA rewrites...
  - same goals as Sender ID but does not alter message body

Q: T/F, a serious impediment to the full adoption of SPF is the need
   to rewrite MUAs/MTAs to handle the SRS rewriting of envelope addresses.

** Sender ID = Caller ID + SPM (Microsoft patent issues)

  http://www.silicon.com/research/specialreports/thespamreport/0,39025001,39131378,00.htm
   "However, this Microsoft effort to push adoption of Sender ID is
    likely to fail, certainly with such a short deadline, said
    Jonathan Penn, an analyst at Forrester Research. "Hotmail is in no
    position to dictate that organisations adopt Sender ID," he said.

  http://www.circleid.com/posts/sender_id_a_tale_of_open_standards_and_corporate_greed_part_i/
   "Now forward twenty years to 2002: John Postel passed away, the
    Internet standards process is now done by the IETF under ISOC instead
    of DOD, Apple and IBM market shares are almost nil, the Soviet Union
    has collapsed, Linux and open source software now run majority of
    the Internet infrastructure, and the phrase "Evil Empire" is now
    used to refer to Microsoft. But the original two documents defining
    the email system of what has become a worldwide network now simply
    called "the Internet" are still here. Every single email message sent
    today still follows the original guidelines and format set out over
    20 years ago by two scientists working on behalf of the US Government.

  http://itmanagement.earthweb.com/columns/executive_tech/article.php/3444571
  - December 2004 comment on CSV alternative to Sender ID or Domain Keys:
   "Levine quotes from the text of the applications to show that
    Microsoft claims not just patent rights on anything similar to Sender
    ID, but also on spam filters that compute scores based on the content
    of messages. That's not the kind of patent that standards bodies
    have ever wanted anyone to have on an Internet protocol.

Q: T/F, the adoption of Sender ID is impeded by Microsoft patent issues.

** CSV - "Certified Server Validation"
  http://mipassoc.org/csv/
  - was called "Client SMTP Validation" 
    "CSV originally stood for "Client SMTP Validation". However,
     market(ing) feedback suggested that "Certified Server Validation"
     is more useful to folks who are trying to understand the nature of
     the service, without requiring that they be email geeks...
  
  http://mipassoc.org/csv/CSV-Comparison.html
  - compare CSV with SPF and Sender-ID (CSV is local and simpler)

  http://mipassoc.org/csv/draft-ietf-marid-csv-intro-02.html (expired Aug 05)
  http://tools.ietf.org/html/draft-crocker-csv-csa-00 (expired April 06)
  - drafts submitted to IETF (expired)
   "This specification defines a mechanism to permit session-time
    verification that a connecting SMTP client is authorized to request
    service as a mail transfer client.  The mechanism uses a DNS SRV
    [RFC2782] record as a basis for verifying that the associated domain
    name is authorized to act as an SMTP client.  The mechanism is small,
    simple and useful.

  http://itmanagement.earthweb.com/columns/executive_tech/article.php/3444571
  - article Dec 2004 - Hello 'Certified Server,' Goodbye Spam

Q: T/F, CSV can only certify the currently connecting SMTP client; it
   is not a complicated end-to-end validation system.

** DNA - "Domain Name Accreditation"
  http://tools.ietf.org/html/draft-ietf-marid-csv-dna-02  (expired Aug 05)
  http://tools.ietf.org/html/draft-ietf-marid-csv-intro-02  (expired Aug 05)

** BATV: Bounce Address Tag Validation
  http://mipassoc.org/batv/

** Compatible Low-level Email Authentication and Responsibility (CLEAR)
  http://mipassoc.org/clear/index.html
  - combine CSV and BATV

** DKIM: Domain Keys Identified Mail
  http://www.dkim.org/
  http://www.dkim.org/info/dkim-faq.html
  http://www.dkim.org/ietf-dkim.htm   - IETF submission
  - DKIM is claimed to be an upgrade of Yahoo's DomainKeys; DKIM was
    produced by an industry consortium in 2004. It merged and enhanced
    DomainKeys, from Yahoo! and Identified Internet Mail, from Cisco.
  - DKIM places its parametric information into RFC2822 header fields
    that are typically not shown to the recipient. Therefore DKIM's can
    be entirely invisible to recipients.

Q: Does DKIM operate at the SMTP protocol level or at the Internet
   Message level?

SPAM solutions response form (checklist):
  http://craphound.com/spamsolutions.txt

Q: T/F, The current state of RFCs for spam filtering is: Let the market decide.
   (In 2006 there is no agreed-upon single standard for what is best.)

=============================================================================

DNS-based IP block lists - DNSBL
--------------------------------

http://en.wikipedia.org/wiki/DNSBL

http://www.spamhaus.org/dnsbl_function.html
   "A DNSBL (commonly referred to as a 'Blocklist") is a database
    served as a DNS Zone able to be queried in realtime by Internet
    mail servers for the purpose of obtaining an opinion on the
    origin of incoming email. The role of a DNSBL such as Spamhaus'
    SBL/XBL/PBL Advisory system is to provide an opinion, to anyone
    who asks, on whether a particular IP Address meets Spamhaus' own
    policy for acceptance of inbound email.

    Every Internet network that chooses to implement spam filtering
    is, by doing so, making a policy decision governing acceptance
    and handling of inbound email. The Receiver unilaterally makes
    the choices on whether to use DNSBLs, which DNSBLs to use, and
    what to do with an incoming email if the email message's
    originating IP Address is "listed" on the DNSBL. The DNSBL
    itself, like all spam filters, can only answer whether a
    condition has been met or not.

    Spamhaus does not tell a 3rd-party mail system what to do with an
    item of email, the 3rd-party mail system asks Spamhaus for an
    opinion and Spamhaus responds to that request with its opinion.
    In effect the receiving mail server asks the Spamhaus DNSBL "Does
    this Sender's IP Address exist on the Spamhaus database?", the
    Spamhaus DNSBL simply responds with a "Yes" if present or, if not
    present does not respond at all (no response means "we have no
    opinion on that IP Address").

http://www.de.sorbs.net/lookup.shtml
   "Since those initial 78,000 proxies, the SORBS DNSBL has grown to
    an astounding 3 million listed hosts (that's less than 0.07% of
    the possible addresses on the internet - statistics correct as
    of June 2004). SORBS has also expanded over time to include
    hacked and hijacked servers, formmail scripts, trojan
    infestations (particularly those with backdoors), and more
    recently made the move to pre-emptively list all dynamically
    allocated IP address space. 

   "The SORBS DNSBL is just list of numbers, nothing more, nothing
    less. The significance of these numbers is that they are related
    to hosts on the Internet whose condition/settings have included
    the particular vulnerabilities which we seek to eliminate, i.e.
    open relays, open proxies, etc.
    As a prospective user of the SORBS lists the most important
    question you need to ask yourself is: Do I understand the listing
    criteria for the list(s) I plan to use? 

Q: What does DNSBL stand for and how is a DNSBL implemented?
Q: T/F, DNSBLs actually block IP addresses.
Q: T/F, DNSBLs only offer opinions about IP addresses, they can't
   actually block anything.

Problems:
  http://en.wikipedia.org/wiki/DNSBL
   "Additionally, it may be tricky to get a mistakenly listed IP
    address removed. For example, to request removal from the DUL
    provided by dynablock.njabl.org, you are supposed to send an email
    to removals at mail.njabl.org [3] but that address is in turn
    protected by the same DUL you are asking to be removed from.

Q: Why might it be hard to get your IP address out of a DNSBL?

Comparison:
  http://en.wikipedia.org/wiki/Comparison_of_DNS_blacklists

How do you actually query the DNS?
----------------------------------

Web based interface:
  http://www.ioncannon.net/dnsbl/
  Try 81.66.180.11

http://en.wikipedia.org/wiki/DNSBL
  See "DNSBL Queries
    When a mail server receives a connection from a client, and wishes
    to check that client against a DNSBL (let's say,
    spammers.example.net), it does more or less the following:

http://sorbs.net/
http://www.dnsbl.us.sorbs.net/using.shtml   (return codes)
    $ host 11.180.66.81.dnsbl.sorbs.net
    11.180.66.81.dnsbl.sorbs.net has address 127.0.0.10

    $ host -t txt 11.180.66.81.dnsbl.sorbs.net
    11.180.66.81.dnsbl.sorbs.net descriptive text "Dynamic IP Addresses See: http://www.sorbs.net/lookup.shtml?81.66.180.11"

http://njabl.org/use.html    (return codes)
   "Non-dial-up range entries will often have a descriptive TXT record
    which should indicate why the entry was added.

  Try 81.66.180.11

    $ host 11.180.66.81.combined.njabl.org
    11.180.66.81.combined.njabl.org has address 127.0.0.3

    $ host -t txt 11.180.66.81.combined.njabl.org
    11.180.66.81.combined.njabl.org descriptive text "Dynamic/Residential IP range listed by NJABL dynablock - http://njabl.org/dynablock.html"

http://www.spamcop.net/
http://www.spamcop.net/bl.shtml?81.66.180.11  (web access)

  Try 81.66.180.11

    $ host 11.180.66.81.bl.spamcop.net
    11.180.66.81.bl.spamcop.net has address 127.0.0.2

    $ host -t txt 11.180.66.81.bl.spamcop.net
    11.180.66.81.bl.spamcop.net descriptive text "Blocked - see http://www.spamcop.net/bl.shtml?81.66.180.11"

Q: If bl.spamcop.net is a DNSBL, give a command-line that would query
   this DNSBL for the address 1.2.3.4.
Q: If bl.spamcop.net is a DNSBL, give a command-line that would query
   this DNSBL for the reason that address 1.2.3.4 was put in the DNSBL.
Q: T/F, there is a standard for DNSBL responses, so that a given
   response always means the same thing from every DNSBL.

SpamAssassin RBL checking
-------------------------

SpamAssassin is a package that will look at an email message (RFC2822)
and analyse the message headers.  One service it can provide is
looking up mail relay addresses ("Received:" header lines) in DNSBLs:

X-Spam-Report:
 *  2.0 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP address
 *      [81.66.180.11 listed in dnsbl.sorbs.net]
 *  1.6 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
 *      [Blocked - see <http://www.spamcop.net/bl.shtml?81.66.180.11>]
 *  1.9 RCVD_IN_NJABL_DUL RBL: NJABL: dialup sender did non-local SMTP
 *      [81.66.180.11 listed in combined.njabl.org]

Received: from [81.66.180.11] (helo=m11.net81-66-180.noos.fr)
    by server320.tchmachines.com with esmtp (Exim 4.52) id 1GmGuf-00030Z-C3
    for idallen@idallen.ca; Mon, 20 Nov 2006 16:42:49 -0500

Q: What part of an email message does SpamAssassin extract and send to
   various DNDBLs for opinions?

=============================================================================

Protocols - Telnet
------------------
  Started with rfc0097 (Feb 1971!)
  - too many revisions to list!

  http://tools.ietf.org/html/rfc854   (1983, 15 pages)
  http://tools.ietf.org/html/rfc855   (options: 1983, 3 pages)
  http://tools.ietf.org/html/rfc856   (binary: 1983)
  ...etc...
  http://tools.ietf.org/html/rfc1097   (subliminal: April 1 1989)
  ...etc...
  http://tools.ietf.org/html/rfc4248   (URI: 2005)

Q: T/F, never believe anything you read in an April 1 RFC document.

Telnet URI (RFC submitted in 2005):
  http://tools.ietf.org/html/rfc4248
  - e.g.  telnet://<user>:<password>@<host>:<port>/
   "Few implementations handle the user name and password very well, if at all."

Q: Give the full URI for a "telnet" connection.
 
 - "The purpose of the TELNET Protocol is to provide a fairly general,
    bi-directional, eight-bit byte oriented communications facility. 
 - "the symmetry of the TELNET model requires that there is
    an NVT at each end of the TELNET connection" p.6
 - "symmetry is an operating principle rather than an ironclad rule." p.4

Q: What is the purpose of the TELNET protocol?

 - a line-oriented "Network Virtual Terminal" with option negotiation
   - options start with Interpret As Command (IAC) byte - 255 0xFF)
     - IAC must therefore be doubled when sent as part of data stream
   - this IAC option negotiation may confuse some applications
   - "The code set is seven-bit USASCII in an eight-bit field" (p.4)
   - which is why netcat ("nc") is better for TCP/IP debugging

Q: How does the TELNET protocol signal that an option is coming?
Q: How does the TELNET protocol transmit the IAC byte if it appears as
   part of the data stream?

 - TELNET command "Go Ahead" (GA) for old 2741 lockable keyboards (!)  p.5

 - some things don't map well to the data byte stream:
   - Interrupt Process (IP)  (this is mandatory if local system supports it)
   - Abort Output (AO)                  (optional)
   - Are You There (AYT)                (optional)
   - Erase previous Character (EC)      (optional)
   - Erase current Line (EL)            (optional)

How do you send an Interrupt or Abort command to a remote terminal?
 - don't want to add the command to the end of the queued data stream
 - don't want flow control to hold up IP, AO, AYT
 - use "out-of-band" socket data that goes "around" the main data stream:
   - TCP provides an "Urgent notification" packet that bypasses queues
   - "Synch signal consists of a TCP Urgent notification, coupled with
      the TELNET command DATA MARK (DM) [in the data stream]..." p.9
   - SYNCH causes client to throw away data and messages (except for IP,AO,AYT)
     until it finds the DM you put at the end of the data stream
   - the SYNCH mechanism discards all data (not TELNET commands)
     between the sender of the Synch and its recipient

Q: How do you send an Interrupt or Abort command to a remote terminal?

The TELNET NVT printer:
 - 95 USASCII graphics (codes 32 through 126)
 - a very few control characters
 - to send 8-bit data, you have to escape the IAC 255 byte by doubling it
 - though TELNET has an option to pass raw data, netcat is better
   since it doesn't need to escape the IAC byte

Telnet options (many, many following RFCs):
  http://tools.ietf.org/html/rfc854

Telnet protocol was eventually adopted for the FTP control stream (port 21).

Q: How does FTP use the TELNET protocol?

=============================================================================

Domain Name System - DNS
------------------------
  http://tools.ietf.org/html/rfc1034   (concepts; Nov 1987; 55 pages, index at end)
  http://tools.ietf.org/html/rfc1035   (implementation; Nov 1987)
  http://www.dns.net/dnsrd/rfc/rfc1035/rfc1035.html (annotated with pictures)

Additional:
  http://tools.ietf.org/html/rfc920    (Initial Set of Top Level Domains; October 1984)
  http://tools.ietf.org/html/rfc4343   (case sensitivity; January 2006)
  http://tools.ietf.org/html/rfc4033   (DNS security; March 2005)

- turning names into IP addresses, vice-versa, and more
- originally a big HOSTS.TXT file
   "Host name to address mappings were maintained by the Network
    Information Center (NIC) in a single file (HOSTS.TXT) which
    was FTPed by all hosts [RFC-952, RFC-953].  The total network
    bandwidth consumed in distributing a new version by this
    scheme is proportional to the square of the number of hosts in
    the network, and even when multiple levels of FTP are used,
    the outgoing FTP load on the NIC host is considerable.
    Explosive growth in the number of hosts didn't bode well for
    the future.   - http://tools.ietf.org/html/rfc1034

  - /etc/hosts still used for local non-DNS names on Unix
  - other config options determine whether local file check comes first or last
  - LMHOSTS on Windows?

Q: What is the purpose of the Domain Name System?
Q: Give any four (of many) specific functions that can be performed by
    the Domain Name System (DNS)?
Q: In what file do Unix/Linux systems keep local non-DNS names?

Assumptions (2.3 p.3)
 - size proportional to number of hosts, then number of users
 - most of the data changes slowly, some isolated parts may change quickly
 - administrative divisions and boundaries have their own name servers
 - availability of local "trusted" name servers to do external referrals
 - access is more important than timely updates or consistency

Q: Give three (of five) assumptions made when DNS was designed.

Query styles:
 - iterative query: NS refers client to another NS  (blocked at Algonquin?)
   - the client has to query each new NS
 - recursive query: NS does lookup for client   (must be used at Algonquin)
   - the client just waits for the answer

Q: Describe and differentiate between the two types of DNS queries.

Elements (2.4 p.6)
 - Domain Name Space and Resource records
   - tree-structured name space and data
 - Name Servers hold complete information about a subset and may cache more
 - Resolvers are programs or libraries that query Name Servers
   - "directly accessible to user programs; hence no protocol is necessary"
   - this is the user-visible part
   - the resolver may have its own cache
   - Unix/Linux resolvers start with /etc/resolv.conf

Q: T/F, the DNS name space is flat.
Q: What is the function of the "name server" part of a DNS (not the "resolver")?
Q: What is the function of the "resolver" part of a DNS (not the "name server")?
Q: On Unix/Linux, what file is used by the resolver library to find
   name servers to query?
 
Rules
 - case-insensitive (but case is preserved) p.7
 - domain name components are *separated* by dots
 - "absolute" names end in the ROOT - a zero-length domain: idallen.ca.
 - "relative" names don't end in ROOT - no trailing dot: idallen.ca
 - longest domain is 255 characters (plus dots, which separate components)
 - subdomains are fully contained within domains (3.1):
   - For example, A.B.C.D is a subdomain of B.C.D, C.D, D, and " " (the root).

Q: T/F, DNS records are case-sensitive.
Q: T/F, DNS records are converted to lower-case.
Q: Describe and differentiate between a DNS "absolute" and "relative" name.
Q: T/F, the dots in a DNS name are included in the 255 character name limit.

 - reverse DNS (ptr) lookups map into otherwise unused "IN-ADDR.ARPA."
   e.g. to reverse-look-up 1.2.3.4 you search for ptr in 4.3.2.1.in-addr-arpa.

   - some DNS software will automatically do the ptr look up for you:

      $ host 72.18.159.15
      15.159.18.72.in-addr.arpa domain name pointer server320.tchmachines.com.

      $ host -t ptr 15.159.18.72.in-addr.arpa.
      15.159.18.72.in-addr.arpa domain name pointer server320.tchmachines.com.

   - some software will not do the ptr look up for you:
     e.g. "dig" does not - you have to be explicit with the domain and type:

      $ dig 72.18.159.15                        # fails - NXDOMAIN
      $ dig 15.159.18.72.in-addr.arpa. ptr      # works
      $ dig -x 72.18.159.15                     # also works

Q: What is a "reverse-DNS lookup"?
Q: How does a resolver look up the PTR record for IP address 1.2.3.4?

 - 3.5 p.11
   "The labels must follow the rules for ARPANET host names.  They must
    start with a letter, end with a letter or digit, and have as interior
    characters only letters, digits, and hyphen.  There are also some
    restrictions on the length.  Labels must be 63 characters or less.
   - but in 2006 we have many violations: 3com.com, etc.

Resource Records (RRs): Type, Class, TTL, RData
-----------------------------------------------
  http://www.dns.net/dnsrd/rr.html

- resource records of various types are stored in Name Servers
- most common look-up is for A (address) records - a "forward DNS look-up"
- a "reverse DNS look-up" turns an IP into a domain name via PTR records

Type:  (see http://www.dns.net/dnsrd/rr.html )
 - A
 - CNAME
 - HINFO
 - MX
 - NS
 - PTR
 - SOA
 - TXT (see rfc1035)
 - SRV (not in rfc1034)
 - AAAA (not in rfc1034)
 - A6 (not in rfc1034)

Q: What data is contained in a DNS type "A" record?
Q: What data is contained in a DNS type "MX" record?
Q: What data is contained in a DNS type "NS" record?
Q: What data is contained in a DNS type "PTR" record?
Q: Which DNS record type (may) hold domain SPF records?

Class:
 - IN  (Internet system)
 - CH  (CHAOS system)

TTL: time to live

RData - various  (see p.13)


Watching it work
----------------

Need to use a "resolver" library.  On Unix/Linux, it starts here:

  $ cat /etc/resolv.conf 
  search somedomain.ca
  nameserver 0.0.0.0
  nameserver 192.168.0.1
  nameserver 192.168.0.2

  $ host idallen.ca.
  idallen.ca has address 72.18.159.15              # "A" record
  idallen.ca mail is handled by 0 idallen.ca.      # "MX" record

  $ dig idallen.ca.
; <<>> DiG 9.3.2 <<>> idallen.ca.
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11389
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;idallen.ca.                    IN      A

;; ANSWER SECTION:
idallen.ca.             14373   IN      A       72.18.159.15

;; Query time: 1 msec
;; SERVER: 205.211.30.21#53(205.211.30.21)
;; WHEN: Tue Nov 28 09:38:30 2006
;; MSG SIZE  rcvd: 44

Queries for PTR records are handled automatically by "host":

  $ host 72.18.159.15
  15.159.18.72.in-addr.arpa domain name pointer server320.tchmachines.com.

  $ host -t a 15.159.18.72.in-addr.arpa
  15.159.18.72.in-addr.arpa has no A record

  $ host -t ptr 15.159.18.72.in-addr.arpa
  15.159.18.72.in-addr.arpa domain name pointer server320.tchmachines.com.

Tracing a query
---------------

Here is a trace of an iterative lookup for the A record for "www.idallen.ca.":
- name www.idallen.ca. is:
  - the name "www"
    - in subdomain .idallen
      - in subdomain .ca
        - in the ROOT domain (".")

Q: Given the domain www.idallen.ca, list the steps of an iterative DNS
   query that would resolve this domain to its IP address.

Steps:
  1. Locate the IP addresses of the ROOT name servers (the NS records)
     (may be compiled in, or kept cached in a local file)
  2. Query some root name server for the .ca domain NS name server IP addrs.
  3. Query some .ca name server for the .idallen domain NS name server addrs.
  4. Query some .idallen name server for the A record IP address of "www".

An example of a command that can do an iterative query:

    $ dig +trace idallen.ca.    (may not work at Algonquin due to blocking)
    ; <<>> DiG 9.3.1 <<>> +trace www.idallen.ca.
    ;; global options:  printcmd

*** 1. locate addresses of root NS ***
    .                       3600000 IN      NS      F.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      G.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      H.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      I.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      J.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      K.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      L.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      M.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      A.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      B.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      C.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      D.ROOT-SERVERS.NET.
    .                       3600000 IN      NS      E.ROOT-SERVERS.NET.
    ;; Received 436 bytes from 127.0.0.1#53(127.0.0.1) in 1 ms

*** 2. locate addresses of .ca NS ***
    ca.                     172800  IN      NS      CA04.CIRA.ca.
    ca.                     172800  IN      NS      CA05.CIRA.ca.
    ca.                     172800  IN      NS      CA06.CIRA.ca.
    ca.                     172800  IN      NS      NS-EXT.ISC.ORG.
    ca.                     172800  IN      NS      CA01.CIRA.ca.
    ca.                     172800  IN      NS      CA02.CIRA.ca.
    ;; Received 284 bytes from 192.112.36.4#53(G.ROOT-SERVERS.NET) in 43 ms

*** 3. locate addresses of .idallen NS ***
    idallen.ca.             86400   IN      NS      ns2.totalchoicehosting.com.
    idallen.ca.             86400   IN      NS      ns1.totalchoicehosting.com.
    ;; Received 90 bytes from 192.228.28.9#53(CA04.CIRA.ca) in 80 ms

*** 4. look up A record for name "www" ***
    www.idallen.ca.         14400   IN      CNAME   idallen.ca.
    idallen.ca.             14400   IN      A       72.18.159.15
    idallen.ca.             86400   IN      NS      ns2.totalchoicehosting.com.
    idallen.ca.             86400   IN      NS      ns1.totalchoicehosting.com.
    ;; Received 136 bytes from 65.254.32.122#53(ns2.totalchoicehosting.com) in 43 ms

Since most DNS traffic is UDP, it is optimized to fit in one single
UDP packet.  (Full zone transfers will use TCP.)  Only 13 ROOT name
servers exist because only 13 resource records fit in a single UDP packet.

Q: Why aren't there more than 13 ROOT name servers?
Q: T/F, most DNS traffic uses UDP.

Configuring Name Servers
------------------------

How do we get the address of the root name servers "."?

Unix/Linux keeps a copy in a local file.  The BIND name server "named"
also has a copy compiled in (may be outdated).

Q: How does a Unix/Linux system know the addresses of the ROOT name
   servers, to start an iterative DNS query?

Unix/Linux file name /var/named/named.ca   (unreadable in Linux lab)
 - Use "dig @A.ROOT-SERVERS.NET . ns" to update this file if it's outdated.
 - but not at Algonquin (blocked)

Unix/Linux DNS program is "BIND" - Berkeley Internet Name Daemon
 - actual program name is "named"
 - see /etc/named.conf for the location of the "." domain "hints"

You only need to find one working ROOT server, at which point you can
use it to find the current addresses of the rest.

Below is the config file for "named" from Linux Fedora Core 5.
Note the "type hint" file named.ca containing the ROOT name server info.
---------------------------------------------------------------------------
//
// named.conf for Red Hat caching-nameserver 
//

options {
        directory "/var/named";
        dump-file "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        /*
         * If there is a firewall between you and nameservers you want
         * to talk to, you might need to uncomment the query-source
         * directive below.  Previous versions of BIND always asked
         * questions using port 53, but BIND 8.1 uses an unprivileged
         * port by default.
         */
         // query-source address * port 53;
};

// 
// a caching only nameserver config
// 
controls {
        inet 127.0.0.1 allow { localhost; } keys { rndckey; };
};

zone "." IN {
        type hint;
        file "named.ca";
};

zone "localdomain" IN {
        type master;
        file "localdomain.zone";
        allow-update { none; };
};

zone "localhost" IN {
        type master;
        file "localhost.zone";
        allow-update { none; };
};

zone "0.0.127.in-addr.arpa" IN {
        type master;
        file "named.local";
        allow-update { none; };
};

zone "0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa" IN
 {
        type master;
        file "named.ip6.local";
        allow-update { none; };
};

zone "255.in-addr.arpa" IN {
        type master;
        file "named.broadcast";
        allow-update { none; };
};

zone "0.in-addr.arpa" IN {
        type master;
        file "named.zero";
        allow-update { none; };
};
---------------------------------------------------------------------------


Finding domain owners
---------------------

$ whois idallen.ca.       (may not work at Algonquin due to blocking)

Status:         EXIST                                             
Registrar:      SIBERNAME INTERNET AND SOFTWARE TECHNOLOGIES INC. 
Registrar-no:   108                                               
Registrant-no:  445525                                            
Domaine-no:     445525                                            
Subdomain:      idallen.ca                                        
Renewal-Date:   2007/05/29                                        
Date-Approved:  2002/05/29                                        
Date-Modified:  2006/03/30                                        
Organization:   Ian D. Allen                                      
Description:                                                      
Admin-Name:     Ian! D. Allen                                     
Admin-Title:                                                      
Admin-Postal:   idallen.ca                                        
                22 Oak Street                                     
                Ottawa ON K1R 6S9 Canada                          
Admin-Phone:    +1-613-235-6216                                   
Admin-Fax:                                                        
Admin-Mailbox:  idallen@idallen.ca
Tech-Name:      Ian! D. Allen                                     
Tech-Title:                                                       
Tech-Postal:    idallen.ca                                        
                22 Oak Street                                     
                Ottawa ON K1R 6S9 Canada                          
Tech-Phone:     +1-613-235-6216                                   
Tech-Fax:                                                         
Tech-Mailbox:   idallen@idallen.ca
NS1-Hostname:   ns1.totalchoicehosting.com                        
NS2-Hostname:   ns2.totalchoicehosting.com                      

Q: What Unix/Linux command can find the owner/registrar of a domain name?

Resources
  http://directory.google.com/Top/Computers/Internet/Protocols/DNS/
  http://www.root-servers.org/
    - Ottawa has a copy of "F"

Probes and Tools
  http://www.dnsreport.com/
  http://www.dnsstuff.com/

Software
  http://www.isc.org/index.pl?/sw/bind/
  http://www.dns.net/dnsrd/

Problems
  http://en.wikipedia.org/wiki/Site_Finder  (3 weeks in September 2003)
  http://en.wikipedia.org/wiki/DNS_cache_poisoning