-------------------------
Week 04 Notes for CST8165
-------------------------
-Ian! D. Allen - idallen@idallen.ca

Remember - knowing how to find out an answer is more important than
memorizing the answer.  Learn to fish!  RTFM!  (Read The Fine Manual)

Midterm test coming

Comments on assignments:
  programming_style.txt
   - keep lines less than 80 characters
   - indentation is critical

Review:
 - you know the basic inputs/outputs of the Unix syscalls:
   socket,bind,listen,accept,read/recv,write/send,close
 - you know how to write the basic code/PDL of any process that reads a
   buffer from a one place and writes that data to another place
 - you can program a TCP/IP echo server that accepts multiple clients
 - you can program a two-process TCP/IP client that talks to a server

Q: What is the successful return value of socket() used for?

Q: What is the function of bind()?

Q: What is the function of listen()?

Q: What is the function of accept()?
   What is the successful return value of accept() used for?

Q: Give the PDL for a forking "echo" server that creates new processes
   for each new client connection and echoes back to the client all
   input received.

Q: When can write() be used in place of send() in accessing a socket fd?
   (see "man 2 send")  (Note: You cannot always use send() in place of
   write(), unless you are writing to a socket!)

Q: When can read() and recv() be interchanged in accessing a socket fd?

Q: Can recv() and send() be used on non-sockets?  (see lab 3)

Coding the Client
-----------------

I did a review of PDL for any process that reads from one place and
writes to another; see last week's notes.  The Client contains two
separate processes, each with one of these read/write loops.

Q: Give the PDL for a forking two-process "echo" client that sends
   keyboard input to a remote TCP/IP server and receives the echo
   of the input back and displays it on the screen.  Explain under
   what conditions one process needs to kill() the other process.
   Explain under what conditions one process needs to shut down the
   writing half of the socket that connects to the remote server.

TCP/IP References
-----------------

    http://www.tcpipguide.com/

Encapsulation - Protocol Layering
---------------------------------

- encapsulation as data moves down the stack, de-encapsulation moving up

Review the OSI network stack and Internet network stack:

  http://en.wikipedia.org/wiki/Internet_protocol_suite
  http://en.wikipedia.org/wiki/TCP/IP_model

  OSI Seven Layers
   - OSI divides into seven layers with standard names

  TCP Four Layers (or Five Layers)
   - the Internet grew up with four layers; some say five now:
   - nobody has the names right any more
  "No document officially specifies the model; different names are given
   to the layers by different documents, and different numbers of layers
   are shown by different documents. There are versions of this model
   with four layers and with five layers."
  "In modern text books, the model has evolved into a five-layer
   version that splits Layer 1 into a Physical layer and a Network
   Access layer, corresponding to the physical layer and data link
   layer of the OSI model. The Internet or Internetworking layer is
   named Network layer."

- Internet TCP/IP has four/five (don't try to name them - they keep changing)
  - give examples of programs/protocols/methods at each layer
    see Figure 1 in http://tools.ietf.org/html/rfc791

Q: With respect to the original four-layer Internet protocol stack, what
   is the difference between an IP router and an Ethernet switch or hub?

Q: Show how data from an application (e.g. TFTP) is
   encapsulated/de-encapsulated as it moves down the four-layer TCP/IP
   stack, gets shipped over an Ethernet, passes through a switch, passes
   through an IP router, and is finally delivered to another application.
  - See Figure 2 in http://tools.ietf.org/html/rfc791
  - See http://en.wikipedia.org/wiki/Image:UDP_encapsulation.svg
  - See http://en.wikipedia.org/wiki/Image:IP_stack_connections.png

Your application data is passed to the computer's TCP/IP stack, which
wraps a TCP header around it (containing port information), then an IP
header around that (containing information such as source/destination
address).  That wrapped packet is passed down to the network hardware,
which wraps your packet with hardware framing bits that will get it out
your network card, onto the network, and into the next network card.

Your Ethernet card has a unique MAC address that is used at the Ethernet
level to pass packets around.

Also: "2.2. Low level Nonsense and Network Theory" in
  http://beej.us/guide/bgnet/output/htmlsingle/bgnet.html

This "packetization" of your data across the Internet may be visible
to your application.  Packets may be dropped, fragmented, arrive late,
or arrive out-of-sequence (and no amount of money can change that on
the public Internet, at least until the big telcos get their way).

Q: T/F IP packets arrive in the order in which they are sent.

Q: T/F IP packets are reliable.

Dotted Quad (Dotted Decimal) structure
--------------------------------------

IP addresses are part network number and part host number depending on
how you divide up the 32 bits, e.g. address 1.2.3.4 might be host number
4 on network 1.2.3 (a /24 network), or it might be host 3.4 on network
1.2 (a /16 network), or host 2.3.4 on network 1 (a /8 network).

The division of bits doesn't have to be on an 8-bit (one-byte)
boundary, so you will find nets given as 1.2.3.0/25

Some nice properties apply to a "network" of hosts, including limiting
of traffic and being able to direct traffic to a large number of hosts
by using just the network number:

    http://www.ralphb.net/IPSubnet/ipaddr.html
    http://www.networkcomputing.com/netdesign/1122ipr-full.html

Q: Why not just put all the machines on the same network?

In traditional routing, sub-networks and hosts are not allowed to use
numbers that are either all-zeroes or all-ones.  All-ones addresses
are interpreted as broadcast addresses for their networks - packets
sent to these addresses are processed by every node on the network.
(All-zeroes used to be broadcast addresses 20 years ago.)

Q: What happens if you send an ICMP echo "ping" to a network broadcast
   address?

Q: Suppose you forged your IP source address and then sent a ping to a
   network broadcast address with a large number of hosts on it?
   http://www.webopedia.com/TERM/S/smurf.html

IP Routing
----------

When an application's machine wants to send a packet on the network,
the low-level network hardware (which knows nothing about IP addresses)
needs to know "the next stop" hardware network interface for the packet.
Either the packet is destined directly for a host on one of the attached
networks (often a machine is on only one single network); or, the packet
has to be sent off to the network card of a "gateway" machine on the
local network, and the gateway machine will know where to forward it
(to another hardware network card on another network, and so on...).

http://tools.ietf.org/html/rfc950
 - setion 2.2 shows code fragment used in IP routing and subnet routing
 - note that the IP address and IP mask are unique to each network interface

Either way, your system has to send the IP packet, encapsulated for
the local network hardware (e.g. Ethernet).  That encapsulation - the
finding out of the network card MAC (Media Access Control) address -
is often assisted by a low-level networking protcol such as ARP (Address
Resolution Protocol).  http://en.wikipedia.org/wiki/MAC_address
http://www.dcs.gla.ac.uk/~lewis/networkpages/m05s05IPForwarding.htm

Q: What Linux command shows you your interfaces and their IP addresses
   and network masks?  A: ifconfig (may be under /sbin or /usr/sbin)

Q; What Linux command shows you your main IP routing tables?
   A: ip route  (or: ip route list table main)

Q: How does my computer know if an IP address is on the local network?

Q: How does my machine know what to do with an IP packet if the packet
IP address isn't on the local network?

Q: Does my computer have routing tables for the Internet?  Does my
machine know how a packet will travel to Google.ca ?

Subnetting
----------

http://www.bergen.org/ATC/Course/InfoTech/Coolip/
RFC: http://tools.ietf.org/html/rfc950

Subnetting is the process of being handed a network address and being
able to subdivide it into subnets, correctly deciding how many bits to
use for the subnet and how many bits to leave for the host addresses.

See the examples in: http://www.bergen.org/ATC/Course/InfoTech/Coolip/

Q: Know how to divide a network into subnets.

Figure 14:

   "Notice how sequential subnet numbers do not appear to be sequential
    when expressed in dotted-decimal notation. This can cause a great
    deal of misunderstanding and confusion since everyone believes
    that dotted-decimal notation makes it much easier for human users
    to understand IP addressing. In this example, the dotted-decimal
    notation obscures rather than clarifies the subnet numbering scheme!"

Q: Given an IP address and network mask, determine the network prefix
(the /nn number), the network number, and the broadcast address.

Q: Given an IP network address, apply subnetting to the address to supply
a certain number of subnets, or a certain number of hosts.

Q: What is the maximum number of hosts you can have (avoiding the all-zero
and all-one networks and hosts) for a Class C address and a 4-bit subnet?
(How many usable subnets are available with with four bits?  How many
usable hosts can reside on each of those sub-networks?  Multiply.) Answer: 196
http://www.ralphb.net/IPSubnet/example.html
http://www.ralphb.net/IPSubnet/restr.html

Q: What is the next available subnet address after this one 192.168.1.0/24 ?
   Answer: 192.168.2.0 (/24)
   - add one to the network part of the 32-bit number

Q: What is the next available subnet address after this one 192.168.1.0/25 ?
   Answer: 192.168.1.128 (/25)
   - add one to the network part of the 32-bit number

Q: What is the lowest usable host address in the 192.168.1.128/25 network?
   Answer: 192.168.1.129 (/25)
   - avoid the all-zero host address 192.168.1.128 (/25)

CIDR (supernetting)
-------------------

http://www.bergen.org/ATC/Course/InfoTech/Coolip/
- diagrams of bits for traditional Class A,B,C networking

http://tools.ietf.org/html/rfc1518
- the CIDR proposal

http://www.ipprimer.com/addressing.cfm

   "Although RFC 1812 came out in June of 1995(!), most certification
    tests still test you on the RFC 950 rules, for (in my opinion)
    one of the following reasons:

    * Their software still follows RFC 950 rules (this is rare.)
    * Since RFC 1812 simplifies things significantly, there's not
      enough material to test on. Test items from RFC 950 are added
      as "filler".
    * They are ignorant of the fact that the material on their tests has
      been out of date for more than five years.
    * They are mean-spirited, perniciously forcing you to learn material
      that will never be relevant to your job."

Originally, IP addresses were classified strictly as Class A, B, C
depending on the size of the network part.  Class A addresses used the top
8 bits for the network number; Class B used 16 bits; Class C used 24 bits.
The top few bits of an IP address decided whether an address was A, B,
or C.

Q: Give an example (network and mask) of a Class A,B,C address.

   "Although the original intent of having Classes was to allow for flexible
    addressing, experience showed that the hard boundary of the three Classes
    actually made the addressing less flexible.  For example, if a site
    connecting to the Internet needed to address 300 hosts, then a Class C
    network wouldn't be adequate and a Class B would need to be assigned.
    This resulted in poor utilization of the assigned address space and
    caused a faster-than-necessary rate of consumption of the available IP
    address space."
    http://www3.ietf.org/proceedings/99jul/I-D/draft-ietf-idr-aggregation-tutorial-01.txt

When the number of IP numbers started to run scarce, the Internet changed
to using an arbitrary number of bits:

   "CIDR removed the idea of Classes from IP.  Instead of having networks
    with an implied number of bits referring to network/host, there are
    "prefixes" with an associated mask explicitly identifying which bits
    refer to network/host.  For example, the prefix "38.245.76.0" with
    a mask of "255.255.255.0" has 24 bits of network and 8 bits of host
    (i.e., it can address the same number of hosts as a Class C network
    even though the prefix is in the Class A range).  The CIDR paradigm
    prefers the term "prefix" over "network" because it's more clear that
    no Class is being implied.  Another way to write this example prefix is
    "38.245.76.0/24", meaning that the mask contains 24 1s in the high-order
    portion of the mask."
    http://www3.ietf.org/proceedings/99jul/I-D/draft-ietf-idr-aggregation-tutorial-01.txt

CIDR throws out all the traditional classes and subnetting:

   "The solution is simple: someone just issued an edict saying "forget
    everything you learned, we won't bother with those rules any more".
    There's even a command to tell the routers themselves that they should
    ignore the rules - "ip classless" When you break the rules like this,
    and allow netmasks that end in all 0's or all 1's, it's called "CIDR"
    - Classless InterDomain Routing." http://www.gtoal.com/subnet.html
 
Unix Tools
----------

Network tools (RTFM):

- ifconfig, "ifconfig eth0"
  - show MAC, IP address, and network mask of each network interface
  - ifconfig may be in /sbin which may not be in your default $PATH
- ip route (or "netstat -r -n" or "route -n")
  - show IP routing tables, including route to default gateway
- arp, "arp -a"
  - show known (cached) MAC addresses on local net
- traceroute
  - using increasing small TTLs, find the route of an outgoing packet
  - may be blocked at Algonquin College
- tcpdump (privileged under Linux - needs root permissions)
  - show the raw network activity on a network card
- ethereal (privileged under Linux - needs root permissions)
  - show the raw network activity on a network card (GUI)

Q: What Unix command shows the MAC, IP address, and network mask of
   each network interface?

Q: What Unix command shows the machine's routing tables?

Q: What Unix command shows the machine's ARP MAC address tables?

Q: What Unix command traces the route a packet takes to a remote host?

Getting a machine on the net - four network parameter requirements
------------------------------------------------------------------

At minimum, your machine needs two network parameters to be a good
network citizen:

   1. an IP address assigned to at least one connected network card
   2. a network mask or prefix length, so you know which IP addresses
      are on the local net and which are not

Q: What are the two minimum network parameters needed to allow your
   machine to talk on the local network?

If you want to talk to more than your local network, you also need:

   3. the IP address of a gateway machine (for off-net access)

Q: What are the three minimum network parameters needed to allow your
   machine to talk to machines not on your local network?

If you want to use names instead of IP addresses, you need:

   4. addresses of DNS server(s) to resolve host names
   5. a host name for your machine (fully qualified with a domain name)

You can program your machine with all or some these things directly
(static addressing); or, you can have your machine broadcast a request
to see if some other machine on the network has its configuration info:
DHCP, BOOTP(old), RARP

The Unix "hostname" command shows and sets the machine host name.

The Unix "ifconfig" command shows and sets IP addresses and network
masks on interfaces.

A "gateway" machine is a machine on your local network to which packets
will be sent if your machine doesn't know where else to send them.
Without a gateway, your machine can only communicate with other machines
on the local network segment (the local ARP domain).  The "arp" command
shows the current kernel table listing known MAC addresses on the local
network.  The "route" command shows you your routing tables, including
the "default" route to your gateway machine.

You can run your machine without defining any DNS servers, in which case
you will have to use IP addresses (not names) for all hosts.  The file
/etc/resolv.conf ("man resolv.conf") contains definitions of your domain
name and your DNS servers.

Network broadcast address can be calculated from IP and mask.

Q: What Internet network access is possible without a DNS server?

Q: What Internet network access is possible without a gateway machine?

Q: What Internet network access is possible without a network mask?

Q: I want my computer to talk to another computer on the same network
   as mine.  What minimum network configuration do I need?

Q: I want my computer to talk to another computer on a different
   network from mine.  What minimum network configuration do I need?