------------------------- Week 08 Notes for CST8165 ------------------------- -Ian! D. Allen - idallen@idallen.ca - www.idallen.com Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) Keep up on your readings (Course Outline: average 4 hours/week homework) Review: ------ - host and dig - looping echo server and client - reading from network sockets (how to get all the data?) - RFC, IETF - four (maybe five) layers Application, Transport, Network, Physical IP - Internet Protocol ---------------------- - http://tools.ietf.org/html/rfc791 (45 pages, Sep 1981) - layer 2 of the 4 (or 5) layer stack: 4 - application layer (programs) 3 - TCP/UDP (transport/host-to-host layer) 2 - IP (Internet/gateway layer), ICMP 1 - Network/hardware layer (e.g. Ethernet, ARP, MAC addresses) (Layer 1 may be split into Physical/Network Access) Internet four (or five) layer stack has IP at layer 2. Below IP are one (or two) layers; above IP are another two layers. - Figure 2: http://www.garykessler.net/library/tcpip.html#arch At or near the bottom, below IP is the Network layer (e.g. Ethernet) - ARP converts between Ethernet hardware (MAC) and IP addresses - ARP is part of both "layers" (IETF doesn't like the "layers" concept) http://www.garykessler.net/library/tcpip.html#ARP Most everything on the Internet starts with just plain IP, "the Internet's most basic protocol" (http://www.freesoft.org/CIE/Topics/79.htm): * Internet layer - IP - IP has no port information; only IP addresses Figure 4: http://tools.ietf.org/html/rfc791#section-3.1 Figure 4: http://www.garykessler.net/library/tcpip.html#IP - simple http://www.freesoft.org/CIE/Topics/79.htm - large amounts of data may be "fragmented" into multiple IP packets - the IP Identification field numbers the fragments for later re-assembly - this was later determined to be a Very Bad Idea - fragmentation is now considered harmful, difficult to get right, etc. - more on this later (below) Compare protocol complexities: - IP RFC791 is 45 pages - TCP and UDP are on "top" of IP (means packets go *inside* IP packets) - UDP RFC768 is only 3 more pages on top of IP (unreliable) - TCP RFC793 is 95 more pages on top of IP (reliable) - DCCP RFC4340 is 125 pages on top of IP (!!) Q: T/F packets get larger as they move down the protocol stack from Layer 4 (Application) down to the Physical media. * Overloading the IP network: http://www.africonnect.com/tcpip_tut.htm "The IP protocol does not guarantee delivery, or that packets will arrive in the proper sequence. [...] "Rather than simply discarding all newly arriving packets, the routers are programmed discard packets in a random fashion to prevent buffer overflow. This is best implemented in a "fair" way so that the data stream having the largest volume suffers the largest number of dropped packets." Q: True/False - the IP packet header contains port numbers Q: Looking at RFC791 Figure 4, what is the longest total length theoretically possible for an IP packet? Q: Looking at RFC791 Figure 4, what is the largest time-to-live value possible? Q: What does ARP stand for and how is it used in Internet networking? Q: What happens to packets when the Internet gets overloaded? How do routers recover from an overload? ICMP - Internet Control Message Protocol ---------------------------------------- - same layer as IP (layer 2) Ref: http://www.freesoft.org/CIE/Topics/81.htm Q: Is the delivery of ICMP messages guaranteed? Q: What is ICMP used for on the Internet (name two of four functions)? Announce network errors (e.g. unreachable), network congestion (quench) Announce time-outs (zero TTL) Troubleshooting (ICMP echo) Q: What popular program uses ICMP echo packets? http://www.freesoft.org/CIE/Topics/53.htm Q: How does traceroute use ICMP to map a packet route? http://www.freesoft.org/CIE/Topics/54.htm Q: Traceroute is not reliable. What can go wrong (describe two things)? http://www.freesoft.org/CIE/Topics/54.htm Layer Three: TCP and UDP - port numbers --------------------------------------- Just above the IP layer is the Transport layer (layer 3). UDP and TCP add "port" numbers to IP, for host-to-host communication. - http://www.garykessler.net/library/tcpip.html#transport Just above layer 3 Transport (TCP/UDP) is the Application layer (4) - this is the part where you get to write the application code - SMTP, HTTP, POP3, etc. Two major Linux/Unix sockets types are UDP (SOCK_DGRAM) or TCP (SOCK_STREAM). Both extend IP addressing with the concept of "ports". Reference: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#twotypes - UDP is essentially raw IP plus port numbers; still unreliable See the RFC: http://tools.ietf.org/html/rfc768 (only 3 pages!) - used in DNS and TFTP - UDP is message-oriented - fixed size chunks, unreliable - TCP is like streaming UDP with reliable transmission added See the RFC: http://tools.ietf.org/html/rfc793 (85 pages!) - TCP is stream-oriented - arbitrary byte stream, reliable http://www.tcpipguide.com/free/t_TCPDataHandlingandProcessingStreamsSegmentsandSequ.htm Most UDP/TCP Port Numbers have to be Registered with IANA - IANA: Internet Assigned Numbers Authority - Master IANA List of ports: http://www.iana.org/assignments/port-numbers - ports are in three ranges: "Well Known", "Registered", "Dynamic/Private" - you SHOULD NOT use a "Well Known" or "Registered" port without first registering it with IANA. Major service port numbers (often seen in trace output): http://www.tcpipguide.com/free/t_TCPCommonApplicationsandServerPortAssignments.htm - port numbers are given names in the Linux/Unix file /etc/services - see also the master list at http://www.iana.org/assignments/port-numbers * TCP 20 ftp-data * TCP 21 ftp (control) TCP 22 SSH TCP 23 telnet * TCP 25 SMTP (sending mail only) * UDP/TCP 53 domain (DNS) UDP 67-68 DHCP * TCP 80 HTTP (WWW) * TCP 110 POP3 (receiving mail only) TCP 113 ident (identifying incoming TCP connections) TCP 119 NNTP (Network News) UDP/TCP 123 NTP (Network Time) UDP/TCP 137-139 Microsoft netbios (SMB) (Samba) TCP 443 HTTPS (secure WWW) UDP/TCP 445 Microsoft-DS UDP/TCP 631 Internet Printing Protocol (IPP - CUPS) The "*" protocols are the ones most important in this course. On Unix/Linux, individual network servers/daemons (e.g. ssh, http) may have individual start-up scripts, or they may run on demand out of the master "inetd" or "xinetd" super-servers. Socket Options for UDP/TCP/IP ----------------------------- As an application programmer, what control does your application have over the lower-level TCP/IP layers in Unix/Linux? - you can set options on the sockets you open that affect the TCP/IP stack - "man 7 socket" setsockopt(2) and getsockopt(2) - SO_KEEPALIVE - SO_RCVTIMEO and SO_SNDTIMEO (useful in port scanning) - SO_BINDTODEVICE - SO_REUSEADDR (you already used this in labs) - SO_DONTROUTE - SO_BROADCAST - SO_LINGER - SO_PRIORITY The SO_SNDTIMEO can be used in a port scanner to reduce the amount of time that the program waits for a reply from a blocked port (a port that issues no ICMP "connection refused" error) or from a machine that does not exist. Q: What function calls are available to C programmers to set options on sockets? Give two examples of the kind of options you can set. Q: The TCP/UDP header contains port numbers. Why aren't the source and destination addresses also in the TCP/UDP header? Q: Why is the UDP RFC 3 pages but the TCP RFC is 85 pages? Q: What port numbers lie in the "Well Known" range? Q: T/F your Internet application can use any port it wants outside of the "Well Known" range Understanding UDP ----------------- Ref: http://tools.ietf.org/html/rfc768 (only 3 pages!) http://www.freesoft.org/CIE/RFC/1122/72.htm "The User Datagram Protocol UDP [UDP:1] offers only a minimal transport service -- non-guaranteed datagram delivery -- and gives applications direct access to the datagram service of the IP layer. UDP is used by applications that do not require the level of service of TCP or that wish to use communications services (e.g., multicast or broadcast delivery) not available from TCP. UDP is almost a null protocol; the only services it provides over IP are checksumming of data and multiplexing by port number. Therefore, an application program running over UDP must deal directly with end-to-end communication problems that a connection-oriented protocol would have handled -- e.g., retransmission for reliable delivery, packetization and reassembly, flow control, congestion avoidance, etc., when these are required. The fairly complex coupling between IP and TCP will be mirrored in the coupling between UDP and many applications using UDP. " - a very thin layer added inside an IP packet - like raw IP, UDP is unreliable, no retransmission: "fire and forget" - adds "ports" to IP and little else: any reliability or retransmission work has to be done by the application (as is done by TCP) - recall that the TCP RFC is 85 pages; that's an indication of how hard it would be to make your application turn UDP into a reliable protocol! - big user of UDP is basic DNS queries and replies (DNS zone transfers use TCP; everything else is UDP) Q: What four fields are added to raw IP by a UDP packet header? To ensure that a UDP packet arrives at the right destination, the checksum in UDP includes a "pseudo-header" that includes some of the IP header information. See page 2 in: http://tools.ietf.org/html/rfc768 A very brief history of the development of the pseudo-header, and how the NSA messed things up by preventing encryption: http://www.postel.org/pipermail/end2end-interest/2005-February/004616.html http://www.postel.org/pipermail/end2end-interest/2005-February/004617.html Q: What is the purpose of the UDP or TCP "pseudo header"? Understanding TCP ----------------- Ref: http://tools.ietf.org/html/rfc793 (85 pages!) http://www.ssfnet.org/Exchange/tcp/tcpTutorialNotes.html http://www4.informatik.uni-erlangen.de/Projects/JX/Projects/TCP/tcpstate.html "TCP provides a connection oriented, reliable, byte stream service. The term connection-oriented means the two applications using TCP must establish a TCP connection with each other before they can exchange data. It is a full duplex protocol, meaning that each TCP connection supports a pair of byte streams, one flowing in each direction. TCP includes a flow-control mechanism for each of these byte streams that allows the receiver to limit how much data the sender can transmit. TCP also implements a congestion-control mechanism." Q: Does TCP include flow-control and/or congestion control? Q: Can a TCP connection be on one-way or must it always be two way? Q: What purpose is the "pseudo header" used in calculating a checksum? http://tools.ietf.org/html/rfc793 page 16-17 http://www.postel.org/pipermail/end2end-interest/2005-February/004617.html http://www.postel.org/pipermail/end2end-interest/2005-February/004616.html Handshaking: 3 way open, 4 way close including SYN, ACK, FIN etc - http://www.garykessler.net/library/tcpip.html#connect "This three-way handshake is sometimes referred to as an exchange of "syn, syn/ack, and ack" segments. It is important for a number of reasons. For individuals looking at packet traces, recognition of the three-way handshake is how to find the start of a connection. For firewalls, proxy severs, intrusion detectors, and other systems, it provides a way of knowing the direction of a TCP connection setup since rules may differ for outbound and inbound connections." Q: Outline the TCP flags used in the basic TCP 3-way handshake. Clearly indicate which is server and which is client. You can attack some servers by doing many partial handshakes and exhausting resources: - syn flood attack: http://www.vijaymukhi.com/vmis/tcp.htm Q: How does a syn-flood attack work? UDP and TCP packet header ------------------------- The TCP header is much more complex than the UDP header http://www.ssfnet.org/Exchange/tcp/tcpTutorialNotes.html#TH - has to handle issues dealing with reliability, flow control, congestion - note the peculiar TCP/UDP pseudo-header for checksums - UDP and TCP checksums include the source and destination IP addresses! - UDP pseudo-header: http://tools.ietf.org/html/rfc768 page 2 - TCP pseudo-header: http://tools.ietf.org/html/rfc793 page 16-17 http://www.postel.org/pipermail/end2end-interest/2005-February/004617.html http://www.postel.org/pipermail/end2end-interest/2005-February/004616.html Q: What purpose is the UDP "pseudo header" used in calculating a checksum? Q: What purpose is the TCP "pseudo header" used in calculating a checksum? Q: T/F TCP and UDP include the IP layer packet source and destination addresses in their checksum calculations. TCP state transition diagram ---------------------------- http://www4.informatik.uni-erlangen.de/Projects/JX/Projects/TCP/tcpstate.html http://www.ssfnet.org/Exchange/tcp/tcpTutorialNotes.html#ST - client and server both start in the CLOSED state (top of diagram) - graph arrows are labelled with transitions [/] where indicates either an incoming packet with a flag set, (e.g. ACK, FIN) or a deliberate change to another state (e.g. "Passive Open", "Close", "Send"). * The "three-way handshake" for a non-simultaneous connectin opening: - a server is sitting in the LISTEN state - a client does an "active open" 1. client sends: SYN, moves to SYN_SENT 2. server sends: SYN,ACK, moves to SYN_RCVD 3. client sends: ACK, moves to ESTABLISHED 4. server receives ACK and moves to ESTABLISHED Now both processes are in the "ESTABLISHED" state. Q: Give the three-way TCP handshake, showing the role of client and server * A *simultaneous* TCP connection opening: 1. both systems send SYN and move to SYN_SENT 2. both send SYN,ACK (RFC793 diagram has an error) and move to SYN_RCVD 3. both systems receive SYN,ACK and move to ESTABLISHED - RFC1122 4.2.2.7 says RFC793 has an error on what is sent on the transition from SYN_SENT directly to SYN_RCVD: should be sending SYN,ACK, not SYN http://tools.ietf.org/html/rfc1122 - the corrections suggested by RFC1122 appear to break the simultaneous open; one has to interpret the "ACK" transition as "ACK or SYN,ACK" Q: Looking at the TCP state transition diagram, into which state will a program move if it is currently in state SYN_SENT and it receives a TCP packet with just the SYN flag set? When it makes that state transition, what flags will it set in the next outgoing packet? - be familar with interpreting a TCP state diagram in RFC793 - three-way handshake for an asymmetric (non-simultaneous) open - trace a simultaneous open in RFC793 - the corrections suggested by RFC1122 appear to break simultaneous open; one has to interpret the "ACK" transition as "ACK or SYN,ACK" - RFC1122 section 4.2.2.10 says: http://tools.ietf.org/html/rfc1122 "It sometimes surprises implementors that if two applications attempt to simultaneously connect to each other, only one connection is generated instead of two. This was an intentional design decision; don't try to "fix" it." Q: T/F When two systems attempt simultaneous connections with each other, you end up with two separate TCP streams. Lab work: Testing all three processes in a TCP echo client/server application.