------------------------- Week 02 Notes for NET2003 ------------------------- -Ian! D. Allen - idallen@idallen.ca - www.idallen.com Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) Midterm test dates are posted on the Course Home Page. Lab #01 is available in the Class Notes area. Comments on lab work - read the whole question before starting to answer it - the hints that make life easier are at the end of the question Review: - you can log in to the course linux server - you have a basic knowledge of the VIM editor - you have finished Lab 1 (simple editing using VIM) On Learning VIM: See the image learning_curves.jpg in the Class Notes. Course Linux Server ------------------- - Ubuntu 7.10 Linux distribution (October 2007) - Ubuntu is based on the Debian Linux distribution The Internet ------------ http://en.wikipedia.org/wiki/Internet "The Internet is a worldwide, publicly accessible series of interconnected computer networks that transmit data by packet switching using the standard Internet Protocol (IP). It is a "network of networks" that consists of millions of smaller domestic, academic, business, and government networks, which together carry various information and services, such as electronic mail, online chat, file transfer, and the interlinked Web pages and other documents of the World Wide Web." - the Internet is not just the WWW (HTTP) - HTTP is just one of many, many Internet protocols! - but Algonquin College blocks most non-HTTP traffic - in particular, the SMTP port (25) is blocked to external sites - blocks are "drop packet", not "refuse packet" types; they time out - Internet not developed as a proprietary system - standards-based vs. product-based - based on defined protocols, not on vendor products or implementations - nobody pays license fees to use TCP/IP, SMTP, HTTP, etc. - Tim Berners-Lee doesn't get royalties for your web site - why do companies still write web pages that only work in one browser? - e.g. Algonquin Blackboard - http://www.anybrowser.org/campaign/ - the mistake of designing for a vendor's product, not for an international standard protocol Role of Unix (now Linux or BSD) and the Internet: ------------------------------------------------ - WWW slashes are "forward" slashes because the WWW grew up on open-source Unix machines. (DOS/Windows came much later, and was closed-source.) - text-based Internet protocols pre-date XML (everything is text in Unix) - Unix was full of tools to deal with text and text files - an "ethereal" or "netcat" text dump of most Internet protocols is often very readable (no binary junk) - Be aware of the history and importance of Open Source in the development of the Internet and its protocols (e.g. RFC). The Internet could not have evolved under a closed-source, pay-per-view business model. (Don't let it head that way!) - Internet development was Open Source: - "FLOSS": Free/Libre Open Source Software (or "FOSS" in the USA) - open-source discussions occur with source code samples Is the Internet smart about content? ----------------------------------- - The Internet is dumb. It wasn't designed to give priority to different owners of packet traffic. The intelligence is "at the edges" of the net. - Some say you could implement the Internet using two cans and a string; or, even using carrier pigeons: - pigeons: http://tools.ietf.org/html/1149 (1 April 1990) - pigeons: http://www.blug.linux.no/rfc1149/ Net Neutrality - not for long? -------------- - Like the downtown streets at rush hour, the Internet doesn't (yet) pass traffic based on how much money you have. You can't get higher priority by paying more; though, this may change (on the Internet) in the next year or two if the backbone carries have their way. - http://www.digital-copyright.ca/taxonomy/term/396 * AT&T blocks Pearl Jam's Bush slam : Pearl Jam calls for Net Neutrality A Salon article discusses how AT&T unilaterally censored political speech at a Pearl Jam concert: The band says the company's actions highlight the need for action on "network neutrality" -- the fight for regulations prohibiting broadband firms from making decisions about what content is and is not allowed on their networks. AT&T is currently fighting network neutrality, helping the NSA spy on Americans, and developing a way for Hollywood to police the Internet. * Rogers Must Come Clean on Traffic Shaping: Michael Geist's weekly Law Bytes column (Toronto Star version, Homepage version) focuses on Rogers, a leading Canadian ISP, actively engaging in "traffic shaping", a process that limits the amount of bandwidth available for certain applications. Although this was initially limited to peer-to-peer file sharing applications, there is mounting speculation that the practice may be affecting basic functionality such as email and the use of virtual private networks. The Internet - who owns it? who controls it? ------------ - IP and port address space is coordinated by ICANN/IANA - Internet Corporation for Assigned Names and Numbers: icann.org - Internet Assigned Numbers Authority http://www.iana.org/ - Internet Engineering Task Force (IETF): http://www.ietf.org/ - Motto: "Rough consensus and running code." "When I was studying Physics the quickest way to end an argument was to show the explanation in mathematics (albeit a lot of handwaving mathematics!). Most software developers on the otherhand do not grok math, however they surely do grok code. Therefore if you could explain your arguments through code then you would have improved your odds of getting your message through." http://www.manageability.org/blog/stuff/rest-explained-in-code/view "Be liberal in what you accept, and conservative in what you send" (Jon Postel, TCP/IP developer) * But: "If we were all conservative in what we do, then we wouldn't do much that is new, or different. This would seem to retard progress. Of course, the same would be true in protocols so perhaps we need a "where possible" qualifier." http://www.aaronsw.com/weblog/000776 - Internet standards: ARPAnet Request for Comment - RFC http://tools.ietf.org/html/ IP: http://tools.ietf.org/html/791 (45 pages) UDP: http://tools.ietf.org/html/768 (3 pages on top of IP) TCP: http://tools.ietf.org/html/793 (85 pages on top of IP) SMTP: http://tools.ietf.org/html/2821 (79 pages on top of TCP) TCP tutorial: http://tools.ietf.org/html/1180 * Who controls handing out the IP numbers and port numbers? - the Internet Corporation for Assigned Names and Numbers (ICANN) through its operating unit the Internet Assigned Numbers Authority (IANA) "Dedicated to preserving the central coordinating functions of the global Internet for the public good." ICANN: http://www.icann.org/ IANA: http://www.iana.org/ - IANA delegates to a few Regional Internet Registries (RIRs) to distribute the large blocks of IP addresses http://www.iana.org/ipaddress/ip-addresses.htm http://www.iana.org/assignments/ipv4-address-space - e.g. American Registry for Internet Numbers (ARIN) IP address list http://www.arin.net/ - special IP addresses (historical and current) are documented in RFC3330 http://tools.ietf.org/html/3330 - note: hosts on this net are allocated: 0.0.0.0/8 - note the important RFC1918 private address space: 10.0.0.0 - 10.255.255.255 (10/8 prefix) 172.16.0.0 - 172.31.255.255 (172.16/12 prefix) 192.168.0.0 - 192.168.255.255 (192.168/16 prefix) "the Internet does not inherently protect against abuse of these addresses; if you expect (for instance) that all packets from the 10.0.0.0/8 block originate within your subnet, all border routers should filter such packets that originate from elsewhere. Attacks have been mounted that depend on the unexpected use of some of these addresses." - IANA TCP/UDP port list (see RFC4340 for the three big divisions): http://www.iana.org/assignments/port-numbers - Well Known Ports are those from 0 through 1023 - only Unix privileged (root) programs can bind to these ports - Registered Ports are those from 1024 through 49151 - note that 65536 - 16384 = 49152 (2**16 - 2**14 = 49152) - Dynamic and/or Private Ports are those from 49152 through 65535 - a shorter Unix/Linux specific copy of this file is kept in /etc/services - to register a new port, see [RFC4340], Section 19.9 http://tools.ietf.org/html/rfc4340#section-19.9 ============================================================================ Q: T/F the Internet is patented; companies pay royalties to use the WWW and IP protocols Q: T/F you can pay more to have your data packets given priority on the global Internet Q: What organization is the ultimate authority on IP addresses and ports? Give the full name. Q: What organization is delegated to manage IP addresses in North America? Give the full name. Q: What does "Be liberal in what you accept, and conservative in what you send" mean? Q: What does the acronym "FLOSS" mean? Q: What do the initials RFC mean with regard to Internet standards documents? Q: Give the three RFC1918 private address space blocks and their masks Q: What is the last IP address in the RFC1918 block 172.16.0.0/12 ? Q: Is 172.15.0.0 a RFC1918 private address? Q: Is 172.17.0.0 a RFC1918 private address? Q: What is the last (highest) private address in the RFC1918 10.0.0.0 block? Q: What is the last (highest) private address in the RFC1918 172.16.0.0 block? Q: What is the last (highest) private address in the RFC1918 192.168.0.0 block? Q: T/F the Internet will not route RFC1918 private addresses Q: T/F Special address block 0.0.0.0 is reserved for hosts on your local network. [see RFC3330] Q: T/F IP address 0.0.0.0 is not a valid address. [see RFC3330] Q: What Unix/Linux file is used to turn "smtp" into "25" when you do $ telnet localhost smtp $ nc -v localhost smtp Q: Name and give the port ranges of the three RFC4340 divisions of ports ( http://tools.ietf.org/html/rfc4340#section-19.9 ) Q: Which port numbers can only be bound to by the super-user on Unix/Linux? What is the IANA name for this reserved-for-super-user port range? (Not all operating systems restrict access to these low-numbered ports.) Ref: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#bind See the paragraph: "Another thing to watch out for when calling bind(): don't go underboard with your port numbers. ..." ============================================================================ Motivations for learning shell script (command line) programming ---------------------------------------------------------------- 1) Six out of 9 third year co-op students in BIT/NET said they used scripting on the job (and needed to know more!). 2) See the shell script /etc/init.d/apache2 that starts the Apache web server. Can you read and modify this script? 3) The network manager thinks someone is abusing their server login account. Can you generate an instant sorted list of the top 10 logins to the machine? It's one line of script: $ last | awk '{print $1}' | sort | uniq -c | sort -nr | head (Generate a list of logins; print the first (userid) field on each line; sort the userids; count adjacent lines; sort the count in reverse order; pick the top ten.) Linux basics & Linux command line interface (CLI) tour ------------------------------------------------------ Unix is an O/S designed by programmers for programmers - command-line driven (programmers didn't use the GUI) - things work silently (no confirmation) - messages appear only when things fail - most command names are cryptic abbreviations! - like vim, the commands are hard to learn but are short and easy to use Using the Unix Shell - up-arrow repeats previous commands in shell command history - TAB will complete command names and pathnames (but not arguments) - "exit" will cause the shell to exit, possibly logging you out if there are no other shells running on that connection Pagination commands - several comands paginate output, e.g. more, less, pg - you "pipe" the output of verbose commands into these programs, e.g. $ ls -l /usr/bin | less $ last | more - pagination is not built in to each command, as in DOS - one set of commands does all the pagination; it isn't part of other programs - searching can be done using / in MORE and LESS (and VIM) - space bar goes down by pages; return goes down by lines - in more or less, type h or ? at the prompt to get a help screen Manual pages - using the man command, man -k, and apropos (man_page_RTFM.txt) - man pages are displayed using LESS on Linux - as with less, you can use / to search the man pages for words Entering console or command-line command text - the terminal driver - your terminal is two devices, keyboard and screen, loosely coupled - many programs can write on your screen at the same time - control chars (unprintables) syntax: ^X means CTRL-X for the 32 ASCII characters from @ to A to Z to [ to ^ - see "man ascii" and "man latin1" ^? by convention means the unprintable DEL character (does not mean CTRL-?) - use the backspace key to erase one char in Unix/Linux - but Unix has to know what character the backspace key sends! - do not use the back-arrow key to erase! use the backspace key - VIM and some other programs use back-arrow to move, not erase - terminal driver line edit characters: ^H ^? ^W ^U - ^H or ^? - erase previous character (backspace) - ^W - erase most recent word on the line (same as vim) - ^U - erase entire line (same as vim) - other control characters: - ^R - redraw line (in case overwritten by background program output) - ^C - interrupt the current (foreground) process - ^D - send EOF (end of input) to program from keyboard - ^L - often clear/redraw screen (in bash shell, less, more, and vim) - ^Z - suspend/stop (not kill) current process temporarily; use the built-in shell command fg to restart the process If your backspace key isn't recognized by Unix/Linux, you can fix it: - see Notes file terminal.txt $ stty erase '^?' $ stty erase '^H' - the quote characters protect the argument to stty from interpretation by the shell Important: Most Unix commands that take file names as arguments will read standard input (usually your keyboard) if no file names are given. Unlike the shell, the commands will *not* prompt when reading your keyboard. Some new useful command names: - "sort" sorts all its argument files (together) to standard output - the files themselves are not changed - "cat" catenates all its argument files (one after the other) to stdout - the files themselves are not changed - "stty erase X" sets your backspace (erase) character to X Differentiate between EOF (^D) and Interrupting Processes (^C) - ^D and ^C are not the same - ^C kills the process and it doesn't finish what it was doing - many programs read your keyboard if you don't give them any files $ sort - note difference between ^C and ^D $ wc - note difference between ^C and ^D $ cat - less difference, since cat doesn't buffer to a terminal Another terminal control character: ^Z - suspends/stops (does not kill) the current process - allows you to suspend VIM (or something else) so you can do other work - use the built-in shell command fg to resume the process again - use the built-in shell command jobs to see suspended processes that were created from the current shell For most programs that talk to your screen to work (e.g. VIM, LESS), Unix needs to know what kind of terminal emulation your screen is using (e.g. vt100, xterm, ansi, etc.). Sometimes you have to set this explicitly: see Notes file terminal.txt Unix file system notes - see Notes file pathnames.txt - Unix pathnames use slashes / not backslashes \ - slashes *separate* pathname components - the first directory to the left of the leftmost slash is the ROOT directory that has no name (often incorrectly called "/" because calling it "" is awkward) - "absolute" pathnames start with a slash (preceded by the empty ROOT) - "relative" pathnames do not start with a slash - but note that a leading tilde "~" contains a hidden slash, e.g. ~idallen ! - but note that shell variables such as $HOME may also contain slashes, e.g. $HOME/foo --> /home/idallen/foo --> absolute pathname - there are no "drive letters" in Unix - hardware can be mounted anywhere in the file system tree See Notes file miscellaneous.txt - learn various ways of getting out of programs - Unix line ends are \n newline (ASCII LF) characters (not \r or \r\n) - "echo a | wc -c" counts 2 characters, not just one! - in vim: set fileformat=unix or dos or mac VIM Tips -------- - finding and deleting lines that match a pattern; :g/pattern/ d - finding and deleting lines that do not match a pattern; :v/pattern/ d - deleting leading digits from a line: :%s/^\d// - deleting all leading spaces from every line: :%s/^ *// - deleting all leading digits followed by spaces from every line: :%s/^\d *// Commands introduced so far: 1 bash 2 cal (9 1752) 1 cat (-s) 1 cd 1 cp (-u) 1 date 1 dir (DOS alias - use ls instead) 1 echo (shell built-in and external) 2 exit (shell built-in) 2 ifconfig 1 less 1 ls ( -l -a -d ) 1 man (-k) 1 more 2 mv 2 passwd 1 pwd (shell built-in and external) 2 rm ( -r -f ) 2 sort (-n -r -u) 2 stty ( erase '^H' erase '^?' ) 1 vi, vim (-r) 2 wc ( -l -w -c ) 2 wget [-O outfile] {URL} 1 which 1 who ============================================================================ Q: T/F like DOS, each Unix program has its own pagination option Q: how do you reach the help screen in the "less" pagination program? Q: how do you search for a word when using "less" or "vim"? Q: In most Unix shells, how do you "get back" or repeat the last command? Q: How can you ask the shell to auto-complete a file name? Q: What built-in shell command causes the shell to terminate? Q: T/F Linux manual pages are displayed using the "less" pagination program Q: How would you search forward for the word "TCP" when looking at a man page? Q: T/F most Unix/Linux commands ask for confirmation of serious actions Q: when talking to the terminal driver in Unix/Linux: - how do you erase a character? a word? a line? - how do you redraw the current line of input? - how do you interrupt the current process? - how do you send EOF from the keyboard? - how do you clear/redraw the screen? Q: what does the ^Z keyboard signal do to a process? Q: T/F the ^Z keyboard signal terminates a process, similar to ^C Q: T/F the Unix/Linux keyboard is connected to your terminal screen so that characters from the keyboard go directly to the screen and then off to Unix/Linux Q: how do I fix my backspace character if it is echoing ^? characters? Q: how do I fix my backspace character if it is echoing ^H characters? Q: for most commands that take file names, what happens if you don't give the command any file names? Q; T/F all commands that read your keyboard issue a prompt for input first Q: what does this command do: sort file1 file2 file3 Q: what does this command do: cat file1 file2 file3 Q: how would I set my backspace character to be ^E (CTRL-E)? Q: what is the difference between ^C and ^D ? between ^C and ^Z ? Q: What shell variable contains the current terminal type? Q: What is the name of the top (root) of the Unix/Linux directory tree? Q: What is an absolute path? a relative path? Q: T/F an absolute path is dependent on the current directory Q: T/F a relative path always refers to the same file Q: T/F most Unix/Linux commands ask for confirmation of serious actions Q: How many characters are counted here: echo abc | wc -c Q: If a Unix text file contains 10 lines, each with one letter on it, what is the overall size of the file (in bytes)? Q: If a Windows/DOS text file contains 10 lines, each with one letter on it, what is the overall size of the file (in bytes)? Q: From the given current directory of /tmp/idallen, which of the following command lines is the same as "cp file1 file2"? $ pwd /tmp/idallen $ cp ./file1 ./file2 $ cp ././././././file1 ././././././file2 $ cp file1 ../idallen/file2 $ cp ../idallen/file1 file2 $ cp ../idallen/file1 ../idallen/file2 $ cp ../../tmp/idallen/file1 file2 $ cp file1 ../../tmp/idallen/file2 $ cp ../../tmp/idallen/file1 ../../tmp/idallen/file2 $ cp ./././././../../tmp/idallen/file1 ./././././../../tmp/idallen/file2 $ cp /tmp/idallen/file1 /tmp/idallen/file2 $ cp /tmp/../tmp/idallen/file1 /tmp/idallen/../idallen/file2 $ cp file1 /tmp/idallen/../../tmp/idallen/file2 $ cp /file1 /file2 $ cp file1/. file2/. $ cp idallen/file1 idallen/file2 $ cp idallen/../file1 idallen/../file2 $ cp ./idallen/file1 ./idallen/file2 $ cp tmp/idallen/file1 tmp/idallen/file2 $ cp tmp/../idallen/file1 tmp/../idallen/file2 $ cp ./tmp/idallen/file1 ./tmp/idallen/file2