----------------------- Lab #05 for NET2003 due February 12, 2007 ----------------------- -Ian! D. Allen - idallen@idallen.ca Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) Global weight: 3% of your total mark this term. Due date: before 10h00 Monday February 12 The deliverables for this lab exercise are to be submitted online on the Course Linux Server using the "netsubmit" method described in the lab exercise description, below. No paper; no email; no FTP. Late-submission date: I will accept without penalty lab exercises that are submitted late but before 12h00 (noon) on Wednesday, February 14. After that late-submission date, the lab exercise is worth zero marks. Lab exercises submitted by the *due date* will be marked online and your marks will be sent to you by email after the late-submission date. Lab Synopsis ------------ Use Unix/Linux commands to extract data from an ADSL modem log file to produce some graphs of phone line quality. Extract information about line reliability and down-time. NOTE: For full marks, keep your lines shorter than 80 columns in this course. Short lines allow for easy printing and side-by-side comparison of files on a screen. Where to work: Easy access to Course Notes: See Lab #1. Folding long lines: See last week's lab. ---------------------------------------- Lab Details (on the Course Linux Server) ---------------------------------------- Make sure you are working on the Course Linux Server. You can do some of this work on Knoppix; however, none of the Course Notes are on Knoppix and your files will disappear when you log off unless you copy them to the Course Linux Server. *) For full marks, keep your lines shorter than 80 columns in this course. Short lines allow for easy printing and side-by-side comparison of files on a screen. *) You may find it useful to create separate directories in your linux server account to store the files for each lab exercise, e.g. $ mkdir lab5 $ cd lab5 $ vim lab05script.txt Part I - lab05script.sh ------ You will build an executable shell script containing a commented list of Unix commands to be executed. Make sure you are working on the Course Linux Server. You can do some of this work on Knoppix; however, none of the Course Notes are on Knoppix and your files will disappear when you log off unless you copy them to the Course Linux Server. In shell scripts, lines starting with octothorpe characters ('#') are ignored as comment lines. Steps 1-10 buid the same script template as you used for the last assignment. You may copy that template rather than recreating it. 1. Fetch a copy of the argv.sh.txt file from the course notes directory (copy the file) and rename your copy to be: lab05script.sh Make the file executable using the chmod command (note the plus): $ chmod +x lab05script.sh 2. Verify that the copied script executes correctly and without error: $ ./lab05script.sh a b c ./lab05script.sh: This script has 3 command line arguments. Argument 0 is [./lab05script.sh] Argument 1 is [a] Argument 2 is [b] Argument 3 is [c] 3. At the top of the lab05script.sh file, fix the existing Assignment Submission label comment to be your own label. Make sure the lines of the label all start with the shell comment character '#'. 4. Delete the line (line 2) starting with "# Display on standard error ...". 5. At the top, under "Syntax", replace "[args...]" with "(no arguments)" so that the line reads: # $0 (no arguments) This script takes no command line arguments. 6. At the top, under "Purpose", replace the paragraph with a short description of what this script does (come back to this - see below). Don't forget this step! 7. Change the line starting with PATH= to be this exact line below by adding "/sbin" and ":" to the start of the PATH string: PATH=/sbin:/bin:/usr/bin ; export PATH Do not add any blanks or quotes to this line. Make sure you see the two characters ":" and ";" as different. (Adding /sbin to the PATH allows the shell to find executable commands in the /sbin directory as well as in /bin and /usr/bin .) At this point, you might re-execute the script to make sure you haven't introduced any typing errors. The output should be the same as above. There should be no error messages on the screen. 8. Delete everything below and after the line "umask 022" in the file. 9. Add a blank line after "umask 022". 10. Delete all the "I!" comment lines. Never submit these lines in your own scripts - they are "instructor" comments from me to you. 11. You will now add, to the bottom of this executable script, at least three lines for every section in this question. The first line will echo the question number onto the screen in the exact format shown. The second (or more) line(s) will be shell comment(s) that will not print on the screen. The last line will be a shell command line or pipeline (one or more commands) that will do the action required by the question. The output from the command(s) will appear on the screen, if the command generates output. (Many Unix/Linux commands work silently and have no output on the screen.) For example, if asked to "display a count of the non-hidden names in the /bin directory" you would echo the question number, enter at least one comment line (less than 80 characters), and enter the correct Unix/Linux command line (one or more commands) into the bottom of the script file like this: echo "--- Question 11y" ls /bin | wc -l Use the exact format and syntax shown for the echo line that shows the Question number. (Use the cut-and-paste features of vim to copy the line for the next question; don't re-type it every time!) You may have more than one comment line; each line must be less than 80 columns. You may want to open up another login window and work with the shell interactively to test the commands you come up with. Copy each working command into the shell script as you debug it in the interactive window, and run the script after each addition. As you write the script, leave one blank line between each question you answer, to make the script source more readable. The result will be groups of three or more lines, separated from one another by one blank line. Test your script periodically (after each edit) by saving it and running it from the command line (perhaps in a different window): $ ./lab05script.sh Make sure there are no error messages. Run the script after each edit you add; that way you debug only the few lines you added. NOTE: Do not use any "change directory" commands in your script unless the instructions ask for it explicitly. Do not add commands or create output that is not required. Do only the commands listed, in the order listed. If you can't answer a question: echo the question number, add the comment line, but leave out the actual command itself. Here are the individual questions to add to your executable script. Find executable commands that look up and produce the given output. All the command names needed are given in this lab (below) or are already known to you via the Notes file: unix_command_list.txt 11 a) Count the number of lines/words/characters in the output of the "date" command; but, throw away standard output. No output should appear on your screen. Do not throw away standard error output. If this command fails with an error, check the spelling of your PATH line. (Hint: throwing away output is covered in Notes file "redirection.txt".) b) Copy the Notes file "speedstats.gnuplot.txt" into "gnuplot.txt". Use an option in the copy that only copies the file if the source file is newer than the destination file or when the destination file is missing. File gnuplot.txt can be read by the "gnuplot" command. The plot file will tell gnuplot to read input data from file "speedstats.txt" and create a plot image in output file "speedtouch.png". We will use this file below. c) Uncompress the Notes file "speedstats.txt.gz" into "stats.txt". The stats.txt file contains one (very long) log line per minute. Each line in the log file was created by running the "adsl info" command on my SpeedTouch modem, replacing all newlines in the output with TAB characters, and putting the current date at the front of the line. There is one log line per minute. d) With one command count the lines/words/chars in gnuplot.txt and stats.txt The gnuplot.txt file should contain 34 lines. The stats.txt file should contain 25,000 lines (25,000 minutes). e) Extract the last line of stats.txt and change its TAB characters ('\t') back into newline characters ('\n') and put the resulting output into a file named "format.txt". Display the number of lines in format.txt (43 lines). This file shows you the format of the raw log data from my home ADSL modem. The second field on every long log line is the date; the third field is the time. f) Count the number of lines in stats.txt that contain the date '2007-02-07' (1440 lines). g) Count the number of lines in stats.txt that contain the string 'DOWN' (6,782 lines). These are minutes where the modem failed to synch up with the central office - the line was down. h) Display the first 2 lines in stats.txt that contain the string 'DOWN'. i) Extract the last 2 lines in stats.txt that contain the string 'DOWN' and translate all the TAB characters back to newline characters. j) Display the count of lines in stats.txt that contain 'DOWN' and that also contain '2007-02-07' (305 lines). k) Extract all lines from stats.txt containing the date '2007-02-07' into file "speedstats.txt" and then run the gnuplot command with the plot file "gnuplot.txt" as its only argument. The gnuplot will take a few seconds to plot the data from "speedstats.txt" into output file "speedtouch.png". Rename speedtouch.png to be 2007-02-07.png (If you are logged in from a machine running X11, you can check that the plot worked by giving the image file name to the "xli" or "display" commands.) l) Following the preceding method, create output plot image file day.png containing only the last 24 hours of log entries. m) Create output plot image file week.png containing only the last 7 days of log entries. (The gnuplot of this many lines may take 20-90 seconds to complete - be patient.) n) Create output plot image file down.png containing only the minutes where the line was down. o) Create output plot image file wednesday_down.png containing only the minutes where the line was down on the date 2007-02-07. p) Display the checksums of all files ending in extension ".png" in the current directory. If you've done everything correctly, you will have these checksums: 33246 13 2007-02-07.png 22861 13 day.png 09860 8 down.png 62818 8 wednesday_down.png 30272 18 week.png q) The command "uniq -c" will count the number of consecutive occurrences of the same lines on standard input and prefix each line with the count. In the format.txt file you created earlier, you can see that the "Modemstate" line gives the current status of the ADSL line - it will say "up" or "DOWN". From the stats.txt file, extract and display the top five "Modemstate" lines with the largest counts of consecutive occurrences. When you get it right, your pipeline will output these five lines: 632 Modemstate : up 594 Modemstate : up 582 Modemstate : up 576 Modemstate : up 561 Modemstate : up This shows that the longest continuous period where the line stayed up was 632 minutes (10.5 hours). (Hint: You must ensure that each Modemstate entry from the stats.txt log file is on its own line, just as it appears in format.txt, before you can extract it or apply other line-oriented commands such as uniq to it. You will also need options to sort lines numerically and in reverse order.) r) Extract the top five "Modemstate" lines showing the largest counts of consecutive line down-time. This is complicated by the fact that there is different text after the word DOWN, depending on what the modem was doing while the line was down. We don't care about that text; we only care that the line was DOWN. However, "uniq -c" will see the different text at the end of the line and count the lines as different, even though we want all the consecutive DOWN lines to be counted as one unit. RTFM for uniq and find a way to tell uniq to only check no more than a fixed number of characters at the start of each line. Set the number so that uniq only compares up to the opening parenthesis on each DOWN line. When you get it right, your pipeline will output these five lines: 11 Modemstate : DOWN ( Line in Initializing mode ) 11 Modemstate : DOWN ( Line in Initializing mode ) 11 Modemstate : DOWN ( Line in Configures mode ) 10 Modemstate : DOWN ( Line in Initializing mode ) 10 Modemstate : DOWN ( Line in Configures mode ) This shows that the longest period of continuous line down-time was 11 minutes. (Hint: You have to apply uniq to count sequences of identical consecutive Modemstate log lines *before* you extract just the DOWN lines; since, doing it the other way around will result in all the lines being identical consecutive DOWN lines.) s) Output the count of many times the line was down for exactly one minute. (1,748 times) (Hint: The line was down for one minute if "uniq -c" says that the count of consecutive occurrences of a line containing DOWN is 1. To find, and count, lines that contain the digit 1, the digit 1 must be surrounded on both sides by a blank (' 1 '), so that you don't find numbers such as 12 or 21.) t) Output the count of many times the line was up for exactly one minute. (2,726 times) (Hint: You can select lines that contain the "up" pattern, or you can select lines that *do not* contain the "DOWN" pattern. Both give the same answer.) u) Count the number of lines in stats.txt that *do not* contain the words "up" or "DOWN". (73 lines) (Hint: Look for the keyword "invert" in the grep man page.) v) Display the last line in stats.txt that *does not* contain the words "up" or "DOWN". (The line is dated 2007-02-07 19:49:01 and mentions "Unable to connect".) w) Remove the stats.txt file. x) Last command: Cause the script to exit. All the command names needed are given in this lab (above) or are already known to you via the Notes file: unix_command_list.txt 12. How to work: Execute your script as you add each of the above command lines. Make sure the script generates the correct output and no error messages when you execute it: $ ./lab05script.sh 13. Verify the script format. Above each of the lines in your working script, make sure you have added a line that echoes the question number in the exact format given. Under that echo line, make sure you have comment lines (one or more, each less than 80 characters) explaining in your own words what the command line that follows the comment does. Each answer will look similar to this: echo "--- Question 11z" exit 0 In your script source, leave one blank line between each question you answer, to make the script readable. (You may also echo blank lines to the screen to separate the output sections, if you wish.) 14. Go back and finish the Purpose section comment at the top of the script. Scripts without your added comments will not be marked. NOTE: For full marks, keep your lines shorter than 80 columns in this course! Submission ---------- 15. Submit the above two files (see below). Submission Standards: See Lab #1 for details. A. Make sure all files contain an Exterior Assignment Submission label. For full marks, lines must be shorter than 80 columns. B. Submit your files for marking as Lab 05 using the following *single* netsubmit command line exactly as given here: $ netsubmit 05 lab05script.sh Always submit *all* files at the same time for every submission. Files submitted under the wrong names are worth zero marks. P.S. Did you spell all the assignment label fields and file names correctly?