----------------------- Exercise #5 for CST8129 due October 12, 2005 ----------------------- -Ian! D. Allen - idallen@idallen.ca Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) Global weight: 3% of your total mark this term Due date: Before the end of your Lab period on Wednesday, October 12. The online deliverables for this exercise are to be submitted online via the T127 Linux Lab using the submit method described in the exercise description, below. No paper; no email; no FTP. Late-submission date: I will accept without penalty online exercises that are submitted late but before 16h00 (4pm) on Thursday, October 13. After that late-submission date, the exercise is worth zero marks. Exercises submitted by the *due date* will be marked online and your marks will be sent to you by email after the late-submission date. This exercise is due before the end of your Lab period on October 12. Exercise Synopsis: Marks: 3% Write a shell script to data-mine from a web page. Create a PDL algorithm to sort three integers. Where to work: Do your Unix command line work on any WT127 workstation. (You may login to the workstation remotely.) The files you work on will remain in your account after you log off. Do not erase your files after submission; always keep a spare copy of your exercises. WARNING: Do not attempt this exercise on a Windows machine - the text file format is different. You must connect to and work on Unix/Linux. Note that you may connect to a lab workstation *from* a Windows machine (using PuTTY); however, you may not use the Windows machine itself to do your work. Use the vim editor on the Linux machine. Location of the course notes on the Lab workstations: You can find a copy of all the course Notes files on any Lab workstation under directory: ~alleni/public_html/teaching/cst8129/05f/notes/ You can copy files from this directory to your own account for modification or study, if you like. (To avoid plagiarism charges, you must credit any material that you copy and submit unchanged as your own work.) Location of the textbook CDROM files on the Lab workstations: The CDROM files for the Quigley textbook are available in the WT127 Lab under the directory: /home/cst8129/ Exercise Preparation: A. Know where to find an online copy of all the course Notes on the Lab workstations. (See above.) You can get a copy of this exercise from the course notes. B. Complete the online Course Notes readings. Any questions? See me in a lab or post questions to the Discussion news group (on the top left of the Course Home Page). --------------------------------------------- Part I Exercise Details (in the T127 Linux Lab) --------------------------------------------- 1. Create a file named "exercise05script.sh" with these four lines in it: #!/bin/sh -u PATH=/bin:/usr/bin ; export PATH LC_COLLATE=C ; export LC_COLLATE umask 022 The lines must be at the left margin, with no leading or trailing blanks or blank lines. The word count and checksum of the resulting file will be: $ wc exercise05script.sh 4 12 89 exercise05script.sh $ sum exercise05script.sh 35197 1 2. Make the file executable: chmod +x exercise05script.sh Make sure the file executes without errors: ./exercise05script.sh (There will be no output from the file yet.) 3. Add your Assignment Label to the file as comment lines, below the /bin/sh line and above the PATH line. Make sure the first line of the script remains the shell interpreter line (as given above). 4. Copy the block of questions below into the end of the script file and add octothorpe comment characters ("#") in front of all the lines. Your file will now be in these sections, in this exact order: - shell interpreter line (comment) - Assignment Label (comments) - set PATH, LC_COLLATE, and umask - Questions 5-15 (comments) Execute the script file and make sure there are no errors and no output. (You only added comment lines - the file should produce no output.) Under each numbered question below, add commands, one by one, that will do the steps below, in order. You must make sure each command works at the command line before you copy it into the the script file and then test it by executing the file. (Hint: Work with two or three shell windows open.) Do not create any temporary files. Use pipes to connect commands. 5. Remove recursively any directory named "web5". (Hint: Find an option to have the command ignore nonexistent files so that you don't get an error message if the directory isn't there.) 6. Create a new directory named "web5" in the current directory. 7. Change directories to make web5 the current directory. 9. Show on the screen the full pathname of the current directory. 10. Fetch the unformatted raw HTML page for the Ottawa weather into the current directory under the name "weather.html". (URL: http://weatheroffice.ec.gc.ca/city/pages/on-118_metric_e.html ) 11. Echo the phrase "Count of <" followed by a count (a single number) of the number of lines that contain a left angle bracket < in the weather.html file. 12. Echo the phrase "Compressability of HTML file using GZIP" followed by two numbers: a) the size in bytes (a single number) of the weather.html file b) the size in bytes (a single number) of the file after having processed it using the GZIP program 13. Fetch the *formatted* HTML page for the Ottawa weather into the current directory under the name "weather.txt". (Note the different extension.) The formatted page will contain no HTML. 14. Echo the phrase "Compressability of TEXT file using GZIP" followed by two numbers: a) the size in bytes (a single number) of the weather.txt file b) the size in bytes (a single number) of the file after having processed it using the GZIP program 15. From the formatted weather file, use commands (one or more pipelines) to data-mine the file to produce the output in the following format: Observed at: Ottawa Macdonald-Cartier Int'l Airport 12 October 2005 Temperature 11 °C The temperature should be the current temperature, not necessarily the value given above. Only the above three lines should appear in the output - no others. BONUS MARK: Make the temperature appear on the same line as the word "Temperature", e.g.: Temperature 11 °C Execute your file and make sure there are no errors: $ ./exercise05script.sh --------------------------------------------- Part II Exercise Details (in the T127 Linux Lab) --------------------------------------------- 1. Create a file named "exercise05text.txt" with your assignment label at the top of the file as comments. 2. Below your label, write a PDL algorithm (plain text) that will take three integers, I1, I2, and I3, compare them (pairwise) and output them in sorted order. E.g. if I1=33, I2=11, and I3=99 the algorithm should output: 11 33 99 The algorithm must work for any integers, not just the above example. (Hint: Use a small set of nested IF statements.) Submission ---------- Submit the finished and labelled files for marking using the following Linux command line: $ ~alleni/bin/copy exercise05script.sh exercise05text.txt This program will copy the selected files to me for marking. You can copy the files more than once. Only the most recent copies will be marked. Always submit both files for marking at the same time. This exercise is due at the end of your lab period today. P.S. Did you spell all the label fields and file names correctly?