========================== CST8129 Term Assignment #3 ========================== -IAN! idallen@ncf.ca Due: 18:00 (6pm) Friday December 6, 2002 Marks: 10% Late penalty: 50% per day Purpose: practice writing a real shell script to do real work Hand in format: online submission only - no paper, no diskettes All scripts described below must be written to conform to the script writing checklist: script_checklist.txt and to the script style given in: script_style.txt. All user input (command line arguments or input via "read") must be fully validated before being used in expressions. Do not process bad input! Echo user input (including command line arguments given) back to the user. This is usually a good idea both for debugging your script and giving the user feedback on what data the script is actually processing. Avoid Linux-only commands and command options. The same script should work without modification on both Linux and ACADUNIX, where possible. (In particular, do *not* use Bash 2.x shell syntax!) Test your scripts. The sample inputs and output shown below are not a complete test suite. I will try to find test cases using glob patterns and blanks that will make your scripts abort or misbehave. Scripts without *useful* block comments will be severely penalized. (See the file script_style.txt for a description of good comment style.) ------------------ Hand in directory: ------------------ Completed scripts must have permissions "read-write-execute-only" for you, "read-only" permissions for group, no permissions for other people. The following directory is ready to receive your completed scripts: ~alleni/cst/assignment03/xxxxnnnn/ where xxxxnnnn is your Algonquin userid (e.g. abcd0001). When you have completed a script, copy it into the above directory: cp myscript.sh ~alleni/cst/assignment03/xxxxnnnn/myscript.sh Replace "myscript.sh" with the actual script name (given below). Files with the wrong name or wrong Unix permissions will be penalized. ---------------------------------------------------------- Write this executable script named "40_weather_grabber.sh" ---------------------------------------------------------- Syntax: $0 [ city_name | airport_code ] Purpose: Write a script to fetch the current weather for a Canadian city from an Environment Canada Web page and display it. The script you write will expect zero or one command-line arguments that will be the city name or 3-letter (upper case!) airport code of a Canadian city (e.g. "Ottawa" or "YOW"). The output will look like this: The Current Weather for Ottawa (YOW) Temperature: 3C Pressure: 101.7 kPa Visibility: 16 km Humidity: 93% Dew Point: 2C Wind: SSW 9 km/h Or like this (Wind Chill instead of Dew Point): The Current Weather for Montréal (YUL) Temperature: -7C Pressure: 102.1 kPa Visibility: 24 km Humidity: 68% Wind Chill: -14 Wind: WNW 19 km/h Specifications -------------- Print an error message and exit with status 2 if there is more than one argument given on the command line. Prompt for and read the city if it is missing from the command line. If the user responds with an empty city (blank line), issue a message saying that the default city of "Ottawa" is being used, and use Ottawa. The Environment Canada web weather service expects city airport codes, not city names. If the city argument given by the user is not already 3 upper-case letters, you will have to turn the city name into the city code. (If it is already 3 upper-case letters, no conversion is needed; use the code directly.) Write and call a shell function to do the conversion from city name to 3-character city code (airport code): City Name to City (airport) Code Conversion Function ---------------------------------------------------- Function input (one argument): name of a Canadian city Function output: (on standard output) 3-letter airport code for the city Return code: 0 if the city name was found and converted 1 if the city name was not found (no output) Error messages: none (only return 1 on failure to convert) The name of the city, coming from the user, may be upper-case, lower-case, or mixed-case. Your function may wish to translate all the upper-case letters in the name to lower-case before trying to decide which 3-letter airport code goes with it. The function must recognize and be able to produce city codes for at least these three Canadian cities: Montreal (YUL), Ottawa (YOW), and Toronto (YYZ). Add others as you see fit. The output of the function (the city code) is on standard output. Save the city code output by the function into a variable. Echo the city name, and city code returned by the function, back to the user. Now that we have the correct city code, we can connect to Environment Canada and fetch the weather web page for that city code. To fetch the Environment Canada web page for a city, use the Unix command "wget" (Web Get) to fetch the correct page into a temporary file under the /tmp/ directory. Use the current process ID somewhere in the file name and make sure you have read the Notes file "less_code.txt" first. Using wget (RTFM) ----------------- The wget command fetches a URL. You can experiment with it: $ wget http://idallen.com/index.html $ more index.html $ wget ftp://ftp.rs.internic.net/domain/named.root $ more named.root The URL for weather for a city code that you should give to wget is this: http://weatheroffice.ec.gc.ca/scripts/citygen.pl?client=ECCDN_e&city=XXX where XXX is the 3-letter city code (e.g. YOW). The default is for wget to put the web page into the current directory under an appropriate name derived from the URL you give it - this is not what we want. We want the page put into a unique temporary file under /tmp/. Fortunately, wget has a Download Option that lets you write the output document to a file you specify. Use the option. As wget works, it prints logging and status messages. These messages are useful to see while we are writing the script; but, when the script is working, we don't want to see them. When the script is working, you can simply redirect the output of wget to /dev/null; or, you can use the Logging Option to wget that turns off wget's logging output. Use the option to turn off logging output, when your script is finally working. After executing wget, test the return code to make sure it is zero and also make sure that the downloaded web page (saved in the temp file) is not empty. (Print appropriate error messages and exit non-zero if necessary.) The downloaded web page contains not only the weather information but also all the HTML tags that format the page for the web. Looking for information in this file is easier if you first pre-process the file as follows: Pre-processing the downloaded web page -------------------------------------- 1. Delete all CR characters from the temp file. Some web pages contain lines that end in CR+LF (carriage return and line feed), not just LF. The extra CRs at the end of every line can cause problems when we try to pick off the weather information. (Unix expects files to have just LF at the end.) Delete all the CR characters. (Hint: The file "echo_commands.txt" contains a command that translates away undesired characters.) 2. Remove all HTML tags <...stuff...> from the temp file. None of the weather information we want is actually contained inside any of the HTML tags. Remove all the tags from the file. (Hint: An HTML tag starts with '<' and contains zero or more characters that are not '<' or '>', ending in '>'.) 3. Change ° and   in the temp file. Since the file is HTML, certain characters are expressed as HTML character-escape sequences, e.g. "°" (degrees) and " " (non-breaking space). Remove all "°"; strings from the file. Change all strings " " into single spaces. The resulting file is easier to work with, since it has less HTML formatting in it. If the city code we gave to Environment Canada is not recognized, the downloaded web page will contain the word "error" in it somewhere. Look for this word (match upper or lower case) anywhere in the temp file - if you find it, print an appropriate error message and exit non-zero. Now for the best part - extracting the weather information from the downloaded information in the temp file: Extracting information from the downloaded web page --------------------------------------------------- Examine what is left of the web page in the temp file. (You will have removed all the HTML tags before this.) Locate the lines containing the following current weather information: 1. The name of the city for which the weather report is issued 2. The current temperature 3. The current pressure 4. The current visibility 5. The current humidity 6. The current dew point or current wind chill (* see below *) 7. The current wind You must write Unix commands in your script that will extract these pieces of information from the temp file and save the results into variables, for later output. For each of the pieces of information: a. locate the line containing the information (e.g. find the line containing "Visibility:") b. extract the data from the end of the line (e.g. extract the "16 km" visibility data that appears after the word "Visibility:" on that line) c. echo just the extracted data to standard output What you write here is up to you. You can use anything you have learned in the course to mine the data from this downloaded web page. At minimum, each of the data extractions should *work* correctly, locating the correct line in the web page and extracting the information on that line. Best marks will be given to clear, concise code with minimal repetition. If you can't find the field or information for some field in the file, print a useful error message explaining what you were looking for and move on to the next piece of data - do not exit the script. (Yes, to avoid seven-times or worse code duplication, you can write one shell function that, given an appropriate regular expression, can find a field and output the rest of the data on the line containing that field name, and print a nice error message if it isn't found.) * Dew Point and Wind Chill * It appears that the data for dew point and wind chill never appear at the same time on the same page. Some pages have one; some have the other. Don't show any error message to the user unless you can't find either of them. (It is not an error for just one of them to be missing.) You have to look for both and print whichever one you find (with the correct label, of course). See the examples, below. Each of the data extractions should be saved into its own shell variable. After you have done all the extractions, remove the temp file. Output ------ Output the data in the shell variables on lines in this aligned format: The Current Weather for Ottawa (YOW) Temperature: 3C Pressure: 101.7 kPa Visibility: 16 km Humidity: 93% Dew Point: 2C Wind: SSW 9 km/h The city name "Ottawa" comes from the web page, not from the user's input. (The user may have given only a city code; the output must have the city name, not the code.) As always, exact spelling counts. Only print the dew point if you found it in the weather web page. Only print the wind chill if you found it in the weather web page. Testing ------- The following sample test runs are not exhaustive. *** $ ./40_weather_grabber.sh YOW Now retrieving weather for 'YOW'. Using City Code 'YOW'. The Current Weather for Ottawa (YOW) Temperature: 3C Pressure: 101.7 kPa Visibility: 16 km Humidity: 93% Dew Point: 2C Wind: SSW 9 km/h *** $ ./40_weather_grabber.sh Montreal Now retrieving weather for 'Montreal'. City Name 'Montreal' has City Code 'YUL'. The Current Weather for Montréal (YUL) Temperature: -7C Pressure: 102.1 kPa Visibility: 24 km Humidity: 68% Wind Chill: -14 Wind: WNW 19 km/h *** $ ./40_weather_grabber.sh tOrOnTo Now retrieving weather for 'tOrOnTo'. City Name 'tOrOnTo' has City Code 'YYZ'. The Current Weather for Toronto (YYZ) Temperature: 2C Pressure: 101.8 kPa Visibility: 15 km Humidity: 95% Dew Point: 2C Wind: WSW 7 km/h *** $ ./40_weather_grabber.sh Enter a Canadian City or 3-letter City Code: ottawa Now retrieving weather for 'ottawa'. City Name 'ottawa' has City Code 'YOW'. The Current Weather for Ottawa (YOW) Temperature: 3C Pressure: 101.7 kPa Visibility: 16 km Humidity: 93% Dew Point: 2C Wind: SSW 9 km/h *** $ ./40_weather_grabber.sh Enter a Canadian City or 3-letter City Code: <== blank line entered here Using default city of 'Ottawa'. Now retrieving weather for 'Ottawa'. City Name 'Ottawa' has City Code 'YOW'. The Current Weather for Ottawa (YOW) Temperature: 3C Pressure: 101.7 kPa Visibility: 16 km Humidity: 93% Dew Point: 2C Wind: SSW 9 km/h *** $ ./40_weather_grabber.sh nosuchplace Now retrieving weather for 'nosuchplace'. ./40_weather_grabber.sh: Cannot find the City Code for 'nosuchplace'. *** $ ./40_weather_grabber.sh a b c d ./40_weather_grabber.sh: Expecting one City argument, found 4 (a b c d) *** $ ./40_weather_grabber.sh ABC Now retrieving weather for 'ABC'. Using City Code 'ABC'. City weather page error - Erreur de chargement de la page météo par ville An error loading this page has occurred. This error may have occurred due to a recent restructuring of WeatherOffice.com. ./40_weather_grabber.sh: Error fetching weather for City Code 'ABC'. (The unlabelled error messages above come from inside the web page - they are displayed because we let the "grep" output that looks for errors display the errors found on standard output.) -------------------------------------------------------------------------- Bonus (+3%) ----- If the Wind Chill data doesn't end in a "C", add one. The web page http://www.hotels-near-airports.com/canadian_airports.htm contains a list of Canadian cities and airport city codes. Rewrite the city code conversion shell function from 40_weather_grabber.sh to fetch the above web page, process it, and look up the city name and return the corresponding airport code, as found on the web page. (This means the function will no longer contain any built-in cities or codes itself - it will always go to the above web page to look up the city and return the corresponding airport code, if found on the web page.) If the user enters just "Montreal" as a city, return the code for "Montreal - Dorval". If the user enters just "Toronto" as a city, return the code for "Toronto - Pearson".