========================== CST8129 Term Assignment #2 ========================== -IAN! idallen@ncf.ca Due: 12:00 (12 noon) Tuesday November 19, 2002 Marks: 8% Late penalty: 50% per day Purpose: - practice writing real shell scripts using Regular Expressions (material in Chapters 2, 3, 8, 9) Hand in format: online submission only - no paper, no diskettes All scripts described below must be written to conform to the script writing checklist: script_checklist.txt and to the script style given in: script_style.txt. All user input (command line arguments or input via "read") must be fully validated before being used in expressions. Do not process bad input! Echo user input (including command line arguments given) back to the user. This is usually a good idea both for debugging your script and giving the user feedback on what data the script is actually processing. Avoid Linux-only commands and command options. The same script should work without modification on both Linux and ACADUNIX, where possible. (In particular, do *not* use Bash 2.x shell syntax!) Test your scripts. The sample inputs and output shown below are not a complete test suite. I will try to find test cases using glob patterns and blanks that will make your scripts abort or misbehave. Scripts without *useful* block comments will be severely penalized. (See the file script_style.txt for a description of good comment style.) ------------------ Hand in directory: ------------------ Completed scripts must have permissions "read-write-execute-only" for you, "read-only" permissions for group, no permissions for other people. The following directory is ready to receive your completed scripts: ~alleni/cst/assignment02/xxxxnnnn/ where xxxxnnnn is your Algonquin userid (e.g. abcd0001). When you have completed a script, copy it into the above directory: cp myscript.sh ~alleni/cst/assignment02/xxxxnnnn/myscript.sh Replace "myscript.sh" with the actual script name (given below). Files with the wrong name or wrong Unix permissions will be penalized. -------------------- Write these scripts: -------------------- --) Write this executable script named "30_script_validator.sh" (7 marks) Syntax: $0 [ scriptname ] Purpose: Write a script to validate some aspects of a properly-written shell script. The script you write will expect exactly one command-line pathname argument that will be a shell script file to process. Prompt for and read the argument if it is missing. Print an error message and exit with status 2 if there is more than one argument given on the command line. Check (validate) the pathname argument before trying to read and process it; if the pathname is not a non-empty, readable, executable, plain file, issue an error message and exit the script with status 1. Do not process a bad pathname. Perform a series of tests on this file, as specified below. Write shell functions to perform each test. (An example is given below.) Each shell function should take an argument that is the file pathname of the script that is to be tested. If the test fails, the function will print an error message and return a non-zero status. (Do not exit the script!) Each function must return status 0 if its particular test succeeds and non-zero otherwise. For example, you will write the following one-argument testing function based on this description: 1) This testing function looks for the "Student Name" comment that should be in the top 20 lines of the file whose name is given as the first argument to the function. If the comment line is found, the function validates it further. To select the "Student Name" line, we select the first 20 lines of the script file (first argument) and run the lines through the following egrep extended regexp: - start at the beginning of the line - allow any number of blanks around the comment char "#" - allow one or more blanks between the two words Student Name - accept upper or lower case leading letters on the words - let the single trailing colon on Name: be optional If egrep finds the line based on the above regular expression, the function will do the following further tests on the line found: - make sure the Student Name is followed by at least one blank, followed by at least two letters (assumed to be part of a name). Return status 0 if all the tests pass, non-zero otherwise. Here is the code that you would write to implement the above testing function (this code comes directly from the above description): TestStudentName () { # Select the first 20 lines of the file and use egrep on them. # Put the egrep regexp into a variable so we can use it twice # without writing it twice - do not duplicate code. # Save the output of the egrep in variable $line for later use. # regexp='^ *# *[Ss]tudent +[Nn]ame:?' line=$( head -20 "$1" | egrep "$regexp" ) # See if the egrep found the line (test for zero size string). # if [ -z "$line" ] ; then echo 1>&2 "$0: '$1': Missing Student Name comment" return 1 # line not found - return bad status (not exit) fi # We get here if the egrep pattern did find the comment line. # The line we found is stored in shell variable "$line". # Do further tests on the line: # See if the line we found has at least two letters following. # This re-uses the same regexp pattern from above and adds to it. # Note the correct use of double/single quoting in the pattern. # Note that we echo the $line as part of the error message. # try=$( echo "$line" | egrep "$regexp"' +[a-zA-Z][a-zA-Z]' ) if [ -z "$try" ] ; then echo 1>&2 "$0: '$1': Missing letters after Student Name: $line" return 2 # return bad status (do not exit) fi return 0 # no errors - must be a valid line - return good status } All the testing functions will follow the above order of operation: First, the function must try to find the basic keywords used in the line for which it is looking. Second, perform some validations on the text that is supposed to follow the keywords. Each testing function prints an error message and returns a non-zero status when it detects an error. (Do not exit the script on error!) You would write the above function and then use it in an IF statement in your script as follows (assuming the argument pathname is in the variable $file): if ! TestStudentName "$file" ; then ... insert code to count the error here ... fi Your finished program will start with a set of function definitions, each preceded by a block comment describing the purpose of the function. After all the function definitions will come the same number of IF statements (similar to the one above). Each IF statement will execute one of the functions and check its return status. Below are descriptions of more testing functions that you must write. Each function should search the script file for a line containing the the given pattern and, if found, validate the rest of the line. Be precise - "Name:*" would be an incorrect regexp for use in the above function; because, it matches "Name::::", which is not in the specification for the function. (The specification says the colon is "optional", which means "zero or one", not "zero or more".) Match exactly what is described in the specifications for each function. Define each testing function before you use it in the script. The top part of your script will be all your function definitions; the bottom part will be IF statements, similar to the one shown above, each using one of the function definitions on the existing pathname argument. (Do not process nonexistent arguments!) Every time a testing function returns a non-zero status, count it as an error. (Do not exit the script after a testing function - just count each error and move on to the next IF statement that uses the next testing function.) You are to write the following testing functions, one at a time, and add them to the script. Write each function, make sure it works, and then add the next one. Start by using the TestStudentName function code and IF statement given above, then add more functions, one at a time. Test your script after each new function. Copy the Student Name function code given above and modify it to work for each new function that you write. Follow the same two-part order of operation (described earlier) in each function that you write. Be generous with white space, punctuation, and upper/lower case letters in what you accept in your regular expressions for comments; be more strict with tests that apply to shell syntax (e.g. the #! line). Here are the descriptions of the testing and validation functions. Write one function per numbered test below; invent and use your own good function names: 1) A function to test for and validate the Student Name comment line. (See the description and TestStudentName code already given above.) 2) A function to test for the Algonquin Email Address comment line. Look for and reject (with an error message) yahoo or hotmail in the email address. (Bonus mark: If the email address contains "@", make sure the "@" is followed by "algonquincollege.com" and nothing else. Hint: Look for @ in the line first. If you find it, then look for @algonquincollge.com and if that fails, print an error message.) 3) A function to test for the Student Number comment line. Make sure the student number contains only 9 digits, with any number of optional blanks or hyphens between sets of three digits. Make sure there is nothing else on the line but the student number. Good line: # Student Number: 123 456 789 Bad line: # Student Number: 123 456 789 junk Bad line: # Student Number: 12 3456 789 4) A function to test for the Course Number comment line. Make sure that the course number on the line is CST8129, with optional spaces or hyphens between CST and 8129 and optional upper/lower case letters for CST. Make sure there is nothing else on the line but the course number. Good line: # Course Number: CST 8129 Bad line: # Course Number: CST 8229 Bad line: # Course Number: csd8129 5) A function to test for the Lab Section Number comment line. Validate the section number as one of 011, 012, 013, or 014. You can make the leading zero optional. Make sure there is nothing else on the line but the lab section number. Good line: # Lab Section Number: 011 Bad line: # Lab Section Number: 110 Bad line: # Lab Section Number: 010 6) A function to validate the first line of the script as being a valid shell line in this course. Accept #!/bin/bash or #!/bin/sh at the start of the line, followed by one or more blanks follwed by "-u" at the very end of the line. (Partial egrep regexp hint: /bin/(ba)?sh ) Make sure you are testing only the first line of the script! If the first line is not valid, also echo it in your error message. 7) A function to search for a valid PATH= line anywhere in the script. If you find a line containing "PATH=", construct a validating egrep regular expression that will match the line, in this order: - beginning of line - any number of blanks, tabs (whitespace) - optional: word "export" followed by one or more blanks (Note: "optional" is a synonym for "zero or 1 occurrences") - the exact string: PATH=/bin:/usr/bin - zero or more blanks - a comment character "#", semicolon ";", or, end-of-line Echo the invalid PATH line in your error message. Good line: export PATH=/bin:/usr/bin # this is a comment Bad line: PATH=/bin:/user/bin Bad line: path=/bin:/usr/bin 8) A function to search for a valid umask line anywhere in the script. When you find the line containing "umask", construct a validating egrep regular expression that will match the line, in this order: - beginning of line - any number of blanks, tabs (whitespace) - the exact word: umask - one or more blanks - a digit zero - two or three (not four) more digits from the set 0-7 - zero or more blanks - a comment character, semicolon, or end-of-line Echo the invalid umask line in your error message. Good line: umask 022 # this is a comment Bad line: umask 22 Bad line: unmask 022 Write a series of IF statements that uses each of your functions and counts the error if the function returns a non-zero status. (See my example IF statement, above.) At the end of the script, after you have made all of the above tests and counted all the errors that might exist, exit the script with the following exit status: - exit 0 if the argument file passed all of the tests without error - exit 1 if the argument was not a non-empty, readable, executable file - exit 2 if more than one argument was given to the script - if the argument file had errors, exit with a value that is the number of errors plus 10, e.g. exit 15 for a count of 5 errors. Example runs of this script might look like this (the comment lines beside the exit codes were added to explain the return value): $ ./30_script_validator.sh 30_script_validator.sh $ echo $? 0 # the script should process itself without errors! $ ./30_script_validator.sh a b c ./30_script_validator.sh: only 1 path argument allowed, you entered 3 (a b c) $ echo $? 2 # invalid calling syntax $ ./30_script_validator.sh /dev/null ./30_script_validator.sh: /dev/null is not a file $ echo $? 1 # non-file argument supplied $ ./30_script_validator.sh badcourse.sh ./30_script_validator.sh: 'badcourse.sh': bad format Course Number in: # Course Number CST8229 $ echo $? 11 # one error means exit status 10+1 = 11 $ ./30_script_validator.sh badfirst.sh ./30_script_validator.sh: 'badfirst.sh': incorrect first line of script: #/bin/bash -u $ echo $? 11 # one error means exit status 10+1 = 11 $ echo not much >j $ ./30_script_validator.sh j ./30_script_validator.sh: 'j': Missing Student Name comment ./30_script_validator.sh: 'j': Missing Algonquin Email Address comment ./30_script_validator.sh: 'j': Missing Student Number comment ./30_script_validator.sh: 'j': Missing Course Number comment ./30_script_validator.sh: 'j': Missing Lab Section Number comment ./30_script_validator.sh: 'j': incorrect first line of script: not much ./30_script_validator.sh: 'j': missing PATH= line ./30_script_validator.sh: 'j': missing umask line $ echo $? 18 # 8 errors means exit status 10+8 = 18 $ echo 'PATH=/bin:/user/bin' >j $ ./30_script_validator.sh j ./30_script_validator.sh: 'j': Missing Student Name comment ./30_script_validator.sh: 'j': Missing Algonquin Email Address comment ./30_script_validator.sh: 'j': Missing Student Number comment ./30_script_validator.sh: 'j': Missing Course Number comment ./30_script_validator.sh: 'j': Missing Lab Section Number comment ./30_script_validator.sh: 'j': incorrect first line of script: PATH=/bin:/user/bin ./30_script_validator.sh: 'j': unrecognized PATH= line: PATH=/bin:/user/bin ./30_script_validator.sh: 'j': missing umask line $ echo $? 18 # 8 errors means exit status 10+8 = 18 The above sample test runs are not exhaustive. Test all eight of your functions to make sure they work with various input files containing various errors. The script file "validate.sh" (under Notes) may also be useful to you as an example of writing and using these testing functions. --) Write this executable script named "31_multi_validator.sh" (1 mark) Loop for all command line arguments and call the above 30_script_validator.sh script with each argument. (Do not bother validating the pathnames before passing them to the 30_script_validator.sh script, since that script already does argument validation for you.) Process and total up the exit statuses of the validator script after each execution to use for the statistics, below: After having processed all the command line arguments with the validator script, print your collected statistics on how the arguments were processed by the script: - a count of how many pathnames were found invalid (unreadable, etc.) (how many times the validator script exited with code 1) - a count of how many files were processed with no errors found (how many times the validator script exited with code 0) - a count of how many files had any errors found (how many times the validator script exited with a code > 10) - a count of the total number of errors in all the files processed (recall that the 30_script_validator.sh script exits with the number of errors plus 10, if there are errors - add up the total number of errors) Note that you can write and test this script even if you don't have a working 30_script_validator.sh script. You can create a "dummy" script that does nothing but exit with the desired return status, and call that instead of 30_script_validator.sh.