Week 7-8 Notes for DAT2330 - Ian Allen
Supplement to Text, outlining the material done in class and lab.

Remember - knowing how to find out the answer is more important than
memorizing the answer.  Learn to fish!  RTFM![*]
([*] Read The Fine Manual)


Look under the "Notes" button on the course web page for these
study notes for the Linux textbook:

    chapter11.txt (Shell Script Programming [not all])


Complete these Floppix Labs on http://floppix.ccai.com/

    Floppix Lab 26: BASH Scripts

Note: An early version of "bash" is also available on ACADAIX.  


In this file:

   * Shell Script Programming
   * Shell Script Header
   * Commenting Shell Programs
   * Code Commenting Style
   * Testing Return Codes of Commands
   * Complementing Return Status
   * Writing and Debugging Shell Scripts
   * Naming Shell Scripts

========================
Shell Script Programming
========================

You are computer programmers.  When writing programs (scripts are
programs), you are not simply trying to "get it to work"; but, you are
also (and most importantly) practicing and demonstrating good programming
techniques.

Employers tell us that programming techniques are more important than
the question of which languages you know.

Good programming techniques include such things as:

   1. Correct documentation (using guidelines given here)
   2. Writing clean, simple, robust code: "Less code is better code"!
   3. Using correct indentation, style, and mnemonic names
   4. Using correct modularization and scope (i.e. local vs. global variables)
   5. Correct use of arguments and/or prompting for user input
   6. Correct use of exit codes on success/failure
   7. Choosing the correct control structures
   8. Presenting accurate and useful help if user makes an error
   9. Checking the return status of all commands used in the script

These criteria will be assessed in marking your programs (shell scripts).

===================
Shell Script Header
===================

All Shell Scripts in this course must contain a set of comment lines
similar to the following (my explanatory remarks are on the right):

    #!/bin/sh -u                               <== specify interpreter
    #     $0 [ args... ]                       <== show arguments and syntax
    # This script echoes its arguments.        <== give purpose of script
    # Ian Allen - idallen@ncf.ca               <== author and email
    # Mon Jun 11 17:06:04 EDT 2001             <== date (optional)

    export PATH=/bin:/usr/bin                  <== set correct PATH
    umask 022                                  <== correct file creation mask

    echo "$@"                                  <== your script goes here
    status=$?                                  <== your script goes here

    exit $status                               <== set exit status on exit

Don't put the name of the shell script inside the shell script, since
the name will be wrong if you rename the script.  Use $0 instead.

You will need to set variable $status to the appropriate exit value for
your script.  (Zero means "okay", non-zero means "something went wrong".)

Choose PATH to include the system directories that contain the commands
your script needs.  Directories /bin and /usr/bin are almost always
necessary.  System scripts may need /sbin and /usr/sbin.  GUI programs
will need the X11 directories.  Choose appropriately.

Choose the umask to permit or restrict access to files and directories
created by the shell script.  "022" is a customary value, allowing read
and execute by group and others.  "077" is used in high-security scripts;
since, it blocks all permissions for group and others.

You must test the return codes of all commands inside the script.
Your script must exit non-zero if anything inside the script fails.
(Read more on how to do this, below.)

Prompts and error messages should be sent to Standard Error, not to
Standard Output (Linux text p.308).

Variables must be quoted correctly to prevent unexpected special character
expansion by the shell.

If you prepare a template file containing the above script model,
you can copy and use it to begin your scripts on tests and save time
(and avoid errors and omissions).  Do not include my remarks.

=========================
Commenting Shell Programs
=========================

Comments should add to a programmer's understanding of the code.
They don't comment on the syntax or language mechanism used in the code;
since, both these things are obvious to programmers who know the language.
(Don't comment that which is obvious to anyone who knows the language.)

Programmer comments deal with what the line of code means in the
*algorithm* used, not with syntax or how the language *works*.

Thus: Do not use comments that state things that relate to the syntax or
language mechanism used and are obvious to a programmer, e.g.

    # THESE ARE OBVIOUS AND NOT HELPFUL COMMENTS:
    x=$#               # set x to $#           <== OBVIOUS; NOT HELPFUL
    date >x            # put date in x         <== OBVIOUS; NOT HELPFUL
    test "$a" = "$b"   # see if $a equals $b   <== OBVIOUS; NOT HELPFUL
    cp /dev/null x     # copy /dev/null to x   <== OBVIOUS; NOT HELPFUL

Better, programmer-style comments:

    loop=$#                  # initialize loop index to max num arguments
    date >tmpdate            # put account starting date in temp file
    test "$itm" = "$arg"     # see if search list item matches command arg
    cp /dev/null tmpdate     # reset account starting date to empty

Do not copy "instructor-style" comments into your code.  Instructor-style
comments are put on lines of code by teachers to explain the language
and syntax functions to people unfamiliar with programming (e.g. to
students of the language).  Instructor-style comments are "obvious"
comments to anyone who knows how to program; they should never appear
in your own programs (unless you become an instructor!).

=====================
Code Commenting Style
=====================

Comments should be grouped in blocks, ahead of blocks of related code
to which the comments apply, e.g.

    # Set standard PATH and secure umask for accounting file output.
    #
    PATH=/bin:/usr/bin ; export PATH
    umask 077

    # Verify that arguments exist and are non-empty.
    #
    NARGS=3
    if test "$#" -ne $NARGS ; then
        echo 1>&2 "$0: Expecting $NARGS arguments; you gave: $#"
        exit 1
    fi
    for arg do
        if ! test -s "$arg" -a -f "$arg" ; then
            echo 1>&2 "$0: arg '$arg' is missing, empty, or a directory"
            exit 1
        fi
    done

Do not alternate comments and single lines of code!
This makes the code hard to read:

    # THIS IS A BAD EXAMPLE OF COMMENTS MIXED WITH CODE !!
    # Set a standard PATH plus system admin directories
    PATH=/bin:/usr/bin:/sbin:/usr/sbin
    # export the PATH for other programs
    export PATH
    # Set secure umask for accounting file output.
    umask 077
    # create empty lock file
    >lockfile || exit $?
    # attempt to create link to lock file
    ln lockfile lockfile.tmp || exit $?
    # copy password file in case of error
    cp -p /etc/passwd /tmp/savepasswd$$ || exit $?
    # remove guest account (can't quick-check return code on grep)
    grep -v '^guest:' /etc/passwd >lockfile.tmp
    # copy new file back to password file file system
    cp lockfile.tmp /etc/passwd.tmp || exit $?
    # fix the mode to be readable
    chmod 444 /etc/passwd.tmp || exit $?
    # use mv to do atomic update of passwd file
    mv /etc/passwd.tmp /etc/passwd || exit $?
    # remove lock file
    rm lockfile.tmp lockfile
    # THIS IS A BAD EXAMPLE OF COMMENTS MIXED WITH CODE !!

Block comments and code are easier to read.  Here is a block-comment
version of the above code:

    # Set and export standard PATH plus system admin directories.
    # Secure umask protects files.
    #
    export PATH=/bin:/usr/bin:/sbin:/usr/sbin
    umask 077

    # Create empty lock file and attempt to create link to lock file.
    #
    >lockfile || exit $?
    ln lockfile lockfile.tmp || exit $?

    # Copy password file in case of error.
    #
    cp -p /etc/passwd /tmp/savepasswd$$ || exit $?

    # Remove guest account.  (Can't quick-check return code on grep.)
    # Copy new file back to password file file system.
    # Fix the mode to be readable.
    # Use mv to do atomic update of passwd file.
    # Remove the lock file.
    #
    grep -v '^guest:' /etc/passwd >lockfile.tmp
    cp lockfile.tmp /etc/passwd.tmp || exit $?
    chmod 444 /etc/passwd.tmp || exit $?
    mv /etc/passwd.tmp /etc/passwd || exit $?
    rm lockfile.tmp lockfile

Why can't you exit the script if grep returns a non-zero status code?

================================
Testing Return Codes of Commands
================================

Just as you would never use a C Library function without checking its
return code, you must never use commands in important shell scripts
without at least a minimal checking of their return codes.  At minimum,
the shell script should exit non-zero if a command fails unexpectedly:

    grep -v '^guest:' /etc/passwd >lockfile.tmp
    cp lockfile.tmp /etc/passwd.tmp || exit $?
    chmod 444 /etc/passwd.tmp || exit $?
    mv /etc/passwd.tmp /etc/passwd || exit $?
    rm lockfile.tmp lockfile || exit $?

The shell conditional execution syntax "||" is used here to test the
return codes of the commands on the left and execute the command on
the right if the command on the left returns a bad status (non-zero).

Some commands naturally return a non-zero exit status even when they are
doing what you expect (e.g. grep might not find what you were looking for
- this might be okay), and cannot be tested using this simple method.
Do not exit the script after a "grep" or "cmp" command retuns non-zero!

More complex testing
--------------------

Unfortunately, simply exiting non-zero doesn't tell the user of the
script which script contained the command that failed:

    $ ./myscript
    cp: /etc/passwd.tmp: No space left on device

If this script were being run in the background along with several other
scripts also containing similar commands, or if this script were being
run by a system daemon or delayed execution scheduler (atd or crond),
we wouldn't know from which script the actual "cp" error message came.

More work is needed to produce a truly useful error message.

The full and proper way to handle non-zero return codes in scripts is
by using error messages that contain the script name.  This means you
need "if" statements around *every* command that might fail!  This is
probably overkill for most hobby scripts; but, it is necessary for
systems programming:

    grep -v '^guest:' /etc/passwd >lockfile.tmp
    status="$?"
    if [ "$status" -ne 1 -a "$status" -ne 0 ] ; then
        # grep returns 2 on serious error
        echo 1>&2 "$0: grep guest /etc/passwd failed; status $status"
        exit 1
    fi
    if ! cp lockfile.tmp /etc/passwd.tmp ; then
        echo 1>&2 "$0: cp lockfile.tmp /etc/passwd.tmp failed; status $?"
        exit 1
    fi
    if ! chmod 444 /etc/passwd.tmp ; then
        echo 1>&2 "$0: chmod /etc/passwd.tmp failed; status $?"
        exit 1
    fi
    if ! mv /etc/passwd.tmp /etc/passwd ; then
        echo 1>&2 "$0: mv /etc/passwd.tmp /etc/passwd failed; status $?"
        exit 1
    fi
    rm lockfile.tmp lockfile

Now, the script tells you its name in the error message:

    $ ./myscript
    cp: /etc/passwd.tmp: No space left on device
    ./myscript: cp lockfile.tmp /etc/passwd.tmp failed; status 2

Now it's easy to tell from which script the above "cp" error came.

Making your scripts detect errors and issue clear error messages is
tedious but not difficult.  Adding all the error checking makes the
code much longer and harder to read and modify.  If the script isn't
doing anything important, simply exiting after a failed command may be
sufficient; it only adds a few words to each line of the script.

For a system script that must detect errors under all conditions
(including "too many processes", "file system full", etc.), you must
have all the additional error checking.  The reward is a script that
won't let you down when things go wrong, and that will tell you exactly
what the problem is when one develops.

===========================
Complementing Return Status
===========================

Any command or command pipeline's return status can be complemented
(reversed from good to bad or bad to good) using a leading "!" before
the command, e.g.

    $ false
    $ echo $?
    1

    $ ! false
    $ echo $?
    0

    $ grep nosuchxxx /etc/passwd
    $ echo $?
    1

    $ ! grep nosuchxxx /etc/passwd
    $ echo $?
    0

This is useful in shell scripts to simplify this:

    if grep "$var" /etc/passwd ; then
        : do nothing
    else
        echo 1>&2 "$0: Cannot find '$var' in /etc/passwd"
    fi

to this:

    if ! grep "$var" /etc/passwd ; then
        echo 1>&2 "$0: Cannot find '$var' in /etc/passwd"
    fi

The "!" prefix is also useful in turning "while" loops into "until"
loops or vice-versa.

===================================
Writing and Debugging Shell Scripts
===================================

The Number One rule of writing shell scripts is:

    Start Small and Add One Line at a Time!

Students who write a 10- or 100-line script and then try to test it
all at once usually run out of time.  An unmatched quote at the start
of a script can eat the entire script until the next matching quote!

Start your script with the Script Header (name of interpreter, PATH,
umask, comments) and the single command "date".  If that doesn't work,
you know something fundamental is wrong, and you only have a few lines
of code that you need to debug.  (Is your interpreter correct? your PATH?)

Add to this simple script one or two lines at a time, so that when an
error occurs you know it must be in the last line or two that you added.

Do not add 10 lines to a script!  You won't know what you did wrong!

You can ask the a shell to show you the lines of the script it is reading
and executing by using the "-v" or "-x" (or both) option to the shell:

    $ sh -v -u ./myscript arg1 arg2 ...

    $ sh -x -u ./myscript arg1 arg2 ...

The "-v" options displays the command lines as they are read by the shell
(without any shell expansion).  The "-x" option displays the command
lines as they are passed to the commands being executed, after the shell
has done all the command line expansion and processing.

These options will allow you to see the commands as they execute, and
may help you locate errors in your script.  (Double-quote your variables!)

Of course you can use -v and -x with an interactive shell too:

    $ sh -v
    $ echo $SHELL
    echo $SHELL
    /usr/bin/ksh
    $ echo *
    echo *
    a b c d
    
    $ sh -x
    $ echo $SHELL
    + echo /usr/bin/ksh
    /usr/bin/ksh
    $ echo *
    + echo a b c d
    a b c d

    $ sh -v -x
    $ echo $SHELL
    echo $SHELL
    + echo /usr/bin/ksh
    /usr/bin/ksh
    $ echo *
    echo *
    + echo a b c d
    a b c d

Remember that if you use a shell to read a shell script ("sh scriptname"),
instead of executing it directly ("./scriptname"), the shell will
treat all the comments at the start of the shell script as comments.
In particular, the comment that specifies the interpreter to use when
executing the script ("#!/bin/sh -u") will be ignored, as will all of
the options listed beside that interpreter.

Only by actually *executing* the script will you cause the Unix kernel
to use the interpreter and options given on the first line of the script.
For example:

    $ cat test
    #!/bin/sh -u
    echo 1>&2 "$0: This is '$undefined'"

    $ ./test
    ./test: undefined: unbound variable

    $ sh test
    test: This is ''

    $ sh -u test
    test: undefined: unbound variable

    $ csh test
    Bad : modifier in $ ( ).

All shells treat #-lines as comments and ignore them.  Only the Unix
kernel treats #! specially, and only for executable scripts.

====================
Naming Shell Scripts
====================

Often, you will want to put an example of how to run a shell script
inside the shell script as a comment.  You might create a script called
"doexec" and write it as follows:

    #!/bin/sh -u
    #
    #     doexec [ files... ]
    #
    # This script sets execute permissions on all its arguments.
    # -IAN! idallen@ncf.ca
    # Mon Jun 11 23:02:38 EDT 2001

    PATH=/bin:/usr/bin ; export PATH
    umask 022

    if "$#" -eq 0 ; then
        echo 1>&2 "$0: No arguments; nothing done"
        status=0
    else
        if chmod +x "$@" ; then
            status=0   # it worked
        else
            status="$?"
            echo 1>&2 "$0: chmod exit status: $status"
            echo 1>&2 "$0: Could not change mode of some argument: $*"
        fi
    fi
    exit "$status"

You would execute this script by typing:

    $ ./doexec filename1 filename2 filename3...

This comment line in the doexec script:

    #     doexec [ files... ]

tells the reader that the script name is "doexec" and the files are the
(optional) arguments to the script.

But what if you rename the script to be something other than "doexec"?

    $ mv doexec fixperm
    $ ./fixperm foo bar

The use of "$0" in the echo line for the error message ensures that the
shell will print the actual script name in the error message, but the
comment in the script is now wrong, since the program name is no longer
"doexec".  I don't want to have to edit the script and make a change
such as "doexec" to "fixperm" every time I change the name of the script.

The solution is never to put the actual name of a script inside the
script, even as a comment.  Wherever you refer to the name of the script,
even in a comment, use the "$0" convention instead.  So, the comment
changes from:

    #     doexec [ files... ]

to be:

    #     $0 [ files... ]

In the comment, "$0" just means "whatever the name of this script is",
without my having to actually write the script name.  I don't want to use
the actual script name, because I might change it.  Since the line is a
comment, ignored by the shell, the shell will never actually expand that
"$0" to be the real name of the shell; it's just a convenient way of
specifying the program name without actually naming it inside the script.

Never put the name of a program inside the program; it might change!