Shell Script Problems – arithmetic, syntax, test, boolean, etc.

Ian! D. Allen – www.idallen.com

Fall 2016 - September to December 2016 - Updated 2018-11-29 14:31 EST

1 Avoiding Common Script ProblemsIndexup to index

There are many ways to make mistakes in script programming. Here are some warnings about common errors.

2 Writing too much code to testIndexup to index

Shells are not particularly good about giving helpful error messages when shell scripts contain errors. For example, having a missing or non-integer argument to the test command may produce a vague error message:

bash$ test 1 -eq
bash: test: 1: unary operator expected

sh$ test 1 -eq ""
sh: 1: test: Illegal number:

A common mistake when writing a new shell script is to write too many lines of code, run the new script, and then get too many error messages. Because you wrote so many lines, you don’t know which line contains the error.

Create shell scripts a few lines at a time, testing the script after you add each line or two so you know where the errors lie.

Running the script using a shell with the debug options -x or -v set may also be helpful:

$ bash -u -x ./myscript.sh
$ bash -u -v ./myscript.sh

3 Scripts don’t do arithmeticIndexup to index

Don’t forget that shells aren’t designed to do arithmetic. They find and run commands, and it is commands that you must use in if statements.

This if statement below is wrong thinking; it forgets to use the test helper program to compare the numbers:

#!/bin/sh -u
if $# -gt 0 ; then      # WRONG WRONG WRONG
    echo "Number of arguments is $#"
fi

The above if line will have the shell expand the variable $# into some number and then try to execute that number as a command. The error message will not be very helpful and will depend on the number of arguments:

$ ./myscript.sh a b c
./myscript.sh: 2: ./myscript.sh: 3: not found

$ ./myscript.sh a b c d e f g h
./myscript.sh: 2: ./myscript.sh: 8: not found

You can see the problem if you use the -x option to the shell:

$ bash -ux ./myscript.sh a b c d
+ 4 -gt 0
./myscript.sh: line 2: 4: command not found

The shell if keyword must always be followed by a command name. Always remember to code some command name after if:

if test $# -gt 0 ; then      # RIGHT: use test helper command to compare
if [ $# -gt 0 ] ; then       # RIGHT (syntactic sugar for test helper)

4 Don’t mix square brackets and command namesIndexup to index

Conversely, don’t double up on the command name you use after if:

$ if [ fgrep foo /etc/passwd ] ; then date ; fi   # WRONG WRONG WRONG
[: foo: binary operator expected.

The above syntactic-sugar line is equivalent to this (also incorrect) line:

$ if test fgrep foo /etc/passwd ; then date ; fi  # WRONG WRONG WRONG
test: foo: binary operator expected.

The command name after the if, above, is test. The test command will see three command-line arguments (fgrep, foo, and /etc/passwd) and complain that the middle one isn’t an operator. You can’t use both commands test and fgrep at the same time.

If you want to test the return status of fgrep, then fgrep must be the command name that immediately follows the if keyword:

$ if fgrep foo /etc/passwd ; then date ; fi  # RIGHT

Don’t put command names inside square brackets.

5 The [...] square bracket syntax for test needs surrounding blanksIndexup to index

The if keyword must always be followed by a command name, and that command name is exactly one square bracket [ in the syntactic-sugar form of the test helper command.

The following code does not work because blanks are missing around the first square bracket, making it into an unknown command named [1 or [!:

$ if [1 -eq 1 ] ; then echo "ALWAYS USE BLANKS" ; fi
sh: [1: command not found

$ if [! -r /etc/passwd ] ; then echo "ALWAYS USE BLANKS" ; fi
sh: [!: command not found

The shell sees [1 and [! as two-character command names that don’t exist. The following incorrect statement fails for the same reason:

$ if [a=b] ; then echo "ALWAYS USE BLANKS" ; fi
bash: [a=b]: command not found

Square brackets are not punctuation! Always use blanks around [ and ]:

$ if [ 1 -eq 1 ] ; then ...          # RIGHT! surround with blanks
$ if [ ! -r /etc/passwd ] ; then ... # RIGHT! surround with blanks

The arguments to the test command must always be separate command line arguments. This next line fails because of the missing blank before the required closing square bracket:

$ if [ 1 -eq 1] ; then echo "ALWAYS USE BLANKS" ; fi
[: missing `]'

The [ command is looking for the argument “left square bracket” ] not 1] as its last command line argument. The corrected line uses blanks:

$ if [ 1 -eq 1 ] ; then ...          # RIGHT! surround with blanks

Always surround all the test helper command arguments with blanks.

6 Don’t forget blanks around test operatorsIndexup to index

The test helper command behaves differently depending on the number of arguments you pass to it:

$ test 1 -eq 2              # three arguments: operator in middle
$ test -f file              # two arguments: operator on left
$ test string               # one argument: -n assumed on left: -n string

If the test command has only one single command line argument, it defaults to using -n as the implied operator (test for non-empty string) on the one argument. The following one-argument tests are always TRUE, though they may not appear that way at first to human eyes:

if test a=b ; then          # WRONG!  THIS IS TRUE (good return code) !
if [ a=b ] ; then           # WRONG!  THIS IS TRUE (good return code) !

if test 1=2 ; then          # WRONG!  THIS IS TRUE (good return code) !
if [ 1=2 ] ; then           # WRONG!  THIS IS TRUE (good return code) !

if test 0 ; then ...        # WRONG!  THIS IS TRUE (good return code) !
if [ 0 ] ; then ...         # WRONG!  THIS IS TRUE (good return code) !

In all the above lines, the test command has only one command line argument (not counting the trailing ] that is always ignored). Since the single argument to test is not the empty string, test returns a good status and the if succeeds. The test command is defaulting to use an implied -n operator on the left. The shell is actually executing these tests for non-empty strings:

if test -n "a=b" ; then     # THIS IS ALWAYS TRUE (good return code) !
if [ -n "a=b" ] ; then      # THIS IS ALWAYS TRUE (good return code) !

if test -n "1=2" ; then     # THIS IS ALWAYS TRUE (good return code) !
if [ -n "1=2" ] ; then      # THIS IS ALWAYS TRUE (good return code) !

if test -n "0" ; then ...   # THIS IS ALWAYS TRUE (good return code) !
if [ -n "0" ] ; then ...    # THIS IS ALWAYS TRUE (good return code) !

All the above tests are always true, because the three-character strings a=b and 1=2 are not empty strings and never will be empty, and the single-character string 0 is also never the empty string.

If you want to perform equality tests, you must separate each argument by blanks so that test sees three separate arguments, not just one:

if test a = b ; then        # correct 3-argument syntax
if [ a = b ] ; then         # correct 3-argument syntax

if test 1 = 2 ; then        # correct 3-argument syntax
if [ 1 = 2 ] ; then         # correct 3-argument syntax

Always keep the arguments to test separated by blanks.

7 Don’t use redirection operators < or > for -lt less or -gt greaterIndexup to index

Another common mistake, usually made by programmers accustomed to other programming languages, is to use shell redirection metacharacters < and > instead of the correct operators -lt and -gt in test numeric comparisons. Here are two identical mistakes:

if test 1 > 2 ; then ...    # THIS IS WRONG - must use -gt not >
if [ 1 > 2 ] ; then ...     # THIS IS WRONG - must use -gt not >

The above two lines have the shell first use redirection (>) to create a file named 2 and redirect the output of the test command into it. (The test command produces no output; the file remains empty.) The test command itself is left with only one single command line argument, the digit 1. With one argument and no operators, the test command returns success if the argument is not the empty string (test -n 1). The string 1 is never empty, so the above test, and the if, always succeeds.

The correct shell scripting form does not use the redirection syntax:

if test 1 -gt 2 ; then ...    # right syntax for "greater than"
if [ 1 -gt 2 ] ; then ...     # right syntax for "greater than"

Do not use shell redirection metacharacters inside test expressions!

8 The test string equality operator is = not ==Indexup to index

If you’re a programmer, you’re used to doing equality comparisons using the == operator. In shell programming the test command uses the string comparison operator = (one equals) and not == (two equals):

if [ "$1" = '--help' ] ; then ...      # correct syntax uses '='
if [ "$1" == '--help' ] ; then ...     # WRONG !

Some shells (e.g. bash) accept the incorrect == operator as well as = to compare strings, but the /bin/sh (a link to /bin/dash) shell on Ubuntu (the CLS) is not one of them:

bash$ [ a = b ]                         # correct syntax uses one '='
bash$ [ a == b ]                        # WRONG ! but bash allows it anyway

sh$ [ a = b ]                           # correct syntax uses one '='
sh$ [ a == b ]                          # WRONG ! causes error in /bin/sh
sh: 1: [: a: unexpected operator

Always use one single equals = to compare strings.

9 Don’t confuse an empty/null argument with a missing argumentIndexup to index

This script below has no argument:

$ ./example.sh

Given the above script command line, inside the script the value of $# (the number of arguments) is zero. The value of the first argument $1 (and all following arguments) is undefined.

These script command lines below both have a single empty or null string argument:

$ ./example.sh ''
$ ./example.sh ""

Given the above script command lines, inside the script the value of $# is one because the script has one argument. The first argument itself $1 is defined but has zero characters in it:

test -z "$1"        # this is TRUE inside the script
[ "$1" = '' ]       # this is TRUE inside the script

An argument with no characters in it is not the same thing as a missing argument.

An argument that is a space character is not null or empty. These script command lines below all have a single string argument that contains a space character:

$ ./example.sh ' '
$ ./example.sh " "
$ ./example.sh \                    # there is a space after the backslash 

test -z "$1"        # this is FALSE inside the script
[ "$1" = '' ]       # this is FALSE inside the script

test -n "$1"        # this is TRUE inside the script
[ "$1" = ' ' ]      # this is TRUE inside the script
[ "$1" = " " ]      # this is TRUE inside the script
[ "$1" = \   ]      # this is TRUE inside the script

Remember the difference between:

  1. A missing (undefined) argument.
  2. A defined but null (empty) argument.
  3. An argument containing a space (or many spaces).

10 Don’t use confusing double negatives or double exit status inversionsIndexup to index

The exit status negation operator ! may be used to the left of any single expression used inside the test command:

if test ! -r "$file" ; then ...
if [ ! -r "$file" ] ; then ...
if test ! -z "$string" ; then ...
if [ ! -z "$string" ] ; then ...

The test command uses the exclamation point operator ! to negate/invert/complement the exit status of a Boolean test. If you combine the negation operator of test with the shell return code negation operator that also uses !, you can end up with confusing or unreadable code:

if ! test ! -r file ; then            # CONFUSING
if ! [ ! "abc" != "def" ] ; then      # EVEN MORE CONFUSING

Don’t use confusing double-negative logic. Rework the expression to use only a single ! or none at all:

if test -r file ; then ...            # same expression as above: readable
if [ "abc" != "def" ] ; then ...      # same expression as above: readable

Keep the negation operator as an argument to test; don’t place it before the opening square bracket alias to negate the return code of test. To test if a file is non-existent or exits but is not readable:

if ! [ -r file ] ; then               # NO: correct but awkward (do not use)
if [ ! -r file ] ; then               # YES: correct and preferred

The shell return code negation operator ! is almost never used to negate the return code of the test command itself. Always use ! as an argument to the test command, inside the square brackets, never outside.

11 Don’t use shell Boolean operators && or || for -a AND or -o ORIndexup to index

C and Java programming language programmers sometimes confuse the syntax of the Boolean operators AND && and OR || inside the test command, where you should be using -a or -o:

if [ $# != 1 -o -z "$1" ] ; then ...             # YES: correct shell syntax
if [ $# != 1 || -z "$1" ] ; then ...             # NO: incorrect C language syntax

The error messages for this incorrect use look like this:

$ if [ $# != 1 || -z "$1" ] ; then echo hi ; fi  # WRONG SYNTAX
[: missing `]'
bash: -z: command not found

The Bourne shell || and && operators separate shell commands in a manner similar to the semicolon ;. You cannot use them inside test expressions.

Use -a and -o to separate Boolean clauses to the test command:

if [ $# != 1 -o -z "$1" ] ; then echo hi ; fi      # RIGHT

Digression (optional reading):

You can use the shell || and && command separators between individual test commands if you make sure each test command is complete:

if [ $# != 1 ] || [ -z "$1" ] ; then echo hi ; fi  # valid but inefficient

Above, the || separates two different and complete test command executions. Rather than using the test command twice, you can simply join them into one using the correct test Boolean operator:

if [ $# != 1 -o -z "$1" ] ; then echo hi ; fi      # RIGHT

Less code is better code.

12 Don’t mix comparing strings and comparing numbers in testIndexup to index

The test helper command has six ways to compare numbers and two ways to compare strings. Don’t mix them up. In particular, don’t use the numeric operators to try to compare strings; the error message isn’t very obvious:

$ if [ "$1" -eq "" ] ; then echo "Empty string" ; fi
sh: [: Illegal number:

The string comparison operators are = and !=, not -eq and -ne.

13 Opposites and false opposites in testIndexup to index

Boolan logic has some subtle consequences when applied to the operations performed by the test helper command.

13.1 True Boolean opposites: -n and -z, = and !=, -eq, and -neIndexup to index

The logical opposite of the test operator -n (is not an empty string) is -z (is an empty string), just as the opposite of = (string equality) is != (string inequality), and the opposite of -eq (integer equality) is -ne (integer not equal). These are all correct opposites.

13.2 Subtle Boolean opposites: -lt and -geIndexup to index

The logical opposite of the test operator -lt (less than) is not -gt (greater than), it is -ge (greater than or equal to). (If you are not younger than your sister, you are either older or the same age.)

The opposite of the test operator -gt is not -lt, it is -le.

13.3 Files are not “not directories”Indexup to index

The test operators -f and -d are not opposites. If a pathname is not a file, it may or may not be a directory. It could be a directory or any number of other special file types under Unix/Linux. (/dev/null is a common example of a pathname that is not a directory or a plain file.)

You cannot replace the test ! -f with -d or vice-versa.

13.4 Negating/inverting test pathname operators, e.g. ! -rIndexup to index

The test pathname operators all return success (zero) only if the pathname is accessible (all the directories can be traversed) AND the pathname exists AND if it has the given pathname property. This means that the negation/inversion of a pathname operation has to include the possibility that the pathname does not exist or that it can’t be accessed:

if [ -r file ] ; then ...    # succeed if pathname is accessible and readable
if [ ! -r file ] ; then ...  # succeed if pathname inaccessible, non-existent, or not readable

Inverting the status of most of the pathname operators means that the resulting test might succeed either because the pathname can’t be reached, OR the pathname doesn’t exist, OR because the pathname exists but fails the test. You need to apply more programming logic if you want to know that a pathname actually exists but is not, for example, readable:

if [ -e pathname -a ! -r pathname ] ; then ...  # if path exists AND path is *not* readable

Remember that inverting a pathname test may mean the inverted test succeeds because the pathname is not accessible or does not exist!

The opposite of “pathname is readable” is “pathname is not accessible, OR pathname does not exist, OR pathname is not readable”.

14 The multiple causes of failure of test pathname testsIndexup to index

If a test pathname operator (e.g. -r, -w, -x, -f, -d, -s, -e) succeeds, you also know that you have permission to traverse all the directories leading up to it and that the pathname actually exists.

If a test pathname operator fails, it may also fail because you have no permission to search one of the directories in the pathname, or because the pathname simply doesn’t exist. Without first testing if you can access the pathname and that it actually exists, the following error message is misleading:

if [ ! -r "$path" ] ; then
   echo 1>&2 "$0: '$path' is not readable"   # POOR ERROR MESSAGE
fi

While it is true that the pathname is not readable, the above error message is incomplete. You might not have permission to traverse all the directories in its pathname, or, the pathname might not even exist. Saying the overall pathname is not readable is true, but it is only part of the truth. A more accurate error message would be:

if [ ! -r "$path" ] ; then
   echo 1>&2 "$0: '$path' is inaccessible, missing, or not readable"
fi

If you want to be more specific in your error message about why the pathname is not readable, you need code to test for existence first:

if [ ! -e "$path" ] ; then
   echo 1>&2 "$0: '$path' does not exist or is not accessible by you"
else
   # the pathname exists and is accessible; test readability:
   if [ ! -r "$path" ] ; then
      echo 1>&2 "$0: '$path' exists but is not readable by you"
   fi
fi

The test for readability is now done only if the pathname exists and is accessible; if the test for readability fails, you know the (existing, accessible) pathname item is truly not readable. The error message is more accurate now.

Any time one of the test pathname operator tests fails, be accurate in your error message. State whether the failure is due to a missing or inaccessible pathname, or due to a failure of the actual test being performed on the (existing, accessible) pathname.

15 Multiple test expressions cloud error messageIndexup to index

Be careful in if statements when testing multiple conditions at the same time that you do not make the failure error message unhelpful:

if [ $x -gt 0 -a -f "$file" -a $y -lt 27 -a -n "$string" ] ; then
   ... do something useful ...
else
   echo 1>&2 "$0: Error: ... what do you say here ??? ..."
fi

The ??? error message above would have to say what failed, and there are so many possibilities for failure that the message becomes unreadable. The error would have to read like this: Error: $x is <= 0 or $file is inaccessible, does not exist, or is not a file, or $y is >= 27, or '$string' is a null string. Which failure was it? Such a complex error message is not helpful to the users of your scripts!

Use separate tests and separate error messages for each test condition; don’t bunch them together using Boolean -a or -o operators:

# Split the huge condition into more readable error messages.
# Test each condition separately and exit if any condition fails.
#
if [ $x -le 0 ] ; then
   echo 1>&2 "$0: Error: x value $x is <= 0"
   exit 1
fi
if [ ! -f "$file" ] ; then
   echo 1>&2 "$0: Error: path '$file' is inaccessible, does not exist, or is not a file"
   exit 1
fi
if [ $y -ge 27 ] ; then
   echo 1>&2 "$0: Error: y value $y is >= 27"
   exit 1
fi
if [ -z "$string" ] ; then
   echo 1>&2 "$0: Error: string value '$string' is a null string"
   exit 1
fi
... all tests passed; now do something useful ...

16 Use Less CodeIndexup to index

Less code is better code.

Consider this correct but amateur shell script code:

fgrep "foo" /etc/passwd >/dev/null
if [ $? -eq 0 ] ; then
    echo "I found foo in the password file"
fi

The programmer forgot that the if statement can directly test the return code of the command it executes. Calling up the test command to examine the shell variable for the return code of the previous command is superfluous. The “less code” version of the above amateur code is:

if fgrep "foo" /etc/passwd >/dev/null ; then
    echo "I found foo in the password file"
fi

A real pro might have read the manual page for fgrep an knows that fgrep has a --quiet (-q) option to suppress output, so the pro version becomes:

if fgrep -q "foo" /etc/passwd ; then
    echo "I found foo in the password file"
fi

Don’t write more code than you need to. Less code is better code.

Author: 
| Ian! D. Allen, BA, MMath  -  idallen@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/

Plain Text - plain text version of this page in Pandoc Markdown format

Campaign for non-browser-specific HTML   Valid XHTML 1.0 Transitional   Valid CSS!   Creative Commons by nc sa 3.0   Hacker Ideals Emblem   Author Ian! D. Allen