% Shell Script Control Structures -- `if`, `then`, `else`, `elif`, `test`, `[...]`, `shift`, `while`, `do`, `done`, `case`, `esac` % Ian! D. Allen -- -- [www.idallen.com] % Fall 2015 - September to December 2015 - Updated 2019-04-09 02:13 EDT - [Course Home Page] - [Course Outline] - [All Weeks] - [Plain Text] Control Structures / Control Statements ======================================= **Control structures** (also called control statements) are lines you can place in a shell script to change the order that the shell executes the command lines in the shell script. This file explains how these statements might be useful. Reference: **Chapter 7. Conditional statements** Humans see and act on error messages ------------------------------------ Shell scripts are files of command lines that by default execute one-by-one from the first line of the file to the last line. If a human were to run the lines in a shell script by typing them into a shell manually, s/he might notice that a line failed, and thus would not proceed to do the other lines in the script. Let's use this three-line example shell script called `example.sh`: $ cat example.sh mkdir foo cd foo date >date.txt If these lines were typed by a human, it might look like this: $ mkdir foo mkdir: cannot create directory ‘foo’: File exists At this point, the human would stop. The human would not go on to type `cd foo` or any of the following commands, because the human notices the error of the failed command and does not proceed. > The `mkdir` command itself quietly returns a non-zero integer [Command Exit > Status] when it fails, though you as a human don't care about that because > you can't normally see it and you can read the error message. Shell programs don't see or act on error messages ------------------------------------------------- A non-human shell program (such as `/bin/sh` or `/bin/bash`) reading the same three-line shell script would run all the commands in the script one after the other, even though the `mkdir` command failed on line 1. The shell program is not a human and it doesn't see or care about the error. The `cd` command on line 2 of the script would also fail, and then the `date` would be written into the current (wrong) directory, not into the new `foo` directory: $ sh -u example.sh mkdir: cannot create directory ‘foo’: File exists example.sh: 2: cd: can't cd to foo $ ls -l -rw-r--r-- 1 idallen idallen 0 Nov 19 15:25 foo -rw-r--r-- 1 idallen idallen 32 Nov 19 15:26 example.sh -rw-r--r-- 1 idallen idallen 29 Nov 19 15:26 date.txt Control Structures affect which commands execute ------------------------------------------------ **Control structures** (also called flow control statements) are lines you can place in shell scripts to change the order that the shell executes the command lines, so that the shell can do things such as avoid some commands when things go wrong. > Other control structures let the shell loop (repeat) a list of commands > multiple times, allowing the script to operate on multiple command line > arguments or until some exit status is satisfied. More on loops later. Adding an `if` statement to check the return code ------------------------------------------------- Here is the same script as above, rewritten with an added `if` control structure that checks the [Command Exit Status] of the `mkdir` and only does the other two commands if the `mkdir` was successful (has an exit status of zero): $ cat example.sh if mkdir foo ; then cd foo date >date.txt fi echo "This script is done." The `if` control statement in the script file runs the `mkdir` command and checks its return code. If the return code is good (zero), the shell runs the two indented statements between the `then` and the closing `fi` that ends the `if` statement. If the exit status is bad (non-zero), the shell skips the two statements in the file by moving to the `fi` line and does not run them. The `if` statement automatically checks the return code of the `mkdir` command without printing the code on your screen. The `if` statement doesn't need to print the return status; it simply tests it and acts on it. (You can use `echo $?` to see a previous return code.) If we run this new version of the script, the shell does not execute the two commands inside the `if` statement if the `if` statement detects that the `mkdir` fails (with a non-zero exit code): $ sh -u example.sh mkdir: cannot create directory ‘foo’: File exists This script is done. $ ls -l -rw-r--r-- 1 idallen idallen 0 Nov 19 15:25 foo -rw-r--r-- 1 idallen idallen 81 Nov 19 15:36 example.sh > The shell does not print the return status of the commands it executes. If > you want to actually see the return status of a command, you need to have > the shell `echo` the value of the `$?` variable right after running the > command. (Remember that using `echo` this way will change the exit status! > Save the status if you need it after the `echo`.) Branch Conditional Shell Control Structures -- `if`, `then`, `else`, `fi` ========================================================================= Unix/Linux shells are designed to find and run commands. This means that the programming control structures of shells look at the [Command Exit Status] values of running commands. (Most other programming language control structures use mathematical expressions, not running commands.) A basic shell control structure is a **branch** statement that chooses which list of commands to run based on the success/failure of a command return code. The simple `if` ... `then` ... `fi` branch statement (no `else`) ---------------------------------------------------------------- The simplest control structure is the `if` statement that checks the return status of a command (or a list of commands) and only runs another command (or list of commands) if the return status of the original command is successful (zero): if testing_command_list ; then zero_command_list fi The *testing_comamnd_list* is executed, and the return status is tested, then: - If the exit status is zero (the command succeeded), the commands in the *zero_command_list* are executed. - If the exit status is not zero (the command failed), the *zero_command_list* is not executed and the shell continues after `fi` with the rest of the shell script, if any. - Either command list can be multiple commands separated by newlines, semicolons, or logical `&&` or `||` operators. If the *testing_command_list* is multiple commands, only the return status of the *last* command is tested. Any list may include commands connected by pipelines. if false ; false ; false ; false ; true ; then echo "This is true" fi > After the shell keyword `if` comes a *testing_command_list*, not an > arithmetic or logical expression as would be found in most programming > languages. The shell executes commands; it does not do arithmetic. Here are some examples of what these simple `if` statements might look like inside shell scripts: #!/bin/sh -u if mkdir foo ; then cd foo date >date.txt fi #!/bin/sh -u if fgrep "$1" /etc/passwd ; then # search for first script argument echo "I found $1 in file /etc/passwd" fi #!/bin/sh -u if who | fgrep "idallen" ; then echo "idallen is online" echo "Hello Ian" | write idallen fi Things to notice about the syntax of this control structure: - The structure starts with `if` and ends with `fi` on a line by itself. The `fi` is `if` spelled backwards. - There is a semicolon separating the end of the *testing_command_list* from the keyword `then`. - The commands of the *zero_command_list* are indented right several spaces (or one tab stop). - The `fi` keyword is lined up directly under the `if` keyword. - The *testing_command_list* may be a pipeline or a list of several commands. Only the exit status of the *last* command in a pipeline or list is checked by the shell. Always write `if` statements using the above form in this course. Negating/inverting a command exit status using exclamation mark `!` ------------------------------------------------------------------- Any single command's return status may be logically negated/inverted (turn success to failure or failure to success) by preceding the command name with an exclamation mark: $ cd $ echo $? 0 $ ! cd $ echo $? 1 $ rm nosuchfile rm: cannot remove ‘nosuchfile’: No such file or directory $ echo $? 1 $ ! rm nosuchfile rm: cannot remove ‘nosuchfile’: No such file or directory $ echo $? 0 Inverting the exit status of a command is useful in an `if` statement when you want to execute some commands when a command *fails*, not when it *succeeds*. The syntax is to precede the command name with an exclamation mark to invert the return status: if ! single_command ; then nonzero_command_list # list executes on command FAILURE, not SUCCESS fi For example, the same `mkdir` script used above could be rewritten as shown below, so that the indented commands in the `if` statement execute when the `mkdir` command *fails*, not when it *succeeds*: if ! mkdir foo ; then # note use of ! to invert mkdir exit status echo "$0: mkdir foo failed" exit 1 fi cd foo date >date.txt The leading exclamation mark `!` above turns a failure exit status from `mkdir` into a success status (and vice-versa) for the `if` statement. The leading `!` means that when `mkdir` *fails*, the non-zero exit status is inverted to a zero exit status and the *nonzero_command_list* executes and the script prints a failure message and exits. Conversely, the leading `!` means that when `mkdir` *succeeds*, the zero exit status is inverted to a non-zero exit status and the *nonzero_command_list* is *not* executed. The script does not exit and it continues after the `fi` line with the `cd` and `date` commands. ### Inverting an exit status hides the exit status value Inverting the exit status of a command that has different types of non-zero exit statuses (such as the `grep` and `fgrep` commands) will hide the difference between the exit statuses -- all types of non-zero exit status will be inverted to the same success exit status (zero): $ fgrep "nosuchstring" /etc/passwd $ echo $? 1 # exit status 1 means "string not found" $ fgrep "nosuchstring" nosuchfile fgrep: nosuchfile: No such file or directory $ echo $? 2 # exit status 2 means "error in pathname" $ ! fgrep "nosuchstring" /etc/passwd $ echo $? 0 # exit status 1 inverts to zero $ ! fgrep "nosuchstring" nosuchfile fgrep: nosuchfile: No such file or directory $ echo $? 0 # exit status 2 also inverts to zero Do not invert an exit status if you need to use it later or if you need to know the difference between different exit statuses: if ! fgrep "nosuchstring" nosuchfile ; then echo "fgrep failed with exit status $?" # WRONG: PRINTS ZERO EXIT STATUS! fi The above message always prints "exit status 0" on command failure, since the leading `!` always logically negates/inverts a non-zero exit code to a zero exit code that is then placed in the `$?` variable. The two-part `if` ... `then` ... `else` ... `fi` branch statement ----------------------------------------------------------------- We can include an `else` clause inside the `if` statement that will contain commands to run if the *testing_command_list* fails (has a non-zero exit status). This gives the shell a choice of which commands to run: if testing_command_list ; then zero_command_list else nonzero_command_list fi The *testing_comamnd_list* is executed, and the return status is tested, then: - If the exit status is zero (the command succeeded), only the commands in the *zero_command_list* are executed. The shell skips over (does not run) the commands in the *nonzero_command_list* (after the `else` keyword). - If the exit status is not zero (the command failed), only the commands in the *nonzero_command_list* are executed. The shell skips over (does not run) the commands in the *zero_command_list* (before the `else` keyword). - Any of the command lists can be multiple commands separated by newlines, semicolons, or logical `&&` or `||` operators. If the *testing_command_list* is multiple commands, only the return status of the *last* command is tested. Any list may include commands connected by pipelines. > After the shell keyword `if` comes a *testing_command_list*, not an > arithmetic or logical expression as would be found in most programming > languages. If the exit status of the command list is zero (success), the > **zero** branch of the `if` is taken, otherwise a non-zero exit status > causes the **non-zero** branch to be taken. There are always exactly two > branches: zero, and non-zero, and only the commands in one of the branches > will execute, never both. Here are some examples of what these `if-else` statements might look like inside shell scripts: #!/bin/sh -u if mkdir foo ; then cd foo date >date.txt else echo 1>&2 "$0: Cannot create the foo directory; nothing done." exit 1 # exit the script non-zero (failure) fi echo "This script is done." #!/bin/sh -u if fgrep "$1" /etc/passwd ; then # search for first script argument state='found' # set variable indicating success else state='did not find' # set variable indicating failure fi echo "I $state the text '$1' in the file /etc/passwd" # use the variable Things to notice about the syntax of this control structure: - The structure starts with `if` and ends with `fi` on a line by itself. The `fi` is `if` spelled backwards. - There is a semicolon separating the end of the *testing_command_list* from the keyword `then`. - The `else` keyword is on a line by itself. It separates the *zero_command_list* from the *nonzero_command_list*. - The `fi` keyword is lined up directly under the `else` keyword that is lined up directly under the `if` keyword. - The commands of the *zero_command_list* and the *nonzero_command_list* are all indented right several spaces (or one tab stop) with respect to the `if`, `else`, and `fi` keywords. Always write `if-else` statements using the above form in this course. Using `if` ... `else` instead of `!` to preserve the exit status ---------------------------------------------------------------- If you want to know the exact non-zero exit status of a command, you can't use `!` in front of the command to negate/invert the status to zero: if ! fgrep "nosuchstring" nosuchfile ; then echo "fgrep failed with exit status $?" # WRONG: ALWAYS ZERO EXIT STATUS! fi You can use an `if-else` syntax to preserve the exit code for use in `$?`: if fgrep "nosuchstring" nosuchfile ; then : # do nothing on success else echo "fgrep failed with exit status $?" # show correct $? exit status exit 1 fi The shell built-in command `:` (also called `true`) does nothing. Nested `if` statements: `if` statements inside other `if` statements -------------------------------------------------------------------- Often a shell script needs to test the return code of a command used in one of the branches of an `if` statement. For example: #!/bin/sh -u # First try: search for first argument in passwd file # Second try : search for first argument in group file # if fgrep "$1" /etc/passwd ; then echo "I found '$1' in file /etc/passwd" else if fgrep "$1" /etc/group ; then echo "I found '$1' in file /etc/group" else echo "I did not find '$1' in /etc/passwd or /etc/group" echo "Please try again" fi fi echo "This script is done." A *nested* `if` statement is an `if` or `if-else` statement that is contained inside one of the two branches of an outer `if-else` statement, as shown the example above. We inserted another complete `if-else` statement in the *nonzero_command_list* branch of the outer `if` statement. The second `if-else` statement searches for the first command line argument in the `/etc/group` file, but it only does the search if the argument was *not* found in the `/etc/passwd` file. We can nest another `if/else` statement into the script: #!/bin/sh -u # First try: search for first argument in passwd file # Second try : search for first argument in group file # Third try : search for first argument in networks file # if fgrep "$1" /etc/passwd ; then echo "I found '$1' in file /etc/passwd" else if fgrep "$1" /etc/group ; then echo "I found '$1' in file /etc/group" else if fgrep "$1" /etc/networks ; then echo "I found '$1' in file /etc/networks" else echo "I did not find '$1' in passwd, group, or networks" echo "Please try again" fi fi fi echo "This script is done." You can nest control statements as much as you like, though deeply nested structures can be hard to read and understand. Keep things simple! Using correct indentation in shell scripts ========================================== Pay careful attention to the indentation that makes the statement easier to read for humans inside shell scripts. The shell itself doesn't care about indentation, but humans do when reading the scripts. At the shell command line, you can dispense with the indentation and type an `if/else` statement all on one line, if you like: $ if mkdir foo ; then echo OK ; else echo BAD ; fi Don't do the above one-line `if` statements inside shell scripts! Space out the statements using proper indentation inside scripts, and don't put multiple commands on the same line: #!/bin/sh -u # Using correct indentation is essential inside shell scripts if mkdir foo ; then echo OK else echo BAD fi Use comments lines to explain what your script is doing. The `test` helper program -- file tests, string tests, and integer expressions ============================================================================== Despite the command-oriented nature of the Unix/Linux shell, people often want shell scripts to make conditional decisions based on things that do not directly involve running commands: - The properties of file system objects, e.g. test is a path readable or writable - String comparisons, e.g. test is a command-line argument equal to `--help` - Numeric comparisons, e.g. test is the number of arguments equal to `2` Since shell `if` statements can only act on the exit status of a command (or command list), we need a helper command to do the comparison work and set an exit status if we want to do any of the above three kinds of tests. In this course, we will follow the traditional shell syntax for doing the above three types of tests using a helper command named `test` to do the tests for us. This traditional syntax works in all Bourne-style shells, at least back to 1972 or so. (Recent Bourne-style shells have added syntax to allow arithmetic expressions to directly follow the `if` keywords; but, this is not universal and not all shells can do this.) The `test` helper command accepts blank-separated arguments to be tested or compared. It sets its own return code and exits depending on whether or not the supplied test comparison succeeded (exit zero) or not (exit non-zero). Simple example: testing if a pathname is a file with `-f` --------------------------------------------------------- Here is the `test` helper program being used to test to see if pathnames `/bin/bash` and then `/bin/nosuchfile` are existing files: $ test -f "/bin/bash" $ echo $? 0 # zero means success (is a file) $ test -f "/bin/nosuchfile" $ echo $? 1 # non-zero means failure (is not a file) The `test` command normally has *no output*, unless something goes wrong. Because `test` only sets an exit status, it doesn't print anything on your screen. It only sets its return code, based on the tests you ask it to do. You can use `echo` to make the invisible exit status of `test` visible at the command line, by displaying the command exit status left in shell's `$?` variable, as in the examples above. You can use the `test` command in an `if` conditional control structure and check its return status just as you would use any other command: $ cat example.sh #!/bin/sh -u if test -f "$1" ; then echo "File '$1' is a file" ls -l "$1" else echo "'$1' is not a file" fi Running the above script: $ ./example.sh /bin/bash File '/bin/bash' is a file -rwxr-xr-x 1 root root 1037464 Aug 31 2015 /bin/bash $ ./example.sh /bin/nosuchfile '/bin/nosuchfile' is not a file In the example above, the `test` program silently tests to see if the pathname argument given as the first script argument is an existing file. If it is, `test` returns a successful (zero) exit status and the *success* half of the `if` statement is executed; otherwise, `test` returns failure (non-zero) and the *failure* half of the `if` statement is executed. The `test` program itself has no output; it only sets a return code. > **Critical section**: There is a small period of time between the testing > of the existence of the filename and the use of the filename, and someone > could, in theory, remove the file in between the testing and the using, > causing `ls` to give an error. This is unlikely to happen, but it's not > impossible. You must be wary of these possible faults in your programming > logic. Some versions of the `test` command are built-in to the shell ------------------------------------------------------------- The `test` program has a large number of tests and operations it can do that are essential in control statements in shell scripts. Because it is used so often and is so important, many shells (including the `bash`, `dash`, and Ubuntu `sh` shells) have a built-in version of `test` that is documented in the manual page for the shell. The shell built-in `test` may be slightly different from the external one documented in the `test` manual page. See the manual page for your `sh` shell for the most accurate documentation. The `test` command used by `sh` scripts under modern versions of Ubuntu Linux is the one in the `dash` shell manual page: see `man sh` See the `test` documentation for the full list of things this command can do. We will concentrate on a few key tests. Three main categories of tests: pathname, string, numeric --------------------------------------------------------- The `test` helper command has three main categories of things it can test: 1. test pathname properties, e.g. `test -f` 2. test and compare strings, e.g. `test -z` 3. compare integers, e.g. `test 4 -lt 9` Some key examples from each category follow. Using `test` to test pathname properties, e.g. `test -f` -------------------------------------------------------- You can test most any property or attribute pertaining to an object in the file system. These tests are commonly used: test -e "pathname" # true if pathname exists (any kind of path) test -f "pathname" # true if pathname is a file test -d "pathname" # true if pathname is a directory test -r "pathname" # true if pathname is readable test -w "pathname" # true if pathname is writable test -x "pathname" # true if pathname is executable test -s "pathname" # true if pathname has size larger than zero All the pathname tests also fail if the pathname does not exist or is not accessible (because some directory prevents access to the pathname). Here is a script example that tests if its first argument is accessible and is a file: #!/bin/sh -u if test -f "$1" ; then echo "Pathname '$1' is an accessible file" else echo "Pathname '$1' is inaccessible, missing, nor not a file" fi - Note that if a if a `test` pathname operator fails, it may also fail because you have no permission to search one of the directories in the pathname, or because the pathname simply doesn't exist. - Note that the opposite of "is a file" is not "is a directory", since there are more things in the file system than just files and directories. (Check out the type of the `/dev/null` pathname that is neither a file nor a directory.) See the manual page for other less common types of pathname tests. Using `test` to compare text strings, e.g. `test -z` ---------------------------------------------------- Since the shell `if` keyword must be followed by a command name, to compare strings we must execute the `test` helper command to do the string comparison for us. The `test` helper command will do the comparison of the strings given on its command line and set its *return status* depending on the result of the comparison: test -z "$1" # true if length of argument $1 is zero (empty string) test -n "$1" # true if length of argument $1 is not zero test "$1" = "foo" # true if argument $1 and foo are the same strings test "$1" != "foo" # true if argument $1 and foo are not the same strings Here is an example that tests if the first script argument is the text string `--help`: #!/bin/sh -u if test "$1" = '--help' ; then echo "Usage: $0 [pathname]" exit 3 fi To compare if a string is empty, remember to put double quotes around variable expansions: #!/bin/sh -u if test -z "$1" ; then echo "The first argument is an empty string." fi if test "$1" = '' ; then echo "The first argument is an empty string." fi #!/bin/sh -u if test -n "$1" ; then echo "The first argument is NOT an empty string." fi if test "$1" != '' ; then echo "The first argument is NOT an empty string." fi **Always surround variables with double quotes!** Warning: a single argument to `test` is a non-empty string test --------------------------------------------------------------- **WARNING:** Any time the `test` program is given exactly one argument, it assumes the argument is preceded by `-n` and it tests the single argument for a non-empty string, so these are all equivalent: test "$1" # true if length of argument $1 is not zero test -n "$1" # true if length of argument $1 is not zero test "$1" != '' # true if length of argument $1 is not zero Because it is not an error to forget to use `-n`, this single-argument default to the `-n` string test is the cause of many shell programming mistakes. In the example below, the command does not do what it looks like it does at first glance; it does not test equality between two strings: test "abc"="def" # success because "abc=def" is one non-empty string argument! If you use the above `test` expression in a script, it is a one-argument `test -n` command, not a three-argument string comparison! test -n "abc=def" # success because "abc=def" is one non-empty string argument! The resulting exit status will always be zero (success) because the single argument `abc=def` given to `test` is not an empty string! The correct way to test string equality is to add blanks around the equals sign to make sure the `test` command sees *three* separate arguments: test "abc" = "def" # correct way to test if string1 is not equal to string2 Make sure you surround all `test` operators with blanks on both sides so that the `test` command sees separate arguments, or else everything will look like a one-argument string test and will always succeed: test "abc"="def" # WRONG! missing blank - always succeeds (exit zero) test -f/dev/null # WRONG! missing blank - always succeeds (exit zero) test -s/dev/null # WRONG! missing blank - always succeeds (exit zero) test -z"abc" # WRONG! missing blank - always succeeds (exit zero) Using `test` to compare integer numbers, e.g. `-eq` --------------------------------------------------- Since the shell `if` keyword must be followed by a command name, to compare numbers we must execute the `test` helper command to do the numeric comparison for us. The `test` helper command will do the numeric comparison of numbers given on its command line and set its *return status* depending on the result of the comparison. The `test` helper program can compare integers using one of six comparison operators. There are only six possible combinations for comparing two integer numbers *n1* and *n2*: test n1 -eq n2 # true if n1 and n2 are the same numeric value test n1 -ne n2 # true if n1 and n2 are not the same numeric value test n1 -lt n2 # true if n1 is less than n2 test n1 -le n2 # true if n1 is less than or equal to n2 test n1 -gt n2 # true if n1 is greater than n2 test n1 -ge n2 # true if n1 is greater than or equal to n2 > The `test` helper does not traditionally use the familiar mathematical > operators for these comparisons. The `test` helper uses `-gt` rather than > the more familiar `>` because the `>` character is used for redirection by > the shell and it would have to be quoted to be hidden from the shell and > seen by `test`. Example use of a numeric test on the number of command line arguments: if test $# -ne 2 ; then echo 1>&2 "$0: Expecting two arguments; found $# ($*)" exit 1 fi These six numeric `test` operators only work on integers, not empty strings, letters, or other non-digits. If you try to compare a non-integer with one of these six operators, `test` issues an error message. The error message is slightly different depending on which shell is reading your shell script: sh$ test 1 -eq a sh: 1: test: Illegal number: a bash$ test 1 -eq a bash: test: a: integer expression expected bash$ test 1 -eq "" bash: test: : integer expression expected > **Boolean Logic Warning:** Note that the opposite of the condition "less > than" (**\<**) is "greater than or equal to" (**\>=**), not "greater than" > (**\>**). Don't make this common logic mistake! > **Syntax Warning:** If you forget to use the `-gt` syntax and use something > familiar such as `test 1 > 2`, the shell redirection syntax will > create an empty file named `2` and then run `test 1` whose result (a > one-argument `test` command) will always be successful. Don't do that! Comparing strings is not the same as comparing integers ------------------------------------------------------- Comparing numbers and comparing strings do not always give the same results: test 0 = 00 # fails because s1 not equal to s2 test 0 -eq 00 # succeeds because zero equals zero test 0 = " 0 " # fails because s1 not equal to s2 test 0 -eq " 0 " # succeeds because zero equals zero You will often see people use a string equality test on two numbers in shell scripts, where the correct test should be the numeric equality test: if test $# = 0 ; then ... # should use -eq for numbers if test $# -eq 0 ; then ... # always use use -eq for numbers In most cases, the results are the same, but be careful of cases where the strings may differ but the numbers may be equal. To be safe, always use one of the numeric comparison operators for integers. Combining `test` expressions using logical *AND* `-a` and *OR* `-o` ------------------------------------------------------------------- You can test more than one thing in a `test` helper expression by separating each test expression using `-o` for logical **OR** and `-a` for logical **AND**: test -f "path" -a -s "path" # true if path is a file AND path is not empty test -d "path" -o -f "path" # true if path is a directory OR path is a file Remember that each test expression on either side of the **AND** or the **OR** must be a complete and valid test expression: test "$1" = 'dog' -o "$1" = 'cat' # correct use of two expressions separated by -o test "$1" = 'dog' -o 'cat' # WRONG - second expression is always true The last line, above is the **OR** of these two test expressions: test "$1" = 'dog' test 'cat' The last expression -- a single-argument `test` expression -- is always true because the argument is not an empty string, so the logical **OR** of the above two expressions is also always true, which is probably not what you want. ### *AND* has higher precedence than *OR* When both **AND** and **OR** are present, the `-a` **AND** operator has higher precedence than the `-o` **OR** operator. (You can think of **AND** as being similar to multiplication and **OR** being similar to addition in precedence.) ### Parentheses and operator precedence in `test` expressions You can create more complex combinations of **AND** and **OR** logic by using parentheses for grouping, but you must hide the parentheses from the shell using quoting. **Example A:** The following logic succeeds if the path is not empty and is either a directory or a file: # Example A - note the quoting of the parentheses test -s "path" -a \( -d "path" -o -f "path" \) Without parentheses, the conjunction `-a` **AND** operator has precedence (binds more tightly) than the disjunction `-o` **OR** operator. The Example A expression above, when used without parentheses, has a completely different meaning because logical `-a` **AND** binds more tightly than logical `-o` **OR**. **Example B:** The expressions below are equivalent (and are not the same as the Example A expression above); # Example B - same - logical -a binds before -o test -s "path" -a -d "path" -o -f "path" test \( -s "path" -a -d "path" \) -o -f "path" # same as above The above Example B means "succeed if path is not empty and is a directory, **OR** succeed if path is a file". Without parentheses, the file could be empty, which was not true in Example A that used parentheses to group the `-o` expressions together. You can always use parentheses to make sure your logic binds the way you intend, instead of relying on default precedence. Negating/Inverting `test` expressions using exclamation mark `!` ---------------------------------------------------------------- You can negate/invert the exit status of any test expression by inserting an exclamation point at the start of the test expression, e.g. test ! -e "pathname" # true if pathname does *not* exist test ! -w "pathname" # true if pathname does *not* exist or is *not* writable test ! -s "pathname" # true if pathname does *not* exist or has size zero The `!` negation operator applies only to the single closest expression, not to the whole expression: test ! -f path -o -d path # if path is not a file OR if path is a directory In programming terms, the negation `!` operator has highest precedence over all the other Boolean operators. If you want to negate an entire expression, you have to put the expression in parentheses: test ! \( -f path -o -d path \) # if path is not a file or directory You can always use parentheses to make sure your logic binds the way you intend, instead of relying on default precedence. Two ways of negating/Inverting `test` expressions ------------------------------------------------- Negating a `test` expression using the exclamation mark `!` may sometimes have the same effect as using an exclamation mark at the start of the whole *testing_command_list* to negate the whole `test` command (as shown [above]), but negating a single `test` expression applies just to that expression in the `test` command, not to the exit status of the whole `test` command. These next two ways of negating (inverting) the exit status of a single `test` expression are the same. The first negation is done by the `test` command on the single expression; the second negation is done by the shell on the exit status of the `test` command: if test ! -e "path" ; then ... # true if pathname does *not* exist if ! test -e "path" ; then ... # same as above; done by shell In shell programming, we prefer the first syntax, with the negation happening inside the `test` expression. We rarely negate the entire `test` command (as in the second example, above). Use the first syntax. ### Negating compound `test` expressions These complex negations below are not equivalent, since the second one negates the entire `test` exit code (both expressions) and not just the first expression as in the first line: if test ! -e "path" -o -d "path" ; then ... # only one expression is negated if ! test -e "path" -o -d "path" ; then ... # WRONG! NOT THE SAME ! The second statement above is technically a negation of a disjunction (**OR**), so we need to call upon De Morgan's Law to see what it really means: ### De Morgan's Laws If you know [**De Morgan's laws**], then you know that a negation of a disjunction (**OR**) is the same as a conjunction (**AND**) of the negations (and vice-versa), so these two lines are equivalent: if ! test -e "path" -o -d "path" ; then ... if test ! -e "path" -a ! -d "path" ; then ... The deMorgan logic transform of the first statement above into the second statement above looks like this, where `A` is `-e "path"` and `B` is `-d "path"`: if not test (A or B) ; then if test (not A and not B) ; then # deMorgan !(A or B) --> (!A and !B) These are also equivalent statements, with the first one being simpler and easier to understand: if test ! -e "path" -o -d "path" ; then ... # only one expression is negated if ! test -e "path" -a ! -d "path" ; then ... # deMorgan Shell programming prefers to put the negations inside the test helper command, not in front of it. Keep thing simple! ### Don't use negation `!` in front of `test` or `[` In shell programming, we prefer to have the `test` command do the negations inside its own expressions. We rarely would use the leading `!` on the exit status of the `test` command, so we prefer the first statements in each pair below: if test ! -e "path" ; then ... # use this syntax (preferred) if ! test -e "path" ; then ... # DO NOT USE THIS (bad form) if [ ! -e "path" ] ; then ... # use this syntax (preferred) if ! [ -e "path" ] ; then ... # DO NOT USE THIS (bad form) Using `test` with variables --------------------------- Scripts often use the `test` helper command to test the contents of variables, so one or both of the arguments is often a double-quoted variable to be expanded inside the script: if test "$1" = '/' ; then echo "Using ROOT directory" ; fi if test "$1" = "$2" ; then echo "Two arguments are identical" ; fi if test -r "$1" ; then echo "Argument $1 is a readable pathname" ; fi Here is an example script that tests the number of positional parameters (arguments) to the script via the `$#` variable: #!/bin/sh -u if test "$#" -eq 0 ; then echo "$0: The script has no arguments" fi if test "$#" -ge 1 ; then echo "$0: The first argument of $# is '$1'" fi if test "$#" -ge 2 ; then echo "$0: The second argument of $# is '$2'" fi if test "$#" -ge 3 ; then echo "$0: The script has three or more arguments: $#" fi Running the above script: $ ./example.sh one two three four ./example.sh: The first argument of 4 is 'one' ./example.sh: The second argument of 4 is 'two' ./example.sh: The script has three or more arguments: 4 Using `[...]` as a synonym for `test` *(syntactic sugar)* ========================================================= Someone in the Unix past decided that shell `if` and `while` control statements should look more like the statements found in programming languages. For example, the **Java** and **C** programming languages use parentheses around their conditional expressions: if ( x > 3 ) { /* programming languages such as Java or C */ Someone came up with the idea of making an alias for the `test` helper command that would be named `[` (left square bracket). The `test` command was rewritten so that, if it were called by the name `[`, it would ignore a final argument of `]` (right square bracket). We could now replace this ugly `test` syntax: if test "$1" = 'yes' ; then echo "Argument $1 is 'yes'" fi with this more elegant bracket syntax, using `[` as an alias for `test`: if [ "$1" = 'yes' ] ; then echo "Argument $1 is 'yes'" fi This is still the `test` command, executing under the alias of the command name `[`, and ignoring the final argument `]`. Those square brackets look similar to the parentheses used in some programming languages; but, you must remember that they are *not* punctuation. Each one of those square brackets must be a separate blank-separated token to the shell, and that means both brackets must be surrounded by blanks on both sides (except that you don't actually need blanks before or after semicolons). Ever since that day, most shell scripts now use the square-bracket `[` form of `test` because it looks nicer. Students of the shell must remember that this square bracket `[` form is *not* punctuation; it is simply **syntactic sugar** that uses a command name alias for `test` that happens to be a square bracket. Use blanks around the brackets! > *Syntactic sugar* is a feature added to the syntax of a language that makes > it easier or more elegant for humans to use, but that does not increase the > power or range of things that can already be done. Using `[...]` instead of > the command name `test` is syntactic sugar. Example: Count attacks on SSH port ================================== Below is a fairly complete script to check how many times the SSH port was attacked on a particular date on the [Course Linux Server]. If no date is given on the command line, yesterday is assumed. The `test` helper program is used multiple times via its square-bracket alias syntax: #!/bin/sh -u # $0 [ date_string ] # Count the SSH attacks on the (optional) given date. # If no date given, count attacks on "yesterday". # The date_string is anything acceptable to the "date" command, e.g. # 'today', 'yesterday', 'mar 1', 'now + 2 weeks', etc. # -Ian! D. Allen - idallen@idallen.ca - www.idallen.com PATH=/bin:/usr/bin ; export PATH umask 022 # Check for too many arguments. if [ $# -gt 1 ] ; then echo 1>&2 "$0: Only expecting one optional date argument, found $# ($*)" echo 1>&2 "Usage: $0 'date_string'" exit 1 fi # Make sure we can read the SSH log file. AUTH=/var/log/auth.log if [ ! -r "$AUTH" ] ; then echo 1>&2 "$0: You are not allowed to read '$AUTH'" exit 1 fi # Make sure we can read the Kernel log file. KERN=/var/log/kern.log if [ ! -r "$KERN" ] ; then echo 1>&2 "$0: You are not allowed to read '$KERN'" exit 1 fi # If no arguments, default the date to yesterday. if [ $# -eq 0 ] ; then date=yesterday else date=$1 fi # Re-format the command line date to match the date format in the log files if ! logdate=$( date +"%b %e" --date="$date" ) ; then echo 1>&2 "$0: Could not understand date '$date' - nothing done" exit 1 fi echo "Checking for attacks on '$logdate'" # Extract and count lines on the given date that are attacking lines: authattacks=$( fgrep "$logdate " "$AUTH" | fgrep -c ": refused connect from " ) kernattacks=$( fgrep "$logdate " "$KERN" | fgrep -c " IDAhostsevil " ) attacks=$(( authattacks + kernattacks )) if [ "$attacks" -eq 0 ] ; then echo "No attacks recorded on '$logdate'" else echo "Attacks recorded on '$logdate': $attacks (auth.log=$authattacks kern.log=$kernattacks)" fi Running the above script: $ ./attacks.sh Checking for attacks on 'Mar 27' Attacks recorded on 'Mar 27': 1370 (auth.log=0 kern.log=1370) $ ./attacks.sh 'jan 1' Checking for attacks on 'Jan 1' Attacks recorded on 'Jan 1': 516 (auth.log=2 kern.log=514) $ ./attacks.sh 'crap' date: invalid date ‘crap’ ./attacks.sh: Could not understand date 'crap' - nothing done $ ./attacks.sh too many arguments ./attacks.sh: Only expecting one optional date argument, found 3 (too many arguments) Usage: ./attacks.sh 'date_string' The script isn't perfect, since it doesn't tell you if the date you supply isn't in the range of dates recorded in the log files. An out-of-range date simply produces no results from the logs: $ ./attacks.sh 'nov 1' # date in future Checking for attacks on 'Nov 1' No attacks recorded on 'Nov 1' With more script programming effort, we could check the date to make sure it was in the range of dates recorded in the log files. Other helper programs: `:`, `true` and `false` ============================================== The shell built-in commands `:` (colon) and `true` do nothing and always exit with a success (zero) return status. You can use either one as a place-holder in cases where you don't want to execute a real command, as in this example below when we want to echo the failure return status of a command but don't want to do anything if the command succeeds: if somecommand ; then : # do nothing on success else echo "somecommand: failed with exit status $?" # show exit status fi The value in `$?` would not be correct if we inverted the command exit status using `!`, as in this shorter but incorrect version: if ! somecommand ; then echo "somecommand: failed with exit status $?" # WRONG: ZERO EXIT STATUS! fi The above message always prints "status 0" on command failure, since the leading `!` always negates a non-zero exit code to a zero exit code that is then placed in the `$?` variable. The shell built-in command `false` does nothing and always exits with a failure (non-zero) exit status. It's the same as `! true` and I have no idea why anyone would use it, except as an example of a program that returns a non-zero exit code: $ false $ echo "Return code $?" Return code 1 Condensing nested `if`...`else` using `elif` ============================================ To improve script readability, a set of nested `if` statements can be simplified using the `elif` keyword that combines `else` and `if` together: Before: if [ "$1" -eq "0" ] ; then size='empty' else if [ "$1" -lt "10" ] ; then size='small' else if [ "$1" -lt "100" ] ; then size='medium' else size='large' fi fi fi echo "We classify '$1' as '$size'." After combining every `else` with its following `if`: if [ "$1" -eq "0" ] ; then size='empty' elif [ "$1" -lt "10" ] ; then size='small' elif [ "$1" -lt "100" ] ; then size='medium' else size='large' fi echo "We classify '$1' as '$size'." The above combined `elif` syntax has the same meaning as the nested `if` statement above, but it is four lines shorter and the indentation is only one level for the whole statement. The multi-way `case` ... `esac` statement with GLOB patterns ============================================================ Often we want to see if a string (usually inside a variable) contains any one of a list of different things. Doing this with `if` statements can be tedious. Here is a tedious example: #!/bin/sh -u if [ "$1" = "dog" ] ; then kind='animal' elif [ "$1" = "cat" ] ; then kind='animal' elif [ "$1" = "goat" ] ; then kind='animal' elif [ "$1" = "pig" ] ; then kind='animal' elif [ "$1" = "apple" ] ; then kind='fruit' elif [ "$1" = "peach" ] ; then kind='fruit' elif [ "$1" = "plum" ] ; then kind='fruit' elif [ "$1" = "cherry" ] ; then kind='fruit' else kind='unknown' fi echo "We classify '$1' as '$kind'." Running the above script: $ ./example.sh plum We classify 'plum' as 'fruit'." $ ./example.sh pig We classify 'pig' as 'animal'." The above program works, but it's too long. We can do better. The shell provides a simplified way of testing one string against multiple other strings using the `case`/`esac` statement that has this syntax: case "test-string" in patterns1 ) command_list1 ;; patterns2 ) command_list2 ;; patterns3 ) command_list3 ;; * ) # the "default" if nothing else matches command_list_default ;; esac - If unquoted, each *pattern* is a shell GLOB pattern to be matched against the *test-string* in order from top-to-bottom. The first match wins and the *command_list* closest to the matching pattern is executed. Any other subsequent matches are not executed. Only the one *command_list* is used. - It is not an error for no *pattern* to match. If no *pattern* matches, no *command_list* is executed and the `case` statement does nothing. - As when using GLOB patterns to match file names, each GLOB *pattern* must be a single unbroken word to the shell -- no spaces or word-breaking characters such as semicolon allowed. If you want to match spaces or other special characters in a GLOB pattern, quote them all, e.g. use the quoted pattern `"My Documents")` not `My Documents)` - The GLOB pattern `*` matches anything. If present in a `case` statement, it is always the last pattern in the `case` statement and always matches the *test-string* (if nothing else has matched first). Consider `*` as the **default** match if nothing else matches. - If a *command_list* is only one line, it is often placed adjacent to its associated pattern instead of on a separate line. See the example below. - Any *pattern* can actually be a list of patterns to match separated by **or** `|` characters, e.g. `'dog' | 'cat' | 'pig' )` - Unlike file system GLOB patterns, `case` statement GLOB patterns *do* match leading periods in the *test-string* (because the *test-string* could be any string, not just a file name). Below is the same example as before, eight lines shorter and much easier to read. None of the pattern matches are GLOB patterns in this example; they are all quoted fixed strings to be matched exactly against the first command line argument in the `$1` positional parameter: #!/bin/sh -u case "$1" in 'dog' ) kind='animal' ;; 'cat' ) kind='animal' ;; 'goat' ) kind='animal' ;; 'pig' ) kind='animal' ;; 'apple' ) kind='fruit' ;; 'peach' ) kind='fruit' ;; 'plum' ) kind='fruit' ;; 'cherry' ) kind='fruit' ;; * ) kind='unknown' ;; # the "default" if nothing else matches esac echo "We classify '$1' as '$kind'." Running the above script: $ ./example.sh plum We classify 'plum' as 'fruit'. $ ./example.sh pig We classify 'pig' as 'animal'. Since none of the patterns above contain any GLOB characters, we don't actually need to quote any of them, but quoting them does show that we are not using any GLOB matching here. Using `|` for multiple GLOB patterns in `case` statements --------------------------------------------------------- We can condense the script even more by using **or** `|` characters to put multiple GLOB patterns on the same line: #!/bin/sh -u case "$1" in 'dog' | 'cat' | 'goat' | 'pig' ) kind='animal' ;; 'apple' | 'peach' | 'plum' | 'cherry' ) kind='fruit' ;; * ) kind='unknown' ;; # the "default" match esac echo "We classify '$1' as '$kind'." This program is now about one-third the size of the equivalent program that used nested `if` statements to do the same thing. Example: Classify a pathname ---------------------------- Below is an example using GLOB patterns to identify the first command-line argument in the positional parameter variable `$1`: #!/bin/sh -u case "$1" in '' ) type='missing (empty)' ;; /* ) type='an Absolute Pathname' ;; */ ) type='a Relative Pathname ending in a slash' ;; */* ) type='a Relative Pathname in some directory' ;; *' '* ) type='a Relative Pathname with blank(s)' ;; * ) type='a Relative Pathname in the current directory' ;; # the "default" match esac echo "Pathname '$1' is $type" - Most cases are GLOB patterns, so the GLOB characters must not be hidden from the shell by quotes. Do not use quotes if you want GLOB patterns to work. - As with file name GLOB patterns, if you want a `case` statement GLOB pattern to match some text *anywhere* in the *test-string*, you need to put GLOB `*` characters at either end of the text: `*/*` - A trailing `*` in the pattern makes the text match only at the beginning of the *test-string*: `/*` - A leading `*` makes the text match only at the end: `*/` - A GLOB pattern must be a single shell *word*: If you want to match spaces or special characters in the *test-string*, quote them all in the GLOB pattern to hide them from the shell: `*' '*` Running the above script: $ ./example.sh '' Pathname '' is missing (empty) $ ./example.sh /etc/passwd Pathname '/etc/passwd' is an Absolute Pathname $ ./example.sh a/b/c/ Pathname 'a/b/c/' is a Relative Pathname ending in a slash $ ./example.sh foo/bar Pathname 'foo/bar' is a Relative Pathname in some directory $ ./example.sh "foo bar" Pathname 'foo bar' is a Relative Pathname with blank(s) $ ./example.sh foobar Pathname 'foobar' is a Relative Pathname in the current directory Example: Find the star/asterisk/`*` ----------------------------------- As when matching file names on a shell command line, GLOB pattern characters in `case` statements must be unquoted to behave as GLOB characters. Quoting hides the GLOB characters from the shell and makes them into ordinary characters that must match the *test-string* exactly: #!/bin/sh -u case "$1" in '*' ) msg='a single asterisk (star)' ;; '*'* ) msg='an asterisk at the beginning of the argument' ;; *'*' ) msg='an asterisk at the end of the argument' ;; *'*'* ) msg='an asterisk in the middle of the argument' ;; * ) msg='an argument with no asterisk' ;; # the "default" match esac echo "You entered $msg: $1" Running the above script: $ ./example.sh '*' You entered a single asterisk (star): * $ ./example.sh '*foobar' You entered an asterisk at the beginning of the argument: *foobar $ ./example.sh 'foobar*' You entered an asterisk at the end of the argument: foobar* $ ./example.sh 'foo*bar' You entered an asterisk in the middle of the argument: foo*bar $ ./example.sh 'no star here' You entered an argument with no asterisk: no star here Example: Matching numbers by counting digits -------------------------------------------- Below is another example using more complex GLOB patterns to match the number of digits inside a number, allowing a crude method for checking number ranges: #!/bin/sh -u case "$1" in 0 ) price='free' ;; # 0 [1-9] ) price='cheap' ;; # 1...9 [1-4][0-9] ) price='middle' ;; # 10...49 [5-9][0-9] ) price='upper' ;; # 50...99 [1-9][0-9][0-9] ) price='high' ;; # 100...999 [1-9][0-9][0-9][0-9] ) price='exorbitant' ;; # 1000...9999 * ) price='impossible' ;; # the "default" match esac echo "We classify '$1' as '$price'." Running the above script: $ ./example.sh 37 We classify '37' as 'middle'. $ ./example.sh 2348 We classify '2348' as 'exorbitant'. $ ./example.sh crap We classify 'crap' as 'impossible'. Using GLOB patterns to test numeric ranges has its limitations, and doesn't always work as nicely as one would like (since we are comparing digits and not evaluating and comparing numeric values) $ ./example.sh 037 # 037 is in range 10...49 We classify '037' as 'impossible'. $ ./example.sh 3.14 # 3.14 is in range 1...9 We classify '3.14' as 'impossible'. Shells find and run commands; they don't do mathematics very well. Example: Validating an argument using a complemented character class -------------------------------------------------------------------- Since GLOB patterns can also match characters that are *not* in a range by adding `!` inside a character class, we can use GLOB patterns to detect input that is, for example, not alphabetic: #!/bin/sh -u case "$1" in '' ) echo "Empty string" ;; *[![:alpha:]]* ) echo "Non-alphabetic '$1'" ;; * ) echo "Alphabetic: '$1'" ;; esac The above script has a `case` GLOB pattern that matches a non-alphabetic character anywhere in the string, by using the complement (inverse) of a POSIX character class named `[:alpha:]` that represents alphabetic characters. If that pattern matches in the argument, we know the argument has a non-alphabetic character in it somewhere: $ ./example.sh happy Alphabetic: happy $ ./example.sh abc0def Non-alphabetic: abc0def You can use `case` statements and appropriate complemented POSIX character classes to make sure that arguments contain only the types of characters you want: letters, digits, spaces, printable characters, etc. For a list of POSIX character classes, see [POSIX Character Classes]. ### Detailed explanation of GLOB pattern `*[![:digit:]]*` Below are steps to show you how you can understand what the complemented POSIX character class means in this GLOB pattern: `*[![:digit:]]*` 1. Match a digit `0` anywhere in the argument: case "$1" in *0* ) do something here ... ;; esac 2. Match digits `0` or `1` anywhere in the argument: case "$1" in *[01]* ) do something here ... ;; esac 3. Match digits `0` through `9` anywhere in the argument using a GLOB range (only valid for digits; don't use ranges for letters!): case "$1" in *[0-9]* ) do something here ... ;; esac 4. Match digits `0` through `9` anywhere in the argument using a POSIX character class named `[:digit:]` that replaces the range `0-9`: case "$1" in *[[:digit:]]* ) do something here ... ;; esac 5. Match any character that is *NOT* a digit by inserting `!` at the start of the character class: case "$1" in *[![:digit:]]* ) do something here ... ;; esac Other useful POSIX class names you can use inside GLOB character classes: `[:digit:] [:alpha:] [:alnum:] [:space:]` For a list of POSIX character classes, see [POSIX Character Classes]. **Do not use letter ranges in character classes, e.g. `[a-z]`!** Internationalization and localization features will cause incorrect matches. Always use the POSIX class names for letter ranges inside character classes, e.g. `[[:lower:]]` or `[[:upper:]]` or `[[:alpha:]]` Loop Conditional Shell Control Structures -- `while`, `for`, `do`, `done` ========================================================================= Reference: **Chapter 9. Repetitive tasks** Unix/Linux shells are designed to find and run commands. This means that the programming control structures of shells are based on the exit statuses of running commands. A more complex shell control structure is a **while loop** that repeats running a list of commands over and over based on the success/failure of a command return code. The `while` ... `do` ... `done` loop statement ---------------------------------------------- Recall the syntax of a simple `if` branching statement: if testing_command_list ; then zero_command_list fi For example: if who | fgrep "$1" ; then echo "User '$1' is signed on" fi The shell `while` loop is similar to a simple `if` statement, except that the *zero_command_list* is executed by the shell over and over as long as the *testing_comamnd_list* succeeds. Instead of `then` and `fi`, the *zero_command_list* of the `while` loop is delimited by the new keywords `do` and `done`: while testing_command_list ; do zero_command_list done For example: while who | fgrep "$1" ; do echo "User '$1' is still signed on" sleep 10 done As with the `if` statement, the *testing_comamnd_list* is executed, and the return status of the last command in the list is tested, then: - While the exit status is zero (the command succeeded), all the commands in the *zero_command_list* are executed, over and over. - If the exit status becomes non-zero (the testing command fails), the *zero_command_list* is not executed any more and the shell continues after `done` with the rest of the shell script, if any. - Either command list can be multiple commands separated by newlines, semicolons, or logical `&&` or `||` operators. If the *testing_command_list* is multiple commands, only the return status of the *last* command is tested. Any list may include commands connected by pipelines. If the *testing_command_list* never fails, the `while` loop will repeat the *zero_command_list* over and over, forever. To avoid the loop repeating forever, the *testing_command_list* should eventually return a non-zero (failing) status. Below are some examples of what these `while` loop statements might look like inside shell scripts. Example use of an integer expression to add one to a loop index variable to create 100 files with numbered names (without error checking): #!/bin/sh -u # Create 100 files with names filename1.txt through filename100.txt # This version does no error checking. count=1 while [ $count -le 100 ] ; do touch "filename$count.txt" count=$(( count + 1 )) done echo "Touched all the files." Below is an improved version of the above script that checks for errors. Note how the loop index starts at zero and is incremented before being used; the `while` loop testing condition is modified to make sure the loop still stops at file 100. We keep a count of the number of errors: #!/bin/sh -u # Create 100 files with names filename1.txt through filename100.txt # This version does error checking. i=0 errors=0 while [ $i -lt 100 ] ; do i=$(( i + 1 )) name="filename$i.txt" if ! touch "$name" ; then echo 1>&2 "$0: Failed to touch '$name'" errors=$(( errors + 1 )) fi done echo "Number of names touched: $i errors: $errors" Below is another example of a `while` loop using a shell pipeline as the *testing_comamnd_list*: #!/bin/sh -u # Loop while idallen is signed on, then exit. while who | fgrep "idallen" ; do echo "idallen is still online" sleep 60 done echo "idallen just signed off" As you can see in the example above, the *testing_comamnd_list* being executed can be a shell pipeline, just as in an `if` statement. Only the exit status of the last command in a pipeline is used by the shell. As long as the `fgrep` command succeeds in finding the string `idallen` in the output of the `who` command (exit status zero), the loop will continue. When the `fgrep` command does not find `idallen`, it returns a non-zero exit status and the loop finishes and the message `idallen just signed off` is echoed. Below is a similar script the works the opposite to the one above. It uses the exclamation mark `!` negation operator to negate the return code of the (last command in the) pipeline. It waits for `idallen` to sign on, and loops waiting until he does. When he does sign on, the `fgrep` command returns a success zero status that is inverted to failure by the exclamation mark `!` and the script exits with the message `idallen just signed on`. #!/bin/sh -u # Loop while idallen is NOT signed on, then exit. while ! who | fgrep "idallen" ; do echo "idallen is not online yet" sleep 60 done echo "idallen just signed on" Below is a more general version of the same `while` loop script, using a required command line argument as the *userid* to look for: #!/bin/sh -u # $0 userid # Loop until the userid signs on, then print a message and exit. # PATH=/bin:/usr/bin ; export PATH umask 022 # Make sure we have exactly one command line argument. if [ $# -ne 1 ] ; then echo 1>&2 "$0: expecting one userid, found $# ($*)" echo 1>&2 "Usage: $0 userid" exit 1 fi user="$1" # Loop while the $user is *NOT* signed on. while ! who | fgrep "$user" ; do echo "$0: $(date) $user is not online yet." sleep 60 done echo "$0: $(date) $user just signed on." The `for` ... `do` ... `done` loop statement -------------------------------------------- Unlike the `while` loop that has to run a *testing_comamnd_list* to know when to continue the loop, the `for` loop does not execute any testing command. It simply iterates over a fixed list of words, one at a time. There are two kinds of `for` loops, one with an implicit list of words (the command line arguments) and one with an explicit list of words: # implicit list of words come from command line arguments for name do # iterates over arguments $1 $2 $3 ... command_list done # explicit list of words is supplied before semicolon for name in word1 word2 word3 ; do # iterates over word1 word2 word3 ... command_list done Using square brackets here to indicate optional text and `...` to indicate repeated text, as in the SYNOPSIS section of `man` pages, we could write the `for` syntax like this: for name [ in word... ; ] do command_list done - The `name` is the name of a variable, called the **index** variable. The name is given here without a leading dollar sign! - The optional `in`*word...*`;` is a list of words to iterate over, one at a time. The list must end with a semicolon. - If the list of words is omitted, the command line argument positional parameters (`$1`, `$2`, etc.) are used for the list. The variable `$name` is set to the first *word* in the list and then the *command_list* is executed. Then `$name` is set to the second *word* and the *command_list* is executed again. This repeats for each word in the list until the list is exhausted. An example `for` loop with a list of three words to iterate over: #!/bin/sh -u for i in dog cow pig ; do echo "See the $i run!" done Running the above script: $ ./example.sh See the dog run! See the cow run! See the pig run! Another example without an explicit list of words uses the positional parameters (command line arguments `$1`, `$2`, etc.) as the list of words: #!/bin/sh -u for j do echo "See the $j run!" done Running the above script: $ ./example.sh man nose See the man run! See the nose run! The name of the index variable is arbitrary but should usually reflect the meaning of the items in the list of words: #!/bin/sh -u fruits= for newfruit in apple pear plum cherry apricot banana ; do fruits="$fruits $newfruit" done echo "So many:$fruits" Running the above script: $ ./example.sh So many: apple pear plum cherry apricot banana Often the shell script iterates over a list of file names on the command line: #!/bin/sh -u for file do if [ ! -r "$file" ] ; then echo "Cannot read '$file'" fi done Running the above script: $ ./example.sh /etc/* Cannot read '/etc/group-' Cannot read '/etc/gshadow' Cannot read '/etc/gshadow-' Cannot read '/etc/shadow' Cannot read '/etc/shadow-' Cannot read '/etc/sudoers' > Remember that the variable named at the start of a `for` loop does *not* > start with a dollar sign! You only use the dollar sign to expand the > variable inside the loop body. Exit a loop in the middle: `break` ---------------------------------- You can exit (break out of) a `while` or `for` loop in the middle, before the loop is finished, using the `break` statement. The `break` causes the shell to skip out of the loop to after the `done` statement: #!/bin/sh -u # Create 100 files with names filename1.txt through filename100.txt # This version does error checking and stops on any error. count=0 while [ $count -lt 100 ] ; do next=$(( count + 1 )) name="filename$next.txt" if ! touch "$name" ; then echo 1>&2 "$0: Failed to touch '$name'" break fi count=$(( count + 1 )) # could use count=$next done echo "Number of names touched successfully: $count" > In the example above, note the use of two variables `$count` and `$next` so > that when the loop exits the correct number of successfully touched files > is printed. The `$count` variable is incremented only *after* a successful > touch; the `$next` variable has to be incremented *before* the touch is > attempted. Another example: #!/bin/sh -u # Make backup copies of file names given as arguments. # Stop copying if any error occurs; don't ignore errors. count=0 for file do if ! cp -p "$file" "$file.bak" ; then echo 1>&2 "$0: Failed to copy '$file' to '$file.bak'" break fi count=$(( count + 1 )) done echo "Number of files backed up: $count of $#" Another example: #!/bin/sh # $0 [ filenames... ] # Create .bak copies of all the names. # Does not overwrite existing non-empty .bak copies. # Skips over missing or unreadable files. # Script terminates on copy error; does not continue. count=0 status=0 for name do if [ ! -e "$name" ] ; then echo "$0: '$name' is inaccessible or does not exist; skipped" status=1 elif [ ! -f "$name" ] ; then echo "$0: '$name' is not a file; skipped" status=1 elif [ ! -r "$name" ] ; then echo "$0: '$name' is not readable; skipped" status=1 elif [ -s "$name.bak" ] ; then echo "$0: '$name.bak' already exists; skipped" status=1 elif cp -p "$name" "$name.bak" ; then count=$(( count + 1 )) echo "$count. Backed up to $name.bak" else echo 1>&2 "$0: Could not copy '$name' to '$name.bak'" echo 1>&2 "$0: Script terminated" status=1 break fi done echo "Copied: $count" exit "$status" Return to the top of a loop: `continue` --------------------------------------- You can skip directly back to the top of a `while` or `for` loop using the `continue` statement. The `continue` statement causes the shell to skip the rest of the statements in the loop and go directly back to the top of the loop. The top of a `while` loop is the *testing condition*; the top of a `for` loop selects the next word in the *word list*. Below is the previous script rewritten to use `continue`. Each `if` statement is now separate, not linked to the previous one with `elif`. Only if we pass the first four `if` statements do we attempt the `cp`, and only if the `cp` fails do we print the error message and break out of the loop: #!/bin/sh # $0 [ filenames... ] # Create .bak copies of all the names. # Does not overwrite existing non-empty .bak copies. # Skips over missing or unreadable files. # Script terminates on copy error; does not continue. count=0 status=0 for name do if [ ! -e "$name" ] ; then echo "$0: '$name' is inaccessible or does not exist; skipped" status=1 continue fi if [ ! -f "$name" ] ; then echo "$0: '$name' is not a file; skipped" status=1 continue fi if [ ! -r "$name" ] ; then echo "$0: '$name' is not readable; skipped" status=1 continue fi if [ -s "$name.bak" ] ; then echo "$0: '$name.bak' already exists; skipped" status=1 continue fi if cp -p "$name" "$name.bak" ; then : else echo 1>&2 "$0: Could not copy '$name' to '$name.bak' (status $?)" echo 1>&2 "$0: Script terminated" status=1 break fi count=$(( count + 1 )) echo "$count. Backed up to $name.bak" done echo "Copied: $count" exit "$status" In the example below, we only use `touch` to touch a file (create a new file) if the file doesn't already exist. If the file already exists, we skip back to the top of the loop. If any error occurs, we break out of the loop and end the script: #!/bin/sh -u # Create 100 files with names filename1.txt through filename100.txt # Don't touch any files that already exist. i=0 new=0 while [ $i -le 100 ] ; do i=$(( i + 1 )) name="filename$i.txt" if [ -e "$name" ] ; then echo "Already created '$name'" continue fi if ! touch "$name" ; then echo 1>&2 "$0: Failed to create '$name' break fi new=$(( new + 1 )) done echo "Number of new files touched: $new out of $i" In the example below, we start a variable `count` at zero and then loop over all the arguments counting how many are readable and adding one to the counter each time. When the loop finishes (all the arguments have been processed), we show the value of the counter: #!/bin/sh # $0 [ names... ] count=0 for name do if [ ! -e "$name" ] ; then echo "$0: '$name' is inaccessible or does not exist" continue # skip back to top of FOR loop fi if [ ! -f "$name" ] ; then echo "$0: '$name' is not a file" continue # skip back to top of FOR loop fi if [ ! -r "$name" ] ; then echo "$0: '$name' is not readable" continue # skip back to top of FOR loop fi count=$(( count + 1 )) # add one to the counter echo "$count. A readable file: $name with lines: $(wc -l <"$name")" done echo "Number of readable files: $count" > In the example above, note the use of a double-quoted variable `"$name"` > used inside a command substitution inside a double-quoted string. The shell > knows how to keep the two nested uses of double quotes separate. Shifting positional parameters: `shift` so `$2` becomes `$1` ============================================================ Often we want a script to process any number of command line arguments. This means we don't know ahead of time how many arguments there will be. A nice way to do this is using the shell built-in `shift` command inside a script. We already know that the command line arguments to a script are assigned to the positional parameter variables `$1`, `$2`, `$3`, etc. The shell built-in `shift` command behaves as if the shell throws away the first command line argument and then re-assigns all the positional parameters using the (fewer) arguments that are left. This has the effect that the argument that used to be in `$2` is now in `$1` and what used to be in `$3` is now in `$2`, etc. All the positional parameters have *shifted* down by one: #!/bin/sh -u echo "'$1' is the first argument of $#: $*" shift echo "'$1' is the first argument of $#: $*" shift echo "'$1' is the first argument of $#: $*" shift echo "'$1' is the first argument of $#: $*" shift echo "'$1' is the first argument of $#: $*" Running the above script: $ ./example.sh one two three four five six 'one' is the first argument of 6: one two three four five six 'two' is the first argument of 5: two three four five six 'three' is the first argument of 4: three four five six 'four' is the first argument of 3: four five six 'five' is the first argument of 2: five six Every time `shift` executes, the first command line argument disappears and all the other arguments move down one. We can use this to process a huge number of command line arguments by coding a shell **loop** statement. Let's look at a loop statement next. This example loop uses the shell built-in `shift` command to shift all the command line arguments over and over until no arguments are left (until `$#` becomes zero): #!/bin/sh -u # Display all the command line arguments, no matter how many while [ $# -gt 0 ] ; do echo "'$1' is the first argument of $#: $*" shift done echo "All the arguments have been processed." The `while` loop uses the `test` helper command to test that the number of command line arguments is greater than zero. When the test fails (the number of arguments is zero), the loop terminates and the script continues on to print its final exit message. Running the above script: $ ./example.sh one two three four five six 'one' is the first argument of 6: one two three four five six 'two' is the first argument of 5: two three four five six 'three' is the first argument of 4: three four five six 'four' is the first argument of 3: four five six 'five' is the first argument of 2: five six 'six' is the first argument of 1: six All the arguments have been processed. Shell functions =============== Functions are named pieces of code that can be created inside the shell and passed parameters and executed by the shell just as if they were commands with arguments. A shell function operates just like a miniature shell script: $ Foo () { echo "Hello $*" ; } $ Foo a b c Hello a b c Like shell variables, shell functions are defined inside the shell itself and the definition is lost when the shell exits. (To have a function defined in all your interactive shells, put the definition in your `.bashrc` file.) No `$PATH` lookup is needed to find and run a shell function, and functions are found before any `$PATH` search is done. If a shell function has the same name as a system command, the function is used, not the system command. The formal syntax definition shows that the keyword `function` is optional when creating a function: [function] name () { list of commands } Any parameters you pass to a function when calling it by name on the command line it will become positional parameters inside that function (`$1`, `$2`, etc.), just as if the function were its own little shell script. > The positional parameters (`$1` etc.) of a script are not available inside > a function, because the function uses the positional parameter names to > hold the arguments to the function. Inside a function, `$1` is the first > argument to the function, not the first argument to the shell script > containing the function. Below is an example showing that inside a function, `$1` is the first argument to the function, not the first argument to the shell script containing the function: #!/bin/sh -u Foo () { echo "The value of $1 in the function is '$1'" } echo "The value of $1 in the script is '$1'" Foo bar Running the above script with the argument `Mom`: $ ./example.sh Mom The value of $1 in the script is 'Mom' The value of $1 in the function is 'bar' Functions are most often defined and used inside shell scripts to hold in one place some code that you want to use over and over in your shell script. Without the function, you would have to repeat the code over and over, making the script longer and harder to maintain. Less code is better code! Here is an example script where we centralize most of an `echo` statement and a sentence in some functions and only pass in to the function the two parameters that change: #!/bin/sh -u # Function to use the first two arguments in a sentence: English () { echo "You are a $1 human being. I $2 you." } # Function to use the first two arguments in a sentence as Yoda might say it: Yoda () { echo "Yoda say $1 human being you are. You I $2." } # Function to pass all its arguments to two other functions: Both () { English "$@" Yoda "$@" } Both nice adore Both melancholy "am indifferent to" Both terrible hate Running the above script: $ ./example.sh You are a nice human being. I adore you. Yoda say nice human being you are. You I adore. You are a melancholy human being. I am indifferent to you. Yoda say melancholy human being you are. You I am indifferent to. You are a terrible human being. I hate you. Yoda say terrible human being you are. You I hate. Using functions can make your shell scripts much smaller and easier to understand and maintain. Less code is better code! Example: `ErrorExit` function to exit a script ---------------------------------------------- This example `ErrorExit` shell function below echoes all its arguments onto standard error, prints a usage message, and exits the script. Putting all this common code in the function makes the script shorter, easier to read, and easier to maintain. #!/bin/sh -u # This function prints an error message and exits the script: ErrorExit () { echo 1>&2 "$0: $*" echo 1>&2 "Usage $0 [filename]" exit 1 } if [ $# -ne 1 ]; then ErrorExit "Expecting one filename argument; found $# ($*)" fi if [ "$1" = "" ]; then ErrorExit "Argument is an empty string" fi if [ ! -f "$1" ]; then ErrorExit "Argument is nonexistent, inaccessible, or not a file: '$1'" fi if [ ! -r "$1" ]; then ErrorExit "File is not readable: '$1'" fi if [ ! -s "$1" ]; then ErrorExit "File is empty: '$1'" fi if cp -p "$1" "$1.bak" ; then echo "Backed up '$1' to '$1.bak'." else echo 1>&2 "$0: Failed to copy '$1' to '$1.bak', cp exit status $?" fi Running the above script: $ ./example.sh ./example.sh: Expecting one filename argument; found 0 () Usage ./example.sh [filename] $ ./example.sh /bin ./example.sh: Argument is nonexistent, inaccessible, or not a file: '/bin' Usage ./example.sh [filename] Functions can hold code in one place that would otherwise be repeated over and over in a script. **Less code is better code.** Short-circuit command list operators: `&&` and `||` =================================================== The `if` statement isn't the only way that the shell checks the return status of a command. If you separate two commands with `||` or `&&` the shell checks the return status of the first command to decide whether to run the second command. There are two versions: # Execute command_2 only if comamnd_1 fails (returns non-zero) # command_1 || command_2 # Execute command_2 only if comamnd_1 succeeds (returns zero) # command_1 && command_2 These Boolean command operators are sometimes used inside scripts to avoid having to write an entire `if` statement to check a command return status: mkdir foo || exit $? # exit script if mkdir fails [ -d bar ] || exit 1 # exit script if not a directory [ -s foo ] && mv foo foo.bak # back up file only if not empty Earlier we wrote this simple shell script: #!/bin/sh -u if mkdir foo ; then cd foo date >date.txt fi Using the short-circuit operators, we could rewrite it this way: #!/bin/sh -u mkdir foo || exit $? cd foo || exit $? date >date.txt Above, the `|| exit $?` after each command makes the script exit if the preceding command fails, using the exit code of the command that failed. We could also write the same script this way: #!/bin/sh -u mkdir foo && cd foo && date >date.txt Above, the `&&` operator ensures that the `cd` command is executed only if the preceding `mkdir` succeeds, and the `date` command is executed only if the preceding `cd` succeeds (which means the `mkdir` also succeeded). The script detects errors and doesn't create the date output file: $ touch foo $ ./example.sh mkdir: cannot create directory ‘foo’: File exists Unfortunately, the error message doesn't include the name of the script containing the command that is generating the error, so tracking down which script is generating the error is harder when you use `||` or `&&`. Disadvantages of short-circuit operators `&&` and `||` ------------------------------------------------------ The disadvantages of using the short-circuit operators inside scripts are that the error messages don't include the script name or the error code, so it's often hard to know which script is having problems and what the problems might be. Compare the above short scripts with this longer script below, which detects errors and prints the script name along with the command generating the error and the error code: #!/bin/sh -u if mkdir foo ; then : # do nothing on success else status=$? echo 1>&2 "$0: mkdir foo failed with status $status" exit "$status" fi if cd foo ; then : # do nothing on success else status=$? echo 1>&2 "$0: cd foo failed with status $status" exit "$status" fi date >date.txt The above script gives detailed information about its name, which command failed, and the exact failure status code. The error messages are much better and tell you the script name containing the command that is failing: $ touch foo $ ./example.sh mkdir: cannot create directory ‘foo’: File exists ./example.sh: mkdir foo failed with status 1 $ echo $? 1 The short-circuit versions of the same script don't even give the script name and so tracking down which script had the error and what the error code was is harder. The short-circuit operators are useful for quick testing of command return codes, but they aren't suitable for a production script where you want to see the script name and good error messages. Use `&&` and `||` with caution. Reading input in shell scripts -- `read` ======================================== The Bourne family of shells use a built-in command named `read` to issue a prompt on standard error and read one single line of input from standard input. The input line is split up into words and assigned to one or more variables given on the command line: #!/bin/sh -u read -p "Enter a number: " num1 read -p "Enter a number: " num2 echo "The sum is $(( num1 + num2 ))" Output: $ ./example.sh Enter a number: 10 Enter a number: 5 The sum is 15 The shell only issues the prompt if standard input is coming from a terminal (from a human via a keyboard). If input is coming from a file or a pipe, no prompt is needed: $ echo 10 >input $ echo 5 >>input $ ./example.sh