================================================= Errors in Unix Shells by Example by Ellie Quigley ================================================= Textbook Errata: Fourth Edition (2004-2005) This file documents the errors and misinformation found in the above book. Many of these errors were also in the First Edition (2000). See the bottom of this file for errors and corrupt files on the CDROM. ---------------- Chapter 1 errors ---------------- p.9 - change "in the passwd file" to "in the file /etc/passwd". p.9 - 1.4.1 The shell removes file redirection *before* checking the first token on the command line. "date >foo" and ">foo date" both work. p.10 - 7&8 happen at the same time, not sequentially (a variable substitution cannot contain a working command substitution) p.10 - 1.4.2 change "the path variable" to "the PATH variable" p.19 - Linux ignores setuid on shell scripts (because it is unsafe) p.19 - the text uses octal permissions on p.19 but doesn't explain what they mean until p.20 (Table 1.1) p.19 - the umask bits are not subtracted; they are a bit mask 666 masked by 007 is 660 - not 659(base 10) or 657(base 8) p.24 - Units 0,1,2 are only assigned to your terminal when you first log in; after that, the descriptors are inherited from the parent process by child processes. If a parent disconnects a descriptor from the terminal, the same descriptor in a new child process of that parent will also be disconnected; the descriptor doesn't magically get reconnected to the terminal. p.27 - the new syntax "2>errors" is used without explanation in the text p.29-30 - the diagrams all use "dup" without explaining what it means p.31 - The actual numbers of the signals (and sometimes the newer names) change between different versions of Unix/Linux. - There is no signal number "0" and no signal called "EXIT". (The shell "trap" command uses the number 0 to mean "on script exit"; this is a shell feature and is *not* a Unix signal number.) - SIGSTP should be spelled SIGTSTP (Control-Z) p.32 - The script example in Figure 1.15 is poor because it will hang waiting for input half-way down. The "cat" command has no argument and is reading from stdin. Change "cat" to "hostname" or "uptime". - add "-f" as an argument to the line "#!/bin/csh" in the script Fig 1.15 (this is done correctly at the top of Example 2.2 on p.39) - Item 1 of the Explanation: The script shown uses #!/bin/csh, not "what shell you want". - Item 3 of the Explanation is wrong - you usually have to put "./" in front of your script name to execute it out of the current directory, e.g. type "./doit" to the shell, not just "doit". ---------------- Chapter 2 errors ---------------- p.41 - always use the -u option to start scripts: #!/bin/sh -u (check for undefined variables and spelling errors) p.42 - remove ":." from search PATH (a security risk!) - double-quote all uses of variables to protect expansion from the shell e.g. echo "$variable_name" p.43 - double-quote all uses of variables to protect expansion from the shell e.g. echo "$1 $2 $3", echo "$*", etc. p.44 - NOTE: under "Operators", the Relational operators are *numbers only* (not for text strings) p.44 - change "after the closing parenthesis" to "after a newline or semicolon" e.g. if [ expression ] ; then block of statements fi p.45 - double-quote all uses of variables to protect expansion from the shell - add missing ";;" to default case (bottom right) in Table 2.2 - after "then a block of statements" at page bottom add "that starts with the 'do' keyword," - In the "for" loop, the keyword "in" followed by the word list are optional. Leaving them off means the list of words will be taken from "$@" (the arguments). p.47 - change "$person =~ root" to "$person = root" on line 13 - to simplify the script, delete line 11 and move lines 19-20 before line 16 p.50 - delete blank in "VARIABLE_NAME =value" under "Global Variables" - remove ":." from search PATH (a security risk!) - double-quote all uses of variables to protect expansion from the shell e.g. echo "$variable_name" ---------------- Chapter 3 errors (p.67-80) ---------------- p.67 - The introduction to this chapter talks as if grep, sed, and awk are covered in this chapter ("will be discussed in detail here"). They are not; they are in separate chapters. p.69 - Table 3.1 is treating regular expressions in the way they are used by grep to select lines, not as used by sed or vi to change the text matched by the regular expression. Change the title of the table to be: Regular Expression Metacharacters when Used to Select Lines (e.g. in the grep family of commands) p.77 - Line "1" confuses a zero '0' with an upper-case letter 'O'. p.78 - the letter 's' is missing after 1,$ under the EXPLANATION: 1,$s/\([Oo]ccur\)ence/\1rence/g p.78 - EXAMPLE 3.13: The example fails to note that your cursor must be on the line containing "square and fair" for this substitution to work. It would be better to prefix the substitution with the line range "1,$" so that the cursor position doesn't matter. - The example is silly, since one can much better say: :1,$s/square and fair/fair and square/g No registers are needed at all. Use a better example! ---------------- Chapter 4 errors (p.81-124) ---------------- p.81 - No, the name "grep" traces back to the original Unix "ed" line editor, not to the Berkeley "ex". "ed" predates "ex" by many years, and "ed" was based on Bell Labs "qed" from years before. "qed" had the g/re/p syntax a decade before Berkeley. p.82 - Section 4.1.2 confuses shell quoting with grep patterns. Grep patterns must be one single argument and therefore must be properly quoted to protect metacharacters from the shell. Single quotes provide full protection. Of course, if the pattern doesn't contain any shell metacharacters (including blanks), it doesn't need quotes. - Grep changes the access times of the files being read - that affects its input files. - under FORMAT change it to: grep pattern [ filename ... ] (the file names are optional - grep reads standard input too) - under EXPLANATION change "a standard input or a pipe" to "standard input, including a pipe," A pipe *is* a form of standard input. - Grep sends its output to standard output, not (necessarily) to the screen. p.84 - Section 4.1.3 is worded very badly. It is confusing and misleading. - \< and \> are not basic regexp metacharacters - they only work in some programs or some versions of some programs. Move them to the list of extended regular expression metacharacters. p.83-84 - Table 4.1 is largely a duplication of 3.1 on p.70, 4.3 on p.101 and 5.3 on p.132. Having slightly different information repeated this way is confusing to the students. It makes the book bigger without adding any useful information. - Table 4.1 uses '[^]' but Table 4.4 uses '[^ ]'. Make them the same. - In many programs, including versions of grep, an asterisk can repeat the previous regular expression, not just the previous character, e.g. \(abc\)* matches: abcabcabc - For grep, the pattern ' *love' is identical to the pattern 'love'. Adding the zero or more blanks in front does nothing useful except make the pattern execute more slowly. Both patterns match exactly the same lines. (However, if used in a "sed" substitution, they would change different things in the line.) p.95 - Extended regexp characters () {} work fine in most versions of egrep. (Do not put backslashes in front of them when used in egrep.) p.96 - In many programs, including versions of grep, a question mark can make the whole previous regular expression optional, not just the previous character, e.g. (abc)? matches abc or nothing. - In many programs, including versions of grep, a plus can repeat the entire previous regular expression, not just the previous character, e.g. (abc)+ matches: abcabcabc Example 4.33 shows this correctly. p.99 - Only one of the eight examples uses any extended regexp characters. This is a poor way to show the power of egrep over grep. The egrep section should only contain examples that use and demonstrate the extended regexp metacharacters. p.100 - Few people consider \< \> \( \) to be "basic" regexp characters. They are extended characters (which is why they need backslashes). Basic regexp characters *never* need backslashes to turn them on. p.101 - \< \> \( \) are all extended characters, not basic. The author confuses things by including them here. Move them to "extended". p.102 - Note "a" in Table 4.7 has a word missing before "work". - Note "b" in Table 4.7 is wrong: most Unix egrep handle ( and ) just fine. - Note "c" in Table 4.7 is wrong: many Unix egrep handle { and } just fine. - Note "a" in Table 4.8 is wrong: most Unix egrep handle ( and ) just fine. - Note "b" in Table 4.8 is wrong: many Unix egrep handle { and } just fine. p.103 - The POSIX [:alnum:] character class is explicitly *NOT* another way of saying A-Za-z0-9, except for North American ASCII character sets. The POSIX class works portably, internationally, and includes *all* letters in the local language, not just ASCII characters. p.114 - The -# option is incorrectly listed as: -#- - Option -C *must* be followed by an integer giving the number of lines. It is not equivalent to "-2" unless you type: -C 2 p.115 - The -v option long form is spelled: --invert-match p.119 - Option -C *must* be followed by an integer giving the number of lines. Change "-C Patricia" to "-C 2 Patricia" in Example 4.61. [... many more errors to come ...] ----------------- Appendix A errors ----------------- p.1057 - "banner" doesn't exist in most Linux distributions p.1058 - "bdiff" doesn't exist in most Linux distributions p.1063 - "crypt" doesn't exist in most Linux distributions p.1071 - "getopt" (not getopts) is what is found in most Linix distribution p.1073 - "jsh" doesn't exist in most Linux distributions p.1073 - "line" doesn't exist in most Linux distributions p.1080 - "news" doesn't exist in most Linux distributions p.1081 - "pack", "pcat", "unpack" don't exist in most Linux distributions p.1083 - "pg" doesn't exist in most Linux distributions p.1090 - "sed" is a "stream" editor, not a "streamlined" editor p.1091 - "spell" doesn't exist in most Linux distributions p.1093 - "tabs" doesn't exist in most Linux distributions p.1095 - the syntax for test is "-gt" not "gt" p.1095 - "timex" doesn't exist in most Linux distributions p.1099 - "units" doesn't exist in most Linux distributions p.1100 - "what" doesn't exist in most Linux distributions ------------ CDROM errors ------------ Many of the script files on the textbook CDROM are saved in DOS/Windows text file format (with CR/LF line ends) and will not run if executed on Linux/Unix that expects just LF line ends, e.g. chap06/Ex_6.167-6.193/awkchecker chap08/Ex_8.48-8.64/mainprog chap10/Ex_10.15-10.26/example10.21 chap14/Ex_14.02-14.31/perm_check2 chap14/Ex_14.02-14.31/tellme chap14/Ex_14.32-14.55/idcheck2 It appears as though someone edited these script files under Windows, mangling them in the process. You must copy these files and change the format back to Unix (only LF line ends) to run them. Most of the other text files, including data files, are also in DOS/Windows format. They will also have format problems on Unix and may not work properly as input for the sample scripts. In particular, regular expressions that try to match '$' (end-of-line) will not work properly in files that have DOS CR/LF line endings. Many of the Quigley data files are damaged this way. The "file" command will often warn you if a file has DOS CR/LF line ends: $ file chap06/Ex_6.001-6.054/lab3.data chap06/Ex_6.001-6.054/lab3.data: ASCII text, with CRLF line terminators You can use the command "dos2unix" to convert these corrupted files back to Unix format: dos2unix /tmp/fixed.txt