------------------------------------------------ Linux Shells by Example: Chapter 4 Reading Guide ------------------------------------------------ -IAN! idallen@ncf.ca Here is a reading guide and some review questions for Chapter 4 "The Streamlined(sic) Editor". Remember to read the text_errata.txt file (under Notes) and correct all the mistakes in this Chapter before you read it. Useful additional notes to read: regular_expressions.txt regular_expression_questions.txt The data files for the examples in the textbook are under this directory: /home/alleni/cst/cdrom/chap04/ Note: The information in Table 4.3 (p.99) is partially duplicated in Table 3.2 and Table 2.1 (p.44). Warning: Do not confuse the meaning of metacharacters used in regular expressions and those used in shell GLOB patterns. The same characters are used; but, they often mean different things. *) What to study in Chapter 4: Study well all of Sections 4.1 to 4.6. YES 4.7.1 p YES 4.7.2 d YES 4.7.3 s/re/string/ YES 4.7.4 address ranges using comma /foo/,/bar/p YES 4.7.5 multiple -e NO 4.7.6 r - skip YES 4.7.7 w file NO 4.7.8 a - skip NO 4.7.9 i - skip NO 4.7.10 n - skip NO 4.7.11 y/chars/CHARS/ - skip YES 4.7.12 q NO 4.7.13 h and g - skip NO 4.7.14 h and x - skip NO 4.8 sed scripting - skip NO 4.8.1 sed scripting - skip YES 4.8.2 sed Review (all except last example using 'l') *) How does sed work? (p.94) *) Does sed operate on all the lines in a file at once, or only one line at a time? (p.94) *) Can sed process a file from last line to first line, or only from first line to last line? In other words, once sed has processed a line, can it "back up" and process the line that came before it, or must sed always move forward in the file, without backing up? (p.94) *) If you leave the addresssing numbers off of the front of a sed expression, does it only operate on one line or does it operate on every line read from the file? (p.94,95) *) In a sed line address, what does a dollar sign represent? (p.94) (It means the same thing in VI addresses, too!) *) True or False: This sed command deletes only line 5 and 9: -e '5,9d' *) For this course, know the following sed commands and skip over the others (Table 4.1): d p q s/re/string/g s/re/string/p s/re/string/w file w file Commands that only accept zero or one addresses preceding: q Example: sed -e '/idallen/q' /etc/passwd Commands that accept an address range preceding: d p s w Example: sed -n -e '1,/idallen/p' /etc/passwd Example: sed -n -e '/idallen/,$w foo' /etc/passwd Example: sed -n -e '1,10s/idallen/alleni/p' /etc/passwd Example: sed -n -e '/:0:/w root' -e '/:[1-9][0-9]*:/w not' /etc/passwd Know these options (Table 4.2) and skip over the others: -e commmand -n -f The man page for sed will help you here. *) What command syntax do I issue to tell sed to delete lines that contain a regexp pattern? (p.97) *) What command syntax do I issue to tell sed to delete lines that do *not* contain a regexp pattern? (p.98) *) p.97 Curly braces aren't used that often in sed; but, they are useful in selecting a range of lines on which you want to do several other sed commands which might themselves have address ranges. Below is an example of a sed command that uses curly braces. Note that the entire expression is a single-quoted string to the shell. No shell processing will happen on any of the characters in the string. Here is the sed command using newlines to separate commands: $ sed -n -e '1,10{ /root/p /root/!s/x/*/pg }' /etc/passwd Here is the same sed command using semicolons to separate commands: $ sed -n -e '1,10{ /root/p; /root/!s/x/*/pg; }' /etc/passwd Explanation: option -n: suppress default "copy through" output from sed - only lines that are explicitly printed will appear - if nothing is printed by sed "p" commands, no output option -e: the next argument will be the sed command expression (it is single quoted here to protect it from the shell) 1,10{...}: select lines 1 to 10 and do the commands contained in the curly braces (which will only operate on lines 1-10). /root/p: find lines containing the regexp /root/ and print them (but only in lines 1,10, due to curly braces) /root/!s/x/*/pg: find lines that do *not* contain the regexp /root/, change all "x" to "*" on the whole line, then and print the line only if the substitution succeeded (but only in lines 1,10, due to curly braces) *) EXAMPLE 4.5,4.6: How would you select and display only lines containing words (a word is a string of non-blank characters) that start with the letters "n" or "s" and end with the letter "t"? (Make sure you suppress the default sed output in your answer.) Hint: [ ]* matches a string of blank characters. You need to match a string of *non*-blank characters. Invert the expression. *) EXAMPLE 4.7,4.8: Can you specify a list of lines to delete, e.g. sed -e '1,3,5,7d' datafile (No. You can't. The "d" command only accepts an address range, i.e. start address and end address, not a list of addresses. How would you do the above using sed? Hint: Multiple -e options.) *) The following command prints the last line twice, because the default action for sed is to output every line it reads in: sed -e '$p' datafile Why doesn't the following command print the last line just once? sed -e '$d' datafile *) Linux Tools Lab 2 Questions (p.132) 1. skip 2. use "-n" - it works everywhere 3. use the man page 4. skip 5. do this 6. do this 7. do this 8. do this 9. do this 10. do this 11. do this (what regexp matches an entire line of characters?) 12. do this 13. do this (a blank line contains only zero or more spaces) 14. skip 15. skip 16. skip ======================== sed substitution summary (p.103-105) ======================== Substitution command formats: s/regexp/chars/ # just like in VI - first match only s/regexp/chars/g # just like in VI - globally on whole line s/regexp/chars/p # print line only if substitution works s/regexp/chars/w file # write line only if substitution works s/regexp/chars/gpw file # combine all three only if substitution works! In sed, the substitution can be followed by a few letters that indicate additional commands to be performed on this line, only if the substitution succeeds. If the substitution fails, nothing happens. ========================== Practice questions for sed ========================== *) Do all the practice questions that use VI, using "sed" instead. (See the chapter02guide.txt file for many VI practice questions.) Remember: Never use shell redirection to redirect output into any file used as input on a command line - the shell will erase the file. *) Examine the password file and do the following: 1. copy every line containing :0: into a file named "roots" 2. copy every line containing allen into a file named "ians" 3. copy every line containing 0000 into a file named "zeroes" Long way: Use three grep commands and redirect output three times. Short way: Use one sed command line with three -e commands and write all three files at once using "w" commands (see Section 4.7.7 p.109). *) Implement lab08exercise.txt using a series of "sed" command lines. You are to decode a file of text by writing a script that uses the "sed" editor to make a correct series of deletions, substitutions, and replacements, in a given order. Step 1: Run this "doright" program and save the output in a file named "right.txt" in your account somewhere: $ ~alleni/cst/lab08exercise/doright >right.txt The file "right.txt" should be 247 lines. Step 2: Construct a shell script containing a series of individual "sed" command lines to perform the following substitution edits on the resulting "right.txt" file. Your file of command lines will look similar to this: #!/bin/bash -u ... shell script label and header goes here ... # start with a copy of the data file to be modified cp right.txt file1 || exit 1 sed -e '...your command...' file1 >file2 || exit 1 sed -e '...your command...' file2 >file3 || exit 1 ... repeat similar lines until ... sed -e '...your command...' file8 >file9 || exit 1 sed -e '...your command...' file9 >file10 || exit 1 # display the final result on standard output cat file10 Start with one sed command and gradually add the others until you have a working file that performs all of the editing of lab08exercise.txt correctly. Unless you are told otherwise, globally change *all* occurrences on each line, not just the first occurrence. (Use the "g" global substitution suffix shown in Table 4.1 (p.97) and Examples 4.11 and 4.14.) Step 3: Verify your work using the "diff" command. Compare your edited file with the following file and ensure that there are no differences: $ diff file10 ~alleni/cst/lab08exercise/right-to-read.txt If your file is correctly edited, there will be no output from "diff". Any differences will be sent to your screen. *) Do these give the same answer? 1. How many lines contain a character that is not the letter 'a'? 2. How many lines do not contain the letter 'a'? If they differ, give an example of a line that one matches but the other does not. How long is the shortest line output by each command? *) Do these command lines always give the same output? 1. grep '[^a]' 2. grep -v 'a' If they differ, give an example of a line that one matches but the other does not. How long is the shortest line output by each command? *) Do these command lines always give the same output? 1. grep '[^d][^o][^g]' 2. grep -v 'dog' If they differ, give an example of a line that one matches but the other does not. How long is the shortest line output by each command? *) Do these command lines always give the same output? 1. sed -n -e '/[^a]/p' 2. sed -n -e '/a/!p' If they differ, give an example of a line that one matches but the other does not. How long is the shortest line output by each command?