======================== CST8129 Lab Exercise #10 (Week 13) ======================== -IAN! idallen@ncf.ca Due: demo in lab period only - no hand in Marks: 3% Late penalty: 100% after last lab on Friday this week. Purpose: - review some of the work done so far (Chapter 2,3,4) Demo procedure: When you are ready to demonstrate your work, put your name on the board at the front of the room under the column "first demo; section 01x". If there is time, I will see re-demos of the same work. Put your name on the board under "second demo". (Note: There may not be time for a second demo; make the first demo good.) Priority is given to people who are registered in a given lab section. If you are trying to give a demo in a section that is not your registered lab section, you must wait until after all the registered students have done their demos and have had their questions answered. Put your name on the board under "other sections demo". -------------------------------------------------------------------------- Instructions for your demo: You are to decode a file of text by writing a script that uses the "sed" editor to make a correct series of deletions, substitutions, and replacements, in a given order. Step 1: Run this "dosed" program and save the output in a file named "right.txt" in your account somewhere: $ ~alleni/cst/lab10exercise/dosed >right.txt The file "right.txt" should be 247 lines or longer. (It will probably be longer.) Step 2: Construct a shell script named "mysed.sh" containing a series of individual "sed" command lines to perform the following substitution edits on the resulting "right.txt" file. Your file of command lines will look similar to this: cp right.txt file1 || exit 1 sed -e '...your sed command...' file1 >file2 || exit 1 sed -e '...your sed command...' file2 >file3 || exit 1 ...etc... ...etc... ..etc... sed -e '...your sed command...' file10 >file11 || exit 1 sed -e '...your sed command...' file11 >file12 || exit 1 cat file12 Start with one sed command in the script file and gradually add the others until you have a working file that performs all of the editing correctly. Unless you are told otherwise, globally change *all* occurrences on each line, not just the first occurrence. (Use the "g" global substitution suffix shown in Table 4.1 (p.97) and Examples 4.11 and 4.14.) Write these changes using sed command lines: 1) Delete all lines that both start and end with the letters "the". (Note: All such lines are assumed to be longer than three characters.) The lines must contain "the" at both ends. 2) Remove the letters "the" and the single space that follows them from the beginning of every line. 3) Change all backslashes to forward slashes. 4) Replace every occurrence of two or more periods with one period. 5) Replace every occurrence of two or more asterisks with one asterisk. 6) On every line that does *not* contain a dash, remove all the non-vowel characters, if the line contains only non-vowel characters. A line contains only non-vowel characters if every character on the line, from the start to the end, is a non-vowel character. Replace an entire string of non-vowel characters with nothing. 7) Find every line that contains the five lowercase letters "elvis" in that order (but with any number of any characters in between) and delete the first space on these lines. (e.g. find lines like this: "wwww exx xxlyy y yyyvzz zzzzia aaa asbb bb bb") ^ ^ ^ ^ ^ E L V I S How many lines contain this secret pattern? ("grep" can count them.) 8) Find every line that contains the word "and" surrounded by spaces on both sides and remove three digits from the end of those lines. 9) Find every occurrence of three adjacent digits and reverse them. (e.g. 123456 would become 321654; but, zz12zz3zz would not change). A) On lines of text that contain only one double quote character, swap the single characters on either side of the double quote (e.g. cad"tog becomes cat"dog). Note: A line that contains only one double quote character contains only (possibly empty) strings of *non*-double-quote characters on either side of the double quote, right out to both the beginning and end of the line. The characters beside the quote may be any character at all (but you know they won't be double quotes!). B) Find every occurrence of this sequence: - a string of one or more non-blank characters, - one punctuation character (a comma or period or semicolon), - one or more spaces, - another string of one or more non-blank characters and exchange the places of the two strings. (e.g. "Canada; O" would become "O; Canada"). Both strings must contain only one or more non-blank characters. Step 3: Verify your script output using the "diff" command. Compare your edited file with the following original file and ensure that there are no differences: $ ~alleni/cst/lab10exercise/dosed >right.txt $ ./mysed.sh >myoutput $ diff myoutput ~alleni/cst/lab10exercise/right-to-read.txt If your file is correctly edited, there will be no output from "diff". Any differences will be sent to your screen. Step 4: Demo: When you have finished making the above substitutions, call over your instructor. For full marks, you will be asked to demonstrate to your instructor how you performed any or all of the above substitutions and explain how the regular expressions work.