------------------------- Week 14 Notes for CST8129 ------------------------- -Ian! D. Allen - idallen@idallen.ca *** Keep up on your readings (Course Outline: average 5 hours/week homework) Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) Announcement: Final Exam Monday Dec 12 3 hours 11h30 - 14h30 ------- Review: ------- More on sed: Using & in LHS of s/// (see Example 5.16 p. 138) More extended regexp metacharacters (see "New with egrep" table p. 96): + - (extended) one or more -> x+ -> xx* ? - (extended) zero or one -> xy?z -> x(y|)z -> (xz|xyz) Chapter 4 - The grep Family: grep, egrep, fgrep grep family options: -v, -A, -B and -n, -l, -i, -c, -w Exercises: ALL Text errata: Q10 p.124 s/insensitive/sensitive/ Chapter 5 - sed, the Streamlined(sic) Editor sed options to know: -e, -n, -f addressing: /regexp/ 1 $ combo: /regexp/,23 23,/regexp/ 23,$ negation address range suffix: ! sed commands: a c d i p q s WARNING: y command doesn't take ranges like tr does! s/// substitution flags: g p Exercises: ALL Chapter 6 - The awk Utility On Linux awk and gawk are the same program, essentially equivalent to "nawk". Note: awk uses floating-point arithmetic, not integer! - be careful of testing equality with floating point! - echo hi | awk ' ((1/3)+1-1)*3 == 1 { print } ' <== NO OUTPUT! Read: 6.1.1 (p. 157) through 6.12.8 (p.202) - OMIT 6.4.2 (0FMT) - OMIT 6.4.3 (printf) - 6.6.3,6.10.4 not all versions of awk support a multiple character FS - avoid using multi-char FS or -F if you can - OMIT 6.11.2,6.12.7 ( ? : query conditional operator) built-in variables: NF NR field selection expressions: $1 $2 $NF $(NF-1) $(1+3) in awk '$' does *NOT* mean "here is a variable" - '$' means "select the following field by number", e.g. $(1+1) - in awk think of $X as a function, e.g.: select_field( X ) awk options: -F $ awk -Fx ' Boolean_Expression { action } ' [... files ...] some new versions of awk allow a regexp for the field delimiter: $ awk -F'[a-z]' .... (avoid using a regexp if you can - it doesn't work in all versions) simple Boolean Expressions look just like C language, with the addition of the '~' operator to do regexp matching: $1 == "hi" $1 < 5 /foobar/ $1 ~ /^[fF]oobar/ compound expressions can use arithmetic, ||, and && like C Language: ( $1+$2 ) >= 500 && $3 ~ /^Moneybags/ { print } WARNING: awk silently treats non-numeric strings as zero in arithmetic! $ echo one two three | awk '{ print $1+$2+$3 }' 0 <=== no error message! ---------- This week: ---------- Read: 6.13.1 (p. 203) through 6.26.7 (p.278) - OMIT 6.14 redirection and pipes - OMIT 6.15 pipes - OMIT 6.16.6 printf - OMIT 6.16.7 redirection and pipes - OMIT 6.16.8 pipes - OMIT 6.23 user-defined functions - OMIT 6.23 user-defined functions - OMIT 6.25.1 fixed fields (but read use of gsub) - OMIT 6.25.2 multi-line records - OMIT 6.25.3 form letters - OMIT 6.26.2 time functions - OMIT 6.26.4 getline - OMIT 6.26.6 user-defined functions Brief notes: number and string constants - awk doesn't care what's in a variable: awk 'BEGIN { x = "010" ; print x }' awk 'BEGIN { x = "010" ; print x+0 }' passing in command line values: awk '{ print NF, x }' x=4 creating new fields : echo 23 67 | awk ' { $3 = $1 + $2 ; print } ' bult-in variables: NF, NR use of BEGIN and END IF, ELSE, WHILE, break, continue, FOR, next, exit Functions: sub and gsub(regexp, str, [output]) index(str, substr) (starting at one) match(str, regexp) length and length(str) ---------- Next week: ---------- Combining quotes on shell command lines: Text p.987-988 userid="idallen" WRONG awk -F: '$1 == idallen {print $1, $5}' /etc/passwd WRONG awk -F: '$1 == $idallen {print $1, $5}' /etc/passwd awk -F: '$1 == "idallen" {print $1, $5}' /etc/passwd awk -F: '$1 == "'"$userid"'" {print $1, $5}' /etc/passwd awk -F: '$1 == x {print $1, $5}' x="$userid" /etc/passwd AWK Arrays, associative: for ( i in ARRAY ) { ... } More AWK functions: substr(str, start, [leng]) split(str, ARRAY, [fieldsep]) math functions p.246 truncation: integer(float) rand() range [0,1) and srand(x)