------------------------- Week 06 Notes for NET2003 ------------------------- -Ian! D. Allen - idallen@idallen.ca Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) Review ------ Unix/Linux file system is hierarchical, based at a nameless ROOT directory see Notes: pathnames.txt - Unix/Linux Pathnames The shell always tries to match GLOB patterns against pathnames see Notes: glob_patterns.txt - GLOB patterns (wildcard pathname matching) You can only redirect what you can see. see Notes: redirection.txt - Unix Shell I/O Redirection (including Pipes) Why learn scripting and command line? ------------------------------------- 6 of 9 BIT/NET co-op students needed scripting on the job, and they want more scripting courses added to BIT/NET. - See /etc/init.d/httpd - the http server dies every hour during critical period - stay up all night, or - write a script to restart it if it is missing while : ; do if ! pgrep httpd ; then /etc/init.d/httpd restart fi sleep 10 done - tell the machine to do the work! Folding long lines in scripts ----------------------------- Instead of writing this unreadable long line in a script: wget -O - http://teaching.idallen.com/net2003/07w/notes/vi_basics.txt | cat -n | sort -nr | tac | cut -b 8- | tail -32 Break the line on any spaces (just before pipes is best) and prefix the preceding line with a backslash. Multiple commands on a pipeline may be placed on separate lines for readability: wget -O - http://teaching.idallen.com/net2003/07w/notes/vi_basics.txt \ | cat -n \ | sort -nr \ | tac \ | cut -b 8- \ | tail -32 No trailing blanks are allowed after the backslash; otherwise, the backslash simply escapes the blank, not the newline. The vim command ":set list" will show trailing blanks and tabs. (":set nolist") This vim search pattern finds blanks at line end: / $ Q: How do you split long pipelines for maximum readability? More on sorting: --------------- see Notes: miscellaneous.txt - Miscellaneous Unix/Linux Facts You must use a numeric sort if you want numbers in order! $ ls -s | sort -nr $ seq 1 2 5 >a ; seq 10 20 50 >>a ; seq 2 2 6 >b ; seq 20 20 60 >>b - file a: 1,3,5,10,30,50 b: 2,4,6,20,40,60 $ sort a ; sort b ; sort a a ; sort a b a b $ sort -n a ; sort -n b Q: What option to sort generates a numeric sort based on a leading number? Arguments, options, and file names ---------------------------------- see Notes: arguments_and_options.txt - Options and Arguments on Command Lines Note the difference between command names, options, and file names. $ wc -wc wc >ls ; ls -ls ls >wc Q: How does the shell split a command line into arguments? Commands for selecting parts of lines (fields) ---------------------------------------------- see Notes: data_mining.txt - Using commands and pipes to "mine" and extract cut - cut out part of the text on every line $ last | cut -c 1-8 | sort | uniq -c | sort -nr | head -5 awk - a programming language, useful for manipulating fields awk '{print $5}' - print the 5th blank-delimited field on every line $ last | awk '{print $1}' | sort | uniq -c | sort -nr | head -5 awk '{print $5,$7}' - print the 5th and 7th blank-delimited field on every line $ last | awk '{print $1,$3}' | sort | uniq -c | sort -nr | head -5 Don't confuse the '$5' passed to awk with the '$5' used in shell scripts to select the 5th command-line argument. Differences between cut and awk: 1. The default field delimiter for "cut" is the TAB character, not spaces. - use the -d option to set a different delimiter 2. In "cut" (not awk), very delimiter counts. Usually "awk" does what your eyes expect when extracting fields. Avoid "cut" for fields unless you know what you're doing. Q: How do you print the third blank-delimited field on every line? To compare files: diff - compare text files and show differences cmp - compare binary files and indicate the first different byte Q: How do I compare text files? Q: How do I compare binary files? Commands for selecting and manipulating lines --------------------------------------------- see Notes: data_mining.txt - Using commands and pipes to "mine" and extract Many commands read standard input and write standard output, allowing them to be used in command pipelines to do data mining: Select lines from text streams: grep, awk, sed, head, tail, look, uniq, comm, diff Select fields in lines or parts of lines: awk, sed, cut Transform text (change characters or words in lines): awk, sed, tr Mining for content in the course notes using "grep -l": $ grep -l ssh ~idallen/public_html/teaching/net2003/07w/notes/*.txt /home/idallen/public_html/teaching/net2003/07w/notes/knoppix_booting.txt /home/idallen/public_html/teaching/net2003/07w/notes/terminal.txt /home/idallen/public_html/teaching/net2003/07w/notes/unix_command_list.txt /home/idallen/public_html/teaching/net2003/07w/notes/vi_basics.txt /home/idallen/public_html/teaching/net2003/07w/notes/week01notes.txt /home/idallen/public_html/teaching/net2003/07w/notes/week03notes.txt Q: Output only line 7 from a file. Q: Output only the third blank-delimited field from line 7 from a file. Q: Find all lines matching a given pattern in a file. Shell Quoting ------------- see Notes: quotes.txt - Unix/Linux Shell Command Line Quoting The problem - the shell splits on blanks: $ grep -l GLOB pattern ~idallen/public_html/teaching/net2003/07w/notes/*.txt grep: pattern: No such file or directory /home/idallen/public_html/teaching/net2003/07w/notes/finding_files.txt /home/idallen/public_html/teaching/net2003/07w/notes/glob_patterns.txt /home/idallen/public_html/teaching/net2003/07w/notes/home_and_HOME.txt /home/idallen/public_html/teaching/net2003/07w/notes/lab03.txt /home/idallen/public_html/teaching/net2003/07w/notes/order_of_processing.txt /home/idallen/public_html/teaching/net2003/07w/notes/shell_basics.txt /home/idallen/public_html/teaching/net2003/07w/notes/week03notes.txt /home/idallen/public_html/teaching/net2003/07w/notes/week04notes.txt The solution - hide the blank from the shell with quotes or a backslash: $ grep -l "GLOB pattern" ~idallen/public_html/teaching/net2003/07w/notes/*.txt /home/idallen/public_html/teaching/net2003/07w/notes/week03notes.txt /home/idallen/public_html/teaching/net2003/07w/notes/week04notes.txt $ grep -l GLOB\ pattern ~idallen/public_html/teaching/net2003/07w/notes/*.txt /home/idallen/public_html/teaching/net2003/07w/notes/week03notes.txt /home/idallen/public_html/teaching/net2003/07w/notes/week04notes.txt Quoting the metacharacters, to stop the shell from helping you: - quotes and backslashes hide things from the shell - quotes delimit each argument; they tell the shell where it starts/ends - quotes are not counted as part of the argument $ echo hi | wc ; echo "hi" | wc ; echo 'hi' | wc 1 1 3 1 1 3 1 1 3 - two different strengths of quotes - single quotes hide all characters from the shell - double quotes hide everything *except* ", \, `, and $variable expansions - only single quotes stop backslash handling and $variable expansion - both kinds stop GLOB expansion and blanks - you can also use backslash to hide single characters from the shell - use argv.sh to see how shell creates arguments (can also use ls -d) (fetch argv.sh.txt from the course Notes area and make it executable) Notes: argv.sh.txt - Display the individual arguments on the command line. $ echo " abc "' def ' " ghi " ' jkl ' $ ./argv " abc "' def ' " ghi " ' jkl ' $ mkdir empty ; cd empty ; touch a b c d $ ./argv ' * ' $ ./argv '" * "' $ ./argv '"' * '"' $ ./argv '"'" * "'"' $ ./argv ' * ' * " * " $ ./argv \' * \' Q: How do you use quotes to prevent GLOB expansion by the shell? Q: How do you use quotes to prevent $variable expansion by the shell? Shell $variable substitutions ----------------------------- see Notes: shell_variables.txt - Variables you should know $!/bin/sh -u echo "first argument is $1" echo "all arguments are $*" echo "number of arguments is $#" Don't confuse the '$5' passed to awk with the '$5' used in shell scripts to select the 5th command-line argument. Q: How do you reference command line arguments inside a shell script? Q: How do set a shell variable? How do you expand it? Q: Why must you double-quote shell variables? Command substitutions: treating commands as variables ----------------------------------------------------- see Notes: command_substitution.txt - Command Substitution - $(unix command) See /etc/init.d/apache2 for a sample network start-up that uses both back-quotes ("`") and $(cmd) syntax in the same file: PIDFILE=`grep -i ^PidFile $i | tail -n 1 | awk '{print $2}'` CNT=$(expr $CNT + 1) Q: How do you execute a command and save its output in a variable? Where does the shell look for command names? $PATH --------------------------------------------------- see Notes: search_path.txt - Shell search PATH New commands: which - which $PATH entry contains this executable file name? whereis - where is this executable file in the system (including the man page) "which" looks in $PATH; "whereis" does not. Q: How do I change the directories where the shell looks for commands? Q: Where does the shell look for the executable "date" program? Q: What command shows which program will execute when I type "date"? Shell order of expansion - what happens first? ------------------------ see Notes: order_of_processing.txt - Order of Shell Command Line processing CRITICAL POINT: Double-quote all your $variable and $(command) expansions! see Notes: data_mining.txt - Using commands and pipes to "mine" and extract