------------------------- Week 04 Notes for NET2003 ------------------------- -Ian! D. Allen - idallen@idallen.ca - www.idallen.com Remember - knowing how to find out an answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) A 50-minute midterm test is at 10AM next Tuesday (Week 5). A 50-minute lecture follows the midterm at 11AM as usual. Why learn scripting and command line? ------------------------------------- 6 of 9 BIT/NET co-op students needed scripting on the job, and they want more scripting courses added to BIT/NET. - See /etc/init.d/httpd - the http server dies every hour during critical period - stay up all night, or - write a script to restart it if it is missing: while : ; do if ! pgrep httpd ; then /etc/init.d/httpd restart fi sleep 10 done - tell the machine to do the work! ---------------------------------------------------------------------------- Review: - you know what a shell is for - you can navigate the file system using . and .. - you can generate simple GLOB patterns for the shell - you know the basic function of a list of Unix/Linux commands - you can sort lines alphabetically or numerically - you can see the modification time of the contents of a directory or of the directory itself Unix/Linux file system is hierarchical, based at a nameless ROOT directory see Notes: pathnames.txt - Unix/Linux Pathnames Q: Give the absolute/relative paths from current dir to HOME, /home, ROOT, etc. The shell always tries to match GLOB patterns against pathnames see Notes: glob_patterns.txt - GLOB patterns (wildcard pathname matching) Course Notes to read -------------------- arguments_and_options.txt glob_patterns.txt miscellaneous.txt shell_prompt.txt unix_command_list.txt ---------------------------------------------------------------------------- Arguments, options, and file names ---------------------------------- see Notes: arguments_and_options.txt - Options and Arguments on Command Lines Note the difference between command names, options, and file names. $ wc -wc wc >ls ; ls -ls ls >wc Q: How does the shell split a command line into arguments? ---------------------------------------------------------------------------- More Commands: see Notes file unix_command_list.txt - see Notes file unix_command_list.txt - keep a list of commands and their common uses - keep a one-line summary of what any of the listed commands do - know how to read man pages (man page syntax meaning) - here are some new command names and options (add these to your list): - "man -k" and "apropos" look up keywords in manual page titles - "mkdir" creates a new directory (but does not change to it) - "rmdir" deletes (removes) an empty directory - use "rm -r" to remove recursively a (non-empty) directory - "touch" creates a new empty file or updates its modify time to now - "hostname" tells (or sets) the name of this computer - "du" shows disk block usage under the current directory - "sum" shows a quick checksum of file contents (see also "md5sum") - two files with the same checksum *probably* have identical content - "file" tells what kind of file a pathname is - helpful if the file is a compressed or executable file - "find" lists recursively all the pathnames under the current directory - contrast it with "ls" that shows only the current directory - find has lots of other options to find things by name, size, etc. - see also "slocate" for a faster way to find pathnames See Class Notes: finding_files.txt - "grep" searches for a pattern inside (text) files - "head" shows lines at the head (top) of a file - "tail" shows lines at the end (tail) of a file - useful to see the end of a system log file - "w" also shows who is logged in to this machine (similar to "who") - "bash" starts a new copy of the shell, until you "exit" - shell scripts are executed by a new copy of the shell - "chmod +x filename" to make a (script) file executable - "last" shows a list of lines that are the recent logins to this machine - head(-5) tail(-5) apropos(man -k) - wc(-lwc) ls(-ld) grep(-v) sort(-nr) rm(-r) - "sort" sorts alphabetically (usually ASCII collating order) - need to use -n option to sort by leading number - other options change sort direction, etc. - see "sort" in Notes: miscellaneous.txt - Miscellaneous Unix/Linux Facts You must use a numeric sort if you want numbers in order! $ ls -s | sort -nr $ seq 1 2 5 >a ; seq 10 20 50 >>a ; seq 2 2 6 >b ; seq 20 20 60 >>b - file a: 1,3,5,10,30,50 b: 2,4,6,20,40,60 $ sort a ; sort b ; sort a a ; sort a b a b $ sort -n a ; sort -n b Q: What option to sort generates a numeric sort based on a leading number? - "ls -ld" is needed to see the attributes of a directory (instead of seeing the attributes of the directory contents) - don't send binary (executable machine code) files to your terminal screen - check the contents with the "file" command first - to unscramble your screen, see Notes: miscellaneous.txt ============================================================================ Redirection of Input and Output by the shell -------------------------------------------- see the course Notes file: redirection.txt - Unix Shell I/O Redirection (including Pipes) The "standard input" of most programs is usually your keyboard. - this is "unit 0" to the shell: cat 0file The "standard error" output of most programs is usually your keyboard. - this is "unit 2" to the shell: ls * 2>file You can easily tell the shell to redirect the standard input and/or output of any command before it runs the command. * Redirection to and from single files: - if a program normally produces output on your screen, the shell can redirect that output to go to a file instead - but the program has to be one that would have produced output! - simple output redirection to a file: sort /etc/passwd >out - if a program normally reads your keyboard, the shell can also redirect input to come from a single file instead - but the program has to be one that would have read the keyboard! - simple input redirection from a file: sort out - shells are designed to issue prompts before reading your keyboard - most other programs do not prompt you * Redirection between two or more programs (Unix pipes): - a pipe is shell redirecton from one program into another program without the use of an intervening temporary file: COMPARE: date > out ; wc out ; cp a b | wc - redirection is done first by the shell and is removed from command line $ echo a b c ; echo a b c | wc ; echo a b c >out - redirection is done by the shell *BEFORE* the command runs! - never redirect output into a file used as input! $ head /etc/passwd >out ; sort out $ head /etc/passwd >out ; sort out >out # BAD! SHELL OVERWRITES OUT FIRST $ rm out >out # no error message, even if out did not exist first $ date >out ; ls -l out >out # BAD! SHELL OVERWRITES OUT FIRST - redirection is never passed to the command; it is never a command argument - never use redirection to direct output into an input file! - you can append to a file, instead of overwrite, using >> The shell does not care where in the command line you put the file redirection; it is always found, done, and removed before the command runs: $ date >wc ; >wc date Note the special syntax for making standard error go to "the same place" as standard output: - the 2>&1 must appear anywhere to the right of the >out on the line $ sum *.c >out 2>&1 $ sum >out *.c 2>&1 $ >out sum 2>&1 *.c $ >out 2>&1 sum *.c You can "throw away" unwanted output by redirecting it to /dev/null $ find / 2>/dev/null # suppress only the error messages $ cat * >/dev/null # only show errors; throw away standard output You can only redirect what you can see. see Notes: redirection.txt - Unix Shell I/O Redirection (including Pipes) Folding long lines in scripts ----------------------------- Instead of writing this unreadable long line in a script: wget -O - http://teaching.idallen.com/net2003/08w/notes/vi_basics.txt | cat -n | sort -nr | tac | cut -b 8- | tail -32 Break the line on any spaces (just before pipes is best) and prefix the preceding line with a backslash. Multiple commands on a pipeline may be placed on separate lines for readability: wget -O - http://teaching.idallen.com/net2003/08w/notes/vi_basics.txt \ | cat -n \ | sort -nr \ | tac \ | cut -b 8- \ | tail -32 No trailing blanks are allowed after the backslash; otherwise, the backslash simply escapes the blank, not the newline. The vim command ":set list" will show trailing blanks and tabs. (":set nolist") This vim search pattern finds blanks at line end: / $ Q: How do you split long pipelines for maximum readability? Commands for selecting parts of lines (fields) ---------------------------------------------- * see Notes: data_mining.txt - Using commands and pipes to "mine" and extract The "last" command shows a list of lines that are the recent logins to this machine. Each login line has this format: idallen pts/9 cpe000c4185badf- Tue Jan 29 21:37 still logged in We use this type of "last" command output in these examples: cut - cut out part of the text on every line In this example we cut out just the first eight characters: $ last | cut -c 1-8 | sort | uniq -c | sort -nr | head -5 awk - a programming language, useful for manipulating fields - awk splits lines on blank-separated fields, the way you expect awk '{print $5}' - print the 5th blank-delimited field on every line $ last | awk '{print $1}' | sort | uniq -c | sort -nr | head -5 awk '{print $5,$7}' - print the 5th and 7th blank-delimited field on every line $ last | awk '{print $1,$3}' | sort | uniq -c | sort -nr | head -5 Don't confuse the '$5' passed to awk with the '$5' used in shell scripts to select the 5th command-line argument. Differences between cut and awk: 1. The default field delimiter for "cut" is the TAB character, not spaces. - use the -d option to set a different delimiter 2. In "cut" (not awk), very delimiter counts. Usually "awk" does what your eyes expect when extracting fields. Avoid "cut" for fields unless you know what you're doing. Q: How do you print the third blank-delimited field on every line? Q: How do "awk" and "cut" differ when splitting lines on blanks?