==================== The Unix/Linux Shell ==================== -IAN! idallen@idallen.ca Some basic Shell concepts. Contents: * What is a shell for? * Most (but not all) commands take what as arguments? * How does the shell help run commands? * What is a "Bourne" shell? What is a "C" shell? * Basic Command Syntax * Shell command line aliases * Using commands as Filters * Command line order of processing -------------------- What is a shell for? -------------------- To find and run programs. ("Programs" are also called commands or utilities.) Shells also do programming kinds of things; but, that programming is usually to aid in the finding and running of programs, not to do any kind of substantial mathematical or business calculations. Part of running a program involves supplying command line arguments to that program; the shell helps with that, too. --------------------------------------------------- Most (but not all) commands take what as arguments? --------------------------------------------------- Most - but not all - Unix commands take pathnames as arguments. Much of what people do online is manipulate files and directories. "Pathnames" are names that might be file names or directory names. (Unix also has names that are neither files nor directories, e.g. the /dev/null pathname is a "character special" device.) The shell has wildcard features to make matching pathnames easier. ------------------------------------- How does the shell help run commands? ------------------------------------- Command names are almost always the names of executable files. Shells look for command names in various places, using a list of directories stored in the $PATH environment variable. Shells provide aliases and variables to save typing the same things (commands or pathnames) over and over. Shells provide wildcards (GLOB patterns) to generate lists of pathnames as arguments for commands. Shells provide a "history" mechanism to recall and edit the last commands you enter, to save retyping them. Shells provide ways of completing command and file names, to save typing. ----------------------------------------------- What is a "Bourne" shell? What is a "C" shell? ----------------------------------------------- The shells sh, ksh, zsh, and bash (the "Bourne" shells) all have a common ancestry. They are all derived from the original shell "sh" written in the 1970's by Stephen Bourne. The programming features of these shells (if statements, for loops, etc.) all look and work the same way. This is the best shell to study. The shells csh and tcsh (the "C" shells) are similar, having a history dating back to Bill Joy at Berkeley in the 1980's. Their syntax for programming is not the same as the Bourne shells. We do not cover the C shell syntax in this course; these shells are notoriously buggy. -------------------- Basic Command Syntax -------------------- Many Unix commands need both a VERB (what to do) and an OBJECT (what to do it on). The following incorrect attempts at Unix commands are wrong: $ /etc/passwd (missing VERB; what are you trying to DO?) $ cat (missing OBJECT; catenate WHAT file?) Remember to tell Unix both what you want to do and to what object you wish to do it. -------------------------- Shell command line aliases -------------------------- Watch out for "helpful" system admin that define aliases for your shells when you log in. (This is true on most versions of Linux, and on ACADUNIX.) The aliases may mislead you about how Unix commands actually work. (For example, the "rm" command does *not* prompt you for confirmation. On some systems, when you log in, "rm" is made to be an alias for "rm -i", which *does* prompt.) To avoid pre-defined aliases, sometimes you can start up a fresh copy of the shell that has no aliases defined: $ alias [...many aliases may print here...] $ bash bash$ alias [...no more aliases here...] The other thing you can do is execute "unalias -a" to remove all your aliases for the current shell. You can put this into your shell start-up file (e.g. .bashrc) to do it every time you start a new shell. To define your own aliases, look up "aliases" in a Linux text index. You must put your own alias definitions in a file to have them saved between sessions (e.g. put them into your .bashrc file). ------------------------- Using commands as Filters ------------------------- Note that many Unix commands can act as filters - reading from stdin and writing to stdout. With no file names on the command line, the comands read from standard input and write to standard output. (You can redirect both.) $ grep "/bin/sh" /etc/passwd | sort | head -5 "grep" is reading from a filename, not from standard input. The "sort" and "head" commands are acting as filters; they are reading from stdin and writing to stdout. The "grep" command is technically not a filter - it is reading from the supplied argument pathname, not from stdin. If file names are given on the command line, the commands almost always ignore standard input and only operate on the file names. $ grep "/bin/sh" /etc/passwd | sort | head -5 /etc/passwd This is the same command line as the above example, except the "head" command is now ignoring standard input and is reading from its filename argument. The "grep" and "sort" commands are doing a lot of work for nothing, since "head" is not reading the output of sort. *** Commands ignore standard input if they are given file names to read. *** If a command does read from file names supplied on the command line, it is more efficient to let it open its own file name than to use "cat" to open the file and feed the data to the command on standard input. (There is less data copying done!) Advice: Let commands open their own files; don't feed them with "cat". Do this: $ head /etc/passwd $ sort /etc/passwd Do not do this (wasteful of processes and I/O): $ cat /etc/passwd | head # <- DO NOT DO THIS - INEFFICIENT $ cat /etc/passwd | sort # <- DO NOT DO THIS - INEFFICIENT Examples using pipes and filters: Problem: "Count the number of each kind of shell in /etc/passwd." $ cut -d : -f 7 /etc/passwd | sort | uniq -c - the cut command picks out field 7 in the password file - the sort command puts all the shell names in order - the uniq command counts the adjacent names Problem: "Count the number of each kind of shell in /etc/passwd and display the results sorted in descending numeric order." $ cut -d : -f 7 /etc/passwd | sort | uniq -c | sort -nr - sort the above output numerically and in reverse Problem: "Count the number of each kind of shell in /etc/passwd and display the top two results sorted in descending numeric order." $ cut -d : -f 7 /etc/passwd | sort | uniq -c | sort -nr | head -2 - pick off only the top two lines of the above output -------------------------------- Command line order of processing -------------------------------- The shell parses and changes a command line in a particular order. That order is: 1. quote processing and initial blank splitting into tokens and individual commands (splitting on semicolons and pipe characters) 2. look for pathname input/output redirection 3. look for variables (splitting unquoted variable contents on blanks!) 4. look for GLOB patterns and match against pathnames What this means is that you can't put a working pathname redirect inside a variable; because, the shell looks for redirection *before* the shell looks for and expands variables: $ x="> out" $ echo hi $x # <- the shell doesn't find any redirection hi > out You can't put a working quote inside a variable; because, the shell looks for quotes before the shell looks for and expands variables: $ x="'" $ touch a b c $ echo $x * $x # <- the shell doesn't see any quotes ' a b c ' Even if a pathname looks like a shell variable, it won't be expanded as a variable by the shell because the shell looks for variables to expand before it looks for and processes GLOB patterns against pathnames: $ touch '$x' $ x=foo $ echo * # <- the shell doesn't find any variable to expand $x The shell does the GLOB expansion *after* it has already done all the other processing; none of the characters in a pathname are treated specially (even blanks). If a pathname contains blanks or other special shell metacharacters (e.g. spaces, semicolons, parentheses, etc.), none of these characters will be treated as special by the shell because the shell does all that special character processing before it looks for and processes GLOB patterns: $ touch "file with spaces" $ rm file* # <- shell does not see any spaces in the GLOB name $ touch "date ; who" # <- filename containing blanks and semicolon $ echo * # <- shell does not see any semicolon date ; who The shell does the GLOB expansion *after* it has already done all the blank and semicolon processing; none of the characters in a pathname are treated specially (not blanks, not redirection, not semicolons). What you must remember, is that the reverse of all the above *is* true. The most critical thing to remember is that if an unquoted variable contains a GLOB pattern or spaces, the GLOB pattern *will* be expanded, and the interpolated text *will* be split on blanks, after the unquoted variable is expanded: $ x='*' $ touch a b c $ echo $x # <- shell expands $x, then expands * GLOB a b c $ touch 'file with spaces' $ y='file with spaces' $ ls $y # <- shell expands $y, then splits on blanks ls: file: No such file or directory ls: with: No such file or directory ls: spaces: No such file or directory To prevent this, you must always double-quote all uses of variables: $ echo "$x" # <- shell expands $x, quotes hide GLOB * $ ls "$y" # <- shell expands $y, quotes hide blanks file with spaces Inside double quotes, shell variables expand but GLOB patterns do not. Inside double quotes, spaces are not seen by the shell - the text interpolated by the variable expansion is not split on blanks. Always double-quote your variables.