======================== Miscellaneous Unix Facts ======================== -Ian! D. Allen - idallen@idallen.ca This file is a review of some basic Unix features. It contains: 1. Notes on GNU and Linux 2. Basic Command Syntax 3. Getting Out of Programs 4. Unscrambling your Terminal 5. EOF and Interrupting Processes 6. TELNET to Unix from Windows 7. Notes on FTP command syntax 8. Locked Out of Unix/Linux Home 9. Aliases 10. Using commands as Filters 11. Understanding the different types of "sort": 12. Using "script" to create a session log 13. Line endings on Unix, Windows, and Macintosh ---------------------- Notes on GNU and Linux ---------------------- GNU - Gnu's Not Unix - GNU is a Free Software Foundation (FSF) project - rewrote Unix as free (libre) software (the way it started out) - chief architect: Richard Stallman (original author of EMACS) Linux "distribution" == Linux Kernel + GNU Utilities - The GNU kernel hasn't progressed far (named HURD) - So, we use the Linux kernel with all the Unix-compatible GNU utility software -------------------- Basic Command Syntax -------------------- Many Unix commands need both a VERB (what to do) and an OBJECT (what to do it on). The following incorrect attempts at Unix commands are wrong: $ /etc/passwd (missing VERB; what are you trying to DO?) $ cat (missing OBJECT; catenate WHAT file?) Remember to tell Unix both what you want to do and to what object you wish to do it. ----------------------- Getting Out of Programs ----------------------- Getting out of Unix programs; or, getting help: Try various things such as: help ? quit Q exit x ^D (CTRL-D) ^C (CTRL-C) :q! (used in VI to exit without saving) logout bye ESC (the ESC key) . (a period) ^\ (CTRL-backslash) One of the above usually works. Sometimes you can use ^Z (CTRL-Z) to "stop" the process temporarily, and then type "kill %%" to kill it. (Remember to kill it, or it will sit there waiting forever.) -------------------------- Unscrambling your Terminal -------------------------- How to unscramble a terminal emulator that is in graphics characters set mode, where see many special and line-drawing characters instead of your typed text. (This might happen after you accidentally use "cat" to send a non-text file to your terminal screen, e.g. "cat /bin/ls" or "cat file.gz".) To Fix (you may not be able to read what you are typing!): - On a Linux system type: $ setterm -reset - On ACADUNIX try: $ reset - On either type: $ echo ^V^O (that's CTRL-V CTRL-O) The above should switch your terminal emulator back to its normal character set. Practice this now, in case it happens to you during a Lab quiz! ------------------------------ EOF and Interrupting Processes ------------------------------ To send an interrupt signal to a process that is running on your terminal, use the Interrupt Character, usually CTRL-C (^C). (You can program a different character; sometimes DEL is used.) The Interrupt usually throws away any pending input/output for the process and causes the process to terminate abnormally. Use it with caution. Interrupting a process usually terminates the process. Whatever the process was doing is left incomplete and unfinished. (Files will be incomplete.) Your EOF character, signalling end of input, is usually CTRL-D (^D). You must type this at the beginning of a line (or type it twice). (You can program a different character; but, it is almost never done.) Sending the EOF character tells the process that you are finished typing at your terminal; but, it does not interrupt or terminate the process. The process will finish whatever it is doing and exit cleanly. Example showing how ^C interrupts a program before it finishes: $ sort >out enter some lines of text and then interrupt the program and you will get an empty file ^C $ cat out $ $ sort >out enter some lines of text and then use an EOF to signal end of file and sort will sort your data ^D $ cat out and sort will sort your data enter some lines of text and then use an EOF to signal end of file $ --------------------------- TELNET to Unix from Windows --------------------------- I don't recommend using the Windows TELNET clients - they display the screen poorly and students often forget to set up their windows correctly, which may lead to people corrupting files that they edit incorrectly. If you use TELNET under Windows, you must configure your Windows TELNET client to connect correctly to a Unix system: 1) drag TELNET window to full size (it will not expand further) 2) set terminal type: vt100 3) set lines: 24 Review the Notes file: telnet_usage.html ----------------------------- Locked Out of Unix/Linux Home ----------------------------- If you find yourself unable to access your home directory, with permission errors such as the following: $ ls ls: .: The file access permissions do not allow the specified action. You have probably removed either read or execute permissions from your directory. To restore these permissions for your userid, use this: $ chmod u+rwx $HOME Details on the chmod command are available in the Unix manual pages. The environment variable $HOME expands to be your home directory. You must be able to read your directory, to see what file names are in it. You must have execute permission on a directory to pass through it to any of its contents. You need both read and execute for "ls ." to work. ------- Aliases ------- Watch out for "helpful" system admin that define aliases for your shells when you log in. (This is especially true on ACADUNIX!) The aliases may mislead you about how Unix commands actually work. (For example, the "rm" command does *not* prompt you for confirmation. On some systems, "rm" is an alias for "rm -i", which *does* prompt.) To avoid pre-defined aliases, start up a fresh copy of the shell: $ alias [...many ACADUNIX aliases print here...] $ bash bash$ alias [...no more aliases here...] To define your own aliases, look up "aliases" in the Linux Text index. You must put your aliase definitions in a file to have them saved between sessions (e.g. put them into your .profile or .bashrc files). ------------------------- Using commands as Filters ------------------------- Note that many Unix commands can act as filters - reading from stdin and writing to stdout. With no file names on the command line, the comands read from standard input and write to standard output. (You can redirect both.) $ grep "/bin/sh" /etc/passwd | sort | head -5 "grep" is reading from a filename, not from standard input. The "sort" and "head" commands are acting as filters; they are reading from stdin and writing to stdout. The "grep" command is not a filter - it is reading from the supplied argument pathname, not from stdin. If file names are given on the command line, the commands almost always ignore standard input and only operate on the file names. $ grep "/bin/sh" /etc/passwd | sort | head -5 /etc/passwd This is the same as the above example, except the "head" command is now ignoring standard input and is reading from its filename argument. The "grep" and "sort" commands are doing a lot of work for nothing, since "head" is not reading the output of sort. Commands ignore standard input if they are given file names to read. If a command does read from file names supplied on the command line, it is more efficient to let it open its own file name than to use "cat" to open the file and feed the data to the command on standard input. (There is less data copying done!) Advice: Let commands open their own files; don't feed them with "cat". Do this: $ head /etc/passwd $ sort /etc/passwd Do not do this (wasteful of processes and I/O): $ cat /etc/passwd | head # DO NOT DO THIS - INEFFICIENT $ cat /etc/passwd | sort # DO NOT DO THIS - INEFFICIENT Problem: "Now, count the number of each kind of shell in /etc/passwd." $ cut -d : -f 7 /etc/passwd | sort | uniq -c Problem: "Count the number of each kind of shell in /etc/passwd and display the results sorted in descending numeric order." $ cut -d : -f 7 /etc/passwd | sort | uniq -c | sort -nr Problem: "Count the number of each kind of shell in /etc/passwd and display the top two results sorted in descending numeric order." $ cut -d : -f 7 /etc/passwd | sort | uniq -c | sort -nr | head -2 -------------------------------------------- Understanding the different types of "sort": -------------------------------------------- The Unix sort command sorts lines by character, not by number. Explain the difference in output of these two "sort" pipelines: $ list="1 11 2 22 3 33 4 44 3 33 2 22 1 11" $ echo "$list" | tr ' ' '\n' | sort $ echo "$list" | tr ' ' '\n' | sort -n (The translate command "tr" is turning blanks into newlines so that the numbers appear on separate lines on input to sort; sort only sorts lines.) Why is the sort output different in these two examples? (RTFM) -------------------------------------- Using "script" to create a session log -------------------------------------- You can create a file log of everything you type and see on your screen using the "script" command and giving it the name of a file into which it will record your session: $ script saveme.txt Script started, file is saveme.txt $ date Thu Sep 23 02:19:59 EDT 2004 $ echo hi there hi there $ who am i idallen pts/1 Sep 19 20:07 $ exit Script done, file is saveme.txt $ The file "saveme.txt" now contains a session log containing everything you typed and everything that was printed on your screen. You can use the vim editor to edit this file (to remove junk you don't want) and then you can copy it to another machine for printing. (See the Notes file file_transfer.txt for information on moving files around.) Warning: If you use a full-screen editor such as vim inside a screen session, the recorded keystrokes and screen output will be a huge mess when recorded in the session file. Make sure you edit out this mess before you print the file! Don't use full-screen editors inside a "script" session if you can avoid it. The "script" command has an option to append to a session file instead of overwriting it. (RTFM) See the BUGS section of the "script" manual page. The command "col -b" is useful for filtering out backspace characters in a file. (The col command only reads standard input; you cannot pass it file names on the command line.) -------------------------------------------- Line endings on Unix, Windows, and Macintosh -------------------------------------------- C programmers will recognize that the line end character for Unix text files is '\n' - an ASCII newline (NL) character. (See "man ascii" for details on the ASCII character set.) Unix commands that count characters in files and lines will also count the newline character at the end of every line: $ echo hi | wc -c 3 $ echo hi >out ; echo ho >>out ; wc -c out 6 out Microsoft operating systems use two characters at the end of every line of text - the NL is preceded by an ASCII carriage return (CR). A text file containing the word "hi" contains four characters: hi A text file written on Unix contains only linefeed (LF, "\n") characters at the ends of lines; Windows expects lines in text files to end in both a carriage-return (CR, "\r") *and* a linefeed character. This may result in "staircasing" text if you send a Unix text file to a Windows printer from inside some Windows programs (e.g. Notepad). Apple computers (e.g. Macintosh) use the single character CR instead of LF at the end of every text line.