-------------------------------- Practice Unix/Linux Questions #2 -------------------------------- -IAN! idallen@idallen.ca Remember to start shell scripts with four things: A - interpreter line: #!/bin/sh -u B - search path setting: PATH=/bin:/usr/bin ; export PATH C - umask setting: umask 022 D - character sort ordering: LC_COLLATE=C ; export LC_COLLATE 1. Write a script named labelcheck.sh that checks the spelling of each of the seven lines of the Exterior Assignment Label. The script should contain Unix commands that look for the correctly spelled lines in a file named label.txt in the current directory. Hint: Write seven Unix commands. Each command tries to find one of the seven label lines in the label.txt file. 2. (*deleted*) 3. Follow the directions in the file named README.txt under the directory named image in the home directory of idallen on the Course Linux server. 4. Modify your CGI weather script to send the weather to your userid's logged in terminal using the "write" command. 5. Write a script to count the number of unique "words" in a file. (A word is separated by blanks or punctuation.) This count will be much less than the actual total number of words in the file. You can try using the file /usr/share/games/freeciv/helpdata.txt as input. Hint: Translate blanks and punctuation characters into newlines. Take the resulting output and count the number of unique lines. (The helpdata.txt file has approximately 2,100 unique words.) Part II: Show the ten most frequently occurring words. (The helpdata.txt file has approximately 486 occurrences of "the".) 6. Indiana University has an index of musicians at http://www.music.indiana.edu/music_resources/ A list of artists starting with "S" is under: http://www.music.indiana.edu/music_resources/artistss.html Write a script to dump this page (in formatted form) and count how many lines contain the string "San" on the formatted page. (Note: Count only lines containing the artists names in the page; do not count http links to the artists that may contain "San".) Hint: The answer (in October 2004) is 4 lines. You may need to select lines that do not contain the pattern "http" to give an accurate count. 7. You can search for the word "foo" using Yahoo search with the URL: http://ca.search.yahoo.com/search?p=foo Write a script to do a Yahoo search for the artist "Coldplay" and count how many lines contain the phrase "Rock and Pop". Note the exact use of capital letters in the phrase! (The answer in February 2005 is 6 references.) Perform the same count for the artist "Britney Spears", and then for "Beatles". 8. Produce a listing of the Unix password file showing only the userid field and the shell field, in sorted order. 9. Write a command pipeline that shows only the name of the current month (e.g. October, November). Hint: what Unix command displays a calendar? (You can also get this information if you know some fancy options to the Unix "date" command.) 10. On the Course Linux Server there is a file named passwd.OLD in the same directory as the Unix password file. Concatenate this old file and the Unix password file together and display a list of the unique full names in UPPER CASE of students in NET2003. (There will be about 32 students in the list. Do not display names in accounts not belonging to NET2003.) 11. Write a command pipeline to display a list of the unique permission strings (as output by "ls -l") for all pathnames (including hidden files) in the current directory. (Do not output the "total" line produced by the "ls" command.) Then use another pipeline to output the permission string that occurs most frequently, preceded by its count. For the /bin directory on the Course Linux Server, the output would be: -r-xr-xr-x -rwsr-xr-x -rwxr-xr-x drwxr-xr-x lrwxrwxrwx 70 -rwxr-xr-x 12. You can find the length of the longest line in a file using this method: (1) Change all the characters that are not newlines into periods. [You now have a file that only contains periods on every line.] (2) Sort the resulting output and extract one of the longest lines. [When you sort lines of identical characters, they sort by length. Select one of these longest lines.] (3) Count the characters in the longest line. The answer will be one less than the resulting count. (Why is the answer one less?) What is the length of the longest line in the text file /usr/share/games/freeciv/helpdata.txt ? (Answer: 78 minus one.) Hint: The "tr" command has an option that complements the first (source) character set, translating all characters that are *not* in the first set. For example (RTFM): # Translate to X all characters that are NOT letters or newlines $ date | tr -c 'a-zA-Z\n' 'X' SunXOctXXXXXXXXXXXXXEDTXXXXX 13. For all non-hidden paths in the /bin directory, find out what type of file each is and produce a list of the unique first words of the type. The output will look like this: Bourne ELF setuid symbolic Hint: The shell can easily generate all the pathnames under /bin. Have the shell pass that list of names to a command that can tell you what type of thing each name is, then extract just the first word of the type information from the resulting output lines. Process the list to remove duplicate lines. 14. The simplest use of the Unix "find" command (using only pathname arguments) is to give a recursive list of all the pathnames under a directory. For all pathnames under the /var/www/ directory, show a list of the ten most frequently occurring pathame components along with their occurrence counts. (A pathname component is the part between, before, or after a slash.) The output on the linux server for path /var/www will be: 287 www 287 var 287 238 icons 66 small 25 error 13 html 8 perl 6 addon-modules 5 .modperl2 Hint: Put each pathname component on its own line by changing the slashes in the pathnames to newlines. Count the unique lines. 15. Write a Unix pipeline to print just your numeric Unix userid and nothing else. Hint: You can find this number in the output of the "id" command or by looking for your userid in the Unix password file. 16. In the /bin directory of commands, we suspect that some of the command files are actually the same program under different names. Use Unix tools to identify all the command files under /bin that actually contain the same programs. Hint: The "sum" comand produces a quick checksum of the bytes in a file; different files give different checksums. The shell can easily generate all the pathnames under /bin. Have the shell pass that list of names to the checksum command and save the output in a temporary file. Extract from that file just the column of checksums and send the column through a pipeline to produce a sorted count of how many times each checksum appears. Some checksums appear more than once, indicating some program has more than one name under /bin. (Checksum 46888 appears 6 times.) Use a program to look for each of the repeated checksums in the temporary file to find the lines of programs that have different names. (For example, /bin/ed and /bin/red are the same program because they have the same checksum.)