================================================ Unix Lab 6 (Regular Expressions and Data Mining) ================================================ - Prepared by Ian Allen - idallen@ncf.ca Work along with your instructor solving the following problem: "The Director of Bean Counting wants to save money by closing down ACADIAX in the early morning hours. You have to show that students are actively using the computer during this time." Your instructor has used the Unix "last" command to provide for you a file containing all login sessions since late August in a file named ~alleni/lastout.txt on the system. (This file is 15,000 lines long!) Work with your instructor doing this "data mining" on lastout.txt: 1) Show all the lines from the file that have these properties: a) Login sessions that started from 1am to 3:59am and that were still going at 5am to 7:59am. b) Login sessions originating from internal IP address starting with aip60, that started at from 8pm to 10:59pm, and that lasted between 3 and 4 hours. c) Login sessions that were not from student accounts. (Student accounts look like "abcd0001". Try one method using "grep -v" and another using a regular expression that matches only all-alphabetic userids.) d) Login sessions lasting less than an hour. e) Login sessions lasting less than ten minutes. f) Login sessions lasting less than five minutes. h) Login sessions beginning exactly at noon. i) Login sessions ending exactly at noon. j) Now, repeat all of the above questions for "ftp" sessions only. 2) Find out who logged in just before root on September 15. 3) Find out who the top ten users of the machine were (most number of login sessions). 4) Find out who the top ten non-student users of the machine were. ---------------------------------------- Part II - Other "data mining" questions: ---------------------------------------- II-1) How many userids in the passwd file are of the form aa000000 ? II-2) How many userids in the passwd file are of the form aa0000 ? II-3) How many userids in the passwd file contain the string "sh" at the start of the userid? II-4) How many userids in the passwd file contain the string "sh" anywhere in the userid? II-5) How many userids in the passwd file contain the string "sh" at the start or the end of the alphabetic part of the userid? II-6) How many userids in the passwd file have sequence number one (and all variations of that number from 0001 to 000001)? (Be careful not to count userids that look like: aaa00017 !)