Difference between revisions of "CSC220 Lab 1 2010"

From dftwiki3
Jump to: navigation, search
(Pipes)
Line 1: Line 1:
<center>
+
<!--center>
 
<font size="+2">Page under construction!</font>
 
<font size="+2">Page under construction!</font>
 
<br \>[[File:UnderConstruction.jpg|300px]]
 
<br \>[[File:UnderConstruction.jpg|300px]]
</center>
+
</center-->
  
 
You can work in pairs on this lab if you wish.  Otherwise work individually.
 
You can work in pairs on this lab if you wish.  Otherwise work individually.

Revision as of 15:04, 7 September 2010


You can work in pairs on this lab if you wish. Otherwise work individually.

Login

  • Login to a Linux box around you using your CSC220a account.

Path-Related Questions

Question 1
What is the path of your account? In other words, what subdirectories does one have to traverse to reach your account.


Question 2
What other subdirectories are at the same height as yours in the directory tree?


You will notice that all user accounts are in a major directory called Users. The old standard for users on a linux system is to have the users in a directory called home in the root directory.


Question 3
You will notice that some user accounts still exist in home. Which are they?

Questions about File-Searching

With Linux, the name of the printers supported are listed in a file called /etc/printcap. Look at its contents.


Question 1
Figure out a way to get only the entries from this file that represent printers that are in Ford Hall.

Filtering Log Files

Open a browser window and load up the following URL: http://maven.smith.edu/~hadoop/log.txt
Notice that it is a long log of the output of a research program I have been running recently. It is very long and contains a lot of information: some useful, some not.
You are going to get a copy of this file into your account. Instead of copying and pasting the text into a file, you are going to use a useful utility called wget. Wget is a Linux utility that allows you to grab Web pages from Web sites, without using a browser.
Try it:
 wget http://maven.smith.edu/~hadoop/log.txt
Check that the file is in your directory


Question 1
How many lines of text does the file contain?


Question 2
The second question is to list only the lines that list the real execution time. These lines look like this:
real	23m6.777s


Go ahead, list the real execution times. What is the shortest time recorded? The longest?
Question 3
List not only the real execution times, but also the lines of the form:
processing noTasks = 17240  maxNoTasks = 8,  splitSize = 33554432L
The output should look something like this:


processing noTasks = 862  maxNoTasks = 8  splitSize = 33554432L
real	22m11.284s
processing noTasks = 862  maxNoTasks = 16  splitSize = 33554432L
real	0m13.113s
processing noTasks = 1724  maxNoTasks = 8  splitSize = 33554432L
real	0m10.891s
processing noTasks = 1724  maxNoTasks = 16  splitSize = 33554432L
real	0m40.891s
processing noTasks = 80  maxNoTasks = 8  splitSize = 33554432L
real	2m54.601s

Pipes

Question 1
How many users have accounts on grendel (or beowulf)?


Question 2
Who are the three users who have modified their account most recently?


Question 3
Same question, but the three who haven't modified (or touched) their account for the longest time?


More challenging pipes

  • Use emacs to create the following program in your 220a account: stdouterr.py
  • Make your program executable
chmod +x stdouterr.py
  • run your program
./stdouterr.py


observe the long output. If you observe closely the listing, you will discover that some lines contain error codes, of the form
Error 404: blue screen alert!
Question 1
run the program and filter its output so that you see only the lines containing error messages


Question 2 (tricky)
How many lines are output by the program?


Question 3 (trickier)
run the program and use commands that will display only the number of Error messages