Difference between revisions of "CSC220 Lab 1 2010"

From dftwiki3
Jump to: navigation, search
(Question Group 3)
Line 34: Line 34:
  
 
* Check that the file is in your directory
 
* Check that the file is in your directory
 +
 +
* The first question is to list only the lines that list the real execution time.  These lines look like this:
 +
 +
real 23m6.777s
 +
 +
* Go ahead, list the real execution times.  What is the shortest time recorded?  The longest?
 +
* List not only the real execution times, but also the lines of the form:
 +
 +
processing noTasks = 17240  maxNoTasks = 8,  splitSize = 33554432L

Revision as of 14:14, 7 September 2010

Page under construction!
UnderConstruction.jpg

You can work in pairs on this lab if you wish. Otherwise work individually.

Login

  • Login to a Linux box around you using your CSC220a account.

Question Group 1

  • What is the path of your account? In other words, what subdirectories does one have to traverse to reach your account.
  • What other subdirectories are at the same height as yours in the directory tree?
  • You will notice that all user accounts are in a major directory called Users. The old standard for users on a linux system is to have the users in a directory called home in the root directory.
  • You will notice that some user accounts still exist in home. Which are they?

Question Group 2

  • With Linux, the name of the printers supported are listed in a file called /etc/printcap. Look at its contents.
  • Figure out a way to get only the entries from this file that represent printers that are in Ford Hall.

Question Group 3

  • Open a browser window and load up the following URL: http://maven.smith.edu/~hadoop/log.txt
  • Notice that it is a long log of the output of a research program I have been running recently. It is very long and contains a lot of information: some useful, some not.
  • You are going to get a copy of this file into your account. Instead of copying and pasting the text into a file, you are going to use a useful utility called wget. Wget is a Linux utility that allows you to grab Web pages from Web sites, without using a browser.
Try it:
 wget http://maven.smith.edu/~hadoop/log.txt
  • Check that the file is in your directory
  • The first question is to list only the lines that list the real execution time. These lines look like this:
real	23m6.777s
  • Go ahead, list the real execution times. What is the shortest time recorded? The longest?
  • List not only the real execution times, but also the lines of the form:
processing noTasks = 17240  maxNoTasks = 8,  splitSize = 33554432L