Difference between revisions of "CSC212 Lab 8 2014"

From dftwiki3
Jump to: navigation, search
(GREP)
(GREP)
Line 121: Line 121:
  
 
<br />
 
<br />
 +
==Reverse Grep==
 +
<br />
 +
* Grep can also report lines not containing a pattern.  For example, if we wanted all the lines in Demo1.java '''not containing''' the word "print", we would write:
 +
 +
 +
grep -v print Demo1.java
 +
 
=Counting Lines=
 
=Counting Lines=
 
<br />
 
<br />

Revision as of 15:45, 15 October 2014

--D. Thiebaut (talk) 15:07, 15 October 2014 (EDT)



This lab is about programming shell scripts, using redirection, pipes, for-loops and various shell tools.


Connect to Linux


  • For this lab, we won't use Eclipse. We'll use the Terminal instead.
  • Connect to your 212a account on one of the Linux Mint machines, or, using your laptop, directly to beowulf2. If beowulf2 is unresponsive, you can connect from your laptop to any of these IP addresses, which belong to our Linux Mint machines in FH342 and FH345. In this case you should replace beowulf2.csc.smith.edu by a group of 4 numbers separated by dots. Note that this machine must be ON and have Linux booted and running before you can actually connect to it. You can connect to amachine even if somebody else is using it, or is already connected to it.

This section is only visible to computers located at Smith College


Redirection


  • Every command, every program in the Linux operating system reads data from the standard-input, and outputs information to the standard-output and the standard-error. By default the standard-input, or sdin for short, is attached to the keyboard.

By default, the standard-output (stdout for short) and the standard-error (stderr for short) are connected to the display.

  • Here is a short Java program that outputs to both stdout and stderr:


public class Demo1 {
	public static void main(String[] args) {
		System.out.println( "This goes to stdout" );
		System.err.println( "This goes to stderr" );
	}
}
Note the different streams used to pring: System.out and System.err.
  • Run the program in the Terminal.
  • Note the output, two very similar lines.


Redirecting stdout to a file.


  • Let's run the program, but using the redirection symbol > to send stdout to a file:


java Demo1  >  Demo1.output


  • Notice that the program still outputs one line on the screen. But only 1. To see the other line,

cat the file Demo1.output.

cat Demo1.output

You will see that the second line, the one directed to stdout has been captured in the file.


The purpose of stderr, is for program to send error messages to the terminal, and to have these messages displayed on the terminal, even when the output has been redirected to a file. We use output to stderr typically for error messages.


Redirecting outputs to an already-existing file


  • Try the command date, which is a typical Linux command:
date

  • Now capture the output of date to a file:
date  > Demo1.output
cat Demo1.output

You will have noticed that the redirection didn't work. You cannot redirect the output of a command to a file that already exists. If you want to force the redirection, add an ! after the >:
date >! Demo1.output
cat Demo1.output


Redirecting the output of several commands to the same file


  • What if we wanted to store both the date and the output of Demo1 to Demo1.output? In this case we do not want the second output to erase the file. Linux uses >> to indicate redirection/append-to-end-of file.
  • Let's try it:


date >! Demo1.output
java Demo1  >> Demo1.output

  • Verify (using cat) that Demo1.output now contains a time tag, and the output of java Demo1.


GREP


  • Grep is one of the most used and useful Linux commands. It is also very very fast. Its purpose is to find a string of characters in one or several files.
  • Let's try it and see if our file Demo1.java contains the word main:
 grep main Demo1.java

  • or the word Demo2:
grep Demo2 Demo1.java

  • You notice that it is not very sophisticated. It just outputs the line that contains the string we're looking for, or nothing.
  • Let's give it something more interesting to do. For this, get a copy of the file ulysses.txt in my 212a account:
getcopy ulysses.txt

(getcopy is not a Linux command, just a command we created in the department for students to get files from their instructor's account. Getcopy for you will copy file from my 212a account to your 212a-xx directory.)
  • Use grep to see if ulysses.txt contains the word orange:
grep orange ulysses.txt

You should get a list of sentences taken from the 1.5 million character book, all containing "orange". Notice how fast grep searches the whole book!
  • How about Orange (with an uppercase O)?
grep Orange ulysses.txt

  • If we had wanted to get all the lines containing either orange or Orange, we could have said:
grep -i orange ulysses.txt

here "-i" means "independent of case" (upper- or lower-case).


Reverse Grep


  • Grep can also report lines not containing a pattern. For example, if we wanted all the lines in Demo1.java not containing the word "print", we would write:


grep -v print Demo1.java

Counting Lines


  • You can use wc, for word-count, to count characters, words, and lines in a file.
  • Let's try it on Ulysses:
wc ulysses.txt

You will notice that Ulysses contains 32,663 lines, 264,965 words, and 1,520,798 characters.
  • If all you are interested in are lines, you can specify -l (minus ell) on the command line:
wc -l ulysses.txt


Challenge 1

QuestionMark1.jpg


How many lines in Ulysses contain the word "orange" or "Orange"?
Think about a solution using all the Linux commands and tools seen so far...
Think some more and try it...
If you have no idea, then highlight the white area below to get some hints...

You can grep for "orange" using the -i option, and redirect the output to a file, say orange.txt.
Then you can use wc -l to find the number of lines in orange.txt.