Difference between revisions of "CSC212 Lab 8 2014"
(→GREP) |
(→Challenge 6) |
||
(34 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
--[[User:Thiebaut|D. Thiebaut]] ([[User talk:Thiebaut|talk]]) 15:07, 15 October 2014 (EDT) | --[[User:Thiebaut|D. Thiebaut]] ([[User talk:Thiebaut|talk]]) 15:07, 15 October 2014 (EDT) | ||
---- | ---- | ||
+ | {| | ||
+ | | | ||
+ | __TOC__ | ||
+ | | | ||
<br /> | <br /> | ||
<bluebox> | <bluebox> | ||
− | This lab is about programming shell | + | This lab is about programming in the shell, using redirection, pipes, for-loops and various shell tools. The shell used is '''tcsh''', which is the default shell for student accounts at Smith. |
</bluebox> | </bluebox> | ||
<br /> | <br /> | ||
− | + | |} | |
+ | <br /> | ||
+ | <br /> | ||
=Connect to Linux= | =Connect to Linux= | ||
<br /> | <br /> | ||
* For this lab, we won't use Eclipse. We'll use the '''Terminal''' instead. | * For this lab, we won't use Eclipse. We'll use the '''Terminal''' instead. | ||
− | * Connect to your 212a account on one of the Linux Mint machines, or, using your laptop, directly to beowulf2. If beowulf2 is unresponsive, you can connect from your laptop to any of these IP addresses, which belong to our Linux Mint machines in FH342 and FH345. In this case you should replace '''beowulf2.csc.smith.edu''' by a group of 4 numbers separated by dots. Note | + | * Connect to your 212a-xx account on one of the Linux Mint machines, or, using your laptop, directly to beowulf2. If beowulf2 is unresponsive, you can connect from your laptop to any of these IP addresses, which belong to our Linux Mint machines in FH342 and FH345. In this case you should replace '''beowulf2.csc.smith.edu''' by a group of 4 numbers separated by dots. Note the machine you are connecting to must be ON and have Linux booted and running before you can actually connect to it. You can connect to a Linux Mint machine even if somebody else is using it, or is already connected to it. |
<onlysmith> | <onlysmith> | ||
::* 131.229.103.143, 131.229.101.192, 131.229.103.174, 131.229.103.167, 131.229.103.158, 131.229.103.214, 131.229.103.109, 131.229.103.122, 131.229.102.54, 131.229.103.186, 131.229.102.142, 131.229.103.187, 131.229.101.176, 131.229.103.172, 131.229.103.173, 131.229.101.219, 131.229.103.46, 131.229.103.63, 131.229.103.64, 131.229.103.57 | ::* 131.229.103.143, 131.229.101.192, 131.229.103.174, 131.229.103.167, 131.229.103.158, 131.229.103.214, 131.229.103.109, 131.229.103.122, 131.229.102.54, 131.229.103.186, 131.229.102.142, 131.229.103.187, 131.229.101.176, 131.229.103.172, 131.229.103.173, 131.229.101.219, 131.229.103.46, 131.229.103.63, 131.229.103.64, 131.229.103.57 | ||
Line 20: | Line 26: | ||
<br /> | <br /> | ||
*Every command, every program in the Linux operating system reads data from the '''standard-input''', and outputs information to the '''standard-output''' and the '''standard-error'''. By default the standard-input, or sdin for short, is attached to the keyboard. | *Every command, every program in the Linux operating system reads data from the '''standard-input''', and outputs information to the '''standard-output''' and the '''standard-error'''. By default the standard-input, or sdin for short, is attached to the keyboard. | ||
− | By default, the standard-output (stdout for short) and the standard-error (stderr for short) are connected to the display. | + | :By default, the standard-output (stdout for short) and the standard-error (stderr for short) are connected to the display. |
<br /> | <br /> | ||
*Here is a short Java program that outputs to both stdout and stderr: | *Here is a short Java program that outputs to both stdout and stderr: | ||
Line 34: | Line 40: | ||
:Note the different streams used to pring: System.'''out''' and System.'''err'''. | :Note the different streams used to pring: System.'''out''' and System.'''err'''. | ||
* Run the program in the Terminal. | * Run the program in the Terminal. | ||
− | * Note the output | + | * Note the output: two very similar lines. In some terminals the text sent to stderr will be <font color="red">red</font>. |
<br /> | <br /> | ||
==Redirecting stdout to a file.== | ==Redirecting stdout to a file.== | ||
Line 44: | Line 50: | ||
<br /> | <br /> | ||
− | * Notice that the program still outputs one line on the screen. | + | * Notice that the program still outputs one line on the screen: the one that is sent to '''stderr'''. To see the other line, '''cat''' the file '''Demo1.output'''. |
− | '''cat''' the file '''Demo1.output'''. | ||
cat Demo1.output | cat Demo1.output | ||
Line 86: | Line 91: | ||
<br /> | <br /> | ||
+ | |||
=GREP= | =GREP= | ||
<br /> | <br /> | ||
Line 121: | Line 127: | ||
<br /> | <br /> | ||
+ | <br /> | ||
+ | ==Reverse Grep== | ||
+ | <br /> | ||
+ | * Grep can also report lines not containing a pattern. For example, if we wanted all the lines in Demo1.java '''not containing''' the word "print", we would write: | ||
+ | |||
+ | |||
+ | grep -v print Demo1.java | ||
+ | |||
=Counting Lines= | =Counting Lines= | ||
<br /> | <br /> | ||
Line 156: | Line 170: | ||
</font> | </font> | ||
<br /> | <br /> | ||
+ | =Pipes= | ||
+ | <br /> | ||
+ | * The solution for the challenge above was this: | ||
+ | |||
+ | grep -i orange ulysses.txt > orange.txt | ||
+ | wc -l orange.txt | ||
+ | |||
+ | :The first line redirects the output of grep into a file called '''orange.txt''' and the second line counts all the lines in '''orange.txt'''. | ||
+ | <br /> | ||
+ | * There's another option that is more efficient, though. The option is to ''pipe'' the output of grep to the input of wc. In other words, we can send the '''standard-out''' of grep, and make it become the '''standard-in''' of wc. In linux this is done with the '''|''' symbol (vertical bar): | ||
+ | |||
+ | grep -i orange ulysses.txt | wc -l | ||
+ | |||
+ | * Try it, and verify that you get the same result as before. | ||
+ | |||
+ | <br /> | ||
+ | =A More Interesting Example= | ||
+ | <br /> | ||
+ | * We are going to be doing a bit of computer ''forensic'' work. This should be done on beowulf2, though, as it has a longer log of the login information for its users. | ||
+ | * On beowulf2, at the prompt, type the command '''last''': | ||
+ | |||
+ | last | ||
+ | |||
+ | :This will list the last connection times for the users who have used beowulf2 since it was last rebooted. | ||
+ | * Try to use all the tools we have seen so far to answer the following questions (Hints are hidden in white spaces under the questions): | ||
+ | <br /> | ||
+ | <!-- ----------------------------------------------------------------------------------------------- --> | ||
+ | {| style="width:100%; background:silver" | ||
+ | |- | ||
+ | | | ||
+ | |||
+ | ==Challenge 2== | ||
+ | |} | ||
+ | [[Image:QuestionMark3.jpg|right|120px]] | ||
+ | <br /> | ||
+ | ;Question 1 | ||
+ | :How many connections have users made to beowulf2 since its last reboot? | ||
+ | <br /> | ||
+ | <font color="white">last | grep pts | wc</font> | ||
+ | <br /> | ||
+ | ;Question 2 | ||
+ | :How many connections were made on October 10? | ||
+ | <br /> | ||
+ | <font color="white">last | grep "Oct 10" | wc</font> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <!-- ----------------------------------------------------------------------------------------------- --> | ||
+ | {| style="width:100%; background:silver" | ||
+ | |- | ||
+ | | | ||
+ | |||
+ | ==Challenge 3== | ||
+ | |} | ||
+ | [[Image:QuestionMark4.jpg|right|120px]] | ||
+ | <br /> | ||
+ | How many times did 212a-xx users connected to the server? How many times 231a-xx users? Try to generate 1-line answers for both. | ||
+ | <br /> | ||
+ | <font color="white">last | grep "212a-" | wc<br />last | grep "232a-" | wc<br />or<br />last | grep "2..a-" | wc</font> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <!-- ----------------------------------------------------------------------------------------------- --> | ||
+ | {| style="width:100%; background:silver" | ||
+ | |- | ||
+ | | | ||
+ | |||
+ | ==Challenge 4== | ||
+ | |} | ||
+ | [[Image:QuestionMark5.jpg|right|120px]] | ||
+ | <br /> | ||
+ | Who are the users who are not 231a-xx or 212a-xx who have used beowulf2? You may have several lines of outputs with user names included in each line. | ||
+ | <br /> | ||
+ | <font color="white">last | grep -v 231a | grep -v 212a</font> | ||
+ | <br /> | ||
+ | <font color="white">or, a very fancy version:<br />last | grep -v 231a | grep -v 212a | cut -d " " -f 1 | sort | uniq</font> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <!-- ----------------------------------------------------------------------------------------------- --> | ||
+ | {| style="width:100%; background:silver" | ||
+ | |- | ||
+ | | | ||
+ | |||
+ | ==Challenge 5== | ||
+ | |} | ||
+ | [[Image:QuestionMark8.jpg|right|120px]] | ||
+ | <br /> | ||
+ | Linux also supports the commands '''tail''' and '''head''' to list the last lines or first lines of a file, respectively. | ||
+ | For example, to list the first 10 lines of the file ulysses.txt, one would type: | ||
+ | |||
+ | head -10 ulysses.txt | ||
+ | |||
+ | To get the last 20 lines: | ||
+ | |||
+ | tail -20 ulysses.txt | ||
+ | |||
+ | If you want to use pipes, you could execute these two commands as follows: | ||
+ | |||
+ | cat ulysses.txt | head -10 | ||
+ | cat ulysses.txt | tail -20 | ||
+ | |||
+ | ;Question 1 | ||
+ | :Strong with this knowledge, pipe together some Linux commands to see the very first time User '''dthiebau''' connected to beowulf2 (only 1 line should be output), and the last time the same user connected to beowulf2. | ||
+ | <br /> | ||
+ | <font color="white">last | grep dthiebau | head -1<br />last | grep dthiebau | tail -1</font> | ||
+ | ;Question 2 | ||
+ | :Use pipes and redirection, and possibly several commands, to create a text file containing the two lines you printed in Question 1. | ||
+ | <br /> | ||
+ | <font color="white">last | grep dthiebau | head -1 > data.txt<br />last | grep dthiebau | tail -1 >> data.txt</font> | ||
+ | |||
+ | <br /> | ||
+ | |||
+ | =For Loops= | ||
+ | <br /> | ||
+ | The shell you are using (known as tcsh) also supports for loops. The best way to figure out how they work is to use one right away. At the prompt type the following lines (the user input is in bold, the other text is automatically output by the shell): | ||
+ | |||
+ | '''foreach user ( dthiebau emendelo suzanne )''' | ||
+ | foreach? '''echo -n "Number of connections for User $user " ''' | ||
+ | foreach? '''last | grep $user | wc -l''' | ||
+ | foreach? '''end''' | ||
+ | |||
+ | * Try it! | ||
+ | <br /> | ||
+ | ; Explanations: | ||
+ | :Notice that 3 sets of output lines are printed, one for User '''dthiebau''', one for User '''emendelo''', and one for User '''suzanne'''. This is because the '''foreach''' loop repeats all the commands between '''foreach''' and '''end''', and substitute a string inside the parenthesis for the ''variable'' user. With Linux, when you declare a variable, you just use a name (in our case, we use '''user'''), and when you want to use it, you put a $-sign in front of it (as in '''$user'''). The '''echo''' command is just a way of printing a string. The -n switch forces it '''not''' to go to the next line. And '''end''' marks the end of the for loop. | ||
+ | |||
+ | <br /> | ||
+ | <br /> | ||
+ | <!-- ----------------------------------------------------------------------------------------------- --> | ||
+ | {| style="width:100%; background:silver" | ||
+ | |- | ||
+ | | | ||
+ | |||
+ | ==Challenge 6== | ||
+ | |} | ||
+ | [[Image:QuestionMark9.jpg|right|120px]] | ||
+ | <br /> | ||
+ | Write a tcsh for-loop that will list the first and last times all three users (dthiebau, emendelo, suzanne) connected to beowulf. | ||
+ | <br /> | ||
+ | Here's an example of the type of output your loop should generate (Note that the output you get may well be different because the date and time when these users last connected to Beowulf2 may have changed since this lab was put together, and beowulf2 may have been rebooted since then, as well.): | ||
+ | <br /> | ||
+ | dthiebau pts/3 131.229.101.231 Wed Oct 15 09:19 - 17:32 (08:13) | ||
+ | dthiebau pts/5 131.229.101.231 Wed Oct 1 11:07 - 15:29 (2+04:21) | ||
+ | ----- | ||
+ | emendelo pts/7 131.229.102.12 Tue Oct 7 13:49 - 14:12 (00:22) | ||
+ | emendelo pts/7 131.229.102.12 Tue Oct 7 13:49 - 14:12 (00:22) | ||
+ | ----- | ||
+ | suzanne pts/7 131.229.87.135 Tue Oct 14 13:48 - 13:56 (00:08) | ||
+ | suzanne pts/7 131.229.87.135 Tue Oct 14 13:48 - 13:56 (00:08) | ||
+ | ----- | ||
+ | |||
+ | <br /> | ||
+ | <font color="white">foreach user ( dthiebau emendel suzanne )<br />last | grep $user | head -1<br />last | grep $user | tail -1<br />echo "---------"<br />end</font> | ||
+ | |||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | Note: the solutions to the challenges can be seen by highlighting the white areas under each challenge. | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | [[Category:CSC212]][[Category:Labs]] |
Latest revision as of 11:33, 16 October 2014
--D. Thiebaut (talk) 15:07, 15 October 2014 (EDT)
This lab is about programming in the shell, using redirection, pipes, for-loops and various shell tools. The shell used is tcsh, which is the default shell for student accounts at Smith.
|
Connect to Linux
- For this lab, we won't use Eclipse. We'll use the Terminal instead.
- Connect to your 212a-xx account on one of the Linux Mint machines, or, using your laptop, directly to beowulf2. If beowulf2 is unresponsive, you can connect from your laptop to any of these IP addresses, which belong to our Linux Mint machines in FH342 and FH345. In this case you should replace beowulf2.csc.smith.edu by a group of 4 numbers separated by dots. Note the machine you are connecting to must be ON and have Linux booted and running before you can actually connect to it. You can connect to a Linux Mint machine even if somebody else is using it, or is already connected to it.
Redirection
- Every command, every program in the Linux operating system reads data from the standard-input, and outputs information to the standard-output and the standard-error. By default the standard-input, or sdin for short, is attached to the keyboard.
- By default, the standard-output (stdout for short) and the standard-error (stderr for short) are connected to the display.
- Here is a short Java program that outputs to both stdout and stderr:
public class Demo1 { public static void main(String[] args) { System.out.println( "This goes to stdout" ); System.err.println( "This goes to stderr" ); } }
- Note the different streams used to pring: System.out and System.err.
- Run the program in the Terminal.
- Note the output: two very similar lines. In some terminals the text sent to stderr will be red.
Redirecting stdout to a file.
- Let's run the program, but using the redirection symbol > to send stdout to a file:
java Demo1 > Demo1.output
- Notice that the program still outputs one line on the screen: the one that is sent to stderr. To see the other line, cat the file Demo1.output.
cat Demo1.output
- You will see that the second line, the one directed to stdout has been captured in the file.
The purpose of stderr, is for program to send error messages to the terminal, and to have these messages displayed on the terminal, even when the output has been redirected to a file. We use output to stderr typically for error messages.
Redirecting outputs to an already-existing file
- Try the command date, which is a typical Linux command:
date
- Now capture the output of date to a file:
date > Demo1.output cat Demo1.output
- You will have noticed that the redirection didn't work. You cannot redirect the output of a command to a file that already exists. If you want to force the redirection, add an ! after the >:
date >! Demo1.output cat Demo1.output
Redirecting the output of several commands to the same file
- What if we wanted to store both the date and the output of Demo1 to Demo1.output? In this case we do not want the second output to erase the file. Linux uses >> to indicate redirection/append-to-end-of file.
- Let's try it:
date >! Demo1.output java Demo1 >> Demo1.output
- Verify (using cat) that Demo1.output now contains a time tag, and the output of java Demo1.
GREP
- Grep is one of the most used and useful Linux commands. It is also very very fast. Its purpose is to find a string of characters in one or several files.
- Let's try it and see if our file Demo1.java contains the word main:
grep main Demo1.java
- or the word Demo2:
grep Demo2 Demo1.java
- You notice that it is not very sophisticated. It just outputs the line that contains the string we're looking for, or nothing.
- Let's give it something more interesting to do. For this, get a copy of the file ulysses.txt in my 212a account:
getcopy ulysses.txt
- (getcopy is not a Linux command, just a command we created in the department for students to get files from their instructor's account. Getcopy for you will copy file from my 212a account to your 212a-xx directory.)
- Use grep to see if ulysses.txt contains the word orange:
grep orange ulysses.txt
- You should get a list of sentences taken from the 1.5 million character book, all containing "orange". Notice how fast grep searches the whole book!
- How about Orange (with an uppercase O)?
grep Orange ulysses.txt
- If we had wanted to get all the lines containing either orange or Orange, we could have said:
grep -i orange ulysses.txt
- here "-i" means "independent of case" (upper- or lower-case).
Reverse Grep
- Grep can also report lines not containing a pattern. For example, if we wanted all the lines in Demo1.java not containing the word "print", we would write:
grep -v print Demo1.java
Counting Lines
- You can use wc, for word-count, to count characters, words, and lines in a file.
- Let's try it on Ulysses:
wc ulysses.txt
- You will notice that Ulysses contains 32,663 lines, 264,965 words, and 1,520,798 characters.
- If all you are interested in are lines, you can specify -l (minus ell) on the command line:
wc -l ulysses.txt
Challenge 1 |
How many lines in Ulysses contain the word "orange" or "Orange"?
Think about a solution using all the Linux commands and tools seen so far...
Think some more and try it...
If you have no idea, then highlight the white area below to get some hints...
You can grep for "orange" using the -i option, and redirect the output to a file, say orange.txt.
Then you can use wc -l to find the number of lines in orange.txt.
Pipes
- The solution for the challenge above was this:
grep -i orange ulysses.txt > orange.txt wc -l orange.txt
- The first line redirects the output of grep into a file called orange.txt and the second line counts all the lines in orange.txt.
- There's another option that is more efficient, though. The option is to pipe the output of grep to the input of wc. In other words, we can send the standard-out of grep, and make it become the standard-in of wc. In linux this is done with the | symbol (vertical bar):
grep -i orange ulysses.txt | wc -l
- Try it, and verify that you get the same result as before.
A More Interesting Example
- We are going to be doing a bit of computer forensic work. This should be done on beowulf2, though, as it has a longer log of the login information for its users.
- On beowulf2, at the prompt, type the command last:
last
- This will list the last connection times for the users who have used beowulf2 since it was last rebooted.
- Try to use all the tools we have seen so far to answer the following questions (Hints are hidden in white spaces under the questions):
Challenge 2 |
- Question 1
- How many connections have users made to beowulf2 since its last reboot?
last | grep pts | wc
- Question 2
- How many connections were made on October 10?
last | grep "Oct 10" | wc
Challenge 3 |
How many times did 212a-xx users connected to the server? How many times 231a-xx users? Try to generate 1-line answers for both.
last | grep "212a-" | wc
last | grep "232a-" | wc
or
last | grep "2..a-" | wc
Challenge 4 |
Who are the users who are not 231a-xx or 212a-xx who have used beowulf2? You may have several lines of outputs with user names included in each line.
last | grep -v 231a | grep -v 212a
or, a very fancy version:
last | grep -v 231a | grep -v 212a | cut -d " " -f 1 | sort | uniq
Challenge 5 |
Linux also supports the commands tail and head to list the last lines or first lines of a file, respectively.
For example, to list the first 10 lines of the file ulysses.txt, one would type:
head -10 ulysses.txt
To get the last 20 lines:
tail -20 ulysses.txt
If you want to use pipes, you could execute these two commands as follows:
cat ulysses.txt | head -10 cat ulysses.txt | tail -20
- Question 1
- Strong with this knowledge, pipe together some Linux commands to see the very first time User dthiebau connected to beowulf2 (only 1 line should be output), and the last time the same user connected to beowulf2.
last | grep dthiebau | head -1
last | grep dthiebau | tail -1
- Question 2
- Use pipes and redirection, and possibly several commands, to create a text file containing the two lines you printed in Question 1.
last | grep dthiebau | head -1 > data.txt
last | grep dthiebau | tail -1 >> data.txt
For Loops
The shell you are using (known as tcsh) also supports for loops. The best way to figure out how they work is to use one right away. At the prompt type the following lines (the user input is in bold, the other text is automatically output by the shell):
foreach user ( dthiebau emendelo suzanne ) foreach? echo -n "Number of connections for User $user " foreach? last | grep $user | wc -l foreach? end
- Try it!
- Explanations
- Notice that 3 sets of output lines are printed, one for User dthiebau, one for User emendelo, and one for User suzanne. This is because the foreach loop repeats all the commands between foreach and end, and substitute a string inside the parenthesis for the variable user. With Linux, when you declare a variable, you just use a name (in our case, we use user), and when you want to use it, you put a $-sign in front of it (as in $user). The echo command is just a way of printing a string. The -n switch forces it not to go to the next line. And end marks the end of the for loop.
Challenge 6 |
Write a tcsh for-loop that will list the first and last times all three users (dthiebau, emendelo, suzanne) connected to beowulf.
Here's an example of the type of output your loop should generate (Note that the output you get may well be different because the date and time when these users last connected to Beowulf2 may have changed since this lab was put together, and beowulf2 may have been rebooted since then, as well.):
dthiebau pts/3 131.229.101.231 Wed Oct 15 09:19 - 17:32 (08:13) dthiebau pts/5 131.229.101.231 Wed Oct 1 11:07 - 15:29 (2+04:21) ----- emendelo pts/7 131.229.102.12 Tue Oct 7 13:49 - 14:12 (00:22) emendelo pts/7 131.229.102.12 Tue Oct 7 13:49 - 14:12 (00:22) ----- suzanne pts/7 131.229.87.135 Tue Oct 14 13:48 - 13:56 (00:08) suzanne pts/7 131.229.87.135 Tue Oct 14 13:48 - 13:56 (00:08) -----
foreach user ( dthiebau emendel suzanne )
last | grep $user | head -1
last | grep $user | tail -1
echo "---------"
end
Note: the solutions to the challenges can be seen by highlighting the white areas under each challenge.