Difference between revisions of "Hadoop/MapReduce Tutorials"

From dftwiki3
Jump to: navigation, search
Line 43: Line 43:
 
|- style="background:#eeeeff" valign="top"
 
|- style="background:#eeeeff" valign="top"
 
|
 
|
[[http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_2.2_--_Running_C%2B%2B_Programs_on_Hadoop Tutorial #2.2]]
+
[http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_2.2_--_Running_C%2B%2B_Programs_on_Hadoop Tutorial #2.2]
 
|
 
|
 
Running C++ programs under Hadoop Pipes
 
Running C++ programs under Hadoop Pipes

Revision as of 07:51, 13 April 2010

AmazonAWS.jpg


HadoopCartoon.png



These tutorials target the Hadoop/MapReduce Cluster in the CS Dept. at Smith College, as well as Amazon's EC2 and S3.








Tutorial Comments

Tutorial #1

Running WordCount written in Java on the Smith College Hadoop/MapReduce Cluster

Tutorial #1.1

Creating timelines of the execution of tasks during the execution of a MapReduce program.

Tutorial #2

Running WordCount in Python on the Smith College Hadoop/MapReduce Cluster

Tutorial #2.1

Running a streaming Python MapReduce program on XML files

Tutorial #2.2

Running C++ programs under Hadoop Pipes

Tutorial #3

Running Hadoop jobs on Amazon AWS

Tutorial #3.1

Uploading text to S3 and running Amazon's WordCount Java program on our own data.

Tutorial #3.2

Compiling our own version of the Java WordCount program and uploading it to AWS.

Tutorial #4

Start a server on Amazon's EC2 infrastructure