Difference between revisions of "CSC352 Notes 2013"

From dftwiki3
Jump to: navigation, search
(Setting up Eclipse for Hadoop)
(Setting up Eclipse for Hadoop)
Line 35: Line 35:
 
  hadoop fs -mkdir In
 
  hadoop fs -mkdir In
  
* create a project as explained in http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html
+
=Create a new project with Eclipse=
*
+
 
 +
Create a project as explained in http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html
 +
 
 +
==Project==
 +
* Right-click on the blank space in the Project Explorer window and select New -> Project.. to create a new project.
 +
* Select Map/Reduce Project from the list of project types as shown in the image below.
 +
* Press the Next button.
 +
* Project Name: HadoopTest
 +
* Use default location
 +
* click on configure hadoop location, browse, and select /Users/thiebaut/hadoop-0.19.1 (or whatever it is)
 +
* Ok
 +
* Finish
 +
==Map/Reduce driver class==
 +
* Right-click on the newly created Hadoop project in the Project Explorer tab and select New -> Other from the context menu.
 +
* Go to the Map/Reduce folder, select MapReduceDriver, then press the Next button as shown in the image below.
 +
* When the MapReduce Driver wizard appears, enter TestDriver in the Name field and press the Finish button. This will create the skeleton code for the MapReduce Driver.
 +
* Finish
 +
* Unfortunately the Hadoop plug-in for Eclipse is slightly out of step with the recent Hadoop API, so we need to edit the driver code a bit.
 +
 
 +
:Find the following two lines in the source code and comment them out:
 +
 
 +
    conf.setInputPath(new Path("src"));
 +
    conf.setOutputPath(new Path("out"));
 +
 
 +
:Enter the following code immediatly after the two lines you just commented out (see image below):
 +
 
 +
    conf.setInputFormat(TextInputFormat.class);
 +
    conf.setOutputFormat(TextOutputFormat.class);
 +
 +
    FileInputFormat.setInputPaths(conf, new Path("In"));
 +
    FileOutputFormat.setOutputPath(conf, new Path("Out"));
  
 
=Notes on doing example in Yahoo Tutorial, Module 2=
 
=Notes on doing example in Yahoo Tutorial, Module 2=

Revision as of 20:13, 15 December 2009

Setting up Eclipse for Hadoop

 start-all.sh
  • setup eclipse
http://v-lad.org/Tutorials/Hadoop/17%20-%20set%20up%20hadoop%20location%20in%20the%20eclipse.html
    • localhost
    • Map/Reduce Master: localhost, 9101
    • DFS Master: user M/R Master ht, localhost, 9000 (must match number in hadoop/conf/hadoop-site.xml for hdfs value, i.e. localhost:9000
    • user name: hadoop-user
    • SOCKS proxy: host, 1080
  • Open DFS Locations
    • localhost
      • (2)
        • tmp(1)
          • hadoop-thiebaut (1)
            • mapred (1)
              • system (0)
        • user(1)
          • thiebaut (2)
            • hello.txt
            • readme.txt
  • make In directory:
hadoop fs -mkdir In

Create a new project with Eclipse

Create a project as explained in http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html

Project

  • Right-click on the blank space in the Project Explorer window and select New -> Project.. to create a new project.
  • Select Map/Reduce Project from the list of project types as shown in the image below.
  • Press the Next button.
  • Project Name: HadoopTest
  • Use default location
  • click on configure hadoop location, browse, and select /Users/thiebaut/hadoop-0.19.1 (or whatever it is)
  • Ok
  • Finish

Map/Reduce driver class

  • Right-click on the newly created Hadoop project in the Project Explorer tab and select New -> Other from the context menu.
  • Go to the Map/Reduce folder, select MapReduceDriver, then press the Next button as shown in the image below.
  • When the MapReduce Driver wizard appears, enter TestDriver in the Name field and press the Finish button. This will create the skeleton code for the MapReduce Driver.
  • Finish
  • Unfortunately the Hadoop plug-in for Eclipse is slightly out of step with the recent Hadoop API, so we need to edit the driver code a bit.
Find the following two lines in the source code and comment them out:
    conf.setInputPath(new Path("src"));
    conf.setOutputPath(new Path("out"));
Enter the following code immediatly after the two lines you just commented out (see image below):
    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    FileInputFormat.setInputPaths(conf, new Path("In"));
    FileOutputFormat.setOutputPath(conf, new Path("Out"));

Notes on doing example in Yahoo Tutorial, Module 2

http://developer.yahoo.com/hadoop/tutorial/module2.html

   cd ../hadoop/examples/
   cat > HDFSHelloWorld.java
   mkdir hello_classes
   javac -classpath /Users/thiebaut/hadoop/hadoop-0.19.2-core.jar -d hello_classes HDFSHelloWorld.java 
   emacs HDFSHelloWorld.java -nw
   javac -classpath /Users/thiebaut/hadoop/hadoop-0.19.2-core.jar -d hello_classes HDFSHelloWorld.java 
   ls
   ls hello_classes/
   jar -cvf helloworld.jar -C hello_classes/ .
   ls
   ls -ltr
   find . -name "*" -print
   hadoop jar helloworld.jar HDFSHelloWorld

   Hello, world!