CSC352 Notes 2013

From dftwiki3
Revision as of 19:13, 15 December 2009 by Thiebaut (talk | contribs) (Setting up Eclipse for Hadoop)
Jump to: navigation, search

Setting up Eclipse for Hadoop

 start-all.sh
  • setup eclipse
http://v-lad.org/Tutorials/Hadoop/17%20-%20set%20up%20hadoop%20location%20in%20the%20eclipse.html
    • localhost
    • Map/Reduce Master: localhost, 9101
    • DFS Master: user M/R Master ht, localhost, 9000 (must match number in hadoop/conf/hadoop-site.xml for hdfs value, i.e. localhost:9000
    • user name: hadoop-user
    • SOCKS proxy: host, 1080
  • Open DFS Locations
    • localhost
      • (2)
        • tmp(1)
          • hadoop-thiebaut (1)
            • mapred (1)
              • system (0)
        • user(1)
          • thiebaut (2)
            • hello.txt
            • readme.txt
  • make In directory:
hadoop fs -mkdir In

Create a new project with Eclipse

Create a project as explained in http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html

Project

  • Right-click on the blank space in the Project Explorer window and select New -> Project.. to create a new project.
  • Select Map/Reduce Project from the list of project types as shown in the image below.
  • Press the Next button.
  • Project Name: HadoopTest
  • Use default location
  • click on configure hadoop location, browse, and select /Users/thiebaut/hadoop-0.19.1 (or whatever it is)
  • Ok
  • Finish

Map/Reduce driver class

  • Right-click on the newly created Hadoop project in the Project Explorer tab and select New -> Other from the context menu.
  • Go to the Map/Reduce folder, select MapReduceDriver, then press the Next button as shown in the image below.
  • When the MapReduce Driver wizard appears, enter TestDriver in the Name field and press the Finish button. This will create the skeleton code for the MapReduce Driver.
  • Finish
  • Unfortunately the Hadoop plug-in for Eclipse is slightly out of step with the recent Hadoop API, so we need to edit the driver code a bit.
Find the following two lines in the source code and comment them out:
    conf.setInputPath(new Path("src"));
    conf.setOutputPath(new Path("out"));
Enter the following code immediatly after the two lines you just commented out (see image below):
    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    FileInputFormat.setInputPaths(conf, new Path("In"));
    FileOutputFormat.setOutputPath(conf, new Path("Out"));

Notes on doing example in Yahoo Tutorial, Module 2

http://developer.yahoo.com/hadoop/tutorial/module2.html

   cd ../hadoop/examples/
   cat > HDFSHelloWorld.java
   mkdir hello_classes
   javac -classpath /Users/thiebaut/hadoop/hadoop-0.19.2-core.jar -d hello_classes HDFSHelloWorld.java 
   emacs HDFSHelloWorld.java -nw
   javac -classpath /Users/thiebaut/hadoop/hadoop-0.19.2-core.jar -d hello_classes HDFSHelloWorld.java 
   ls
   ls hello_classes/
   jar -cvf helloworld.jar -C hello_classes/ .
   ls
   ls -ltr
   find . -name "*" -print
   hadoop jar helloworld.jar HDFSHelloWorld

   Hello, world!