CSC352 Notes 2013
Contents
Setting up Eclipse for Hadoop
- Java 1.6
- http://v-lad.org/Tutorials/Hadoop/03%20-%20Prerequistes.html
- download Eclipse 3.3.2 (Europa) from http://www.eclipse.org/downloads/packages/release/europa/winter
- Use Hadoop 0.19.1
- open eclipse and deploy (Mac)
- uncompress hadoop 19.1
- copy the eclipse-plugin from hadoop to the plugin directory of eclipse
- start hadoop on the Mac and follow directions from http://v-lad.org/Tutorials/Hadoop page:
start-all.sh
- setup eclipse
- localhost
- Map/Reduce Master: localhost, 9101
- DFS Master: user M/R Master ht, localhost, 9000 (must match number in hadoop/conf/hadoop-site.xml for hdfs value, i.e. localhost:9000
- user name: hadoop-user
- SOCKS proxy: host, 1080
- Open DFS Locations
- localhost
- (2)
- tmp(1)
- hadoop-thiebaut (1)
- mapred (1)
- system (0)
- mapred (1)
- hadoop-thiebaut (1)
- user(1)
- thiebaut (2)
- hello.txt
- readme.txt
- thiebaut (2)
- tmp(1)
- (2)
- localhost
- make In directory:
hadoop fs -mkdir In
Create a new project with Eclipse
Create a project as explained in http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html
Project
- Right-click on the blank space in the Project Explorer window and select New -> Project.. to create a new project.
- Select Map/Reduce Project from the list of project types as shown in the image below.
- Press the Next button.
- Project Name: HadoopTest
- Use default location
- click on configure hadoop location, browse, and select /Users/thiebaut/hadoop-0.19.1 (or whatever it is)
- Ok
- Finish
Map/Reduce driver class
- Right-click on the newly created Hadoop project in the Project Explorer tab and select New -> Other from the context menu.
- Go to the Map/Reduce folder, select MapReduceDriver, then press the Next button as shown in the image below.
- When the MapReduce Driver wizard appears, enter TestDriver in the Name field and press the Finish button. This will create the skeleton code for the MapReduce Driver.
- Finish
- Unfortunately the Hadoop plug-in for Eclipse is slightly out of step with the recent Hadoop API, so we need to edit the driver code a bit.
- Find the following two lines in the source code and comment them out:
conf.setInputPath(new Path("src")); conf.setOutputPath(new Path("out"));
- Enter the following code immediatly after the two lines you just commented out (see image below):
conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path("In")); FileOutputFormat.setOutputPath(conf, new Path("Out"));
Notes on doing example in Yahoo Tutorial, Module 2
http://developer.yahoo.com/hadoop/tutorial/module2.html
cd ../hadoop/examples/ cat > HDFSHelloWorld.java mkdir hello_classes javac -classpath /Users/thiebaut/hadoop/hadoop-0.19.2-core.jar -d hello_classes HDFSHelloWorld.java emacs HDFSHelloWorld.java -nw javac -classpath /Users/thiebaut/hadoop/hadoop-0.19.2-core.jar -d hello_classes HDFSHelloWorld.java ls ls hello_classes/ jar -cvf helloworld.jar -C hello_classes/ . ls ls -ltr find . -name "*" -print hadoop jar helloworld.jar HDFSHelloWorld Hello, world!