Difference between revisions of "CSC352 MapReduce/Hadoop Class Notes"

From dftwiki3
Jump to: navigation, search
Line 71: Line 71:
 
<br />
 
<br />
  
==The Computation==
+
==The Computation: micro view==
  
 
The images in this section are mostly taken from the excellent Yahoo tutorial (Module 4) on MapReduce <ref name="YahooMapReduceTutorial4">Yahoo Tutorial, Module 4, MapReduce Basics http://developer.yahoo.com/hadoop/tutorial/module4.html</ref>
 
The images in this section are mostly taken from the excellent Yahoo tutorial (Module 4) on MapReduce <ref name="YahooMapReduceTutorial4">Yahoo Tutorial, Module 4, MapReduce Basics http://developer.yahoo.com/hadoop/tutorial/module4.html</ref>
Line 89: Line 89:
 
<br />
 
<br />
  
 +
==Important Statements To Remember==
 +
 +
Taken from <ref name="hadoopGuide">[http://www.amazon.com/Hadoop-Definitive-Guide-Tom-White/dp/0596521979 Hadoop, the definitive guide], Tim White, O'Reilly Media, June 2009, ISBN 0596521979. The Web site for the book is http://www.hadoopbook.com/</ref>
 +
 +
* A '''Map Reduce job''' is a unit of work submitted by a client
 +
 +
 +
* Hadoop divides a '''job''' into '''tasks''', of which there are two kinds, '''map tasks''', and '''reduce tasks'''.
 +
 +
 +
* There are two types of '''nodes''': a '''jobTracker''' node, which oversees the execution of a '''job''', and '''taskTraker''' nodes that execute '''tasks'''.
 +
 +
 +
==Computation: macro view==
 
<center>[[Image:MapReduceDataFlow.png]]</center>
 
<center>[[Image:MapReduceDataFlow.png]]</center>
 
<br />
 
<br />

Revision as of 17:33, 31 March 2010


This section is only visible to computers located at Smith College