Difference between revisions of "CSC352 MapReduce/Hadoop Class Notes"

From dftwiki3
Jump to: navigation, search
Line 21: Line 21:
 
* '''Phoenix''' <ref name="phoenix">Evaluating MapReduce for Multi-core and Multiprocessor Systems  
 
* '''Phoenix''' <ref name="phoenix">Evaluating MapReduce for Multi-core and Multiprocessor Systems  
 
Colby Ranger, Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, Christos Kozyrakis∗  
 
Colby Ranger, Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, Christos Kozyrakis∗  
Computer Systems Laboratory, Stanford University, http://csl.stanford.edu/~christos/publications/2007.cmp_mapreduce.hpca.pdf</ref>  Another implementation of MapReduce, named Phoenix [2], has been created to facilitate optimal use of multi-core and multiprocessor systems. Written in C++, they sought to use the MapReduce programming paradigm to perform dense integer matrix multiplication as well as performing linear regression and principal components analysis on a matrix (amongst other applications). The following graph shows the speedup they achieve over varying number of cores:
+
Computer Systems Laboratory, Stanford University, http://csl.stanford.edu/~christos/publications/2007.cmp_mapreduce.hpca.pdf</ref>   
 +
::Another implementation of MapReduce, named Phoenix [2], has been created to facilitate optimal use of multi-core and multiprocessor systems. Written in C++, they sought to use the MapReduce programming paradigm to perform dense integer matrix multiplication as well as performing linear regression and principal components analysis on a matrix (amongst other applications). The following graph shows the speedup they achieve over varying number of cores:
 
<center>[[Image:MapReduceMultiCorePerformance.png]]</center>
 
<center>[[Image:MapReduceMultiCorePerformance.png]]</center>
  
Line 96: Line 97:
  
  
* A '''job''' contains
+
* A '''job''' contains<br /><br />
 
 
 
** the data<br /><br />
 
** the data<br /><br />
 
** a MapReduce program<br /><br />
 
** a MapReduce program<br /><br />
Line 113: Line 113:
  
 
* '''Hadoop creates one map task for each split'''
 
* '''Hadoop creates one map task for each split'''
 +
 +
 +
* A split is '''64 MB''' by default.
  
  

Revision as of 17:42, 31 March 2010


This section is only visible to computers located at Smith College