Difference between revisions of "CSC352 MapReduce/Hadoop Class Notes"
Line 21: | Line 21: | ||
* '''Phoenix''' <ref name="phoenix">Evaluating MapReduce for Multi-core and Multiprocessor Systems | * '''Phoenix''' <ref name="phoenix">Evaluating MapReduce for Multi-core and Multiprocessor Systems | ||
Colby Ranger, Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, Christos Kozyrakis∗ | Colby Ranger, Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, Christos Kozyrakis∗ | ||
− | Computer Systems Laboratory, Stanford University, http://csl.stanford.edu/~christos/publications/2007.cmp_mapreduce.hpca.pdf</ref> Another implementation of MapReduce, named Phoenix [2], has been created to facilitate optimal use of multi-core and multiprocessor systems. Written in C++, they sought to use the MapReduce programming paradigm to perform dense integer matrix multiplication as well as performing linear regression and principal components analysis on a matrix (amongst other applications). The following graph shows the speedup they achieve over varying number of cores: | + | Computer Systems Laboratory, Stanford University, http://csl.stanford.edu/~christos/publications/2007.cmp_mapreduce.hpca.pdf</ref> |
+ | ::Another implementation of MapReduce, named Phoenix [2], has been created to facilitate optimal use of multi-core and multiprocessor systems. Written in C++, they sought to use the MapReduce programming paradigm to perform dense integer matrix multiplication as well as performing linear regression and principal components analysis on a matrix (amongst other applications). The following graph shows the speedup they achieve over varying number of cores: | ||
<center>[[Image:MapReduceMultiCorePerformance.png]]</center> | <center>[[Image:MapReduceMultiCorePerformance.png]]</center> | ||
Line 96: | Line 97: | ||
− | * A '''job''' contains | + | * A '''job''' contains<br /><br /> |
− | |||
** the data<br /><br /> | ** the data<br /><br /> | ||
** a MapReduce program<br /><br /> | ** a MapReduce program<br /><br /> | ||
Line 113: | Line 113: | ||
* '''Hadoop creates one map task for each split''' | * '''Hadoop creates one map task for each split''' | ||
+ | |||
+ | |||
+ | * A split is '''64 MB''' by default. | ||
Revision as of 17:42, 31 March 2010