Difference between revisions of "CSC352 MapReduce/Hadoop Class Notes"
(→References) |
(→History) |
||
Line 15: | Line 15: | ||
=History= | =History= | ||
* Introduced in 2004 | * Introduced in 2004 | ||
+ | * MapReduce is a patented[1] software framework introduced by Google to support distributed computing on large data sets on clusters of computers. [wikipedia] | ||
* 2010 first conference: The First International Workshop on MapReduce and its Applications (MAPREDUCE'10). (http://graal.ens-lyon.fr/mapreduce/) Interesting tidbit: nobody from Google on planning committee. Mostley INRIA | * 2010 first conference: The First International Workshop on MapReduce and its Applications (MAPREDUCE'10). (http://graal.ens-lyon.fr/mapreduce/) Interesting tidbit: nobody from Google on planning committee. Mostley INRIA |
Revision as of 12:12, 29 March 2010
Outline
- History
- Infrastructure
- Submitting a Job
- Smith Cluster
- Example 1: Java
- Example 2: Python
- Useful Commands
References
- Apache Hadoop Tutorial: http://hadoop.apache.org/common/docs/current/mapred_tutorial.html
- Wikipedia: http://en.wikipedia.org/wiki/MapReduce
History
- Introduced in 2004
- MapReduce is a patented[1] software framework introduced by Google to support distributed computing on large data sets on clusters of computers. [wikipedia]
- 2010 first conference: The First International Workshop on MapReduce and its Applications (MAPREDUCE'10). (http://graal.ens-lyon.fr/mapreduce/) Interesting tidbit: nobody from Google on planning committee. Mostley INRIA