Difference between revisions of "CSC352 MapReduce/Hadoop Class Notes"

From dftwiki3
Jump to: navigation, search
(References)
(History)
Line 15: Line 15:
 
=History=
 
=History=
 
* Introduced in 2004
 
* Introduced in 2004
 +
* MapReduce is a patented[1] software framework introduced by Google to support distributed computing on large data sets on clusters of computers. [wikipedia]
 
* 2010 first conference: The First International Workshop on MapReduce and its Applications (MAPREDUCE'10). (http://graal.ens-lyon.fr/mapreduce/) Interesting tidbit: nobody from Google on planning committee.  Mostley INRIA
 
* 2010 first conference: The First International Workshop on MapReduce and its Applications (MAPREDUCE'10). (http://graal.ens-lyon.fr/mapreduce/) Interesting tidbit: nobody from Google on planning committee.  Mostley INRIA

Revision as of 12:12, 29 March 2010

Outline

  • History
  • Infrastructure
  • Submitting a Job
  • Smith Cluster
  • Example 1: Java
  • Example 2: Python
  • Useful Commands

References

History

  • Introduced in 2004
  • MapReduce is a patented[1] software framework introduced by Google to support distributed computing on large data sets on clusters of computers. [wikipedia]
  • 2010 first conference: The First International Workshop on MapReduce and its Applications (MAPREDUCE'10). (http://graal.ens-lyon.fr/mapreduce/) Interesting tidbit: nobody from Google on planning committee. Mostley INRIA