CSC352 MapReduce/Hadoop Class Notes

From dftwiki3
Revision as of 12:12, 29 March 2010 by Thiebaut (talk | contribs) (History)
Jump to: navigation, search

Outline

  • History
  • Infrastructure
  • Submitting a Job
  • Smith Cluster
  • Example 1: Java
  • Example 2: Python
  • Useful Commands

References

History

  • Introduced in 2004
  • MapReduce is a patented[1] software framework introduced by Google to support distributed computing on large data sets on clusters of computers. [wikipedia]
  • 2010 first conference: The First International Workshop on MapReduce and its Applications (MAPREDUCE'10). (http://graal.ens-lyon.fr/mapreduce/) Interesting tidbit: nobody from Google on planning committee. Mostley INRIA