Difference between revisions of "CSC352 Class Page 2010"

From dftwiki3
Jump to: navigation, search
(Software)
(Literature)
Line 25: Line 25:
  
 
===Literature===
 
===Literature===
 +
* [[Image:hadoopOReilly.jpg | right |100px]] [http://www.amazon.com/Hadoop-Definitive-Guide-Tom-White/dp/0596521979  Hadoop, the definitive guide], Tim White, O'Reilly Media, June 2009, ISBN 0596521979
 
* Dean, J., and S. Ghemawat, [http://labs.google.com/papers/mapreduce-osdi04.pdf MapReduce: Simplified Data Processing on Large Clusters], Dec. 2004,  ([[media:MapReduce1204.pdf|cached copy]])
 
* Dean, J., and S. Ghemawat, [http://labs.google.com/papers/mapreduce-osdi04.pdf MapReduce: Simplified Data Processing on Large Clusters], Dec. 2004,  ([[media:MapReduce1204.pdf|cached copy]])
 
*  Czajkowski G., [http://googleblog.blogspot.com/2008/11/sorting-1pb-with-mapreduce.html  Sorting 1 PB with MapReduce], Nov. 2008, ([[media:Sorting1PBWithMapReduce.pdf|cached copy]])
 
*  Czajkowski G., [http://googleblog.blogspot.com/2008/11/sorting-1pb-with-mapreduce.html  Sorting 1 PB with MapReduce], Nov. 2008, ([[media:Sorting1PBWithMapReduce.pdf|cached copy]])
 
* Armbrust M, ''et al'', [http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf Above the Clouds: A Berkeley View of Cloud Computing], Tech Rep. CB/EECS-2009-28, Feb. 2009 ([[media:AboveTheCloudsBerkeley.pdf|cached copy]])
 
* Armbrust M, ''et al'', [http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf Above the Clouds: A Berkeley View of Cloud Computing], Tech Rep. CB/EECS-2009-28, Feb. 2009 ([[media:AboveTheCloudsBerkeley.pdf|cached copy]])
 +
 
===Class Material===
 
===Class Material===
 
* [http://code.google.com/edu/submissions/uwspr2007_clustercourse/listing.html University of Washington: Problem Solving on Large Scale Clusters]:
 
* [http://code.google.com/edu/submissions/uwspr2007_clustercourse/listing.html University of Washington: Problem Solving on Large Scale Clusters]:

Revision as of 11:12, 3 December 2009

Python Threads

XGrid Programming

Cloud Computing

References & Bibliography

Parallel Processing/Good background information

Python

XGrid

Cloud Computing

Literature

Class Material

The University of Washington ran an upper-division course on Distributed Computing with MapReduce in Spring 2007. Below you'll find the materials that were used for the class: five lectures in powerpoint format, as well as four lab exercises designed which were completed by students over the duration of the course, using a cluster running Hadoop.

Software/Web Links

Setting up a Hadoop cluster can be an all day job. However, if you want to experiment with the platform right now, [Google] has created a virtual machine image with a preconfigured single node instance of Hadoop
The IBM MapReduce Tools for Eclipse Plug-in is a robust plug-in that brings Hadoop support to the Eclipse platform. Features include server configuration, support for launching MapReduce jobs and browsing the distributed file system. This setup assumes that you are running Eclipse (version 3.3 or above) on your computer.
A video from Cloudera on setting up Hadoop... not easy to follow...