Difference between revisions of "CSC352 Class Page 2010"

From dftwiki3
Jump to: navigation, search
Line 106: Line 106:
 
|}
 
|}
  
=Resources: References & Bibliography=
 
 
 
<!--onlysmith-->
 
==Parallel Processing/Good background information==
 
* Asanovic K. ''et al'', [http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf The Landscape of Parallel Computing Research: A View from Berkeley], Dec. 2006. ([[media:LandscapeParallelProcessingBerkeley1206.pdf|cached copy]])
 
* Xen
 
** Mauer, R., [http://www.linuxjournal.com/article/8812 Xen Virtualization and Linux Clustering], [http://www.linuxjournal.com Linux Journal] January 12th, 2006
 
** Barham P., ''et al.'', [[media:XenAndTheArtOfVirtualization_3.pdf | Xen and the Art of Virtualization]], University of Cambridge Computer Laboratory 15 JJ Thomson Avenue, Cambridge, UK, CB3 0FD
 
* AMD News
 
**  Hardwidge, B., [http://www.bit-tech.net/custompc/news/605374/amd-plans-supercomputer-with-1000-gpus.html AMD plans supercomputer with 1,000 GPUs], Jan. 2009, [http://www.bit-tech.net bit-tech.net] (or graphics goes to the clouds!)
 
** Halfacree G., [http://www.bit-tech.net/news/hardware/2009/11/17/amd-supercomputer-tops-top500-list/1 AMD supercomputer tops TOP500 list], November 2009, [http://www.bit-tech.net bit-tech.net] (or Intel gets a black eye!)
 
* Google University Code
 
** [http://code.google.com/edu/submissions/rutgers/index.html Lecture Notes] by  Paul Krzyzanowski for a course on Distributed Computing at Rutgers.  Quite complete, and covering the basics of parallelism, RPC, synchronization, fault tolerance, security, and distributed file systems.
 
 
* [http://research.microsoft.com/en-us/collaboration/fourthparadigm/ The Fourth Paradigm: Data-Intensive Scientific Discovery], Microsoft Research, 2009.  [http://research.microsoft.com/en-us/collaboration/fourthparadigm/contents.aspx Table of Contents].  A superb collection of essays on different topics ([http://tango.csc.smith.edu/dftwiki/images/4thParadigmMicrosoftGray09.pdf Low-res cached copy]).  The main chapters are:
 
** Part 1: Earth and Environment
 
** Part 2: Health and Wellbeing
 
** Part 3: Scientific Infrastructure
 
** Part 4: Scholarly Communication
 
** Final Thoughts
 
 
* Xgrid
 
** Hughes, B., [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.60.7248&rep=rep1&type=pdf Building Computational Grids with Apple's XGrid Middleware],  ''ACM International Conference Proceeding Series'', Vol. 167, Hobart, Tasmania, Australia, 2006. ([[media:buildingComputationalGrids.pdf|cached copy]])
 
<!--** Kokaly M., et. al., [http://www.cas.mcmaster.ca/~downd/mgst/files/LPAS%20Paper.pdf MGST: A framework for performance evaluation of Desktop Grids], 2009 IEEE International Symposium on Parallel&Distributed Processing, Rome, Italy ([[Media:MGSTFrameworkPerformanceXGrid.pdf|cached copy]])-->
 
** Tsouloupas G, and M. Dikaiakos, [http://grid.ucy.ac.cy/reports/TR-04-5.pdf Characterization of Computational Grid Resources Using Low-Level Benchmarks], Second IEEE International Conference on e-Science and Grid Computing, Amsterdam, Netherlands, 2006 ([[Media:CharacterizationComputationalGridBenchmark.pdf|cached copy]])
 
 
==Python Threads==
 
<greenbox>
 
[[Image:smilingPython.png| right| 100px]]
 
* [http://python.org/ The main Python reference]
 
* [http://heather.cs.ucdavis.edu/~matloff/Python/PyThreads.pdf Norman Matloff and Francis Hsu's Tutorial] on Python Threads (University of California, Davis) ([[media:matlof_PythonTutorial.pdf|cached copy]])
 
* [http://linuxgazette.net/107/pai.html Understanding Threading in Python], Krishna G Pai, Linux Gazette, Oct. 2004
 
* [http://www.python.org/doc/2.3.5/lib/thread-objects.html Thread Objects] from [http://www.python.org Python.Org]
 
</greenbox>
 
 
==XGrid==
 
<bluebox>
 
[[Image:xgridLogo.png | right|100px]]
 
 
* What's an XGrid system?
 
** [http://developer.apple.com/mac/library/documentation/MacOSXServer/Conceptual/Xgrid_Programming_Guide/Overview/Overview.html#//apple_ref/doc/uid/TP40006246-CH2-SW1 XGrid Overview] from Apple
 
** [http://data.scl.utah.edu/fmi/xsl/stream/details.xsl?-recid=104&a::v=2212a4Eaya A Video] presentation of the XGrid (click on movie reel icon to start).
 
** A very good overview of the XGrid from [http://www.macdevcenter.com/pub/a/mac/2005/08/23/xgrid.html?page=1 macdevcenter.com]
 
* [http://tango.csc.smith.edu/classwiki/index.php/Xgrid_Programming Programming Examples, Setup, and References]
 
 
===General References===
 
 
* [http://images.apple.com/server/macosx/docs/Xgrid_Admin_and_HPC_v10.5.pdf XGrid Admin and High Performance Computing] document (PDF)
 
* [http://www.apple.com/macosx/features/xgrid/ Apple Xgrid]
 
*[http://lists.apple.com/faq/pub/xgrid_users/ Apple Xgrid FAQ]
 
*[http://www.macdevcenter.com/pub/a/mac/2005/08/23/xgrid.html?page=1 MacDevCenter]
 
*[http://www.macresearch.org/the_xgrid_tutorials_part_i_xgrid_basics MacResearch]
 
*[http://cmgm.stanford.edu/~cparnot/xgrid-stanford/index.html Stanford Xgrid]
 
* [http://www.macos.utah.edu/documentation/administration/xgrid/xgrid_presentation.html Utah Xgrid]
 
 
===Applications===
 
* [http://developer.apple.com/documentation/MacOSXServer/Conceptual/Xgrid_Programming_Guide/Introduction/chapter_1_section_1.html XGrid Programming Guide]
 
* [http://tango.csc.smith.edu/classwiki/images/a/a4/XGrid_An_Introduction_to_R.pdf An Introduction to R]
 
* [http://unu.novajo.ca/simple/archives/000024.html  POVray] on the XGrid
 
* [http://cmgm.stanford.edu/~cparnot/xgrid-stanford/index.html Stanford] Xgrid: One of the largest XGrid systems around.
 
* [http://www.macos.utah.edu/documentation/administration/xgrid/xgrid_presentation.html Utah] Xgrid: Lots of good stuff.
 
* [http://reference.wolfram.com/mathematica/guide/StandaloneMathematicaKernels.html Using the Mathematica Kernel].
 
 
</bluebox>
 
 
==Cloud Computing==
 
<blockquote>"''Failure is the defining difference between distributed and local programming''" <br>
 
Ken Arnold, CORBA Designer
 
</blockquote>
 
<tanbox>
 
__NOTOC__
 
===Literature===
 
* [[Image:hadoopOReilly.jpg | right |100px]] [http://www.amazon.com/Hadoop-Definitive-Guide-Tom-White/dp/0596521979  Hadoop, the definitive guide], Tim White, O'Reilly Media, June 2009, ISBN 0596521979.  The Web site for the book is http://www.hadoopbook.com/ (with the data used as examples in the book)
 
* Dean, J., and S. Ghemawat, [http://labs.google.com/papers/mapreduce-osdi04.pdf MapReduce: Simplified Data Processing on Large Clusters], Dec. 2004,  ([[media:MapReduce1204.pdf|cached copy]])
 
*  Czajkowski G., [http://googleblog.blogspot.com/2008/11/sorting-1pb-with-mapreduce.html  Sorting 1 PB with MapReduce], Nov. 2008, ([[media:Sorting1PBWithMapReduce.pdf|cached copy]])
 
* Armbrust M, ''et al'', [http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf Above the Clouds: A Berkeley View of Cloud Computing], Tech Rep. CB/EECS-2009-28, Feb. 2009 ([[media:AboveTheCloudsBerkeley.pdf|cached copy]])
 
* Olson C. ''et. al.'', [[Media:pigLatinNotSoForeignLanguage.pdf |Pig  Latin: A Not-So-Foreign Language for Data Processing]], SIGMOD’08, June 9–12, 2008, Vancouver, BC, Canada.
 
* Ghemawat S., H. Gobioff, and S.T. Leung, [http://labs.google.com/papers/gfs-sosp2003.pdf The Google File System], SOSP’03, October 19–22, 2003, Bolton Landing, New York, USA.
 
* [http://research.microsoft.com/en-us/collaboration/fourthparadigm/ The Fourth Paradigm: Data-Intensive Scientific Discovery], Microsoft Research, 2009.  [http://research.microsoft.com/en-us/collaboration/fourthparadigm/contents.aspx Table of Contents],  ([http://tango.csc.smith.edu/dftwiki/images/4thParadigmMicrosoftGray09.pdf Low-res cached copy]). 
 
** [http://research.microsoft.com/en-us/collaboration/fourthparadigm/4th_paradigm_book_part3_larus_gannon.pdf Multicore Computing and Scientific Discovery], by Larus and Gannon
 
** [http://research.microsoft.com/en-us/collaboration/fourthparadigm/4th_paradigm_book_part3_gannon_reed.pdf Parallelism and the Cloud], by Gannon and Reed
 
** [http://research.microsoft.com/en-us/collaboration/fourthparadigm/4th_paradigm_book_part3_hansen_johnson.pdf Visualization and Data-Intensive Science] by Hansen, Johnson, Pascucci, and Silva.
 
===Media Reports===
 
* Markoff, J., [[media:DelugeOfDataShapesNewEraInComputing.pdf | A Deluge of Data Shapes a New Era in Computing]], ''New York Times'', 12/15/09
 
 
===Class Material on the Web===
 
* [http://code.google.com/edu/submissions/mapreduce-minilecture/listing.html Google]'s series of 4 lectures on map-reduce, distributed file-system, and clustering algorithms.
 
* '''University of Washington''': [http://code.google.com/edu/submissions/uwspr2007_clustercourse/listing.html  Problem Solving on Large Scale Clusters]
 
* '''Brandeis University''': [http://www.cs.brandeis.edu/~cs147a/ Distributed Systems Course]
 
** [http://www.cs.brandeis.edu/~cs147a/lab/hadoop-intro/ Introduction to Hadoop Lab]
 
** [http://www.cs.brandeis.edu/~cs147a/lab/hadoop-singlenode/ Single Node setup Lab]
 
** [http://www.cs.brandeis.edu/~cs147a/lab/hadoop-example/ Hadoop Example Program Lab]
 
** [http://www.cs.brandeis.edu/~cs147a/lab/hadoop-cluster/ Hadoop Cluster Setup Lab]
 
* '''Google''': [http://code.google.com/edu/parallel/mapreduce-tutorial.html Introduction to Parallel Programming and MapReduce]
 
* '''U. C. Berkeley''': [http://code.google.com/edu/submissions/ucberkeley-parallelism/index.html Intro to Parallel Programming and Threading]
 
* '''California PolyTech''': [http://code.google.com/edu/submissions/capolytech-parallel-programming/ A lab on the NetFlix data set]
 
* '''New Mexico Tech''': [http://scl.cs.nmt.edu/~doshin/t/s09/cs589/index.html syllabus] ([[Media:UNewMexicoCloudComputingSyllabus.pdf|pdf]])
 
* '''U. Maryland''': [http://www.umiacs.umd.edu/~jimmylin/cloud-2008-Fall/index.html Syllabus], and [http://www.umiacs.umd.edu/~jimmylin/cloud9/docs/index.html Jimmy Lin's Cloud 9] page.
 
 
===Software/Web Links===
 
[[Image:HadoopCartoon.png | 100px | right]]
 
*[http://hadoop.apache.org/common/ Apache's Documentation on Hadoop Common]
 
**[http://hadoop.apache.org/common/docs/current/mapred_tutorial.html The Hadoop Tutorial] from Apache.  A "Must-Do!"
 
**[http://hadoop.apache.org/common/docs/current/streaming.html#Hadoop+Streaming Hadoop Streaming], i.e. using Hadoop with  Python, for example.
 
* [http://developer.yahoo.com/hadoop/tutorial/ A Yahoo Tutorial] on Hadoop.  Another "Must-Do!"
 
* [http://v-lad.org/Tutorials/Hadoop/00%20-%20Intro.html An Hadoop-On-Eclipse] tutorial.  For Windows platform but works for Macs as well.  Best way to setup Eclipse!  You will need Eclipse 3.3.2 and Hadoop 0.19.1.
 
*[http://www.hadoopbook.com/ The Hadoop-Book] Web site.
 
*[http://wiki.apache.org/hadoop/FrontPage The Hadoop Wiki], the authoritative source on working with Hadoop. <font color="purple">Many examples in Java and Python</font>
 
** [http://wiki.apache.org/hadoop/WordCount WordCount]
 
** [http://wiki.apache.org/hadoop/PythonWordCount Python WordCount]
 
** [http://wiki.apache.org/hadoop/C%2B%2BWordCount C++ WordCount]
 
** [http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample How to read and write to HDFS]
 
*[http://code.google.com/edu/parallel/tools/hadoopvm/index.html  Hadoop at Google]: A preconfigured single node instance available at Google.
 
* [http://www.michael-noll.com/wiki/Writing_An_Hadoop_MapReduce_Program_In_Python Writing the WordCount] in Python
 
*[http://code.google.com/edu/parallel/tools/hadoopvm/index.html Guide for setting up IBM's Eclipse Tools for Hadoop] (go to bottom of page)
 
:The IBM MapReduce Tools for Eclipse Plug-in is a robust plug-in that brings Hadoop support to the Eclipse platform. Features include server configuration, support for launching MapReduce jobs and browsing the distributed file system. This setup assumes that you are running Eclipse (version 3.3 or above) on your computer.
 
* [http://www.infosci.cornell.edu/hadoop/mac.html Guide] from Cornell for setting up Hadoop on a Mac.
 
*[http://www.cloudera.com/blog/2009/04/20/configuring-eclipse-for-hadoop-development-a-screencast/ Configuring Eclipse for Hadoop] A video from Cloudera on setting up Hadoop... not easy to follow...
 
* [https://trac.declarativity.net/browser/hadoop-0.19.1-bfs/src/examples/org/apache/hadoop/examples The source code for the examples] that come with the Hadoop 0.19.1 distribution.  Includes WordCount, WordCountAggregate, WordCountHistogram, PiEstimator, Join, and Grep, among others.
 
 
===Videos===
 
* [http://code.google.com/edu/submissions/mapreduce-minilecture/listing.html Google]'s series of 4 lectures on map-reduce, distributed file-system, and clustering algorithms.
 
* [http://jez.blip.tv/file/245701/ A video of Tom White], author of O'Reilly's Hadoop guide, on BlipTV. White outlines the suite of projects centered around Hadoop ( an open source Map / Reduce project)
 
* [http://www.cloudera.com/hadoop-training-basic Cloudera]'s collection of videos. 
 
** [http://www.cloudera.com/hadoop-training-basic Thinking At Scale]  <-- Start here!
 
** [http://www.cloudera.com/hadoop-training-basic MapReduce and HDFS]
 
** [http://www.cloudera.com/hadoop-training-basic Hadoop Ecosystem Tour]
 
** [http://www.cloudera.com/hadoop-training-basic Programming with Hadoop]
 
** [http://www.cloudera.com/hadoop-training-basic Introduction to Hive]
 
** [http://www.cloudera.com/hadoop-training-basic Introduction to Pig]
 
** [http://www.cloudera.com/hadoop-training-basic MapReduce Algorithms]
 
** [http://www.cloudera.com/hadoop-training-basic Training Exercises and Tutorials]
 
** [http://www.cloudera.com/hadoop-training-basic Getting Started with Hadoop]
 
** [http://www.cloudera.com/hadoop-training-basic Writing MapReduce Programs]
 
** [http://www.cloudera.com/hadoop-training-basic Hive Tutorial]
 
** [http://www.cloudera.com/hadoop-training-basic Pig Tutorial]
 
* [http://www.hulu.com/watch/116372/cnbc-originals-inside-the-mind-of-google CNBC's report: Inside the Mind of Google]. "The best way to watch “Inside the Mind of Google,” Maria Bartiromo’s report on the Internet giant Thursday on CNBC, is to not watch the first quarter of it. (from Neil enzlinger's 12/02/09 [http://www.nytimes.com/2009/12/03/arts/television/03mind.html NYT review])
 
* [http://www.cloudcomputingcourse.com/SEM/?source=google&gclid=CLnOkojC0J0CFd9M5QodPz9qAg Short video] by consultant at http://www.stratoslearning.com (5 min) . Outlines a course on Cloud Computing.
 
** Part I: cloud fondamentals
 
** Part II: technology and barriers
 
** Part III: security
 
** Part IV: what options? players?
 
** Part V: Application, hands on
 
** Users Amazon as test platform.
 
</tanbox>
 
 
[[CSC352_Notes | <font color="white">Notes</font>]]
 
<!--/onlysmith-->
 
  
 
<br />
 
<br />

Revision as of 07:44, 19 January 2010



Main Page | Syllabus | Schedule | Links & Resources



Python Threads

Week Topics Reading
Week 1
1/25
  • Tuesday
  • Thursday

 

Week 2
2/1
  • Tuesday
  • Thursday

 

Week 3
2/8
  • Tuesday
  • Thursday

 

XGrid Programming

Week Topics Reading
Week x
  • Tuesday
  • Thursday

 

Week x

  • Tuesday
  • Thursday

 

Week x

  • Tuesday
  • Thursday

 

Cloud Computing

Week Topics Reading
Week x
  • Tuesday
  • Thursday

 

Week x

  • Tuesday
  • Thursday

 

Week x

  • Tuesday
  • Thursday

 














(c) D. Thiebaut 2009, Dept. Computer Science, Smith College.