Difference between revisions of "CSC352 Class Page 2013"
(→Weekly Schedule) |
|||
(176 intermediate revisions by the same user not shown) | |||
Line 6: | Line 6: | ||
<br /> | <br /> | ||
<br /> | <br /> | ||
− | <center>[[CSC352 2013|Main Page]]|[[CSC352 Syllabus 2013|Syllabus]]|[ | + | <center>[[CSC352 2013|Main Page]]|[[CSC352 Syllabus 2013|Syllabus]]|[http://cs.smith.edu/classwiki/index.php/CSC352_Project_Page_2013 Project Page] |
+ | | [https://piazza.com/smith/fall2013/csc352/home PIAZZA]</center> | ||
<br /> | <br /> | ||
<br /> | <br /> | ||
Line 35: | Line 36: | ||
** What is a process? | ** What is a process? | ||
** What is a thread? | ** What is a thread? | ||
+ | ---- | ||
+ | ---- | ||
---- | ---- | ||
---- | ---- | ||
Line 58: | Line 61: | ||
*** Regroup and gather statistics on the different machines in the classroom | *** Regroup and gather statistics on the different machines in the classroom | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
---- | ---- | ||
** Comments on '''bimonthly newsletter''' | ** Comments on '''bimonthly newsletter''' | ||
Line 82: | Line 77: | ||
**** The Official Google blog | **** The Official Google blog | ||
**** Review: Tom's Hardware | **** Review: Tom's Hardware | ||
− | **** Some of the sites listed in [ | + | **** Some of the sites listed in [https://rohidassanap.wordpress.com/2013/06/18/top-40-best-technology-news-websites-the-definitive-list/ this page's] top 40 list. |
*** Recommendation for news aggregator: [http://cloud.feedly.com/#welcome Feedly.com] | *** Recommendation for news aggregator: [http://cloud.feedly.com/#welcome Feedly.com] | ||
Line 107: | Line 102: | ||
*** Introduction to '''Speedup( ''N'' )''', where ''N'' is the number of threads, or the number of processors. | *** Introduction to '''Speedup( ''N'' )''', where ''N'' is the number of threads, or the number of processors. | ||
*** '''Amdahl's Law''' | *** '''Amdahl's Law''' | ||
− | <center>[[Image:AmdahlsLaw.jpg|450px | + | <center>[[Image:AmdahlsLaw.jpg|450px]]</center> |
+ | <!-- | ||
*** '''Processor Utilization''' | *** '''Processor Utilization''' | ||
<center>[[Image:ParallelProcessorUtilizationDefinition.gif]]</center> | <center>[[Image:ParallelProcessorUtilizationDefinition.gif]]</center> | ||
<br /> | <br /> | ||
− | <center>[[Image:ParallelProcessorUtilizationGraph.gif|450px | + | <center>[[Image:ParallelProcessorUtilizationGraph.gif|450px]]</center> |
<br /> | <br /> | ||
+ | --> | ||
*** A bit of Computer Architecture: Cores and Caches | *** A bit of Computer Architecture: Cores and Caches | ||
− | <center>[[Image:4CoreAndLevel123Caches.png| | + | <center>[[Image:4CoreAndLevel123Caches.png|500px]]<br /><br /><br /><br /> |
− | [[Image:4CoreAndLevel3CacheDie.jpg| | + | [[Image:4CoreAndLevel3CacheDie.jpg|500px]]<br /><br /><br /><br /> |
+ | [[Image:LatenciesInMemoryHierarchy.png|500px]] | ||
+ | </center> | ||
+ | (last slide taken from [www.cs.utexas.edu/users/mckinley/352/lectures/16.pdf http://www.cs.utexas.edu/users/mckinley/352/lectures/16.pdf]) | ||
<br /> | <br /> | ||
*** [[CSC352 Synchronization and Java Threads | Synchronizing Java Threads]] | *** [[CSC352 Synchronization and Java Threads | Synchronizing Java Threads]] | ||
+ | ***[[ CSC352: Using Bash, an example | Using Bash to run a program multiple times]] | ||
+ | ---- | ||
+ | ---- | ||
+ | ---- | ||
+ | ---- | ||
* '''Thursday''' | * '''Thursday''' | ||
− | ** Discussion of [http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf A View of | + | ** Discussion of [http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf A View of Parallel Processing] from Berkeley. Prepare a 1- to 2-page summary of the paper in '''Latex'''. Hand-in the summary in class. No summaries will be accepted after class. |
+ | <center>[[Image:AViewFromBerkeleyWordle.png|400px]]</center> | ||
+ | <br /> | ||
+ | ** Some topics taken from the paper: | ||
+ | *** Moore's Law: | ||
+ | **** Processor-DRAM gap increasing (graph taken from [http://www.cs.virginia.edu/stream/ www.cs.virginia]) | ||
+ | <center> | ||
+ | [[Image:MooresLawProcessorMemoryGap.gif]] | ||
+ | </center> | ||
+ | |||
+ | *** '''Barnes and Hut''' approach to N-Body problem<br /> | ||
+ | <center><videoflash>XAlzniN6L94</videoflash></center> | ||
+ | <br /> | ||
+ | *** '''Monte Carlo''': see the current [[CSC352_Homework_1_2013 | Homework #1]] and [[CSC352_Homework_1_Solution_2013|Solution]] | ||
+ | ***Example of Many-Core architecture: | ||
+ | <center> [[Image:ManyCoreArchitecture.jpg|400px]]</center><br /> | ||
+ | |||
+ | (Image taken from URL: http://www.altera.com/technology/system-design/articles/2012/multicore-many-core.html) | ||
+ | *** nanometers: where are we now? | ||
+ | <center>[[Image:nm_fabricationProcess.png]]</center> | ||
+ | <br />( Image taken from http://en.wikipedia.org/wiki/22_nanometer) | ||
+ | |||
+ | *** Ring network connecting cores in Intel architecture (http://semiaccurate.com/2012/08/28/intel-details-knights-corner-architecture-at-long-last/) | ||
+ | <center>[[Image:RingNetworkLinkingMultiCoreIntelArch.png|500px]]</center> | ||
+ | |||
+ | ** Short preparation for Maggie Lind's tour of the SCMA on Tuesday. Meeting place is entrance of SCMA. | ||
+ | *** What the project is about can be included in the field of [http://en.wikipedia.org/wiki/Culturomics Culturomics] | ||
+ | |||
---- | ---- | ||
− | * | + | * [[CSC352 Homework 1 2013| Homework #1]], due 9/19/13 and [[CSC352 Homework 1 Solution 2013 | Solution]] |
|| | || | ||
* [http://docs.oracle.com/javase/tutorial/essential/concurrency/sync.html Java Synchronization] methods | * [http://docs.oracle.com/javase/tutorial/essential/concurrency/sync.html Java Synchronization] methods | ||
Line 135: | Line 167: | ||
* '''Thursday''': | * '''Thursday''': | ||
** <font color="red">Newsletter #1 due today!</font> | ** <font color="red">Newsletter #1 due today!</font> | ||
+ | ** Organization and Anatomy of the ''View from Berkeley'' paper. | ||
+ | ** How to prepare a talk: [[Media:theButlerDidIt_0913.pdf | An example: "The Butler Did It!"]] (and [[Media:theButlerDidIt_0903.pdf| also this]]) | ||
+ | ** [[Media:CSC352_JavaSynchronization_presentation_0913.pdf | Java synchronization]], a presentation | ||
+ | ** [[CSC352 Java Threads: Producer-Consumer Lab| Producer-Consumer Lab]] | ||
---- | ---- | ||
− | * | + | * [[CSC352 Homework 2 2013| Homework 2]], will be due 9/28/13 at midnight. [[CSC352 Homework 2 2013 Solution| Solution programs]] |
|| | || | ||
+ | All the data structures of interest (concurrent non-blocking and blocking) can be found in the Oracle documentation. The information is a bit cryptic, but you need to get comfortable with it! | ||
+ | * [http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/BlockingQueue.html BlockingQueue] | ||
+ | * [http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html ConcurrentLinkedQueue] | ||
<!-- ================================================================== --> | <!-- ================================================================== --> | ||
Line 144: | Line 183: | ||
| Week 4 <br /> 9/24<br /> | | Week 4 <br /> 9/24<br /> | ||
|| | || | ||
− | * '''Tuesday''': <font color="magenta"> | + | * '''Tuesday''': <font color="magenta">Guest Lecture/Informal discussion with by [http://en.wikipedia.org/wiki/Timothy_C._Draper Tim Draper]</font> |
− | * ''' | + | ** Some questions to start the conversation: |
+ | ** How has the cloud infrastructure changed entrepreneurship, if at all? | ||
+ | ** There is a whole ecosystem growing around the cloud services offered by Amazon and the other players: new companies offering services and using Amazon's AWS for example. What are some of the most interesting companies/ideas/technologies emerging that you have discovered or been involved with? | ||
+ | ** There is tremendous worries about the safety and privacy of data in the cloud. Is this an area of growth students should consider? | ||
+ | ** What other area of growth do you see that students should keep in their view-sight? | ||
+ | ** If a graduating major is interested in joining a start-up company, what are the signs she should be looking for before joining such a group? | ||
+ | ** Some students are interested in a management track, starting at a big company and climbing fast. What is your advice for best preparing for this type of career? | ||
+ | ** What is the most exciting development in your eyes happening now with cloud technology? | ||
+ | ** It has been said that the 21st century is the century of the entrepreneur. Do you see this as true? | ||
+ | ** Companies rise and fall. Microsoft was once the place where all our majors wanted to go. The most prestigious company for programmers. Now it's Google, and Facebook. Which company(ies) do you see as potential new meccas for programmers? | ||
+ | ** If somebody were to form a start-up with friends. Say 10 people. Who/What/Where? Who should the people be? What field should they be experts in? Where should the company locate? | ||
+ | <br /> | ||
+ | <br /> | ||
+ | <center>[[Image:TimMelissaDraper.png|400px|link=http://www.smith.edu/video/investing-smith-entrepreneurs]]</center> | ||
+ | <br /> | ||
+ | ** Review of Homework 1 and its [[CSC352 Homework 1 Solution 2013 | Solution]]. | ||
+ | *** Understand static variables | ||
+ | *** don't use global random generators! | ||
+ | *** /usr/bin/time multiplies time by the # of cores for threaded applications | ||
+ | *** be sure to understand if you need the same random seed or a different seed in your experiments | ||
+ | *** create a different user on your laptop with no extra applications loaded in the background (e.g. Skype): less stress on the O.S. | ||
+ | ---- | ||
+ | ---- | ||
+ | ---- | ||
+ | * '''Thursday''' <font color="magenta">Mountain Day!</font> | ||
+ | <br />[[Image:MountainDay.png|450px]]<br /> | ||
---- | ---- | ||
* | * | ||
Line 156: | Line 220: | ||
|width="15%"| Week 5 <br /> 10/1 | |width="15%"| Week 5 <br /> 10/1 | ||
|width="60%"| | |width="60%"| | ||
− | * '''Tuesday''' | + | * '''Tuesday''' (Grace Hopper Conference) |
− | * '''Thursday''' | + | ** Introduction to Packing [http://cs.smith.edu/dftwiki/images/CSC352_IntroductionToPacking.pdf pdf] and [http://cs.smith.edu/dftwiki/images/CSC352_IntroductionToPacking.ppt ppt] |
− | + | <center><videoflash>vDHFF4wjWYU</videoflash></center> | |
+ | ** Studying the [[CSC352 Red-Black Trees in Java | Red-Black Tree]] data-structure | ||
+ | *** Why is it not thread-safe? | ||
+ | *** How can we make it thread-safe? | ||
+ | *** Devise a test to verify that the modifications have resulted in a thread-safe class | ||
+ | *** [[Tutorial: Profiling Java Programs | Profiling Java applications ]] (introduction to Java's '''GC'''). | ||
+ | ---- | ||
+ | ---- | ||
+ | ---- | ||
+ | ---- | ||
+ | * '''Thursday''' (Grace Hopper Conference) | ||
+ | ** <font color="red">Newsletter #2 due today. Please include 1 news item about some form of image collage, representation of many images in some form, hopefully digital. <u>Also, please use a Latex feature you haven't used in your first newsletter</u></font> | ||
+ | ** Elaborating a [[CSC352 Project Roadmap| roadmap]] for the final [[http://cs.smith.edu/classwiki/index.php/CSC352_Project_Page_2013 Project]] | ||
---- | ---- | ||
* | * | ||
Line 168: | Line 244: | ||
|| | || | ||
* '''Tuesday''' | * '''Tuesday''' | ||
− | * '''Thursday''': 50-min Presentation by Rocco Piccinino in | + | ** An Introduction to C ([[CSC352 Keynote Presentations 2013| keynote]]) (we stopped at "Arrays") [[Solutions to Introduction to C Presentation | C solutions for Intro. to C Exercises]] |
+ | * '''Thursday''': [http://libguides.smith.edu/content.php?pid=510405 50-min Presentation by Rocco Piccinino], Head of the Young Science Library, in FH345 on using the library resources for research. | ||
---- | ---- | ||
− | * | + | * [[CSC352 Homework 3 2013| Homework 3]] will be due 10/22. [[CSC352 Homework 3 Solution 2013| Solution Programs]] |
|| | || | ||
− | + | * [https://computing.llnl.gov/tutorials/mpi/ MPI] by Blaise Barney, at Lawrence Livermore National Laboratory: an excellent reference on MPI | |
<!-- ================================================================== --> | <!-- ================================================================== --> | ||
Line 181: | Line 258: | ||
* '''Tuesday''': <font color="magenta">Fall Break</font> | * '''Tuesday''': <font color="magenta">Fall Break</font> | ||
* '''Thursday''' | * '''Thursday''' | ||
+ | ** Introduction to the [[CSC352 Final Project 2013| Final Project]] | ||
+ | ** An Introduction to MPI ([[CSC352 Keynote Presentations 2013 | keynote]]). You may want to install MPI [[Install MPI on a MacBook| on your MacBook]]<br />(We stopped on Thursday on the '''MPI_Send()''' function.) | ||
+ | |||
---- | ---- | ||
* | * | ||
|| | || | ||
− | + | * [https://computing.llnl.gov/tutorials/mpi/ Super MPI-Introduction from the Lawrence Livermore Nat. Lab] | |
<!-- ================================================================== --> | <!-- ================================================================== --> | ||
|- style="background:#eeeeff" valign="top" | |- style="background:#eeeeff" valign="top" | ||
Line 191: | Line 271: | ||
|| | || | ||
* '''Tuesday''' | * '''Tuesday''' | ||
+ | ** <font color="goldenrod">Paper presentation</font>: [[Media:LearningFromTheSuccessOfMPI2002_WilliamGropp.pdf | Learning from the Success of MPI]], presented by '''Gavi''' ([[Learning From the Success of MPI Bibtex| Bibtex]]) | ||
+ | ** Hadoop0 accounts | ||
+ | ** Learn how to become '''rsync''' champions! | ||
+ | ** Continuation of the introduction to MPI ([[CSC352 Keynote Presentations 2013| keynote]]). We stopped on Thursday on the '''MPI_Send()''' function. | ||
+ | ** Code for the [[CSC352 MPI pi2.c program | pi2.c]] program computing Pi using summation of a series | ||
+ | ** <font color="red">Newsletter #3 due today!</font> | ||
+ | |||
* '''Thursday''' | * '''Thursday''' | ||
− | + | ** Continuation of the introduction to MPI ([[CSC352 Keynote Presentations 2013| keynote]]) | |
+ | ** Introduction on how to operate a MySQL database ([[CSC352 Keynote Presentations 2013| keynote]]) | ||
+ | ** A project-oriented MPI example. Bring your Mac! | ||
---- | ---- | ||
− | * | + | * [[CSC352 Homework 4 2013| Homework 4]] on C and MPI. [[CSC352 Homework 4 Solutions| Solution programs]] |
|| | || | ||
| | ||
Line 203: | Line 292: | ||
|width="60%"| | |width="60%"| | ||
* '''Tuesday''' | * '''Tuesday''' | ||
+ | ** <font color="goldenrod">Paper presentation</font>: [[Media:ServerVirtualizationArchitectureAndImplementation2009.pdf | Server Virtualization Architecture and Implementation]] presented by Emily | ||
+ | ** A few words about newsletters | ||
+ | ** [[CSC352 Where is What on Hadoop0 2013| Where is What on Hadoop0]] | ||
+ | ** MySQL Exercises | ||
+ | ** [[Tutorial:_C_%2B_MySQL_%2B_MPI | Combining C, MySQL and MPI]]: combing through a lot of code | ||
+ | ** Project discussion | ||
* '''Thursday''' | * '''Thursday''' | ||
− | + | ** [[Tutorial: Create an MPI Cluster on the Amazon Elastic Cloud (EC2) | Creating an MPI Cluster on Amazon]], ([[CSC352 Keynote Presentations 2013| Accompanying keynote]]), followed by a second [[Computing Pi on an AWS MPI-Cluster| tutorial]] on computing Pi on a 10-node AWS cluster. | |
---- | ---- | ||
* | * | ||
Line 214: | Line 309: | ||
|| | || | ||
* '''Tuesday''': <font color="magenta">Otelia Cromwell Day</font> | * '''Tuesday''': <font color="magenta">Otelia Cromwell Day</font> | ||
− | * '''Thursday''' | + | * '''Thursday''': |
+ | ** <font color="goldenrod">Paper presentation</font>: [[Media:MapReduceDeanGhemawat_2004.pdf |MapReduce: Simplified Data Processing on Large Clusters]] presented by Sharon Pamela | ||
+ | ** <font color="red">Newsletter #4 due today!</font>. Please include at least one image, and at least one news item covering some form of project that could be related or influential for our own wiki-collage project. See [http://cs.smith.edu/dftwiki/index.php/Latex_and_Editing_Tools_to_write_an_Honors_Thesis this document on writing theses] for information about the inclusion of images in Latex. The end section has a good list of sites that have good coverage of Latex topics. There is also plenty of information on the Web about this subject. | ||
+ | ** Preparation for [[CSC352 Homework 5 2013 | Homework 5]]: attaching EBS volumes. We'll do a lab in class to [[Create_an_MPI_Cluster_on_the_Amazon_Elastic_Cloud_(EC2)#Creating_an_EBS_Volume | create ]] and [[Create_an_MPI_Cluster_on_the_Amazon_Elastic_Cloud_(EC2)#Attaching_the_EBS_Volume_to_the_Cluster | attach]] an EBS volume to your AWS cluster. | ||
+ | |||
---- | ---- | ||
− | * | + | * [[CSC352 Homework 5 2013 | Homework 5]] and [[CSC352 Homework 5 Solution 2013| Solution]] |
|| | || | ||
| | ||
Line 226: | Line 325: | ||
|width="60%"| | |width="60%"| | ||
* '''Tuesday''' | * '''Tuesday''' | ||
+ | ** <font color="goldenrod">Paper presentation</font>: [[Media:GeneralPurposeVsGPU_Comparison_Many_Cores_2010_Caragea.pdf |General-Purpose vs. GPU: Comparisons of Many-Cores on Irregular Workloads]], presented by Yoshie | ||
+ | *** '''Questions about the paper:''' | ||
+ | **** What kind of paper is this? Broad distribution? Research? Small group? | ||
+ | **** Organization? Abstract? Introduction? Definition of specialized terms? Early enough in the paper? | ||
+ | **** Are the contributions of paper clear? The section on related research sufficient? | ||
+ | **** What is being compared? Similar machines? Hardware? Software? | ||
+ | **** Are authors partial? Do they have a stake? | ||
+ | **** How does the paper advance the state of research? | ||
+ | **** What does it tell us about the way computer systems evolve? | ||
+ | ** Thinking about the project | ||
+ | *** What do we know better about the overall project. What pieces have we looked at? | ||
+ | *** What is it we don't know? | ||
+ | *** Can we turn any of these questions into a project? | ||
+ | <br /> | ||
* '''Thursday''' | * '''Thursday''' | ||
− | + | ** A few comments on Manager/Worker paradigm in MPI: not the only one. Many logical communication networks do not match star pattern | |
+ | ** Continuation of Project Discussion | ||
+ | ** Quick Introduction to Hadoop/MapReduce ([[CSC352 Keynote Presentations 2013| Accompanying keynote]]) | ||
+ | ** [[Tutorial:_Creating_a_Hadoop_Cluster_on_Amazon_AWS | MapReduce lab on AWS]] | ||
+ | ** [[CSC352 Bash Script to Run Hadoop WordCount| Script to run Hadoop WordCount program]] on AWS | ||
---- | ---- | ||
* | * | ||
|| | || | ||
+ | Yahoo has some very good reading material on Hadoop. One reason is that they may be one of the largest users of AWS and of Hadoop. | ||
+ | ** [http://developer.yahoo.com/hadoop/tutorial/ Yahoo Developers Network]: Tutorial on Hadoop. All the chapters are worth reading! | ||
<!-- ================================================================== --> | <!-- ================================================================== --> | ||
Line 236: | Line 355: | ||
| Week 12 <br /> 11/19<br /> | | Week 12 <br /> 11/19<br /> | ||
|| | || | ||
− | * '''Tuesday''': <font color="magenta"> | + | * '''Tuesday''': |
− | * '''Thursday''': <font color="magenta">Tentative guest lecture: Nick Howe on CUDA and GPUs</font> | + | ** <font color="red">1 month to go (exactly) before the project is due (Dec. 19)!</font> |
+ | ** <font color="magenta">Student-directed work (DT @ INFOCOMP 2013)</font> | ||
+ | ** Finish the [[Tutorial:_Creating_a_Hadoop_Cluster_on_Amazon_AWS | MapReduce lab on AWS]] and make sure you do the [[Tutorial:_Creating_a_Hadoop_Cluster_with_StarCluster_on_Amazon_AWS#Challenge_.23_2 | Challenge 2]] part of the lab. | ||
+ | ** Food for thought: some videos<br />I suggest one of you connects her laptop to the projection system and you all watch these videos together. After each one, discuss it as a group. Take notes and be ready to share your comments during Thursday's class when we resume our regular schedule. | ||
+ | *** The Cave 2 Project at the University of Illinois: Just another hardware solution for presenting the user with a large number of pixels; in this case 27320 x 3072 pixels. ''Short, 3 minutes.'' | ||
+ | <center> | ||
+ | <videoflash>yf0sllpZx3w</videoflash> | ||
+ | </center> | ||
+ | <br /> | ||
+ | *** The Creators Projects video<br /> | ||
+ | ::::This video is not necessarily anything that can work for us, but it's just "food for thought." Just a different way an artist has come up to make still pictures interesting to look at. ''Short, 6 minutes''. | ||
+ | <center> | ||
+ | <videoflash>rKmMaDBoZhs</videoflash> | ||
+ | </center> | ||
+ | <br /> | ||
+ | *** O'Reilly Radar Videos<br /> | ||
+ | [[Image:OReillyPerlBookCover.jpg|100px|right]] | ||
+ | ::: Tim O'Reilly is a visionary who figured out a long time ago that computer technology was an exploding field and he started a very successful line of books to support all new technology projects that were emerging and promising. The books all have animals on them and are uniquely easy to spot. O'Reilly now also has an on-line channel (O'Reilly Radar), and organizes conferences with top researchers and intellectuals in the field of computer science. | ||
+ | ::: The first video is with Doug Cutting, one of the creators of Hadoop. He makes some very good points about what Hadoop is, what it is good at, and what it might not be good at (Homework 5 lesson?). After Cutting you can skip the 2nd interview (about video technology) and zip to the 3rd interview with Jeremy Howard, at time-tag 13:47. Then learn about big data and analytics, and what is said of ''data scientists''. '' About 12 minutes total''. | ||
+ | <center> | ||
+ | <videoflash>BWBGQIq5zow</videoflash> | ||
+ | </center> | ||
+ | <br /> | ||
+ | ::: Good interview of Tim O'Reilly describing Web 2.0, and his view of a data-driven Internet. 8-minute long. You may want to think about how our wikipedia data (images, stats) relate to what is said about data as described in the interview. ''About 8 minutes''. | ||
+ | <center> | ||
+ | <videoflash>FJ3TxeE_tHI</videoflash> | ||
+ | </center> | ||
+ | <br /> | ||
+ | ::: The next video filmed in June 2013 presents Bruno Fernandez-Ruiz of Yahoo, who speaks about Hadoop since 2005, Hadoop today, and what is ahead. An important type of data property Fernandez-Ruiz is interested in is ''timeliness'', which we haven't really looked at for our project, but you will see that it could apply easily to the dynamics of wikipedia. Some interesting statistics about the number of servers, the size of the HDFS they use, the number of processes are given. ''About 17 minutes''. | ||
+ | <center> | ||
+ | [[Image:LookingBeyondHadoop.png | 430px | link=http://fora.tv/2013/06/26/Hadoop_and_Continuous_Computing_Looking_Beyond_MapReduce ]] | ||
+ | </center> | ||
+ | ** If you have at least 25 minutes left before the class time is over, do the [[Tutorial:_Running_a_Python_version_of_WorkCount_on_an_AWS_cluster| MapReduce-Python lab]], without attempting the challenges at the end. We'll do these together. | ||
+ | |||
+ | <br /> | ||
+ | * '''Thursday''': | ||
+ | ** <font color="magenta">Tentative guest lecture: Nick Howe on CUDA and GPUs</font> | ||
+ | ** Some thoughts about INFOCOMP 2013 ([[CSC352 Keynote Presentations 2013| keynote]]) | ||
+ | ** Going over Homework #5 ([[CSC352 Walking a 2-Level Directory in C| Walking a 2-Level Directory in C]]) | ||
+ | |||
---- | ---- | ||
Line 248: | Line 406: | ||
|width="15%"| Week 13 <br /> 11/26 | |width="15%"| Week 13 <br /> 11/26 | ||
|width="60%"| | |width="60%"| | ||
− | * '''Tuesday''' | + | * '''Tuesday''': |
+ | ** <font color="red">No newsletter due</font> | ||
+ | ** <font color="goldenrod">Paper presentation</font>: [[Media:AViewOfCloudComputing_CACM_Apr2010.pdf| A View of Cloud Computing]] presented by Danaë. | ||
+ | ** 5-minute project presentations (everybody) | ||
+ | ** Instead of a newsletter, you may turn today a [[CSC352 Project Introduction in Latex | draft of an introduction to your final project]]. If you have too much work this week, you can turn this in on 12/3. | ||
+ | ** [[Tutorial: A bit of Bash | A bit of Bash]] | ||
+ | ** The challenges of the [[Tutorial:_Running_a_Python_version_of_WorkCount_on_an_AWS_cluster| MapReducing in Python]] lab | ||
* '''Thursday''': <font color="magenta">Thanksgiving Break</font> | * '''Thursday''': <font color="magenta">Thanksgiving Break</font> | ||
Line 260: | Line 424: | ||
|| | || | ||
* '''Tuesday''' | * '''Tuesday''' | ||
+ | ** <font color="goldenrod">Paper presentation</font>: [[Media:unreasonableEffectivenessOfData2009_HalevyNorvigPereira.pdf | The Unreasonable Effectiveness of Data]] presented by Julia | ||
+ | ** Instead of a newsletter, you need to turn in a [[CSC352 Project Introduction in Latex | draft of an introduction to your final project]] (unless you submitted it last week). | ||
+ | |||
+ | ** The challenges of the [[Tutorial:_Running_a_Python_version_of_WorkCount_on_an_AWS_cluster| MapReducing in Python]] lab. We have done Challenge #1 last time. We'll look at Challenge #2 and #3. | ||
+ | ** Some feedback on Homework #5 and one [[CSC352 Homework 5 Solution 2013| solution]]. | ||
+ | ** MapReduce task graphs | ||
+ | ---- | ||
+ | ---- | ||
* '''Thursday''' | * '''Thursday''' | ||
− | + | ** [[Hadoop_Tutorial_1.1_--_Generating_Task_Timelines | Distribution of Map and Reduce tasks over time]] | |
+ | ** Project work and discussion | ||
+ | ** 20-minute individual session (in class) to go over project, questions, setup, etc... | ||
---- | ---- | ||
− | * | + | * |
|| | || | ||
| | ||
Line 271: | Line 445: | ||
|width="15%"| Week 15 <br /> 12/10 | |width="15%"| Week 15 <br /> 12/10 | ||
|width="60%"| | |width="60%"| | ||
+ | [[Image:CSC352Row.jpg|150px|right]] | ||
* '''Tuesday''': <font color="lightblue">Last Day of Class</font> | * '''Tuesday''': <font color="lightblue">Last Day of Class</font> | ||
+ | ** 20-minute presentations of projects. Suggested outline: | ||
+ | *** The context: how your project fits in the overall pictures | ||
+ | *** Has other similar work been done and documented before | ||
+ | *** What you decided to do | ||
+ | **** The challenges | ||
+ | **** The choices | ||
+ | **** The target experiments | ||
+ | *** Preliminary results | ||
+ | *** Expected results | ||
+ | *** Possible directions for continuing research after the project | ||
---- | ---- | ||
− | + | An afternoon of packing circular crepes, including some imaginative variations... | |
+ | [[Image:PackingCrepes1.jpg|200px]][[Image:PackingCrepes2.jpg|200px]] | ||
+ | [[Image:PackingCrepes3.jpg|200px]] | ||
+ | [[Image:PackingCrepes4.jpg|200px]] | ||
+ | [[Image:PackingCrepes5.jpg|200px]] | ||
+ | [[Image:PackingCrepes6.jpg|200px]] | ||
|| | || | ||
Line 283: | Line 473: | ||
=Links and Resources= | =Links and Resources= | ||
+ | <br /> | ||
+ | ==Latex== | ||
+ | <br /> | ||
+ | * [http://www.youtube.com/playlist?list=PLCRFsOKSM7ePUBOfh3O-K5XZldM5uCPwk Latex tutorial (video)] | ||
+ | * [http://www.youtube.com/playlist?list=PLCRFsOKSM7eNGNghvT6QdzsDYwSTZxqjC How to write a thesis in Latex (video)] | ||
+ | * [http://www.youtube.com/playlist?list=PLCRFsOKSM7eO-WX2ENa5A5vtNx1kjPefY Presentations with Beamer (video)] | ||
+ | * [http://www.youtube.com/playlist?list=PLCRFsOKSM7eN6jPk0wSopXb37RKW93PM3 TikZ examples (video)] | ||
+ | <br /> | ||
+ | |||
+ | ==Smith Elements of Style== | ||
+ | <br /> | ||
+ | * [[media:SmithJacobsonCenterWritingPapers-1.pdf | "Writing Papers" from the Smith College Jacobson Center for writing]] | ||
<br /> | <br /> | ||
==On-Line Resources== | ==On-Line Resources== | ||
Line 295: | Line 497: | ||
<br /> | <br /> | ||
==Papers== | ==Papers== | ||
− | + | This is a tentative and non exhaustive list of papers scheduled for reading this semester. | |
− | + | ===Introduction=== | |
− | + | {| class="wikitable" style="width: 100%;" | |
+ | !width="600" | Paper | ||
+ | ! Pages | ||
+ | |- | ||
+ | | style="width: 90%;" | | ||
* [http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf The Landscape of Parallel Computing Research: A View From Berkely], 2006, still good! (very long paper) | * [http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf The Landscape of Parallel Computing Research: A View From Berkely], 2006, still good! (very long paper) | ||
+ | | | ||
+ | 50 | ||
+ | |- | ||
+ | | | ||
* [[Media:UpdateOnaViewFromBerkeley2010.pdf | Update on a view from Berkeley]], 2010. (short paper) | * [[Media:UpdateOnaViewFromBerkeley2010.pdf | Update on a view from Berkeley]], 2010. (short paper) | ||
− | + | | | |
+ | 2 | ||
+ | |} | ||
+ | |||
+ | ===General/Parallelism=== | ||
+ | |||
+ | {| class="wikitable" style="width: 100%;" | ||
+ | !width="600" | Paper | ||
+ | ! Pages | ||
+ | |- | ||
+ | | style="width: 90%;" | | ||
* [[Media:ParallelCOmputingWithPatternsAndFrameworks2010b.pdf | Parallel Computing with Patterns and Frameworks]], 2010, ''XRDS''. | * [[Media:ParallelCOmputingWithPatternsAndFrameworks2010b.pdf | Parallel Computing with Patterns and Frameworks]], 2010, ''XRDS''. | ||
− | + | | | |
− | + | 5 | |
− | + | |- | |
+ | | | ||
* [[Media:UnderstandingThroughputOrientedArchitectures2010.pdf | Understanding Throughput-Oriented Architectures]], CACM, 2010. | * [[Media:UnderstandingThroughputOrientedArchitectures2010.pdf | Understanding Throughput-Oriented Architectures]], CACM, 2010. | ||
− | + | | | |
+ | 7 | ||
+ | |- | ||
+ | | | ||
* [[Media:unreasonableEffectivenessOfData2009_HalevyNorvigPereira.pdf | The Unreasonable Effectiveness of Data]], by Halevy, Norvig, Pereira, IEEE Intelligent Systems, IEEE Intelligent Systems, March 2009, Vol. 24, No. 2, pp. 8-12. | * [[Media:unreasonableEffectivenessOfData2009_HalevyNorvigPereira.pdf | The Unreasonable Effectiveness of Data]], by Halevy, Norvig, Pereira, IEEE Intelligent Systems, IEEE Intelligent Systems, March 2009, Vol. 24, No. 2, pp. 8-12. | ||
+ | | | ||
+ | 5 | ||
+ | |} | ||
+ | |||
+ | ===MPI=== | ||
+ | {| class="wikitable" style="width: 100%;" | ||
+ | !width="600" | Paper | ||
+ | ! Pages | ||
+ | |- | ||
+ | |style="width: 90%;" | | ||
+ | * [[Media:LearningFromTheSuccessOfMPI2002_WilliamGropp.pdf | Learning from the Success of MPI]], by WIlliam D. Gropp, Argonne National Lab, 2002. | ||
+ | | | ||
+ | 11 | ||
+ | |} | ||
+ | |||
+ | ===GPUs=== | ||
+ | {| class="wikitable" style="width: 100%;" | ||
+ | !width="600" | Paper | ||
+ | ! Pages | ||
+ | |- | ||
+ | |style="width: 90%;" | | ||
+ | * [[Media:GeneralPurposeVsGPU_Comparison_Many_Cores_2010_Caragea.pdf |General-Purpose vs. GPU: Comparisons of Many-Cores on Irregular Workloads]], 2010 | ||
+ | | | ||
+ | 6 | ||
+ | |} | ||
+ | |||
+ | ===Virtualization=== | ||
+ | {| class="wikitable" style="width: 100%;" | ||
+ | !width="600" | Paper | ||
+ | ! Pages | ||
+ | |- | ||
+ | |style="width: 90%;" | | ||
+ | * [[Media:ServerVirtualizationArchitectureAndImplementation2009.pdf | Server Virtualization Architecture and Implementation]], xrds, 2009 | ||
+ | | | ||
+ | 5 | ||
+ | |} | ||
+ | |||
+ | ===Cloud=== | ||
+ | {|class = "wikitable" style="width: 100%;" | ||
+ | !width="600" | Paper | ||
+ | ! Pages | ||
+ | |- | ||
+ | |style="width: 90%;" | | ||
+ | * [[Media:NIST_Definition_Cloud_Computing_2010.pdf | The NIST Definition of Cloud Computing (Draft)]] (very short paper) | ||
+ | | | ||
+ | 1.5 | ||
+ | |- | ||
+ | | | ||
+ | * [[Media:AViewOfCloudComputing_CACM_Apr2010.pdf| A View of Cloud Computing]], 2010, By Armbrust, Michael and Fox, Armando and Griffith, Rean and Joseph, Anthony D. and Katz, Randy and Konwinski, Andy and Lee, Gunho and Patterson, David and Rabkin, Ariel and Stoica, Ion and Zaharia, Matei. | ||
+ | | | ||
+ | 9 | ||
+ | |- | ||
+ | | | ||
+ | * [[Media:MapReduceDeanGhemawat_2004.pdf |MapReduce: SImplified Data Processing on Large Clusters]], by Dean and Ghemawat, First published in OSDI 2004, also in Commun. ACM 51, 1 (January 2008), 107-113. | ||
+ | | | ||
+ | 13 | ||
+ | |- | ||
+ | | | ||
+ | * [[Media:NobodyGotFiredUsingHadoopOnCluster_2012.pdf| Nobody ever got fired for using Hadoop on a cluster]], Rowstron, Antony and Narayanan, Dushyanth and Donnelly, Austin and O'Shea, Greg and Douglas, Andrew | ||
+ | | | ||
+ | 5 | ||
+ | |- | ||
+ | | | ||
+ | * [[Media:BeyondHadoop_CACM_Mone_2013.pdf | Beyond Hadoop]], Gregory Mone, CACM, 2013. (short paper). | ||
+ | | | ||
+ | 2 | ||
+ | |} | ||
+ | |||
+ | ===Project-Related=== | ||
+ | {| class="wikitable" style="width: 100%;" | ||
+ | !width="600" | Paper | ||
+ | ! Pages | ||
+ | |- | ||
+ | |style="width: 90%;" | | ||
+ | * [[Media:XGridHadoopCloser2011.pdf | Processing Wikipedia Dumps: A Case-Study comparing the XGrid and MapReduce Approaches]], D. Thiebaut, Yang Li, Diana Jaunzeikare, Alexandra Cheng, Ellysha Raelen Recto, Gillian Riggs, Xia Ting Zhao, Tonje Stolpestad, and Cam Le T Nguyen, ''in proceedings of 1st Int'l Conf. On Cloud Computing and Services Science'' (CLOSER 2011), Noordwijkerhout, NL, May 2011. ([[Media:XGridHadoopFeb2011.pdf |longer version]]) | ||
+ | | | ||
+ | 8 | ||
+ | |} | ||
+ | |||
<p> | <p> | ||
<br /> | <br /> |
Latest revision as of 11:31, 31 January 2017
--D. Thiebaut (talk) 11:15, 9 August 2013 (EDT)
Contents
Weekly Schedule
Week | Topics | Reading |
Week 1 9/3 |
Thread 1 ----------------------|====|-------------------------> time Thread 2 ------------|====|-----------------------------------> time
|
|
Week 2 9/10 |
(last slide taken from [www.cs.utexas.edu/users/mckinley/352/lectures/16.pdf http://www.cs.utexas.edu/users/mckinley/352/lectures/16.pdf])
(Image taken from URL: http://www.altera.com/technology/system-design/articles/2012/multicore-many-core.html)
|
|
Week 3 9/17 |
|
All the data structures of interest (concurrent non-blocking and blocking) can be found in the Oracle documentation. The information is a bit cryptic, but you need to get comfortable with it! |
Week 4 9/24 |
|
|
Week 5 10/1 |
|
|
Week 6 10/8 |
|
|
Week 7 10/15 |
|
|
Week 8 10/22 |
|
|
Week 9 10/29 |
|
|
Week 10 11/5 |
|
|
Week 11 11/12 |
|
Yahoo has some very good reading material on Hadoop. One reason is that they may be one of the largest users of AWS and of Hadoop.
|
Week 12 11/19 |
|
|
Week 13 11/26 |
|
|
Week 14 12/3 |
|
|
Week 15 12/10 |
An afternoon of packing circular crepes, including some imaginative variations... |
|
Links and Resources
Latex
- Latex tutorial (video)
- How to write a thesis in Latex (video)
- Presentations with Beamer (video)
- TikZ examples (video)
Smith Elements of Style
On-Line Resources
- Introduction to Parallel Processing, by Blaise Barney, Lawrence Livermore National Laboratory. A good read. Covers most of the important topics.
- Introduction to MPI, by Blaise Barney, Lawrence Livermore National Laboratory. Another short but excellent coverage of a topic in parallel processing, this time MPI.
- A 90-Minute Guide to Modern Microprocessors
Classics
- Designing and Building Parallel Programs, by Ian Foster. A relatively old reference (1995), with still good information.
Papers
This is a tentative and non exhaustive list of papers scheduled for reading this semester.
Introduction
Paper | Pages |
---|---|
|
50 |
|
2 |
General/Parallelism
Paper | Pages |
---|---|
|
5 |
|
7 |
|
5 |
MPI
Paper | Pages |
---|---|
|
11 |
GPUs
Paper | Pages |
---|---|
6 |
Virtualization
Paper | Pages |
---|---|
5 |
Cloud
Paper | Pages |
---|---|
|
1.5 |
|
9 |
|
13 |
|
5 |
|
2 |
Project-Related
Paper | Pages |
---|---|
|
8 |