Difference between revisions of "CSC352 Class Page 2013"

From dftwiki3
Jump to: navigation, search
(MPI)
(Weekly Schedule)
 
(114 intermediate revisions by the same user not shown)
Line 6: Line 6:
 
<br />
 
<br />
 
<br />
 
<br />
<center>[[CSC352 2013|Main Page]]|[[CSC352 Syllabus 2013|Syllabus]]|[[CSC352_Class_Page_2013#Links_and_Resources|Links & Resources]] | [https://piazza.com/smith/fall2013/csc352/home PIAZZA]</center>
+
<center>[[CSC352 2013|Main Page]]|[[CSC352 Syllabus 2013|Syllabus]]|[http://cs.smith.edu/classwiki/index.php/CSC352_Project_Page_2013 Project Page]
 +
| [https://piazza.com/smith/fall2013/csc352/home PIAZZA]</center>
 
<br />
 
<br />
 
<br />
 
<br />
Line 60: Line 61:
 
*** Regroup and gather statistics on the different machines in the classroom
 
*** Regroup and gather statistics on the different machines in the classroom
  
----
 
**Introduction to '''Latex'''
 
*** [http://cs.smith.edu/dftwiki/index.php/Tutorial:_Writing_a_Latex_paper_with_ShareLatex.com Tutorial #1] on Latex and ShareLatex
 
*** [http://cs.smith.edu/dftwiki/index.php/Latex_Skeleton_for_Simple_Articles_and_Tech_Reports Latex document template]
 
*** [[Latex Example: Bib File: Example Bib File]] for Latex paper (from ACM)
 
*** Learn how to find BibTex entries.  Example: [http://dl.acm.org/citation.cfm?id=1525689 The Unreasonable Effectiveness of Data] (go to ACM and click on BibTex link).
 
*** If you are considering working on an honors thesis, you might want to take a look at this [[Latex and Editing Tools to write an Honors Thesis|  page]] on writing Honors thesis with Latex.
 
<br />
 
 
----
 
----
 
** Comments on '''bimonthly newsletter'''
 
** Comments on '''bimonthly newsletter'''
Line 84: Line 77:
 
**** The Official Google blog
 
**** The Official Google blog
 
**** Review: Tom's Hardware
 
**** Review: Tom's Hardware
**** Some of the  sites listed in [http://tech.blorge.com/Structure:%20/2008/11/15/top-40-technology-news-sites-the-definitive-guide/ blorge.com]'s top-40 list.
+
**** Some of the  sites listed in [https://rohidassanap.wordpress.com/2013/06/18/top-40-best-technology-news-websites-the-definitive-list/ this page's] top 40 list.
 
*** Recommendation for news aggregator:  [http://cloud.feedly.com/#welcome Feedly.com]
 
*** Recommendation for news aggregator:  [http://cloud.feedly.com/#welcome Feedly.com]
  
Line 180: Line 173:
  
 
----
 
----
* [[CSC352 Homework 2 2013| Homework 2]], will be due 9/28/13 at midnight
+
* [[CSC352 Homework 2 2013| Homework 2]], will be due 9/28/13 at midnight.  [[CSC352 Homework 2 2013 Solution| Solution programs]]
 
||
 
||
 
All the data structures of interest (concurrent non-blocking and blocking) can be found in the Oracle documentation.  The information is a bit cryptic, but you need to get comfortable with it!
 
All the data structures of interest (concurrent non-blocking and blocking) can be found in the Oracle documentation.  The information is a bit cryptic, but you need to get comfortable with it!
Line 190: Line 183:
 
| Week 4 <br /> 9/24<br />
 
| Week 4 <br /> 9/24<br />
 
||
 
||
* '''Tuesday''': <font color="magenta">Tentative Guest Lecture/Informal discussion with by [http://en.wikipedia.org/wiki/Timothy_C._Draper Tim Draper]</font>
+
* '''Tuesday''': <font color="magenta">Guest Lecture/Informal discussion with by [http://en.wikipedia.org/wiki/Timothy_C._Draper Tim Draper]</font>
 
** Some questions to start the conversation:
 
** Some questions to start the conversation:
 
** How has the cloud infrastructure changed entrepreneurship, if at all?   
 
** How has the cloud infrastructure changed entrepreneurship, if at all?   
Line 202: Line 195:
 
** Companies rise and fall.  Microsoft was once the place where all our majors wanted to go.  The most prestigious company for programmers.  Now it's Google, and Facebook.  Which company(ies) do you see as potential new meccas for programmers?
 
** Companies rise and fall.  Microsoft was once the place where all our majors wanted to go.  The most prestigious company for programmers.  Now it's Google, and Facebook.  Which company(ies) do you see as potential new meccas for programmers?
 
** If somebody were to form a start-up with friends.  Say 10 people.  Who/What/Where?  Who should the people be?  What field should they be experts in?  Where should the company locate?
 
** If somebody were to form a start-up with friends.  Say 10 people.  Who/What/Where?  Who should the people be?  What field should they be experts in?  Where should the company locate?
 +
<br />
 +
<br />
 +
<center>[[Image:TimMelissaDraper.png|400px|link=http://www.smith.edu/video/investing-smith-entrepreneurs]]</center>
 
<br />
 
<br />
 
** Review of Homework 1 and its  [[CSC352 Homework 1 Solution 2013 | Solution]].
 
** Review of Homework 1 and its  [[CSC352 Homework 1 Solution 2013 | Solution]].
Line 209: Line 205:
 
*** be sure to understand if you need the same random seed or a different seed in your experiments
 
*** be sure to understand if you need the same random seed or a different seed in your experiments
 
*** create a different user on your laptop with no extra applications loaded in the background (e.g. Skype): less stress on the O.S.
 
*** create a different user on your laptop with no extra applications loaded in the background (e.g. Skype): less stress on the O.S.
 +
 
----
 
----
 
----
 
----
Line 225: Line 222:
 
* '''Tuesday''' (Grace Hopper Conference)
 
* '''Tuesday''' (Grace Hopper Conference)
 
**  Introduction to Packing [http://cs.smith.edu/dftwiki/images/CSC352_IntroductionToPacking.pdf  pdf] and [http://cs.smith.edu/dftwiki/images/CSC352_IntroductionToPacking.ppt  ppt]
 
**  Introduction to Packing [http://cs.smith.edu/dftwiki/images/CSC352_IntroductionToPacking.pdf  pdf] and [http://cs.smith.edu/dftwiki/images/CSC352_IntroductionToPacking.ppt  ppt]
 +
<center><videoflash>vDHFF4wjWYU</videoflash></center>
 
** Studying the [[CSC352 Red-Black Trees in Java | Red-Black Tree]] data-structure
 
** Studying the [[CSC352 Red-Black Trees in Java | Red-Black Tree]] data-structure
 
*** Why is it not thread-safe?
 
*** Why is it not thread-safe?
 
*** How can we make it thread-safe?
 
*** How can we make it thread-safe?
 +
*** Devise a test to verify that the modifications have resulted in a thread-safe class
 
*** [[Tutorial: Profiling Java Programs | Profiling Java applications ]] (introduction to Java's '''GC''').
 
*** [[Tutorial: Profiling Java Programs | Profiling Java applications ]] (introduction to Java's '''GC''').
 
----
 
----
Line 235: Line 234:
 
* '''Thursday''' (Grace Hopper Conference)
 
* '''Thursday''' (Grace Hopper Conference)
 
** <font color="red">Newsletter #2 due today.  Please include 1 news item about some form of image collage, representation of many images in some form, hopefully digital.  <u>Also, please use a Latex feature you haven't used in your first newsletter</u></font>
 
** <font color="red">Newsletter #2 due today.  Please include 1 news item about some form of image collage, representation of many images in some form, hopefully digital.  <u>Also, please use a Latex feature you haven't used in your first newsletter</u></font>
 
+
** Elaborating a [[CSC352 Project Roadmap| roadmap]] for the final [[http://cs.smith.edu/classwiki/index.php/CSC352_Project_Page_2013 Project]]
 
----
 
----
 
*  
 
*  
Line 245: Line 244:
 
||
 
||
 
* '''Tuesday'''
 
* '''Tuesday'''
* '''Thursday''': 50-min Presentation by Rocco Piccinino in FH241 on using the library resources for research.
+
** An Introduction to C ([[CSC352 Keynote Presentations 2013| keynote]]) (we stopped at  "Arrays") [[Solutions to Introduction to C Presentation | C solutions for Intro. to C Exercises]]
 +
* '''Thursday''': [http://libguides.smith.edu/content.php?pid=510405 50-min Presentation by Rocco Piccinino],  Head of the Young Science Library, in FH345 on using the library resources for research.
  
 
----
 
----
*  
+
* [[CSC352 Homework 3 2013| Homework 3]] will be due 10/22. [[CSC352 Homework 3 Solution 2013| Solution Programs]]
 
||
 
||
&nbsp;
+
*  [https://computing.llnl.gov/tutorials/mpi/ MPI] by Blaise Barney,  at Lawrence Livermore National Laboratory: an excellent reference on MPI
  
 
<!-- ================================================================== -->
 
<!-- ================================================================== -->
Line 258: Line 258:
 
* '''Tuesday''': <font color="magenta">Fall Break</font>
 
* '''Tuesday''': <font color="magenta">Fall Break</font>
 
* '''Thursday'''
 
* '''Thursday'''
 +
**  Introduction to the  [[CSC352 Final Project 2013| Final Project]]
 +
** An Introduction to MPI ([[CSC352 Keynote Presentations 2013 | keynote]]).  You may want to install MPI [[Install MPI on a MacBook| on your MacBook]]<br />(We stopped on Thursday on the '''MPI_Send()''' function.)
 +
  
 
----
 
----
 
*  
 
*  
 
||
 
||
 
+
* [https://computing.llnl.gov/tutorials/mpi/ Super MPI-Introduction from the Lawrence Livermore Nat. Lab]
 
<!-- ================================================================== -->
 
<!-- ================================================================== -->
 
|- style="background:#eeeeff" valign="top"
 
|- style="background:#eeeeff" valign="top"
Line 268: Line 271:
 
||
 
||
 
* '''Tuesday'''
 
* '''Tuesday'''
 +
** <font color="goldenrod">Paper presentation</font>: [[Media:LearningFromTheSuccessOfMPI2002_WilliamGropp.pdf | Learning from the Success of MPI]], presented by '''Gavi''' ([[Learning From the Success of MPI Bibtex| Bibtex]])
 +
** Hadoop0 accounts
 +
** Learn how to become '''rsync''' champions!
 +
** Continuation of the introduction to MPI ([[CSC352 Keynote Presentations 2013| keynote]]).  We stopped on Thursday on the '''MPI_Send()''' function. 
 +
** Code for the [[CSC352 MPI pi2.c program | pi2.c]] program computing Pi using summation of a series
 +
** <font color="red">Newsletter #3  due today!</font>
 +
 
* '''Thursday'''
 
* '''Thursday'''
 
+
** Continuation of the introduction to MPI ([[CSC352 Keynote Presentations 2013| keynote]])
 +
** Introduction on how to operate a MySQL database ([[CSC352 Keynote Presentations 2013| keynote]])
 +
** A project-oriented MPI example.  Bring your Mac!
 
----
 
----
*  
+
* [[CSC352 Homework 4 2013| Homework 4]] on C and MPI.  [[CSC352 Homework 4 Solutions| Solution programs]]
 
||
 
||
 
&nbsp;
 
&nbsp;
Line 280: Line 292:
 
|width="60%"|
 
|width="60%"|
 
* '''Tuesday'''
 
* '''Tuesday'''
 +
** <font color="goldenrod">Paper presentation</font>: [[Media:ServerVirtualizationArchitectureAndImplementation2009.pdf | Server Virtualization Architecture and Implementation]] presented by Emily
 +
** A few words about newsletters
 +
** [[CSC352 Where is What on Hadoop0 2013| Where is What on Hadoop0]]
 +
** MySQL Exercises
 +
** [[Tutorial:_C_%2B_MySQL_%2B_MPI | Combining C, MySQL and MPI]]: combing through a lot of code
 +
** Project discussion
 
* '''Thursday'''
 
* '''Thursday'''
 
+
** [[Tutorial: Create an MPI Cluster on the Amazon Elastic Cloud (EC2) | Creating an MPI Cluster on Amazon]], ([[CSC352 Keynote Presentations 2013| Accompanying keynote]]), followed by a second [[Computing Pi on an AWS MPI-Cluster| tutorial]] on computing Pi on a 10-node AWS cluster.
 
----
 
----
 
*  
 
*  
Line 291: Line 309:
 
||
 
||
 
* '''Tuesday''': <font color="magenta">Otelia Cromwell Day</font>
 
* '''Tuesday''': <font color="magenta">Otelia Cromwell Day</font>
* '''Thursday'''
+
* '''Thursday''':
 +
** <font color="goldenrod">Paper presentation</font>:  [[Media:MapReduceDeanGhemawat_2004.pdf |MapReduce: Simplified Data Processing on Large Clusters]] presented by Sharon Pamela
 +
** <font color="red">Newsletter #4 due today!</font>.  Please include at least one image, and at least one news item covering some form of project that could be related or influential for our own wiki-collage project.  See [http://cs.smith.edu/dftwiki/index.php/Latex_and_Editing_Tools_to_write_an_Honors_Thesis this document on writing theses] for information about the inclusion of images in Latex.  The end section has a good list of sites that have good coverage of Latex topics.  There is also plenty of information on the Web about this subject.
 +
** Preparation for [[CSC352 Homework 5 2013 | Homework 5]]: attaching EBS volumes.  We'll do a lab in class to [[Create_an_MPI_Cluster_on_the_Amazon_Elastic_Cloud_(EC2)#Creating_an_EBS_Volume | create ]] and [[Create_an_MPI_Cluster_on_the_Amazon_Elastic_Cloud_(EC2)#Attaching_the_EBS_Volume_to_the_Cluster | attach]] an EBS volume to your AWS cluster.
 +
 
  
 
----
 
----
*  
+
* [[CSC352 Homework 5 2013 | Homework 5]] and [[CSC352 Homework 5 Solution 2013| Solution]]
 
||
 
||
 
&nbsp;
 
&nbsp;
Line 303: Line 325:
 
|width="60%"|
 
|width="60%"|
 
* '''Tuesday'''
 
* '''Tuesday'''
 +
** <font color="goldenrod">Paper presentation</font>: [[Media:GeneralPurposeVsGPU_Comparison_Many_Cores_2010_Caragea.pdf |General-Purpose vs. GPU: Comparisons of Many-Cores on Irregular Workloads]], presented by Yoshie
 +
*** '''Questions about the paper:'''
 +
**** What kind of paper is this?  Broad distribution?  Research?  Small group?
 +
**** Organization?  Abstract?  Introduction?  Definition of specialized terms?  Early enough in the paper?
 +
**** Are the contributions of paper clear?  The section on related research sufficient?
 +
**** What is being compared?  Similar machines?  Hardware?  Software?
 +
**** Are authors partial?  Do they have a stake?
 +
**** How does the paper advance the state of research?
 +
**** What does it tell us about the way computer systems evolve? 
 +
** Thinking about the project
 +
*** What do we know better about the overall project.  What pieces have we looked at?
 +
*** What is it we don't know?
 +
*** Can we turn any of these questions into a project?
 +
<br />
 
* '''Thursday'''
 
* '''Thursday'''
 
+
** A few comments on Manager/Worker paradigm in MPI: not the only one.  Many logical communication networks do not match star pattern
 +
** Continuation of Project Discussion
 +
** Quick Introduction to Hadoop/MapReduce ([[CSC352 Keynote Presentations 2013| Accompanying keynote]])
 +
** [[Tutorial:_Creating_a_Hadoop_Cluster_on_Amazon_AWS | MapReduce lab on AWS]]
 +
** [[CSC352 Bash Script to Run Hadoop WordCount| Script to run Hadoop WordCount program]] on AWS
 
----
 
----
 
*  
 
*  
 
||
 
||
 +
Yahoo has some very good reading material on Hadoop.  One reason is that they may be one of the largest users of AWS and of Hadoop. 
 +
** [http://developer.yahoo.com/hadoop/tutorial/ Yahoo Developers Network]: Tutorial on Hadoop.  All the chapters are worth reading!
  
 
<!-- ================================================================== -->
 
<!-- ================================================================== -->
Line 313: Line 355:
 
| Week 12 <br /> 11/19<br />
 
| Week 12 <br /> 11/19<br />
 
||
 
||
* '''Tuesday''': <font color="magenta">Guest Lecture (DT @ INFOCOMP 2013)</font>
+
* '''Tuesday''':  
* '''Thursday''': <font color="magenta">Tentative guest lecture: Nick Howe on CUDA and GPUs</font>
+
** <font color="red">1 month to go (exactly) before the project is due (Dec. 19)!</font>
 +
** <font color="magenta">Student-directed work (DT @ INFOCOMP 2013)</font>
 +
** Finish the [[Tutorial:_Creating_a_Hadoop_Cluster_on_Amazon_AWS | MapReduce lab on AWS]] and make sure you do the [[Tutorial:_Creating_a_Hadoop_Cluster_with_StarCluster_on_Amazon_AWS#Challenge_.23_2 | Challenge 2]] part of the lab.
 +
** Food for thought: some videos<br />I suggest one of you connects her laptop to the projection system and you all watch these videos together.  After each one, discuss it as a group.  Take notes and be ready to share your comments during Thursday's class when we resume our regular schedule.
 +
*** The Cave 2 Project at the University of Illinois:  Just another hardware solution for presenting the user with a large number of pixels; in this case  27320 x 3072 pixels. ''Short, 3 minutes.''
 +
<center>
 +
<videoflash>yf0sllpZx3w</videoflash>
 +
</center>
 +
<br />
 +
*** The Creators Projects video<br />
 +
::::This video is not necessarily anything that can work for us, but it's just "food for thought."  Just a different way an artist has come up to make still pictures interesting to look at.  ''Short, 6 minutes''.
 +
<center>
 +
<videoflash>rKmMaDBoZhs</videoflash>
 +
</center>
 +
<br />
 +
***  O'Reilly Radar Videos<br />
 +
[[Image:OReillyPerlBookCover.jpg|100px|right]]
 +
::: Tim O'Reilly is a visionary who figured out a long time ago that computer technology was an exploding field and he started a very successful line of books to support all new technology projects that were emerging and promising.  The books all have animals on them and are uniquely easy to spot.  O'Reilly now also has an on-line channel (O'Reilly Radar), and organizes conferences with top researchers and intellectuals in the field of computer science.
 +
::: The first video is with Doug Cutting, one of the creators of Hadoop.  He makes some very good points about what Hadoop is, what it is good at, and what it might not be good at (Homework 5 lesson?).  After Cutting you can skip the 2nd interview (about video technology) and zip to the 3rd interview with Jeremy Howard, at time-tag 13:47.  Then learn about big data and analytics, and what is said of ''data scientists''. '' About 12 minutes total''.
 +
<center>
 +
<videoflash>BWBGQIq5zow</videoflash>
 +
</center>
 +
<br />
 +
::: Good interview of Tim O'Reilly describing Web 2.0, and his view of a data-driven Internet.  8-minute long.  You may want to think about how our wikipedia data (images, stats) relate to what is said about data as described in the interview.  ''About 8 minutes''.
 +
<center>
 +
<videoflash>FJ3TxeE_tHI</videoflash>
 +
</center>
 +
<br />
 +
::: The next video filmed in June 2013 presents Bruno Fernandez-Ruiz of Yahoo, who speaks about Hadoop since 2005, Hadoop today, and what is ahead.  An important type of data property Fernandez-Ruiz is interested in is ''timeliness'', which we haven't really looked at for our project, but you will see that it could apply easily to the dynamics of wikipedia.  Some interesting statistics about the number of servers, the size of the HDFS they use, the number of processes are given.  ''About 17 minutes''.
 +
<center>
 +
[[Image:LookingBeyondHadoop.png | 430px | link=http://fora.tv/2013/06/26/Hadoop_and_Continuous_Computing_Looking_Beyond_MapReduce ]]
 +
</center>
 +
** If you have at least 25 minutes left before the class time is over, do the [[Tutorial:_Running_a_Python_version_of_WorkCount_on_an_AWS_cluster| MapReduce-Python lab]], without attempting the challenges at the end.  We'll do these together.
 +
 
 +
<br />
 +
* '''Thursday''':  
 +
** <font color="magenta">Tentative guest lecture: Nick Howe on CUDA and GPUs</font>
 +
** Some thoughts about INFOCOMP 2013 ([[CSC352 Keynote Presentations 2013| keynote]])
 +
** Going over Homework #5 ([[CSC352 Walking a 2-Level Directory in C| Walking a 2-Level Directory in C]])
 +
 
  
 
----
 
----
Line 325: Line 406:
 
|width="15%"| Week 13 <br /> 11/26
 
|width="15%"| Week 13 <br /> 11/26
 
|width="60%"|
 
|width="60%"|
* '''Tuesday'''
+
* '''Tuesday''':
 +
** <font color="red">No newsletter due</font>
 +
** <font color="goldenrod">Paper presentation</font>:  [[Media:AViewOfCloudComputing_CACM_Apr2010.pdf| A View of Cloud Computing]] presented by Dana&euml;.
 +
** 5-minute project presentations (everybody)
 +
**  Instead of a newsletter, you may turn today a [[CSC352 Project Introduction  in Latex | draft of an introduction to your final project]].  If you have too much work this week, you can turn this in on 12/3.
 +
** [[Tutorial: A bit of Bash | A bit of Bash]]
 +
** The challenges of the [[Tutorial:_Running_a_Python_version_of_WorkCount_on_an_AWS_cluster| MapReducing in Python]] lab
 
* '''Thursday''': <font color="magenta">Thanksgiving Break</font>
 
* '''Thursday''': <font color="magenta">Thanksgiving Break</font>
  
Line 337: Line 424:
 
||
 
||
 
* '''Tuesday'''
 
* '''Tuesday'''
 +
** <font color="goldenrod">Paper presentation</font>:  [[Media:unreasonableEffectivenessOfData2009_HalevyNorvigPereira.pdf | The Unreasonable Effectiveness of Data]] presented by Julia
 +
**  Instead of a newsletter, you need to turn in a [[CSC352 Project Introduction  in Latex | draft of an introduction to your final project]] (unless you submitted it last week).
 +
 +
** The challenges of the [[Tutorial:_Running_a_Python_version_of_WorkCount_on_an_AWS_cluster| MapReducing in Python]] lab.  We have done Challenge #1 last time.  We'll look at Challenge #2 and #3.
 +
** Some feedback on Homework #5 and one [[CSC352 Homework 5 Solution 2013| solution]].
 +
** MapReduce task graphs
 +
----
 +
----
 
* '''Thursday'''
 
* '''Thursday'''
 
+
** [[Hadoop_Tutorial_1.1_--_Generating_Task_Timelines | Distribution of Map and Reduce tasks over time]]
 +
** Project work and discussion
 +
** 20-minute individual session (in class) to go over project, questions, setup, etc...
 
----
 
----
*  
+
*
 
||
 
||
 
&nbsp;
 
&nbsp;
Line 348: Line 445:
 
|width="15%"| Week 15 <br /> 12/10
 
|width="15%"| Week 15 <br /> 12/10
 
|width="60%"|
 
|width="60%"|
 +
[[Image:CSC352Row.jpg|150px|right]]
 
* '''Tuesday''': <font color="lightblue">Last Day of Class</font>
 
* '''Tuesday''': <font color="lightblue">Last Day of Class</font>
 +
** 20-minute presentations of projects.  Suggested outline:
 +
*** The context: how your project fits in the overall pictures
 +
*** Has other similar work been done and documented before
 +
*** What you decided to do
 +
**** The challenges
 +
**** The choices
 +
**** The target experiments
 +
*** Preliminary results
 +
*** Expected results
 +
*** Possible directions for continuing research after the project
  
 
----
 
----
*
+
An afternoon of packing circular crepes, including some imaginative variations...
 +
[[Image:PackingCrepes1.jpg|200px]][[Image:PackingCrepes2.jpg|200px]]
 +
[[Image:PackingCrepes3.jpg|200px]]
 +
[[Image:PackingCrepes4.jpg|200px]]
 +
[[Image:PackingCrepes5.jpg|200px]]
 +
[[Image:PackingCrepes6.jpg|200px]]
 
||
 
||
  
Line 360: Line 473:
  
 
=Links and Resources=
 
=Links and Resources=
 +
<br />
 +
==Latex==
 +
<br />
 +
* [http://www.youtube.com/playlist?list=PLCRFsOKSM7ePUBOfh3O-K5XZldM5uCPwk Latex tutorial (video)]
 +
* [http://www.youtube.com/playlist?list=PLCRFsOKSM7eNGNghvT6QdzsDYwSTZxqjC How to write a thesis in Latex (video)]
 +
* [http://www.youtube.com/playlist?list=PLCRFsOKSM7eO-WX2ENa5A5vtNx1kjPefY Presentations with Beamer (video)]
 +
* [http://www.youtube.com/playlist?list=PLCRFsOKSM7eN6jPk0wSopXb37RKW93PM3 TikZ examples (video)]
 +
<br />
 +
 +
==Smith Elements of Style==
 +
<br />
 +
* [[media:SmithJacobsonCenterWritingPapers-1.pdf | "Writing Papers" from the Smith College Jacobson Center for writing]]
 
<br />
 
<br />
 
==On-Line Resources==
 
==On-Line Resources==
Line 374: Line 499:
 
This is a tentative and non exhaustive list of papers scheduled for reading this semester.
 
This is a tentative and non exhaustive list of papers scheduled for reading this semester.
 
===Introduction===
 
===Introduction===
{| class="wikitable"
+
{| class="wikitable" style="width: 100%;"
 
!width="600" | Paper
 
!width="600" | Paper
 
! Pages
 
! Pages
 
|-
 
|-
 
| style="width: 90%;" |
 
| style="width: 90%;" |
* [http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf The Landscape of Parallel Computing Research: A View From Berkely], 2006, still good! (very long paper)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+
* [http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf The Landscape of Parallel Computing Research: A View From Berkely], 2006, still good! (very long paper)
 
|
 
|
 
50
 
50
Line 391: Line 516:
 
===General/Parallelism===
 
===General/Parallelism===
  
{| class="wikitable"
+
{| class="wikitable" style="width: 100%;"
 
!width="600" | Paper
 
!width="600" | Paper
 
! Pages
 
! Pages
Line 412: Line 537:
  
 
===MPI===
 
===MPI===
{| class="wikitable"
+
{| class="wikitable" style="width: 100%;"
 
!width="600" | Paper
 
!width="600" | Paper
 
! Pages
 
! Pages
Line 423: Line 548:
  
 
===GPUs===
 
===GPUs===
{| class="wikitable"
+
{| class="wikitable" style="width: 100%;"
 
!width="600" | Paper
 
!width="600" | Paper
 
! Pages
 
! Pages
 
|-
 
|-
 
|style="width: 90%;" |
 
|style="width: 90%;" |
* [[Media:GeneralPurposeVsGPU_Comparison_Many_Cores_2010_Caragea.pdf |General-Purpose vs. GPU: Comparisons of Many-Cores on Irregular Workloads]], 2010
+
* [[Media:GeneralPurposeVsGPU_Comparison_Many_Cores_2010_Caragea.pdf |General-Purpose vs. GPU: Comparisons of Many-Cores on Irregular Workloads]], 2010&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 
|
 
|
 
6
 
6
Line 434: Line 559:
  
 
===Virtualization===
 
===Virtualization===
{| class="wikitable"
+
{| class="wikitable" style="width: 100%;"
 
!width="600" | Paper
 
!width="600" | Paper
 
! Pages
 
! Pages
 
|-
 
|-
 
|style="width: 90%;" |
 
|style="width: 90%;" |
* [[Media:ServerVirtualizationArchitectureAndImplementation2009.pdf | Server Virtualization Architecture and Implementation]], xrds, 2009.
+
* [[Media:ServerVirtualizationArchitectureAndImplementation2009.pdf | Server Virtualization Architecture and Implementation]], xrds, 2009&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 
|
 
|
 
5
 
5
Line 445: Line 570:
  
 
===Cloud===
 
===Cloud===
{|class = "wikitable"
+
{|class = "wikitable" style="width: 100%;"
 
!width="600" | Paper
 
!width="600" | Paper
 
! Pages
 
! Pages
Line 476: Line 601:
  
 
===Project-Related===
 
===Project-Related===
{| class="wikitable"
+
{| class="wikitable" style="width: 100%;"
 
!width="600" | Paper
 
!width="600" | Paper
 
! Pages
 
! Pages

Latest revision as of 11:31, 31 January 2017

--D. Thiebaut (talk) 11:15, 9 August 2013 (EDT)




Main Page|Syllabus|Project Page | PIAZZA



Weekly Schedule

Week Topics Reading
Week 1
9/3
  • Tuesday
    • Syllabus
    • Introduction to final project
      • Approach
      • Programming
      • Testing
      • ==> paper (see 2011 paper for example).
    • Parallelism: going to the source: Interrupts!
      • 8086 type of interrupts (simplified)
      • Interrupt Vector
      • Interrupt Priority
      • Context Switch
      • Stack and Stack Frame
      • Global and Local Variables
    • What is a process?
    • What is a thread?




  • Thursday
    • Goals of multithreading:
      • Enhanced performance
      • Increased throughput
      • Greater user responsiveness
    • What should we remember 5 years from now?



    • Introduction to a graph we'll use all throughout the semester. The idea of threads
    Thread 1 ----------------------|====|-------------------------> time

    Thread 2 ------------|====|-----------------------------------> time


    • Multithreaded programming.

    • Comments on bimonthly newsletter
      • The format should be similar to the ACM Tech News format.
      • The header should contain a title, your name, the class (CSC352) and the date
      • Each paragraph should have a header with a title, the source of news, the date, and possibly a link to the full article.
      • The paragraph describing a news item should be between 3 to 10 lines, give or take.
      • Write 1 full page to 2 pages, depending on the richness of events in the technology field
      • Feel free to present N-1 topics with just 3 lines, and 1 topic which you highlight with a longer paragraph.
      • Topics: anything related to parallelism: computers, mobile platforms, cloud, companies, new software, new algorithms, conferences, people in the field, etc.
      • Good sources of information to get started:
      • Recommendation for news aggregator: Feedly.com

  • Homework: play with Latex. Find or adapt a document template for your needs (minimalist is the name of the game at this point), and start gathering news bits. First newsletter due Thursday Sept. 19th. The ACM Tech News format is a good and simple format to emulate.

Week 2
9/10
  • Tuesday
      • Introduction to measuring performance. Comparing execution times.
      • Introduction to Speedup( N ), where N is the number of threads, or the number of processors.
      • Amdahl's Law
AmdahlsLaw.jpg
      • A bit of Computer Architecture: Cores and Caches
4CoreAndLevel123Caches.png



4CoreAndLevel3CacheDie.jpg



LatenciesInMemoryHierarchy.png

(last slide taken from [www.cs.utexas.edu/users/mckinley/352/lectures/16.pdf http://www.cs.utexas.edu/users/mckinley/352/lectures/16.pdf])





  • Thursday
    • Discussion of A View of Parallel Processing from Berkeley. Prepare a 1- to 2-page summary of the paper in Latex. Hand-in the summary in class. No summaries will be accepted after class.
AViewFromBerkeleyWordle.png


    • Some topics taken from the paper:
      • Moore's Law:

MooresLawProcessorMemoryGap.gif

      • Barnes and Hut approach to N-Body problem


ManyCoreArchitecture.jpg

(Image taken from URL: http://www.altera.com/technology/system-design/articles/2012/multicore-many-core.html)

      • nanometers: where are we now?
Nm fabricationProcess.png


( Image taken from http://en.wikipedia.org/wiki/22_nanometer)

RingNetworkLinkingMultiCoreIntelArch.png
    • Short preparation for Maggie Lind's tour of the SCMA on Tuesday. Meeting place is entrance of SCMA.
      • What the project is about can be included in the field of Culturomics




Week 3
9/17

All the data structures of interest (concurrent non-blocking and blocking) can be found in the Oracle documentation. The information is a bit cryptic, but you need to get comfortable with it!

Week 4
9/24
  • Tuesday: Guest Lecture/Informal discussion with by Tim Draper
    • Some questions to start the conversation:
    • How has the cloud infrastructure changed entrepreneurship, if at all?
    • There is a whole ecosystem growing around the cloud services offered by Amazon and the other players: new companies offering services and using Amazon's AWS for example. What are some of the most interesting companies/ideas/technologies emerging that you have discovered or been involved with?
    • There is tremendous worries about the safety and privacy of data in the cloud. Is this an area of growth students should consider?
    • What other area of growth do you see that students should keep in their view-sight?
    • If a graduating major is interested in joining a start-up company, what are the signs she should be looking for before joining such a group?
    • Some students are interested in a management track, starting at a big company and climbing fast. What is your advice for best preparing for this type of career?
    • What is the most exciting development in your eyes happening now with cloud technology?
    • It has been said that the 21st century is the century of the entrepreneur. Do you see this as true?
    • Companies rise and fall. Microsoft was once the place where all our majors wanted to go. The most prestigious company for programmers. Now it's Google, and Facebook. Which company(ies) do you see as potential new meccas for programmers?
    • If somebody were to form a start-up with friends. Say 10 people. Who/What/Where? Who should the people be? What field should they be experts in? Where should the company locate?



TimMelissaDraper.png


    • Review of Homework 1 and its Solution.
      • Understand static variables
      • don't use global random generators!
      • /usr/bin/time multiplies time by the # of cores for threaded applications
      • be sure to understand if you need the same random seed or a different seed in your experiments
      • create a different user on your laptop with no extra applications loaded in the background (e.g. Skype): less stress on the O.S.



  • Thursday Mountain Day!


MountainDay.png


 

Week 5
10/1
  • Tuesday (Grace Hopper Conference)
    • Introduction to Packing pdf and ppt
    • Studying the Red-Black Tree data-structure
      • Why is it not thread-safe?
      • How can we make it thread-safe?
      • Devise a test to verify that the modifications have resulted in a thread-safe class
      • Profiling Java applications (introduction to Java's GC).




  • Thursday (Grace Hopper Conference)
    • Newsletter #2 due today. Please include 1 news item about some form of image collage, representation of many images in some form, hopefully digital. Also, please use a Latex feature you haven't used in your first newsletter
    • Elaborating a roadmap for the final [Project]

Week 6
10/8

  • MPI by Blaise Barney, at Lawrence Livermore National Laboratory: an excellent reference on MPI
Week 7
10/15
  • Tuesday: Fall Break
  • Thursday



Week 8
10/22
  • Tuesday
    • Paper presentation: Learning from the Success of MPI, presented by Gavi ( Bibtex)
    • Hadoop0 accounts
    • Learn how to become rsync champions!
    • Continuation of the introduction to MPI ( keynote). We stopped on Thursday on the MPI_Send() function.
    • Code for the pi2.c program computing Pi using summation of a series
    • Newsletter #3 due today!
  • Thursday
    • Continuation of the introduction to MPI ( keynote)
    • Introduction on how to operate a MySQL database ( keynote)
    • A project-oriented MPI example. Bring your Mac!

 

Week 9
10/29

Week 10
11/5
  • Tuesday: Otelia Cromwell Day
  • Thursday:
    • Paper presentation: MapReduce: Simplified Data Processing on Large Clusters presented by Sharon Pamela
    • Newsletter #4 due today!. Please include at least one image, and at least one news item covering some form of project that could be related or influential for our own wiki-collage project. See this document on writing theses for information about the inclusion of images in Latex. The end section has a good list of sites that have good coverage of Latex topics. There is also plenty of information on the Web about this subject.
    • Preparation for Homework 5: attaching EBS volumes. We'll do a lab in class to create and attach an EBS volume to your AWS cluster.



 

Week 11
11/12
  • Tuesday
    • Paper presentation: General-Purpose vs. GPU: Comparisons of Many-Cores on Irregular Workloads, presented by Yoshie
      • Questions about the paper:
        • What kind of paper is this? Broad distribution? Research? Small group?
        • Organization? Abstract? Introduction? Definition of specialized terms? Early enough in the paper?
        • Are the contributions of paper clear? The section on related research sufficient?
        • What is being compared? Similar machines? Hardware? Software?
        • Are authors partial? Do they have a stake?
        • How does the paper advance the state of research?
        • What does it tell us about the way computer systems evolve?
    • Thinking about the project
      • What do we know better about the overall project. What pieces have we looked at?
      • What is it we don't know?
      • Can we turn any of these questions into a project?



Yahoo has some very good reading material on Hadoop. One reason is that they may be one of the largest users of AWS and of Hadoop.

Week 12
11/19
  • Tuesday:
    • 1 month to go (exactly) before the project is due (Dec. 19)!
    • Student-directed work (DT @ INFOCOMP 2013)
    • Finish the MapReduce lab on AWS and make sure you do the Challenge 2 part of the lab.
    • Food for thought: some videos
      I suggest one of you connects her laptop to the projection system and you all watch these videos together. After each one, discuss it as a group. Take notes and be ready to share your comments during Thursday's class when we resume our regular schedule.
      • The Cave 2 Project at the University of Illinois: Just another hardware solution for presenting the user with a large number of pixels; in this case 27320 x 3072 pixels. Short, 3 minutes.


      • The Creators Projects video
This video is not necessarily anything that can work for us, but it's just "food for thought." Just a different way an artist has come up to make still pictures interesting to look at. Short, 6 minutes.


      • O'Reilly Radar Videos
OReillyPerlBookCover.jpg
Tim O'Reilly is a visionary who figured out a long time ago that computer technology was an exploding field and he started a very successful line of books to support all new technology projects that were emerging and promising. The books all have animals on them and are uniquely easy to spot. O'Reilly now also has an on-line channel (O'Reilly Radar), and organizes conferences with top researchers and intellectuals in the field of computer science.
The first video is with Doug Cutting, one of the creators of Hadoop. He makes some very good points about what Hadoop is, what it is good at, and what it might not be good at (Homework 5 lesson?). After Cutting you can skip the 2nd interview (about video technology) and zip to the 3rd interview with Jeremy Howard, at time-tag 13:47. Then learn about big data and analytics, and what is said of data scientists. About 12 minutes total.


Good interview of Tim O'Reilly describing Web 2.0, and his view of a data-driven Internet. 8-minute long. You may want to think about how our wikipedia data (images, stats) relate to what is said about data as described in the interview. About 8 minutes.


The next video filmed in June 2013 presents Bruno Fernandez-Ruiz of Yahoo, who speaks about Hadoop since 2005, Hadoop today, and what is ahead. An important type of data property Fernandez-Ruiz is interested in is timeliness, which we haven't really looked at for our project, but you will see that it could apply easily to the dynamics of wikipedia. Some interesting statistics about the number of servers, the size of the HDFS they use, the number of processes are given. About 17 minutes.

LookingBeyondHadoop.png

    • If you have at least 25 minutes left before the class time is over, do the MapReduce-Python lab, without attempting the challenges at the end. We'll do these together.




 

Week 13
11/26

Week 14
12/3
    • The challenges of the MapReducing in Python lab. We have done Challenge #1 last time. We'll look at Challenge #2 and #3.
    • Some feedback on Homework #5 and one solution.
    • MapReduce task graphs



 

Week 15
12/10
CSC352Row.jpg
  • Tuesday: Last Day of Class
    • 20-minute presentations of projects. Suggested outline:
      • The context: how your project fits in the overall pictures
      • Has other similar work been done and documented before
      • What you decided to do
        • The challenges
        • The choices
        • The target experiments
      • Preliminary results
      • Expected results
      • Possible directions for continuing research after the project

An afternoon of packing circular crepes, including some imaginative variations... PackingCrepes1.jpgPackingCrepes2.jpg PackingCrepes3.jpg PackingCrepes4.jpg PackingCrepes5.jpg PackingCrepes6.jpg



Links and Resources


Latex



Smith Elements of Style



On-Line Resources


Classics



Papers

This is a tentative and non exhaustive list of papers scheduled for reading this semester.

Introduction

Paper Pages

50

2

General/Parallelism

Paper Pages

5

7

5

MPI

Paper Pages
  • Learning from the Success of MPI, by WIlliam D. Gropp, Argonne National Lab, 2002.                                             

11

GPUs

Paper Pages

6

Virtualization

Paper Pages

5

Cloud

Paper Pages

1.5

  • A View of Cloud Computing, 2010, By Armbrust, Michael and Fox, Armando and Griffith, Rean and Joseph, Anthony D. and Katz, Randy and Konwinski, Andy and Lee, Gunho and Patterson, David and Rabkin, Ariel and Stoica, Ion and Zaharia, Matei.

9

13

5

2

Project-Related

Paper Pages

8