Difference between revisions of "CSC352 Project Page 2017"

From dftwiki3
Jump to: navigation, search
(Project Teams)
(Project Teams)
 
(10 intermediate revisions by the same user not shown)
Line 77: Line 77:
 
! style="width:20%; text-align:left;" scope="col"| Date
 
! style="width:20%; text-align:left;" scope="col"| Date
 
|-
 
|-
! scope="row"| Isaiah<br /> &amp; Muriel<br />
+
! scope="row"| Isaiah
| MEAN Stack on Heruko
+
| Google App Engine
 
| 4/25/17
 
| 4/25/17
 
|-
 
|-
! scope="row"| Riley<br /> <br />
+
! scope="row" | Muriel <br />
| Data Processing in Pig
+
| MEAN Stack on Heruko
| 4/25/17
 
|-
 
! scope="row" | ___________<br /> &amp; ____________<br />
 
| ?
 
 
| 4/27/17
 
| 4/27/17
 
|-
 
|-
! scope="row" | Sam <br /> &amp; Angie<br />
+
! scope="row" | Grace <br />
| TensorFlow, image recognition
+
| MongoDB
| 4/27/17
+
| 5/2/17 (lunch time)
 
|-
 
|-
! scope="row" | ___________<br /> &amp; ____________<br />
+
! scope="row" | Riley<br /> <br />
| ?
+
| HBase
 
| 5/2/17
 
| 5/2/17
 
|-
 
|-
Line 100: Line 96:
 
| AWS Spark (on Hadoop)
 
| AWS Spark (on Hadoop)
 
| 5/2/17
 
| 5/2/17
 +
|-
 +
! scope="row" | Sam <br /> &amp; Angie<br />
 +
| Pig
 +
| 5/4/17 (lunch time)
 
|-
 
|-
 
! scope="row" | Lujun <br /> &amp; Vega<br />
 
! scope="row" | Lujun <br /> &amp; Vega<br />
| Comparison of H2O and Google's Cloud ML Engine
+
| Comparison of the H2O to Spark MLlib Machine Learning framework on EC2
 
| 5/4/17
 
| 5/4/17
 
|-
 
|-

Latest revision as of 18:26, 23 April 2017

--D. Thiebaut (talk) 10:22, 26 January 2017 (EST)


SC16TutorialWordCloud.jpg




Work Due


March 21, 2017, 1:00 p.m.


  • A 1-page pdf prepared in Latex, with
  • Name of the team member(s)
  • Name of the project
  • Parallel technology (hardware, software, other) you have picked.
  • Date when you wish to present (conflicting dates will be resolved in class).


Note: You may submit a second proposal at a later time, but you need to have one submitted by 4/21 for this assignment to count. Note: The grade will be A for submission, F for no submission.


May 12, 2017, 4:00 p.m.


  • A pdf of project document. This document should follow the slides you will have presented in the oral presentations, with several examples that you will have tested. This document should also contain a graph showing the result of some experiment where you measured some form of performance, for varying values of a parameter defining your environment (number of threads, number of processors, problem size, different implementation of an algorithm, different languages, etc.)


Project Contents


The second half of the semester (or earlier if you wish), you will work on a project that will consist of several components and/or requirements.
  1. Pick a partner to create a pair (singleton are allowed).
  2. Pick a topic from the list below.
  3. Write a Latex tutorial which will introduce somebody with a CS background to the topic you've picked. You need to include an introduction that sets the background for your pick, a series of simple examples (all of which you will need to demonstrate you have been able to successfully run them), a section on performance, and an assessment on the future of the technology you covered. This document must contain a bibliography implemented with BibTex.
  4. Do a 30-minutes to 1 hour presentation to the class, with slides, where you will introduce the class to the technology you have picked. This presentation should include (if possible) a mini-lab that will allow everybody in class to run a few programs that will be using the technology you picked.


Non Exhaustive List of Topics


The list below is taken from the Hadoop Ecosystem Table page:

Apache


  • Pig, Hive, JAQL, Storm, Flink, Apex, Pydoop, and others
  • NoSQL: HBase, Cassandra, Accumulo, Kudu, MongoDB
  • Machine Learning: Mahoot, H2O
  • Other?

Misc

  • GPU: Cuda
  • Other?

Microsoft

  • Azure, and/or any component of its ecosystem.
  • Other?

Google

  • Google Compute Engine, Google App Engine
  • Tensorflow
  • Other?


Timing


  • You need to have a plan in place the first week after Spring Break. This plan includes
  1. who your partner is
  2. what topic you have picked
  3. a date of presentation during the last 2-3 weeks of the semester.


  • We may have to meet during the lunch hour preceding the last lecture slots of the semester.


Project Teams


Team Project Date
Isaiah Google App Engine 4/25/17
Muriel
MEAN Stack on Heruko 4/27/17
Grace
MongoDB 5/2/17 (lunch time)
Riley

HBase 5/2/17
Kathleen

AWS Spark (on Hadoop) 5/2/17
Sam
& Angie
Pig 5/4/17 (lunch time)
Lujun
& Vega
Comparison of the H2O to Spark MLlib Machine Learning framework on EC2 5/4/17
Youyou
& Zainab
TensorFlow 5/4/17


Resources


  • You may find this page by Prof. Newhall at Swarthmore useful when you prepare your presentation.