Difference between revisions of "Tackling Big Data MIT Course"

From dftwiki3
Jump to: navigation, search
(Notes)
(Certificate)
 
(11 intermediate revisions by the same user not shown)
Line 21: Line 21:
 
</onlydft>
 
</onlydft>
  
 +
 +
=Overall Syllabus=
 +
 +
==MODULES, TOPICS, AND FACULTY ==
 +
 +
===Module One: Introduction and Use Cases===
 +
The introductory module aims to give a broad survey of Big Data
 +
challenges and opportunities and highlights applications as case
 +
studies.
 +
* Introduction: Big Data Challenges (Sam Madden)
 +
* Case Study: Transportation (Daniela Rus)
 +
* Case Study: Visualizing Twitter (Sam Madden)
 +
 +
===Module Two: Big Data Collection===
 +
The data capture module surveys approaches to data collection,
 +
cleaning, and integration.
 +
* Data Cleaning and Integration (Mike Stonebraker)
 +
* Hosted Data Platforms and the Cloud (Matei Zaharia)
 +
 +
===Module Three: Big Data Storage===
 +
The module on Big Data storage describes modern approaches
 +
to databases and computing platforms.
 +
* Modern Databases (Mike Stonebraker)
 +
* Distributed Computing Platforms (Matei Zaharia)
 +
* NoSQL, NewSQL (Sam Madden)
 +
 +
 +
===Module Four: Big Data Systems===
 +
The systems module discusses solutions to creating and deploying
 +
working Big Data systems and applications.
 +
* Multicore Scalability (Nickolai Zeldovich)
 +
* Security (Nickolai Zeldovich)
 +
* User Interfaces for Data (David Karger)
 +
 +
===Module Five: Big Data Analytics===
 +
The analytics module covers state-of-the-art algorithms for very
 +
large data sets and streaming computation.
 +
* Machine Learning Tools (Tommi Jaakkola)
 +
* Fast Algorithms I (Ronitt Rubinfeld)
 +
* Fast Algorithms II (Piotr Indyk)
 +
* Data Compression (Daniela Rus)
 +
* Case Study: Information Summarization (Regina Barzilay)
 +
* Applications: Medicine (John Guttag)
 +
* Applications: Finance (Andrew Lo)
 +
 +
Note: Schedule and faculty are subject to change without notice
  
 
=Notes=
 
=Notes=
 
<onlydft>
 
<onlydft>
 +
 
==Rus: Transportation==
 
==Rus: Transportation==
 
* Rus.  Transportation in Singapore.  Singapore small country. 16,000 taxis.  High number of loops embedded in streets.  Can sample of taxis with GPS be used to approximate well the real traffic given (every 15 minutes) by loops (expensive to maintain).
 
* Rus.  Transportation in Singapore.  Singapore small country. 16,000 taxis.  High number of loops embedded in streets.  Can sample of taxis with GPS be used to approximate well the real traffic given (every 15 minutes) by loops (expensive to maintain).
Line 49: Line 96:
 
<br />
 
<br />
 
----
 
----
 +
<br />
 +
=Complaints=
 +
* Quiz not satisfying
 +
* Forced to contribute to discussion: Ok
 +
* Forced to comment on somebody else's comment: bad
 +
* Should be on each course, rather than on each unit
 +
* Plenty to complain about regarding UI
 +
[[Image:MITMoocComplaint1.jpg|300px]]
 +
[[Image:MITMoocComplaint2.jpg|300px]]
 +
[[Image:MITMoocComplaint3.jpg|300px]]
 +
[[Image:MITMoocComplaint4.jpg|300px]]
 
<br />
 
<br />
 
=Ideas for Presentation=
 
=Ideas for Presentation=
Line 57: Line 115:
 
* Simplicity of presentation hardware
 
* Simplicity of presentation hardware
 
* For UI, invent system that allow users to not have to program
 
* For UI, invent system that allow users to not have to program
 +
* Good to watch videos twice in a row.
 +
* HD essential for graphs
 +
* Office hours 2-3 times by a few profs.  Use Cisco WebEx for office hours.  About 47 participants.  Audio only.
 +
<br />
 +
<center>[[Image:MITTacklingBigData_OfficeHours.png|700px]]</center>
 +
<br />
 +
<center>[[Image:MITTacklingBigData_Format.png|700px]]</center>
 +
<br />
 +
<center>[[Image:MITTacklingBigDataAssessment.png|700px]]</center>
 +
<br />
 +
<center>[[Image:MITTacklingBigDataOfficeHours.png|700px]]</center>
 +
 
</onlydft>
 
</onlydft>
 +
<br />
 +
<center>[[Image:MITEmailParticipationWorkshops.png|700px]]</center>
 +
<br />
 +
=Certificate=
 +
* [[Media:TacklingBigData_Certificate.pdf| Certificate]]
 +
<center>[[Image:TacklingBigData_Certificate.png|500px]]</center>

Latest revision as of 12:38, 23 April 2014

--D. Thiebaut (talk) 20:12, 3 March 2014 (EST)


Misc. Information


...

Login to EdX


...


Overall Syllabus

MODULES, TOPICS, AND FACULTY

Module One: Introduction and Use Cases

The introductory module aims to give a broad survey of Big Data challenges and opportunities and highlights applications as case studies.

  • Introduction: Big Data Challenges (Sam Madden)
  • Case Study: Transportation (Daniela Rus)
  • Case Study: Visualizing Twitter (Sam Madden)

Module Two: Big Data Collection

The data capture module surveys approaches to data collection, cleaning, and integration.

  • Data Cleaning and Integration (Mike Stonebraker)
  • Hosted Data Platforms and the Cloud (Matei Zaharia)

Module Three: Big Data Storage

The module on Big Data storage describes modern approaches to databases and computing platforms.

  • Modern Databases (Mike Stonebraker)
  • Distributed Computing Platforms (Matei Zaharia)
  • NoSQL, NewSQL (Sam Madden)


Module Four: Big Data Systems

The systems module discusses solutions to creating and deploying working Big Data systems and applications.

  • Multicore Scalability (Nickolai Zeldovich)
  • Security (Nickolai Zeldovich)
  • User Interfaces for Data (David Karger)

Module Five: Big Data Analytics

The analytics module covers state-of-the-art algorithms for very large data sets and streaming computation.

  • Machine Learning Tools (Tommi Jaakkola)
  • Fast Algorithms I (Ronitt Rubinfeld)
  • Fast Algorithms II (Piotr Indyk)
  • Data Compression (Daniela Rus)
  • Case Study: Information Summarization (Regina Barzilay)
  • Applications: Medicine (John Guttag)
  • Applications: Finance (Andrew Lo)

Note: Schedule and faculty are subject to change without notice

Notes


...


MITEmailParticipationWorkshops.png


Certificate

TacklingBigData Certificate.png