Difference between revisions of "CSC352 Project 3"

From dftwiki3
Jump to: navigation, search
Line 8: Line 8:
 
=The Big Picture=
 
=The Big Picture=
 
<tanbox>
 
<tanbox>
[[Image:cherries.jpg|right|50px]]
+
[[Image:cherriesXparent.gif|right|50px]]
 
Your project should present your answers to the following three questions:
 
Your project should present your answers to the following three questions:
 
* How should one attempt to process 5 Million Wikipedia pages with MapReduce/Hadoop?  What parameters control the execution time, and what is the best guess for the values they should be set at?
 
* How should one attempt to process 5 Million Wikipedia pages with MapReduce/Hadoop?  What parameters control the execution time, and what is the best guess for the values they should be set at?

Revision as of 14:56, 17 April 2010


This is the extension of Project #2, which is built on top of the Hadoop/Mapreduce Tutorials. It is due on the last day of Exams, at 4:00 p.m.


This section is only visible to computers located at Smith College