Difference between revisions of "CSC352 Problem of the Day"

From dftwiki3
Jump to: navigation, search
(Homework #4, Problem #2)
(Homework #4, Problem #2)
Line 37: Line 37:
 
* If you were to add another column to this table, what quantity would you add?
 
* If you were to add another column to this table, what quantity would you add?
 
* Identify the parties responsible for this surprising difference
 
* Identify the parties responsible for this surprising difference
 +
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 +
 +
[[Category:CSC352]][[Category:Hadoop]][[Category:MapReduce]]

Revision as of 08:17, 27 April 2010

Homework #4, Problem #2

  • Conditions:
    • Processing of wiki pages with Hadoop on 6-PC cluster
    • Same Mapper and same Reducer program to process two different input folders



Number of files Number of wiki pages Number of categories Execution Time
(seconds)
589 589 832 388
1 117,617 51,120 30.7
Ratio=589/1 Ratio=1/199 Ratio=1/61.4 Ratio=12.6/1



  • Discuss these results
  • If you were to add another column to this table, what quantity would you add?
  • Identify the parties responsible for this surprising difference