Difference between revisions of "CSC352 MapReduce/Hadoop Class Notes"
Line 394: | Line 394: | ||
* This is illustrated and explained in Section 4 of [[Hadoop_Tutorial_1_--_Running_WordCount#Running_Your_Own_Version_of_WordCount.java Tutorial #1 | Tutorial #1: Compiling your own version of WordCoung.java]] | * This is illustrated and explained in Section 4 of [[Hadoop_Tutorial_1_--_Running_WordCount#Running_Your_Own_Version_of_WordCount.java Tutorial #1 | Tutorial #1: Compiling your own version of WordCoung.java]] | ||
− | =How does hadoop compare on | + | =How does hadoop on 6 compare to Linux on 1?= |
* This is very interesting! | * This is very interesting! | ||
Line 425: | Line 425: | ||
<br /> | <br /> | ||
<br /> | <br /> | ||
+ | |||
+ | =Debugging/Testing using Counters= | ||
+ | |||
+ | Section 6 of [[Hadoop_Tutorial_1_--_Running_WordCount#Counters | Tutorial #1]] shows how to create counters. Hadoop Counters are special variables that are gathered after each task runs and the values are accumulated and reported at the end and during the computation. They are useful for counting quantities such as amount of data processed, number of tasks executed, etc. | ||
+ | |||
+ | <br /> | ||
+ | <br /> | ||
+ | <greenbox> | ||
+ | [[Image:ComputerLogo.png|right |100px]] | ||
+ | ;Lab Experiment #4: | ||
+ | : [[Hadoop_Tutorial_1_--_Running_WordCount#Counters | Tutorial 1 on Counters]]. Create counters in your Java version of WordCount and count the number of Map tasks and the number of Reduce tasks. | ||
+ | |||
+ | </greenbox> | ||
+ | <br /> | ||
+ | <br /> | ||
+ | |||
=Running WordCount in Python= | =Running WordCount in Python= | ||
Line 432: | Line 448: | ||
<greenbox> | <greenbox> | ||
[[Image:ComputerLogo.png|right |100px]] | [[Image:ComputerLogo.png|right |100px]] | ||
− | ;Lab Experiment # | + | ;Lab Experiment #5: |
: [[Hadoop Tutorial 2 -- Running WordCount in Python | Tutorial 2]] on running Python programs with MapReduce/Hadoop. | : [[Hadoop Tutorial 2 -- Running WordCount in Python | Tutorial 2]] on running Python programs with MapReduce/Hadoop. | ||
Latest revision as of 08:16, 6 April 2010