Hadoop Tutorial 2.1 -- Streaming XML Files
Revision as of 21:13, 12 April 2010 by Thiebaut (talk | contribs) (Created page with '{| | width="40%" | __TOC__ | <bluebox> right | 80px <br /> <br /> This tutorial is the continuation of [[Hadoop_Tutorial_2_--_Running_WordCount_in_…')
Contents |
This tutorial is the continuation of Hadoop_Tutorial_2_--_Running_WordCount_in_Python, and uses streaming to process XML files as a block. In this setup each Map task gets a whole xml file and breaks it down into tuples.
|
The Setup