Difference between revisions of "CSC352 Project 3"

From dftwiki3
Jump to: navigation, search
(Web Server)
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
 
__TOC__
 
__TOC__
  
 
<bluebox>
 
<bluebox>
This is the extension of [[CSC352_Project_2 | Project #2]], which is built on top of the [[http://cs.smith.edu/dftwiki/index.php/Hadoop/MapReduce_Tutorials| Hadoop/Mapreduce Tutorials]].  It is due on the last day of Exams, at 4:00 p.m.
+
This is the extension of [[CSC352_Project_2 | Project #2]], which is built on top of the [[Hadoop/MapReduce_Tutorials| Hadoop/Mapreduce Tutorials]].  It is due on the last day of Exams, at 4:00 p.m.
 
</bluebox>
 
</bluebox>
  
 +
<onlysmith>
 
=The Big Picture=
 
=The Big Picture=
 +
{|
 +
|
 
<tanbox>
 
<tanbox>
 
Your project should present your answers to the following three questions:
 
Your project should present your answers to the following three questions:
Line 13: Line 15:
 
* How does this compare to the execution time of the 5 Million pages on an XGrid system?
 
* How does this compare to the execution time of the 5 Million pages on an XGrid system?
 
</tanbox>
 
</tanbox>
 +
|
 +
[[Image:cherriesXparent.gif|right|100px]]
 +
|}
 +
<br />
  
 
=Assignment (same as for the XGrid Project)=
 
=Assignment (same as for the XGrid Project)=
Line 21: Line 27:
 
* Measure the execution time of the program
 
* Measure the execution time of the program
 
* write a summary of the approach as illustrated in the guidelines presented in class (3/9, 3/11).  
 
* write a summary of the approach as illustrated in the guidelines presented in class (3/9, 3/11).  
* Submit a pdf with your presentation, graphs, and analysis. Submit your programs, even if they are the same as the files you submitted for previous homework or projects. 
+
* Submit a pdf with your presentation, graphs, and analysis.  
  
    submit project3 file1
+
=Project Details=
    submit project3 file2
 
    ...
 
 
 
:'''Note''': You cannot submit directories with the '''submit''' command.  If you want to submit the contents of a whole directory, then proceed as follows:
 
 
 
    cd ''theDirectoryWhereAllTheFilesYouWantToSubmitReside''
 
    tar -czvf  ''yourFirstNameProject3.tgz'' *
 
    submit ''yourFirstNameProject3.tgz''
 
  
=Project Details=
 
 
==Accessing Wiki Pages==
 
==Accessing Wiki Pages==
  
Line 125: Line 122:
  
 
</pre></code>
 
</pre></code>
 +
 +
You are free to put additional wiki pages from the local disk of Hadoop6 into HDFS, but if you do so, do it in the '''wikipages''' directory, and update the README_dft.txt file in the HDFS wikipages directory with information about what you have added and how to access it.  Thanks!
  
 
===Web Server===
 
===Web Server===
Line 130: Line 129:
 
Of course, all the pages are still available on XGridMac, as they were for Project 2.  It is up to you to figure out if it is worth exploring writing MapReduce programs that would gather the pages from the Web rather than from HDFS.
 
Of course, all the pages are still available on XGridMac, as they were for Project 2.  It is up to you to figure out if it is worth exploring writing MapReduce programs that would gather the pages from the Web rather than from HDFS.
  
</onlysmith>
 
  
==Submission==
+
=Submission=
  
 
Submit a pdf (and additional files if needed) as follows:
 
Submit a pdf (and additional files if needed) as follows:
  
   submit project2 project2.pdf
+
   submit project3 project3.pdf
 +
 
 +
Submit your programs, even if they are the same as the files you submitted for previous homework or projects. 
 +
 
 +
    submit project3 file1
 +
    submit project3 file2
 +
    ...
 +
 
 +
:'''Note''': You cannot submit directories with the '''submit''' command.  If you want to submit the contents of a whole directory, then proceed as follows:
 +
 
 +
    cd ''theDirectoryWhereAllTheFilesYouWantToSubmitReside''
 +
    tar -czvf  ''yourFirstNameProject3.tgz'' *
 +
    submit project3  ''yourFirstNameProject3.tgz''
 +
 
 +
=Extra Credits=
 +
 
 +
Extra credits will be given for some work done on AWS.  This could be the whole project or sections of it, or just comparison on some of the input sets.
 +
</onlysmith>
  
 
<br />
 
<br />
Line 145: Line 160:
 
<br />
 
<br />
 
<br />
 
<br />
[[Category:CSC352]][[Category:Projects]][[Category:XGrid]]
+
[[Category:CSC352]][[Category:Project]][[Category:MapReduce]][[Category:XGrid]]

Latest revision as of 13:07, 18 November 2010


This is the extension of Project #2, which is built on top of the Hadoop/Mapreduce Tutorials. It is due on the last day of Exams, at 4:00 p.m.


This section is only visible to computers located at Smith College