Difference between revisions of "CSC352 DT's Class Notes 2013"

From dftwiki3
Jump to: navigation, search
Line 43: Line 43:
 
* test with delay: http://xgridmac.dyndns.org/~thiebaut/swish-e/swishe.php?delay=20&search='local%20government'
 
* test with delay: http://xgridmac.dyndns.org/~thiebaut/swish-e/swishe.php?delay=20&search='local%20government'
 
Where delay is number of 1/10s of a second to wait.  This is a bound as the true delay is random between 0.1 sec and the integer specified times 1/10 seconds.)
 
Where delay is number of 1/10s of a second to wait.  This is a bound as the true delay is random between 0.1 sec and the integer specified times 1/10 seconds.)
 +
 +
==Project==
 +
 +
;Project 1:
 +
:Threading in Python: given two lists of keywords, List1 and List2, retrieve docs from a site (xgridmac.dyndns.org, yahoo, google) that respond/match List1.  Filter the docs received and keep only those that contain most of the words in List2.
 +
 +
;Project 2:
 +
:XGrid: process a gzip xml dump of wikipedia and break it up into individual pages (9 million or so of them)!
 +
 +
;Project 3:
 +
:Map-Reduce: process wikipedia pages and create an index of words and their associated categories
  
 
==Papers==
 
==Papers==

Revision as of 14:42, 24 January 2010

DFT Class Notes for CSC352


...