Difference between revisions of "CSC352 DT's Class Notes 2013"
(→DFT Class Notes for CSC352) |
|||
(15 intermediate revisions by the same user not shown) | |||
Line 4: | Line 4: | ||
__TOC__ | __TOC__ | ||
+ | <br /> | ||
+ | <center> | ||
+ | '''PLEASE SEE [[CSC352_2017_DT%27s_Notes| THE 2017 NOTES PAGE FOR MORE INFO]]''' | ||
+ | </center> | ||
+ | <br /> | ||
==Threads== | ==Threads== | ||
* good example with multiple ping processes: [http://www.wellho.net/solutions/python-python-threads-a-first-example.html] | * good example with multiple ping processes: [http://www.wellho.net/solutions/python-python-threads-a-first-example.html] | ||
Line 11: | Line 16: | ||
* [[NQueens.py | NQueens.py]] | * [[NQueens.py | NQueens.py]] | ||
* [[threadedNQueens.py | threadedNQueens.py ]] | * [[threadedNQueens.py | threadedNQueens.py ]] | ||
+ | * [[classExample1.py | classExample1.py ]] | ||
+ | * [[classExample2.py | classExample2.py ]] | ||
+ | * [[classExample3.py | classExample3.py ]] | ||
* [[serialPing.py | serialPing.py ]] | * [[serialPing.py | serialPing.py ]] | ||
* [[threadedPing.py | threadedPing.py ]] | * [[threadedPing.py | threadedPing.py ]] | ||
+ | * [[searchKeywordsRetrieveEtexts.py | searchKeywordsRetrieveEtexts.py ]] | ||
+ | * [[threadedSearchKeywordRetrieveEtexts.py | threadedSearchKeywordRetrieveEtexts.py ]] | ||
− | == | + | ===Setting up documents and swish-e=== |
− | == | + | (Note: there are 2 other alternatives: sphinx and zend-lucene. Sphinx requires data in xml form or in mysql database) |
− | - | + | |
− | + | * all operations on xgridmac | |
− | + | * downloaded & installed swish-e. Install dir is ~thiebaut/research/swish-e/ | |
− | * | + | * downloaded & installed www.etext.org in http://xgridmac.dyndns.org/~thiebaut/www_etext_org/ |
− | * | + | * swish-e index stored in ~thiebaut/Site/swish-e |
− | * | + | * added [[www.etext.org_swish-e.php | swishe.php]] in ~thiebaut/Site/swish-e/ |
− | * | + | * test: |
− | + | <code><pre> | |
− | + | cd | |
− | * | + | cd Site/swish-e |
− | * | + | php swishe.php search=love |
− | + | ... | |
− | ::: | + | <br> |
− | + | <br>rank: 20 | |
− | + | <br>score: 809 | |
− | + | <br>url: http://xgridmac.dyndns.org/~thiebaut/www_etext_org/Religious_357/Polyamory/Keys2LovingUnity.html | |
− | + | <br>link: <a href="http://xgridmac.dyndns.org/~thiebaut/www_etext_org/Religious_357/Polyamory/Keys2LovingUnity.html">link</a> | |
− | + | <br>file: Keys2LovingUnity.html | |
− | + | <br>offset: 47813 | |
− | + | <br> | |
− | * | + | </pre></code> |
− | + | ||
− | * | + | * test on Web at url http://xgridmac.dyndns.org/~thiebaut/swish-e/swishe.php?search=love |
− | + | * test with delay: http://xgridmac.dyndns.org/~thiebaut/swish-e/swishe.php?delay=20&search='local%20government' | |
− | + | Where delay is number of 1/10s of a second to wait. This is a bound as the true delay is random between 0.1 sec and the integer specified times 1/10 seconds.) | |
− | + | ||
− | + | ==Project== | |
− | + | ||
− | + | ;Project 1: | |
− | + | :Threading in Python: given two lists of keywords, List1 and List2, retrieve docs from a site (xgridmac.dyndns.org, yahoo, google) that respond/match List1. Filter the docs received and keep only those that contain most of the words in List2. | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
+ | ;Project 2: | ||
+ | :XGrid: process a gzip xml dump of wikipedia and break it up into individual pages (9 million or so of them)! | ||
+ | ;Project 3: | ||
+ | :Map-Reduce: process wikipedia pages and create an index of words and their associated categories | ||
+ | ==Papers== | ||
+ | [[CSC352 Notes on A View From Berkeley| Notes]] on a View from Berkeley paper | ||
</onlydft> | </onlydft> | ||
+ | [[Category:CSC352]][[Category:Class Notes]] |