Difference between revisions of "CSC352 DT's Class Notes 2013"

From dftwiki3
Jump to: navigation, search
(DFT Class Notes for CSC352)
 
(16 intermediate revisions by the same user not shown)
Line 4: Line 4:
 
__TOC__
 
__TOC__
  
 +
<br />
 +
<center>
 +
'''PLEASE SEE [[CSC352_2017_DT%27s_Notes| THE 2017 NOTES PAGE FOR MORE INFO]]'''
 +
</center>
 +
<br />
 
==Threads==
 
==Threads==
* good example with multiple ping processes: http://www.wellho.net/solutions/python-python-threads-a-first-example.html
+
* good example with multiple ping processes: [http://www.wellho.net/solutions/python-python-threads-a-first-example.html]
 +
* multi-core not used by python [http://smoothspan.wordpress.com/2007/09/14/guido-is-right-to-leave-the-gil-in-python-not-for-multicore-but-for-utility-computing/]
 +
 
 
===Programs===
 
===Programs===
 
* [[NQueens.py | NQueens.py]]
 
* [[NQueens.py | NQueens.py]]
 
* [[threadedNQueens.py | threadedNQueens.py ]]
 
* [[threadedNQueens.py | threadedNQueens.py ]]
 +
* [[classExample1.py | classExample1.py ]]
 +
* [[classExample2.py | classExample2.py ]]
 +
* [[classExample3.py | classExample3.py ]]
 
* [[serialPing.py | serialPing.py ]]
 
* [[serialPing.py | serialPing.py ]]
 
* [[threadedPing.py | threadedPing.py ]]
 
* [[threadedPing.py | threadedPing.py ]]
 +
* [[searchKeywordsRetrieveEtexts.py | searchKeywordsRetrieveEtexts.py ]]
 +
* [[threadedSearchKeywordRetrieveEtexts.py | threadedSearchKeywordRetrieveEtexts.py ]]
 +
 +
===Setting up documents and swish-e===
 +
(Note: there are 2 other alternatives: sphinx and zend-lucene.  Sphinx requires data in xml form or in mysql database)
 +
 +
* all operations on xgridmac
 +
* downloaded &amp; installed swish-e.  Install dir is ~thiebaut/research/swish-e/
 +
* downloaded &amp; installed www.etext.org in http://xgridmac.dyndns.org/~thiebaut/www_etext_org/
 +
* swish-e index stored in ~thiebaut/Site/swish-e
 +
* added [[www.etext.org_swish-e.php | swishe.php]] in ~thiebaut/Site/swish-e/
 +
* test:
 +
<code><pre>
 +
  cd
 +
  cd Site/swish-e
 +
  php swishe.php search=love
 +
  ...
 +
<br>
 +
<br>rank:  20
 +
<br>score:  809
 +
<br>url:    http://xgridmac.dyndns.org/~thiebaut/www_etext_org/Religious_357/Polyamory/Keys2LovingUnity.html
 +
<br>link:  <a href="http://xgridmac.dyndns.org/~thiebaut/www_etext_org/Religious_357/Polyamory/Keys2LovingUnity.html">link</a>
 +
<br>file:  Keys2LovingUnity.html
 +
<br>offset: 47813
 +
<br>
 +
</pre></code>
  
==Papers==
+
* test on Web at url http://xgridmac.dyndns.org/~thiebaut/swish-e/swishe.php?search=love
===Landscape of parallel computing research: a view from berkeley===
+
* test with delay: http://xgridmac.dyndns.org/~thiebaut/swish-e/swishe.php?delay=20&search='local%20government'
----
+
Where delay is number of 1/10s of a second to wait.  This is a bound as the true delay is random between 0.1 sec and the integer specified times 1/10 seconds.)
* good intro paper.  
+
 
* 3 sections
+
==Project==
** software
+
 
** hardware
+
;Project 1:
** performance/programming models
+
:Threading in Python: given two lists of keywords, List1 and List2, retrieve docs from a site (xgridmac.dyndns.org, yahoo, google) that respond/match List1.  Filter the docs received and keep only those that contain most of the words in List2.
** conclusion
 
* probably cover 1st section w/ dwarfs, plus conclusion, although everything is good
 
----
 
* General introduction to world of // computing in 2006
 
* old wisdom/new wisdom ==> world of // computing is changing
 
* moore's law
 
:::"Key to this approach is a layer of libraries and programming frameworks centered on the 13 computational bottlenecks ("dwarfs") that we identified in the original Berkeley View report. "
 
* Page 7: we have been concentrating on massively parallel architectures, but this may apply to new forms of parallelism
 
* Page 7: good match between scientific // approaches and current parallel (non scientific) domains
 
* Page 7: 7 Dwarfs
 
** ==> question: what's a dwarf? (google berkeley + dwarf)
 
** high level of abstraction
 
* Page 11: 3 new areas for adoption as dwarf?
 
** machine learning
 
** database software
 
*** Page 12: map-reduce
 
** computer graphics
 
* Page 14: introduces 6 new dwarfs
 
* Page 19: End of software Dwarf section ==> Good place to break
 
* Page 20: beginning of hardware
 
** processor
 
** memory
 
** network
 
* Page 32: Programming  models
 
** Does not include threading
 
** MPI
 
** Map-REduce
 
* Page 36: system software
 
** '''Autotuner'''
 
* Page 40: virtual machines to the rescue
 
* Page 44: conclusion
 
  
 +
;Project 2:
 +
:XGrid: process a gzip xml dump of wikipedia and break it up into individual pages (9 million or so of them)!
  
 +
;Project 3:
 +
:Map-Reduce: process wikipedia pages and create an index of words and their associated categories
  
 +
==Papers==
  
 +
[[CSC352 Notes on A View From Berkeley| Notes]] on a View from Berkeley paper
  
  
 
</onlydft>
 
</onlydft>
 +
[[Category:CSC352]][[Category:Class Notes]]

Latest revision as of 15:01, 7 December 2016

DFT Class Notes for CSC352


...