Difference between revisions of "CSC111 Lab 8 2014"

From dftwiki3
Jump to: navigation, search
(Mini Assignments)
 
(48 intermediate revisions by the same user not shown)
Line 52: Line 52:
 
<br />
 
<br />
 
The [[CSC111_Exercises_with_Lists#Solution_Program |solution program for the Exercises]] we saw in class on Monday and Wednesday contains good models of code that can be used to answer most of the challenges in this lab.
 
The [[CSC111_Exercises_with_Lists#Solution_Program |solution program for the Exercises]] we saw in class on Monday and Wednesday contains good models of code that can be used to answer most of the challenges in this lab.
 +
 +
Use the format of the program written for  [[CSC111_Exercises_with_Lists#Solution_Program | the exercises on lists]] as a model for how to format your own program, with a '''main()''' function and individual functions for the challenges.
 +
<br />
 +
<br />
 
<br />
 
<br />
 
<!-- ----------------------------------------------------------------------------------------------- -->
 
<!-- ----------------------------------------------------------------------------------------------- -->
Line 62: Line 66:
 
[[Image:QuestionMark1.jpg|right|120px]]
 
[[Image:QuestionMark1.jpg|right|120px]]
  
* Use a judicious mix() of split() and join operations to convert the string
+
* Use a judicious mix of split() and join() operations to convert the string
  
 
  "1 China 1,339,190,000 9,596,960.00 139.54 3,705,405.45 361.42"
 
  "1 China 1,339,190,000 9,596,960.00 139.54 3,705,405.45 361.42"
Line 70: Line 74:
 
  "China 1339190000"
 
  "China 1339190000"
  
:Note 1: the lack of commas in the number!  (Hints: string objects have ''replace'' methods that could prove useful here!)
+
:'''Note 1''': Notice the lack of commas in the number!  (Hints: string objects have ''replace'' methods that could prove useful here!)
  
:Note 2: that this line is taken from a table from this [http://www.worldatlas.com/aatlas/populations/ctypopls.htm URL] where the numbers after the country indicate a) the population, the area and population density expressed with square-kilometers, and the area and population density expressed with square-miles.
+
:'''Note 2''': that this line is taken from a table from this [http://www.worldatlas.com/aatlas/populations/ctypopls.htm URL] where the numbers after the country indicate a) the population, b) the area, c) the population density expressed, both expressed in or over square-kilometers, d) the area again, but in square miles, and e) the population density expressed per square-miles as well.
  
 
<br />
 
<br />
Line 83: Line 87:
 
|-
 
|-
 
|
 
|
 +
 
==Challenge 2==
 
==Challenge 2==
 
|}
 
|}
[[Image:QuestionMark1.jpg|right|120px]]
+
[[Image:QuestionMark2.jpg|right|120px]]
 
 
* Given the following list, split it into individual lines, and apply your transformation to each line so that your program outputs only the country names and their populations.
 
  
 +
* Given the following list, store it into a multi-line variable called text,  split it into individual lines, and apply your transformation to each line so that your program outputs only the country names and their populations.
  
 +
<source lang="text">
 +
Bangladesh 164,425,000 144,000.00 1,141.84 55,598.69 2,957.35
 +
Brazil 193,364,000 8,511,965.00 22.72 3,286,486.71 58.84
 
  China 1,339,190,000 9,596,960.00 139.54 3,705,405.45 361.42
 
  China 1,339,190,000 9,596,960.00 139.54 3,705,405.45 361.42
 +
Egypt 78,848,000 1,001,450.00 78.73 386,661.85 203.92
 +
Ethiopia 79,221,000 1,127,127.00 70.29 435,185.99 182.04
 +
Germany 81,757,600 357,021.00 229.00 137,846.52 593.11
 
  India 1,184,639,000 3,287,590.00 360.34 1,269,345.07 933.27
 
  India 1,184,639,000 3,287,590.00 360.34 1,269,345.07 933.27
United-States 309,975,000 9,629,091.00 32.19 3,717,811.29 83.38
 
 
  Indonesia 234,181,400 1,919,440.00 122.01 741,099.62 315.99
 
  Indonesia 234,181,400 1,919,440.00 122.01 741,099.62 315.99
  Brazil 193,364,000 8,511,965.00 22.72 3,286,486.71 58.84
+
  Iran 75,078,000 1,648,000.00 45.56 636,296.10 117.99
Pakistan 170,260,000 803,940.00 211.78 310,402.84 548.51
 
Nigeria 170,123,000 923,768.00 171.32 356,668.67 443.71
 
Bangladesh 164,425,000 144,000.00 1,141.84 55,598.69 2,957.35
 
Russia 141,927,297 17,075,200.00 8.31 6,592,768.87 21.53
 
 
  Japan 127,380,000 377,835.00 337.13 145,882.85 873.17
 
  Japan 127,380,000 377,835.00 337.13 145,882.85 873.17
 
  Mexico 108,396,211 1,972,550.00 54.95 761,605.50 142.33
 
  Mexico 108,396,211 1,972,550.00 54.95 761,605.50 142.33
 +
Nigeria 170,123,000 923,768.00 171.32 356,668.67 443.71
 +
Pakistan 170,260,000 803,940.00 211.78 310,402.84 548.51
 
  Phillipines 94,013,200 300,000.00 313.38 115,830.60 811.64
 
  Phillipines 94,013,200 300,000.00 313.38 115,830.60 811.64
 +
Russia 141,927,297 17,075,200.00 8.31 6,592,768.87 21.53
 +
United-States 309,975,000 9,629,091.00 32.19 3,717,811.29 83.38
 
  Vietnam 85,789,573 329,560.00 260.32 127,243.78 674.21
 
  Vietnam 85,789,573 329,560.00 260.32 127,243.78 674.21
 +
</source>
 +
<br />
 +
 +
:Your first variable should be '''text''', defined as follows:
 +
<br />
 +
::<source lang="python">
 +
text = """ Bangladesh 164,425,000 144,000.00 1,141.84 55,598.69 2,957.35
 +
Brazil 193,364,000 8,511,965.00 22.72 3,286,486.71 58.84
 +
China 1,339,190,000 9,596,960.00 139.54 3,705,405.45 361.42
 +
Egypt 78,848,000 1,001,450.00 78.73 386,661.85 203.92
 +
Ethiopia 79,221,000 1,127,127.00 70.29 435,185.99 182.04
 
  Germany 81,757,600 357,021.00 229.00 137,846.52 593.11
 
  Germany 81,757,600 357,021.00 229.00 137,846.52 593.11
  Ethiopia 79,221,000 1,127,127.00 70.29 435,185.99 182.04
+
  India 1,184,639,000 3,287,590.00 360.34 1,269,345.07 933.27
  Egypt 78,848,000 1,001,450.00 78.73 386,661.85 203.92
+
  Indonesia 234,181,400 1,919,440.00 122.01 741,099.62 315.99
 
  Iran 75,078,000 1,648,000.00 45.56 636,296.10 117.99
 
  Iran 75,078,000 1,648,000.00 45.56 636,296.10 117.99
 
+
Japan 127,380,000 377,835.00 337.13 145,882.85 873.17
 
+
Mexico 108,396,211 1,972,550.00 54.95 761,605.50 142.33
 +
Nigeria 170,123,000 923,768.00 171.32 356,668.67 443.71
 +
Pakistan 170,260,000 803,940.00 211.78 310,402.84 548.51
 +
Phillipines 94,013,200 300,000.00 313.38 115,830.60 811.64
 +
Russia 141,927,297 17,075,200.00 8.31 6,592,768.87 21.53
 +
United-States 309,975,000 9,629,091.00 32.19 3,717,811.29 83.38
 +
Vietnam 85,789,573 329,560.00 260.32 127,243.78 674.21"""
 +
</source>
 
<br />
 
<br />
 
<br />
 
<br />
Line 119: Line 146:
 
|
 
|
  
==Challenge x==
+
==Challenge 3==
 
|}
 
|}
[[Image:QuestionMark1.jpg|right|120px]]
+
[[Image:QuestionMark3.jpg|right|120px]]
 
 
*
 
  
 +
* Take your solution for Challenge 2 and make it output the country with the largest population.
  
 +
<br />
 +
<br /><br />
 +
<br />
 
<br />
 
<br />
 
<br />
 
<br />
Line 135: Line 164:
 
|-
 
|-
 
|
 
|
==Challenge x==
+
 
 +
==Challenge 4==
 
|}
 
|}
[[Image:QuestionMark1.jpg|right|120px]]
+
[[Image:QuestionMark4.jpg|right|120px]]
  
*  
+
* Same as Challenge 3, but this time make your program output the country with the largest population density.
  
  
 
<br />
 
<br />
 +
<br /><br />
 +
<br /><br />
 
<br />
 
<br />
 
<!-- ----------------------------------------------------------------------------------------------- -->
 
<!-- ----------------------------------------------------------------------------------------------- -->
  
 +
<br /><br />
 +
=Sorting Lists, Reversing List, finding the Min or Max of a List=
 +
<br />
 +
Enter the different commands below in the console, and observe how Python executes each line.
 +
 +
>>> seven = [ "Sleepy", "Sneezy", "Bashful", "Happy", "Grumpy", "Dopey", "Doc" ]
 +
>>> seven.sort()
 +
>>> seven
 +
 +
>>> seven.reverse()
 +
>>> seven
 +
 +
>>> nums = [0, 10, -200, 3, 4, 100]
 +
>>> nums.sort()
 +
>>> nums
 +
 +
>>> nums.reverse()
 +
>>> nums
 +
 
 +
>>> min( nums )
 +
 +
>>> max( nums )
 +
 +
 +
>>> dwarvesHeight = [('Doc', 2), ('Dopey', 6), ('Grumpy', 4.5), ('Happy', 7),('Bashful', 3)]
 +
>>> dwarvesHeight.sort()
 +
>>> dwarvesHeight
 +
 +
>>> heightDwarves = []
 +
>>> for pair in dwarvesHeight:
 +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;name = pair[0]
 +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;height = pair[1]
 +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;heightDwarves.append( (height, name ) )
 +
 +
 +
>>> heightDwarves
 +
 +
>>> heightDwarves.sort()
 +
>>> heightDwarves
 +
 +
>>> heightDwarves.reverse()
 +
>>> heightDwarves
 +
 +
>>> min( heightDwarves )
 +
 +
>>> max( heightDwarves )
 +
 +
>>>
 
<br />
 
<br />
 
<!-- ----------------------------------------------------------------------------------------------- -->
 
<!-- ----------------------------------------------------------------------------------------------- -->
Line 151: Line 231:
 
|-
 
|-
 
|
 
|
==Challenge x==
+
 
 +
==Challenge 5==
 
|}
 
|}
[[Image:QuestionMark1.jpg|right|120px]]
+
[[Image:QuestionMark5.jpg|right|120px]]
 
 
*
 
 
 
  
 +
* Make your program use the original '''text''' variable and store the pairs '''(population, country name)''' into a list
 +
* Using sorting, reversing, using min or max, make your program output the country with the smallest population, nicely formatted (i.e. no parentheses or commas printed).  This cannot be the same as the solution function for Challenge 3 or 4.
 +
* Similarly, make your program output the country with the largest population.
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 
<br />
 
<br />
 
<br />
 
<br />
Line 167: Line 252:
 
|-
 
|-
 
|
 
|
==Challenge x==
+
 
 +
==Challenge 6==
 
|}
 
|}
[[Image:QuestionMark1.jpg|right|120px]]
+
[[Image:QuestionMark6.jpg|right|120px]]
  
*  
+
* Make your program output the list of countries and population sorted from largest population to smallest population.  The information should show the country first on each line, followed by its population.
 +
  
  
 
<br />
 
<br />
 +
<br /><br />
 +
<br /><br />
 +
<br /><br />
 +
=Processing DNA Strings=
 
<br />
 
<br />
 +
A DNA string is a string composed of sequences of four ''nucleobase'' (guanine, adenine, thymine, and cytosine) represented by the letters G, A, T, and C.  Assume that we have a DNA string defined as follows:
 +
 +
AGCCTTCT<font color="magenta">AAGG</font>TTAATTAACTCGAGAGAGGGTTGGCGCAGTTA<font color="magenta">AAGG</font>CCTTAATCGGTTCTGT
 +
 +
Figure out a way in Python to extract the string that is between the two markers <font color="magenta">AAGG</font>.  In other words create a variable called '''DNA''' equal to the string above, then use all the ''methods'' we've seen so far to isolate the string between the markers and print it.
 +
 
<!-- ----------------------------------------------------------------------------------------------- -->
 
<!-- ----------------------------------------------------------------------------------------------- -->
  
Line 183: Line 280:
 
|-
 
|-
 
|
 
|
==Challenge x==
+
==Challenge 7==
 
|}
 
|}
[[Image:QuestionMark1.jpg|right|120px]]
+
[[Image:QuestionMark8.jpg|right|120px]]
  
*  
+
* Assume that DNA now is a multi-line string defined as follows:
  
 +
<br />
 +
DNA = """AGCCTTCTAGCGTTAATTAACTCGAGAGAGGGTTGGCGCAGTTACCTTAATCGGTTCTGT
 +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TCCTGAGCGA<font color="magenta">AAGG</font>GCTCAAGCACCTGTTACCTCTGTGATAACGCCAGAGTAACTCGAGC
 +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AAAGAC<font color="magenta">AAGG</font>GAAGCTCTAACCATGTCCGAGACAAGTTGTCTAGCAGTCCCAGTTCACACTTG &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ACAATCTACAAATTAGAGCACGGATCATTTACAGGCCAATCTGGCGCGTTAATCGA
 +
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TTTCCGCAAACCGCCATGCTGCATCATTACGGGAACCACACGCCGGAAGCAGGAACAGCA"""
  
 +
(it might be easier to copy/paste the string when formatted in the form below:
 +
<br />
 +
<source lang="text">
 +
DNA = """AGCCTTCTAGCGTTAATTAACTCGAGAGAGGGTTGGCGCAGTTACCTTAATCGGTTCTGT     
 +
TCCTGAGCGAAAGGGCTCAAGCACCTGTTACCTCTGTGATAACGCCAGAGTAACTCGAGC 
 +
AAAGACAAGGGAAGCTCTAACCATGTCCGAGACAAGTTGTCTAGCAGTCCCAGTTCACACTTG     
 +
ACAATCTACAAATTAGAGCACGGATCATTTACAGGCCAATCTGGCGCGTTAATCGA   
 +
TTTCCGCAAACCGCCATGCTGCATCATTACGGGAACCACACGCCGGAAGCAGGAACAGCA"""
 +
</source>
 +
<br />
 +
:where the markers are on separate lines.  Modify your previous solution so that it works on this new string.
 +
* Make your program display the string between the markers on one line only.
 +
* Make your program output the length of the string between markers
 +
* Make your program display how many ''adenine'' (A) nucleobases the string between markers contains.
 +
<br />
 +
<br />
 +
<br />
 +
<br />
 
<br />
 
<br />
 
<br />
 
<br />
 
<!-- ----------------------------------------------------------------------------------------------- -->
 
<!-- ----------------------------------------------------------------------------------------------- -->
  
 +
=Coldest year in Oxford? (Optional)=
 +
<br />
 +
[[Image:UKOxford.png|right|250px|link=http://www.metoffice.gov.uk/climate/uk/stationdata/]] The page at URL [http://www.metoffice.gov.uk/climate/uk/stationdata/ http://www.metoffice.gov.uk/climate/uk/stationdata/] contains historical temperature data for different cities in the United Kingdom.  You click on a red dot to get a page of temperatures for the city associated with the dot.
 +
<br />
 +
Once you are looking at the page indicated above, click on '''Oxford''' and get a page of recorded temperatures since 1853 in that city.
 +
<br />
 +
<source lang="text">
 +
Oxford
 +
Location: 4509E 2072N, 63 metres amsl
 +
Estimated data is marked with a * after the value.
 +
Missing data (more than 2 days missing in month) is marked by  ---.
 +
Sunshine data taken from an automatic Kipp & Zonen sensor marked with a #, otherwise sunshine data taken from a Campbell Stokes recorder.
 +
  yyyy  mm  tmax    tmin      af    rain    sun
 +
              degC    degC    days      mm  hours
 +
  1853  1    8.4    2.7      4    62.8    ---
 +
  1853  2    3.2    -1.8      19    29.3    ---
 +
  1853  3    7.7    -0.6      20    25.9    ---
 +
  1853  4  12.6    4.5      0    60.1    ---
 +
  1853  5  16.8    6.1      0    59.5    ---
 +
  1853  6  20.1    10.7      0    82.0    ---
 +
  1853  7  21.2    12.2      0    86.2    ---
 +
  etc...
 +
</source>
 
<br />
 
<br />
 
<!-- ----------------------------------------------------------------------------------------------- -->
 
<!-- ----------------------------------------------------------------------------------------------- -->
Line 199: Line 342:
 
|-
 
|-
 
|
 
|
==Challenge x==
+
 
 +
==Challenge 8==
 
|}
 
|}
[[Image:QuestionMark1.jpg|right|120px]]
+
[[Image:QuestionMark9.jpg|right|120px]]
  
*  
+
* Figure out a way in Python to find the lowest temperature ever recorded in Oxford.  Make your program output the year this record occurred, and the temperature recorded that year.
 +
<br />
 +
<br />
 +
 +
<tanbox>
 +
Be careful that some temperatures might be missing for certain years, and will be replaced by '''---''' in the table.  Also approximate temperatures have a '''*''' next to them, which might not be appreciated by the '''int()''' or '''float()''' function.
 +
</tanbox>
 +
<br />
 +
<br />
 +
 +
* Modify your program so that it outputs the year the l lowest temperature was recorded and the year the highest temperature was recorded, and what the actual recorded temperatures were.
  
 +
<br />
 +
<br />
 +
 +
<br />
 +
<br />
  
 +
=Submission=
 
<br />
 
<br />
 +
Submit the program (which you should name '''lab8.py''') to this URL: [http://cs.smith.edu/~thiebaut/111b/submitL8.php  http://cs.smith.edu/~thiebaut/111b/submitL8.php]
 
<br />
 
<br />
<!-- ----------------------------------------------------------------------------------------------- -->
+
<br />
 +
<br />
 +
=Reference Output for all the Challenges=
 +
<br />
 +
<source lang="text">
 +
 
  
<br />
+
+-------------+
<!-- ----------------------------------------------------------------------------------------------- -->
+
| Challenge 1 |
{| style="width:100%; background:silver"
+
+-------------+
|-
+
China 1339190000
|
 
==Challenge x==
 
|}
 
[[Image:QuestionMark1.jpg|right|120px]]
 
  
*
 
  
  
<br />
+
+-------------+
<br />
+
| Challenge 2 |
<!-- ----------------------------------------------------------------------------------------------- -->
+
+-------------+
 +
Bangladesh                      164425000
 +
Brazil                          193364000
 +
China                          1339190000
 +
Egypt                            78848000
 +
Ethiopia                        79221000
 +
Germany                          81757600
 +
India                          1184639000
 +
Indonesia                      234181400
 +
Iran                            75078000
 +
Japan                          127380000
 +
Mexico                          108396211
 +
Nigeria                        170123000
 +
Pakistan                        170260000
 +
Phillipines                      94013200
 +
Russia                          141927297
 +
United-States                  309975000
 +
Vietnam                          85789573
  
<br />
 
<!-- ----------------------------------------------------------------------------------------------- -->
 
{| style="width:100%; background:silver"
 
|-
 
|
 
==Challenge x==
 
|}
 
[[Image:QuestionMark1.jpg|right|120px]]
 
  
*
 
  
 +
+-------------+
 +
| Challenge 3 |
 +
+-------------+
 +
China has the largest population of 1339190000
  
<br />
 
<br />
 
<!-- ----------------------------------------------------------------------------------------------- -->
 
  
<br />
 
<!-- ----------------------------------------------------------------------------------------------- -->
 
{| style="width:100%; background:silver"
 
|-
 
|
 
==Challenge x==
 
|}
 
[[Image:QuestionMark1.jpg|right|120px]]
 
  
*
+
+-------------+
 +
| Challenge 4 |
 +
+-------------+
 +
Bangladesh has the highest population density of 1141.84
  
  
<br />
 
<br />
 
<!-- ----------------------------------------------------------------------------------------------- -->
 
  
<br />
+
+-------------+
<!-- ----------------------------------------------------------------------------------------------- -->
+
| Challenge 5 |
{| style="width:100%; background:silver"
+
+-------------+
|-
+
Iran has the smallest population of 75078000
|
+
China has the largest population of 1339190000
==Challenge x==
 
|}
 
[[Image:QuestionMark1.jpg|right|120px]]
 
  
*
 
  
  
<br />
+
+-------------+
<br />
+
| Challenge 6 |
<!-- ----------------------------------------------------------------------------------------------- -->
+
+-------------+
 +
                        China 1339190000
 +
                        India 1184639000
 +
                United-States  309975000
 +
                    Indonesia  234181400
 +
                        Brazil  193364000
 +
                      Pakistan  170260000
 +
                      Nigeria  170123000
 +
                    Bangladesh  164425000
 +
                        Russia  141927297
 +
                        Japan  127380000
 +
                        Mexico  108396211
 +
                  Phillipines  94013200
 +
                      Vietnam  85789573
 +
                      Germany  81757600
 +
                      Ethiopia  79221000
 +
                        Egypt  78848000
 +
                          Iran  75078000
  
<br />
 
<!-- ----------------------------------------------------------------------------------------------- -->
 
{| style="width:100%; background:silver"
 
|-
 
|
 
==Challenge x==
 
|}
 
[[Image:QuestionMark1.jpg|right|120px]]
 
  
*
 
  
 +
+-------------+
 +
| Challenge 7 |
 +
+-------------+
 +
Sequence between AAGG markers = TTAATTAACTCGAGAGAGGGTTGGCGCAGTTA (length=32)
 +
Sequence between AAGG markers = GCTCAAGCACCTGTTACCTCTGTGATAACGCCAGAGTAACTCGAGCAAAGAC (length=52)
 +
There are 16 A nucleobases in the last string.
  
 +
+-------------+
 +
| Challenge 8 |
 +
+-------------+
 +
The coldest temperature of -5.80 degrees was recorded in 1963
 +
The warmest temperature of 27.10 degrees was recorded in 2006
 +
</source>
 
<br />
 
<br />
 
<br />
 
<br />
<!-- ----------------------------------------------------------------------------------------------- -->
 
  
 
<br />
 
<br />
<!-- ----------------------------------------------------------------------------------------------- -->
+
<br />
{| style="width:100%; background:silver"
+
|-
+
<br />
|
+
<br />
==Challenge x==
+
|}
 
[[Image:QuestionMark1.jpg|right|120px]]
 
 
 
*
 
 
 
 
 
 
<br />
 
<br />
 
<br />
 
<br />
<!-- ----------------------------------------------------------------------------------------------- -->
+
 
 
 
<br />
 
<br />
<!-- ----------------------------------------------------------------------------------------------- -->
 
{| style="width:100%; background:silver"
 
|-
 
|
 
==Challenge x==
 
|}
 
[[Image:QuestionMark1.jpg|right|120px]]
 
 
*
 
 
 
 
<br />
 
<br />
 +
 
<br />
 
<br />
<!-- ----------------------------------------------------------------------------------------------- -->
 
 
 
<br />
 
<br />
<!-- ----------------------------------------------------------------------------------------------- -->
+
{| style="width:100%; background:silver"
 
|-
 
|
 
==Challenge x==
 
|}
 
[[Image:QuestionMark1.jpg|right|120px]]
 
 
 
*
 
 
 
 
 
 
<br />
 
<br />
 
<br />
 
<br />
<!-- ----------------------------------------------------------------------------------------------- -->
+
* Figure out a way to take a string of the form "Pakistan 108 166 226" where the first word is a country name, and the following three numbers are estimated populations of this country in 1900, 2008, and 2025, into a new string with only the first and last words, i.e. "Pakistan 226".
+
[[Category:CSC111]][[Category:Python]][[Category:Labs]]
 
 
 
 
China 1,458
 
India 1,398
 
United-States 352
 
Indonesia 273
 
Brazil 223
 
Pakistan 226
 
Bangladesh 198
 
Nigeria 208
 
Russia 137
 
Japan 126
 

Latest revision as of 14:43, 27 March 2014

--D. Thiebaut (talk) 14:01, 24 March 2014 (EDT)


This lab deals with strings and list operations, and transforming strings into lists and lists into strings.


Splitting Strings


Work in the console, and try these different commands. Observe what the different operations do.

>>> line = "The quick, red fox jumped.  It jumped over the lazy, sleepy, brown dog."
>>> line

>>> line.split()
>>> words = line.split()
>>> words

>>> words[0]

>>> words[1]

>>> words[-1]

>>> words[-2]

>>> chunks = line.split( ',' )      # split on commas
>>> chunks

>>> chunks = line.split( '.' )      # split on periods
>>> chunks

>>> words

>>> separator = "+"
>>> newLine = separator.join( words )    # join the words into a new string and use separator as the glue
>>> newLine

>>> separator = "$$$"
>>> newLine = separator.join( words )    # same but use $$$ as the glue
>>> newLine

>>> words       # verify that you still have individual words in this list

>>> newWords = [ words[0], words[3], words[4], words[7], words[8], words[12] ] # create a new list 
>>> newWords

>>> " ".join( newWords )      # join strings in newWords list with a space

Mini Assignments


The solution program for the Exercises we saw in class on Monday and Wednesday contains good models of code that can be used to answer most of the challenges in this lab.

Use the format of the program written for the exercises on lists as a model for how to format your own program, with a main() function and individual functions for the challenges.


Challenge 1

QuestionMark1.jpg
  • Use a judicious mix of split() and join() operations to convert the string
"1	China	1,339,190,000	9,596,960.00	139.54	3,705,405.45	361.42"
into a new string:
"China 1339190000"
Note 1: Notice the lack of commas in the number! (Hints: string objects have replace methods that could prove useful here!)
Note 2: that this line is taken from a table from this URL where the numbers after the country indicate a) the population, b) the area, c) the population density expressed, both expressed in or over square-kilometers, d) the area again, but in square miles, and e) the population density expressed per square-miles as well.




Challenge 2

QuestionMark2.jpg
  • Given the following list, store it into a multi-line variable called text, split it into individual lines, and apply your transformation to each line so that your program outputs only the country names and their populations.
 Bangladesh	164,425,000	144,000.00	1,141.84	55,598.69	2,957.35
 Brazil	193,364,000	8,511,965.00	22.72	3,286,486.71	58.84
 China	1,339,190,000	9,596,960.00	139.54	3,705,405.45	361.42
 Egypt	78,848,000	1,001,450.00	78.73	386,661.85	203.92
 Ethiopia	79,221,000	1,127,127.00	70.29	435,185.99	182.04
 Germany	81,757,600	357,021.00	229.00	137,846.52	593.11
 India	1,184,639,000	3,287,590.00	360.34	1,269,345.07	933.27
 Indonesia	234,181,400	1,919,440.00	122.01	741,099.62	315.99
 Iran	75,078,000	1,648,000.00	45.56	636,296.10	117.99
 Japan	127,380,000	377,835.00	337.13	145,882.85	873.17
 Mexico	108,396,211	1,972,550.00	54.95	761,605.50	142.33
 Nigeria	170,123,000	923,768.00	171.32	356,668.67	443.71
 Pakistan	170,260,000	803,940.00	211.78	310,402.84	548.51
 Phillipines	94,013,200	300,000.00	313.38	115,830.60	811.64
 Russia	141,927,297	17,075,200.00	8.31	6,592,768.87	21.53
 United-States	309,975,000	9,629,091.00	32.19	3,717,811.29	83.38
 Vietnam	85,789,573	329,560.00	260.32	127,243.78	674.21


Your first variable should be text, defined as follows:


text = """ Bangladesh	164,425,000	144,000.00	1,141.84	55,598.69	2,957.35
 Brazil	193,364,000	8,511,965.00	22.72	3,286,486.71	58.84
 China	1,339,190,000	9,596,960.00	139.54	3,705,405.45	361.42
 Egypt	78,848,000	1,001,450.00	78.73	386,661.85	203.92
 Ethiopia	79,221,000	1,127,127.00	70.29	435,185.99	182.04
 Germany	81,757,600	357,021.00	229.00	137,846.52	593.11
 India	1,184,639,000	3,287,590.00	360.34	1,269,345.07	933.27
 Indonesia	234,181,400	1,919,440.00	122.01	741,099.62	315.99
 Iran	75,078,000	1,648,000.00	45.56	636,296.10	117.99
 Japan	127,380,000	377,835.00	337.13	145,882.85	873.17
 Mexico	108,396,211	1,972,550.00	54.95	761,605.50	142.33
 Nigeria	170,123,000	923,768.00	171.32	356,668.67	443.71
 Pakistan	170,260,000	803,940.00	211.78	310,402.84	548.51
 Phillipines	94,013,200	300,000.00	313.38	115,830.60	811.64
 Russia	141,927,297	17,075,200.00	8.31	6,592,768.87	21.53
 United-States	309,975,000	9,629,091.00	32.19	3,717,811.29	83.38
 Vietnam	85,789,573	329,560.00	260.32	127,243.78	674.21"""




Challenge 3

QuestionMark3.jpg
  • Take your solution for Challenge 2 and make it output the country with the largest population.








Challenge 4

QuestionMark4.jpg
  • Same as Challenge 3, but this time make your program output the country with the largest population density.










Sorting Lists, Reversing List, finding the Min or Max of a List


Enter the different commands below in the console, and observe how Python executes each line.

>>> seven = [ "Sleepy", "Sneezy", "Bashful", "Happy", "Grumpy", "Dopey", "Doc" ]
>>> seven.sort()
>>> seven

>>> seven.reverse()
>>> seven

>>> nums = [0, 10, -200, 3, 4, 100]
>>> nums.sort()
>>> nums

>>> nums.reverse()
>>> nums
 
>>> min( nums )

>>> max( nums )


>>> dwarvesHeight = [('Doc', 2), ('Dopey', 6), ('Grumpy', 4.5), ('Happy', 7),('Bashful', 3)]
>>> dwarvesHeight.sort()
>>> dwarvesHeight

>>> heightDwarves = []
>>> for pair in dwarvesHeight:
	      name = pair[0]
	      height = pair[1]
	      heightDwarves.append( (height, name ) )

	
>>> heightDwarves

>>> heightDwarves.sort()
>>> heightDwarves

>>> heightDwarves.reverse()
>>> heightDwarves

>>> min( heightDwarves )

>>> max( heightDwarves )

>>> 


Challenge 5

QuestionMark5.jpg
  • Make your program use the original text variable and store the pairs (population, country name) into a list
  • Using sorting, reversing, using min or max, make your program output the country with the smallest population, nicely formatted (i.e. no parentheses or commas printed). This cannot be the same as the solution function for Challenge 3 or 4.
  • Similarly, make your program output the country with the largest population.








Challenge 6

QuestionMark6.jpg
  • Make your program output the list of countries and population sorted from largest population to smallest population. The information should show the country first on each line, followed by its population.









Processing DNA Strings


A DNA string is a string composed of sequences of four nucleobase (guanine, adenine, thymine, and cytosine) represented by the letters G, A, T, and C. Assume that we have a DNA string defined as follows:

AGCCTTCTAAGGTTAATTAACTCGAGAGAGGGTTGGCGCAGTTAAAGGCCTTAATCGGTTCTGT

Figure out a way in Python to extract the string that is between the two markers AAGG. In other words create a variable called DNA equal to the string above, then use all the methods we've seen so far to isolate the string between the markers and print it.



Challenge 7

QuestionMark8.jpg
  • Assume that DNA now is a multi-line string defined as follows:


DNA = """AGCCTTCTAGCGTTAATTAACTCGAGAGAGGGTTGGCGCAGTTACCTTAATCGGTTCTGT
     TCCTGAGCGAAAGGGCTCAAGCACCTGTTACCTCTGTGATAACGCCAGAGTAACTCGAGC
     AAAGACAAGGGAAGCTCTAACCATGTCCGAGACAAGTTGTCTAGCAGTCCCAGTTCACACTTG      ACAATCTACAAATTAGAGCACGGATCATTTACAGGCCAATCTGGCGCGTTAATCGA
     TTTCCGCAAACCGCCATGCTGCATCATTACGGGAACCACACGCCGGAAGCAGGAACAGCA"""

(it might be easier to copy/paste the string when formatted in the form below:

DNA = """AGCCTTCTAGCGTTAATTAACTCGAGAGAGGGTTGGCGCAGTTACCTTAATCGGTTCTGT      
TCCTGAGCGAAAGGGCTCAAGCACCTGTTACCTCTGTGATAACGCCAGAGTAACTCGAGC   
AAAGACAAGGGAAGCTCTAACCATGTCCGAGACAAGTTGTCTAGCAGTCCCAGTTCACACTTG      
ACAATCTACAAATTAGAGCACGGATCATTTACAGGCCAATCTGGCGCGTTAATCGA     
TTTCCGCAAACCGCCATGCTGCATCATTACGGGAACCACACGCCGGAAGCAGGAACAGCA"""


where the markers are on separate lines. Modify your previous solution so that it works on this new string.
  • Make your program display the string between the markers on one line only.
  • Make your program output the length of the string between markers
  • Make your program display how many adenine (A) nucleobases the string between markers contains.







Coldest year in Oxford? (Optional)


UKOxford.png
The page at URL http://www.metoffice.gov.uk/climate/uk/stationdata/ contains historical temperature data for different cities in the United Kingdom. You click on a red dot to get a page of temperatures for the city associated with the dot.


Once you are looking at the page indicated above, click on Oxford and get a page of recorded temperatures since 1853 in that city.

Oxford
Location: 4509E 2072N, 63 metres amsl
Estimated data is marked with a * after the value.
Missing data (more than 2 days missing in month) is marked by  ---.
Sunshine data taken from an automatic Kipp & Zonen sensor marked with a #, otherwise sunshine data taken from a Campbell Stokes recorder.
   yyyy  mm   tmax    tmin      af    rain     sun
              degC    degC    days      mm   hours
   1853   1    8.4     2.7       4    62.8     ---
   1853   2    3.2    -1.8      19    29.3     ---
   1853   3    7.7    -0.6      20    25.9     ---
   1853   4   12.6     4.5       0    60.1     ---
   1853   5   16.8     6.1       0    59.5     ---
   1853   6   20.1    10.7       0    82.0     ---
   1853   7   21.2    12.2       0    86.2     ---
   etc...


Challenge 8

QuestionMark9.jpg
  • Figure out a way in Python to find the lowest temperature ever recorded in Oxford. Make your program output the year this record occurred, and the temperature recorded that year.



Be careful that some temperatures might be missing for certain years, and will be replaced by --- in the table. Also approximate temperatures have a * next to them, which might not be appreciated by the int() or float() function.



  • Modify your program so that it outputs the year the l lowest temperature was recorded and the year the highest temperature was recorded, and what the actual recorded temperatures were.





Submission


Submit the program (which you should name lab8.py) to this URL: http://cs.smith.edu/~thiebaut/111b/submitL8.php


Reference Output for all the Challenges


+-------------+
| Challenge 1 |
+-------------+
China 1339190000



+-------------+
| Challenge 2 |
+-------------+
Bangladesh                      164425000
Brazil                          193364000
China                          1339190000
Egypt                            78848000
Ethiopia                         79221000
Germany                          81757600
India                          1184639000
Indonesia                       234181400
Iran                             75078000
Japan                           127380000
Mexico                          108396211
Nigeria                         170123000
Pakistan                        170260000
Phillipines                      94013200
Russia                          141927297
United-States                   309975000
Vietnam                          85789573



+-------------+
| Challenge 3 |
+-------------+
China has the largest population of 1339190000



+-------------+
| Challenge 4 |
+-------------+
Bangladesh has the highest population density of 1141.84



+-------------+
| Challenge 5 |
+-------------+
Iran has the smallest population of 75078000
China has the largest population of 1339190000



+-------------+
| Challenge 6 |
+-------------+
                         China 1339190000
                         India 1184639000
                 United-States  309975000
                     Indonesia  234181400
                        Brazil  193364000
                      Pakistan  170260000
                       Nigeria  170123000
                    Bangladesh  164425000
                        Russia  141927297
                         Japan  127380000
                        Mexico  108396211
                   Phillipines   94013200
                       Vietnam   85789573
                       Germany   81757600
                      Ethiopia   79221000
                         Egypt   78848000
                          Iran   75078000



+-------------+
| Challenge 7 |
+-------------+
Sequence between AAGG markers = TTAATTAACTCGAGAGAGGGTTGGCGCAGTTA (length=32)
Sequence between AAGG markers = GCTCAAGCACCTGTTACCTCTGTGATAACGCCAGAGTAACTCGAGCAAAGAC (length=52)
There are 16 A nucleobases in the last string.

+-------------+
| Challenge 8 |
+-------------+
The coldest temperature of -5.80 degrees was recorded in 1963
The warmest temperature of 27.10 degrees was recorded in 2006