Difference between revisions of "CSC111 Lab 8 2014"
(→Coldest year in Oxford?) |
(→Challenge 8) |
||
Line 349: | Line 349: | ||
* Figure out a way in Python to find the lowest temperature ever recorded in Oxford. Make your program output the year this record occurred, and the temperature recorded that year. | * Figure out a way in Python to find the lowest temperature ever recorded in Oxford. Make your program output the year this record occurred, and the temperature recorded that year. | ||
− | + | <br /> | |
+ | <br /> | ||
+ | |||
<tanbox> | <tanbox> | ||
Be careful that some temperatures might be missing for certain years, and will be replaced by '''---''' in the table. Also approximate temperatures have a '''*''' next to them, which might not be appreciated by the '''int()''' or '''float()''' function. | Be careful that some temperatures might be missing for certain years, and will be replaced by '''---''' in the table. Also approximate temperatures have a '''*''' next to them, which might not be appreciated by the '''int()''' or '''float()''' function. | ||
</tanbox> | </tanbox> | ||
− | + | <br /> | |
+ | <br /> | ||
+ | |||
* Modify your program so that it outputs the year the l lowest temperature was recorded and the year the highest temperature was recorded, and what the actual recorded temperatures were. | * Modify your program so that it outputs the year the l lowest temperature was recorded and the year the highest temperature was recorded, and what the actual recorded temperatures were. | ||
Revision as of 16:31, 24 March 2014
--D. Thiebaut (talk) 14:01, 24 March 2014 (EDT)
This lab deals with strings and list operations, and transforming strings into lists and lists into strings.
Contents
Splitting Strings
Work in the console, and try these different commands. Observe what the different operations do.
>>> line = "The quick, red fox jumped. It jumped over the lazy, sleepy, brown dog." >>> line >>> line.split() >>> words = line.split() >>> words >>> words[0] >>> words[1] >>> words[-1] >>> words[-2] >>> chunks = line.split( ',' ) # split on commas >>> chunks >>> chunks = line.split( '.' ) # split on periods >>> chunks >>> words >>> separator = "+" >>> newLine = separator.join( words ) # join the words into a new string and use separator as the glue >>> newLine >>> separator = "$$$" >>> newLine = separator.join( words ) # same but use $$$ as the glue >>> newLine >>> words # verify that you still have individual words in this list >>> newWords = [ words[0], words[3], words[4], words[7], words[8], words[12] ] # create a new list >>> newWords >>> " ".join( newWords ) # join strings in newWords list with a space
Mini Assignments
The solution program for the Exercises we saw in class on Monday and Wednesday contains good models of code that can be used to answer most of the challenges in this lab.
Use the format of the program written for the exercises on lists as a model for how to format your own program, with a main() function and individual functions for the challenges.
Challenge 1 |
- Use a judicious mix() of split() and join operations to convert the string
"1 China 1,339,190,000 9,596,960.00 139.54 3,705,405.45 361.42"
- into a new string:
"China 1339190000"
- Note 1: Notice the lack of commas in the number! (Hints: string objects have replace methods that could prove useful here!)
- Note 2: that this line is taken from a table from this URL where the numbers after the country indicate a) the population, b) the area, c) the population density expressed, both expressed in or over square-kilometers, d) the area again, but in square miles, and e) the population density expressed per square-miles as well.
Challenge 2 |
- Given the following list, store it into a multi-line variable called text, split it into individual lines, and apply your transformation to each line so that your program outputs only the country names and their populations.
Bangladesh 164,425,000 144,000.00 1,141.84 55,598.69 2,957.35
Brazil 193,364,000 8,511,965.00 22.72 3,286,486.71 58.84
China 1,339,190,000 9,596,960.00 139.54 3,705,405.45 361.42
Egypt 78,848,000 1,001,450.00 78.73 386,661.85 203.92
Ethiopia 79,221,000 1,127,127.00 70.29 435,185.99 182.04
Germany 81,757,600 357,021.00 229.00 137,846.52 593.11
India 1,184,639,000 3,287,590.00 360.34 1,269,345.07 933.27
Indonesia 234,181,400 1,919,440.00 122.01 741,099.62 315.99
Iran 75,078,000 1,648,000.00 45.56 636,296.10 117.99
Japan 127,380,000 377,835.00 337.13 145,882.85 873.17
Mexico 108,396,211 1,972,550.00 54.95 761,605.50 142.33
Nigeria 170,123,000 923,768.00 171.32 356,668.67 443.71
Pakistan 170,260,000 803,940.00 211.78 310,402.84 548.51
Phillipines 94,013,200 300,000.00 313.38 115,830.60 811.64
Russia 141,927,297 17,075,200.00 8.31 6,592,768.87 21.53
United-States 309,975,000 9,629,091.00 32.19 3,717,811.29 83.38
Vietnam 85,789,573 329,560.00 260.32 127,243.78 674.21
- Your first variable should be text, defined as follows:
text = """ Bangladesh 164,425,000 144,000.00 1,141.84 55,598.69 2,957.35 Brazil 193,364,000 8,511,965.00 22.72 3,286,486.71 58.84 China 1,339,190,000 9,596,960.00 139.54 3,705,405.45 361.42 Egypt 78,848,000 1,001,450.00 78.73 386,661.85 203.92 Ethiopia 79,221,000 1,127,127.00 70.29 435,185.99 182.04 Germany 81,757,600 357,021.00 229.00 137,846.52 593.11 India 1,184,639,000 3,287,590.00 360.34 1,269,345.07 933.27 Indonesia 234,181,400 1,919,440.00 122.01 741,099.62 315.99 Iran 75,078,000 1,648,000.00 45.56 636,296.10 117.99 Japan 127,380,000 377,835.00 337.13 145,882.85 873.17 Mexico 108,396,211 1,972,550.00 54.95 761,605.50 142.33 Nigeria 170,123,000 923,768.00 171.32 356,668.67 443.71 Pakistan 170,260,000 803,940.00 211.78 310,402.84 548.51 Phillipines 94,013,200 300,000.00 313.38 115,830.60 811.64 Russia 141,927,297 17,075,200.00 8.31 6,592,768.87 21.53 United-States 309,975,000 9,629,091.00 32.19 3,717,811.29 83.38 Vietnam 85,789,573 329,560.00 260.32 127,243.78 674.21"""
Challenge 3 |
- Take your solution for Challenge 2 and make it output the country with the largest population.
Challenge 4 |
- Same as Challenge 3, but this time make your program output the country with the largest population density.
Sorting Lists, Reversing List, finding the Min or Max of a List
Enter the different commands below in the console, and observe how Python executes each line.
>>> seven = [ "Sleepy", "Sneezy", "Bashful", "Happy", "Grumpy", "Dopey", "Doc" ] >>> seven.sort() >>> seven >>> seven.reverse() >>> seven >>> nums = [0, 10, -200, 3, 4, 100] >>> nums.sort() >>> nums >>> nums.reverse() >>> nums >>> min( nums ) >>> max( nums ) >>> dwarvesHeight = [('Doc', 2), ('Dopey', 6), ('Grumpy', 4.5), ('Happy', 7),('Bashful', 3)] >>> dwarvesHeight.sort() >>> dwarvesHeight >>> heightDwarves = [] >>> for pair in dwarvesHeight: name = pair[0] height = pair[1] heightDwarves.append( (height, name ) ) >>> heightDwarves >>> heightDwarves.sort() >>> heightDwarves >>> heightDwarves.reverse() >>> heightDwarves >>> min( heightDwarves ) >>> max( heightDwarves ) >>>
Challenge 5 |
- Make your program use the original text variable and store the pairs (population, country name) into a list
- Using sorting, reversing, using min or max, make your program output the country with the smallest population, nicely formatted (i.e. no parentheses or commas printed). This cannot be the same as the solution function for Challenge 3 or 4.
- Similarly, make your program output the country with the largest population.
- Make your program output the list of countries and population sorted from largest population to smallest population. The information should show the country first on each line, followed by its population.
Challenge 6 |
- Make your program output the list of countries and population sorted from largest population to smallest population. The information should show the country first on each line, followed by its population.
Processing DNA Strings
A DNA string is a string composed of sequences of four nucleobase (guanine, adenine, thymine, and cytosine) represented by the letters G, A, T, and C. Assume that we have a DNA string defined as follows:
AGCCTTCTAAGGTTAATTAACTCGAGAGAGGGTTGGCGCAGTTAAAGGCCTTAATCGGTTCTGT
Figure out a way in Python to extract the string that is between the two markers AAGG. In other words create a variable called DNA equal to the string above, then use all the methods we've seen so far to isolate the string between the markers and print it.
Challenge 7 |
- Assume that DNA now is a multi-line string defined as follows:
DNA = """AGCCTTCTAGCGTTAATTAACTCGAGAGAGGGTTGGCGCAGTTACCTTAATCGGTTCTGT TCCTGAGCGAAAGGGCTCAAGCACCTGTTACCTCTGTGATAACGCCAGAGTAACTCGAGC AAAGACAAGGGAAGCTCTAACCATGTCCGAGACAAGTTGTCTAGCAGTCCCAGTTCACACTTG ACAATCTACAAATTAGAGCACGGATCATTTACAGGCCAATCTGGCGCGTTAATCGA TTTCCGCAAACCGCCATGCTGCATCATTACGGGAACCACACGCCGGAAGCAGGAACAGCA"""
(it might be easier to copy/paste the string when formatted in the form below:
DNA = """AGCCTTCTAGCGTTAATTAACTCGAGAGAGGGTTGGCGCAGTTACCTTAATCGGTTCTGT
TCCTGAGCGAAAGGGCTCAAGCACCTGTTACCTCTGTGATAACGCCAGAGTAACTCGAGC
AAAGACAAGGGAAGCTCTAACCATGTCCGAGACAAGTTGTCTAGCAGTCCCAGTTCACACTTG
ACAATCTACAAATTAGAGCACGGATCATTTACAGGCCAATCTGGCGCGTTAATCGA
TTTCCGCAAACCGCCATGCTGCATCATTACGGGAACCACACGCCGGAAGCAGGAACAGCA"""
- where the markers are on separate lines. Modify your previous solution so that it works on this new string.
- Make your program display the string between the markers on one line only.
- Make your program output the length of the string between markers
- Make your program display how many adenine (A) nucleobases the string between markers contains.
Coldest year in Oxford? (Optional)
Once you are looking at the page indicated above, click on Oxford and get a page of recorded temperatures since 1853 in that city.
Oxford
Location: 4509E 2072N, 63 metres amsl
Estimated data is marked with a * after the value.
Missing data (more than 2 days missing in month) is marked by ---.
Sunshine data taken from an automatic Kipp & Zonen sensor marked with a #, otherwise sunshine data taken from a Campbell Stokes recorder.
yyyy mm tmax tmin af rain sun
degC degC days mm hours
1853 1 8.4 2.7 4 62.8 ---
1853 2 3.2 -1.8 19 29.3 ---
1853 3 7.7 -0.6 20 25.9 ---
1853 4 12.6 4.5 0 60.1 ---
1853 5 16.8 6.1 0 59.5 ---
1853 6 20.1 10.7 0 82.0 ---
1853 7 21.2 12.2 0 86.2 ---
etc...
Challenge 8 |
- Figure out a way in Python to find the lowest temperature ever recorded in Oxford. Make your program output the year this record occurred, and the temperature recorded that year.
Be careful that some temperatures might be missing for certain years, and will be replaced by --- in the table. Also approximate temperatures have a * next to them, which might not be appreciated by the int() or float() function.
- Modify your program so that it outputs the year the l lowest temperature was recorded and the year the highest temperature was recorded, and what the actual recorded temperatures were.
Submission
Submit the program (which you should name lab8.py) to this URL: http://cs.smith.edu/~thiebaut/111b/submitL8.php
Reference Output for all the Challenges
+-------------+
| Challenge 1 |
+-------------+
China 1339190000
+-------------+
| Challenge 2 |
+-------------+
Bangladesh 164425000
Brazil 193364000
China 1339190000
Egypt 78848000
Ethiopia 79221000
Germany 81757600
India 1184639000
Indonesia 234181400
Iran 75078000
Japan 127380000
Mexico 108396211
Nigeria 170123000
Pakistan 170260000
Phillipines 94013200
Russia 141927297
United-States 309975000
Vietnam 85789573
+-------------+
| Challenge 3 |
+-------------+
China has the largest population of 1339190000
+-------------+
| Challenge 4 |
+-------------+
Bangladesh has the highest population density of 1141.84
+-------------+
| Challenge 5 |
+-------------+
Iran has the smallest population of 75078000
China has the largest population of 1339190000
+-------------+
| Challenge 6 |
+-------------+
China 1339190000
India 1184639000
United-States 309975000
Indonesia 234181400
Brazil 193364000
Pakistan 170260000
Nigeria 170123000
Bangladesh 164425000
Russia 141927297
Japan 127380000
Mexico 108396211
Phillipines 94013200
Vietnam 85789573
Germany 81757600
Ethiopia 79221000
Egypt 78848000
Iran 75078000
+-------------+
| Challenge 7 |
+-------------+
Sequence between AAGG markers = TTAATTAACTCGAGAGAGGGTTGGCGCAGTTA (length=32)
Sequence between AAGG markers = GCTCAAGCACCTGTTACCTCTGTGATAACGCCAGAGTAACTCGAGCAAAGAC (length=52)
There are 16 A nucleobases in the last string.
+-------------+
| Challenge 8 |
+-------------+
The coldest temperature of -5.80 degrees was recorded in 1963
The warmest temperature of 27.10 degrees was recorded in 2006