Difference between revisions of "CSC212 Homework 8 2014"
(→Problem #2) |
(→CSV File) |
||
Line 160: | Line 160: | ||
===CSV File=== | ===CSV File=== | ||
<br /> | <br /> | ||
− | * | + | * Because your final program will get a csv file from a URL, you may think that it might be hard to test your program with different csv files. Not really! All you need to do is to artificially modify the number of followers of tweet objects you will have created when reading the data from the URL, and see if your program can output several tweets if they come from users with similarly large number of followers. |
* To read the contents of the csv file on the Web (URL=http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt), you need to add this method to your class: | * To read the contents of the csv file on the Web (URL=http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt), you need to add this method to your class: | ||
<br /> | <br /> |
Revision as of 18:59, 20 November 2014
--D. Thiebaut (talk) 20:30, 12 November 2014 (EST)
Contents
This homework assignment is due Nov. 21st, 2014, at 11:55 p.m.
Problem #1
- Write a program called Hw8_1.java based on the quicksort program of Lab 12, and that contains a new method called quickQuicksort(... ), that has the following properties:
- It receives an array A of ints (int[] A), along with two ints defining the low and high indexes it must apply itself.
- It receives an integer N, with N less than or equal to the size of A.
public static void quickQuicksort( int[] A,int low, int high, int N ) {
...
}
- It partially sorts the array A so that the lowest N integers of the array end up sorted in the left-most side of the array (the part of the array that contains all the cells with low indexes). For example, assume that quickQuicksort receives A = [10, 3, 5, 1, 2, 9, 0, 3], and N = 3. Quicksort will stop as soon as the lowest 3 integers of A are located in the cells of A with lowest index, i.e. A = [0, 1, 2, 5, 10, 3, 9]. Note that only the lowest 3 integers are sorted in the left-most part of the array. The remaining part of A is unsorted (but can be partially sorted).
- The purpose of quickQuicksort is to have a very fast algorithm for locating the N lowest (or N highest) items of a large collection, without having to spend the time to sort the whole array. It is an alternative solution to the problem we solved in class with a heap.
Performance
- Your program has to be robust and avoid O(N2) performance on sorted arrays.
- Your program's execution time will be measured and compared to that of the solution program, on different arrays, some random, some sorted in increasing order, and others sorted in decreasing order. Your function will also be tested with values of N ranging from 0 to the length of A.
- The grade will be proportional to the speed of your program.
Submission
- Submit your program to Moodle, in the Homework 8, Problem 1 section.
- The test program will run your program and the solution program several times on large (500,000 items) arrays of sorted and unsorted ints. Each array tested is given to both the solution program and your program, to make the comparison fair. The average execution times for both programs are computed. A grade of 100 is generated if the average execution time of your program is no more than 110% of the execution time of the solution program. The grade is then decreased the longer your program takes compared to the solution program.
- Testing your program this way may take a minute or longer...
Problem #2
Your Assignment
- Write a Java program that reads a collection of tweets stored in a csv file, and outputs the tweet corresponding to the person with the most followers.
- Your program should be called Hw8_2.java.
Requirements
Imports
- Challenge: You are to use only the following libraries in your program:
import java.io.File; import java.io.FileNotFoundException; import java.net.URL; import java.util.PriorityQueue; import java.util.Scanner;
File Name passed on the Command Line
- Your program should get the name of the file of tweets on the command line. In Eclipse, this means providing the name in the Run Configurations/Arguments/Program Arguments. On Beowulf2, this means providing the name on the command line, as in java Hw8_2 tweets.csv.
Same Number of Followers
- If the file contains tweets from people who have the same largest number of followers, then your program should output all of the tweets.
Additional Information
CSV File
- You can use this file as a practice file for your Java program: tweets.csv
- To make your csv file accessible to your program, you have two options:
- In Eclipse: click on the src folder containing your HW8_2.java file, and right/control click to add a New File. Call it tweets.csv.txt. Note the .txt extension; if you do not have this extension, Eclipse might switch to a spreadsheet editor when you open the file. The .txt extension will fool Eclipse in opening a text editor.
Copy/paste the sample tweets from the link above. Once you have saved the tweets in your tweets.csv.txt file, refactor/rename the file and remove the ".txt" extension, to make the name tweets.csv. - To access the file, use this code as inspiration:
- In Eclipse: click on the src folder containing your HW8_2.java file, and right/control click to add a New File. Call it tweets.csv.txt. Note the .txt extension; if you do not have this extension, Eclipse might switch to a spreadsheet editor when you open the file. The .txt extension will fool Eclipse in opening a text editor.
String fileName = "tweets.csv"; URL url = Hw8_2.class.getClassLoader().getResource( fileName ); Scanner s = new Scanner( new File( url.getPath() ) );
- On Beowulf2: put the file in the same directory where your program resides, and your code will find the file without problem.
Expected Output
- Here are some examples of how your program should behave. The program is run on Beowulf2 in this example. Note that the testing software will expect your program to output the same information the solution program outputs for missing file name on the command line, or for files not found.
java Hw8_2 tweets.csv 11/14/14 18:07: https://twitter.com/karessaNV_M/status/533237307751792640 (28568 followers) (follows 23674) At least you don't h... java Hw8_2 Syntax: java Hw8_2 filename.csv where filename.csv is a csv file in the current path java Hw8_2 badname.csv Cannot find specified file badname.csv
- Note that the program truncates the text of the tweets and adds the ellipses (...).
The code looks like this: TweetText.substring(0, 20)+"..."
Submission
- Submit your program on Moodle, in the Homework 8, Problem 2 section.
- Your program will be tested with several different csv files, with different properties. Try to anticipate ways in which these files will be created.
Problem #2b
If your program for Problem #2 is not working because the file is not finding the file, you should try this approach, which uses a method that will read the csv file directly from the Web site where it resides. This way you won't have to copy/paste the csv content into a file in your directory.
Your Assignment
- Write a Java program that reads a collection of tweets stored at a given URL, and outputs the tweet corresponding to the person with the most followers.
- Your program should be called Hw8_2.java.
Requirements
Imports
- Challenge: You are to use only the following libraries in your program:
import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; import java.net.MalformedURLException; import java.net.URL; import java.net.URLConnection; import java.util.PriorityQueue; import java.util.Scanner;
URL passed on the Command Line
- Your program should get the name of the URL, where the tweets reside, on the command line. In Eclipse, this means providing the name in the Run Configurations/Arguments/Program Arguments. On Beowulf2, this means providing the name on the command line, as in java Hw8_2 http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt.
Same Number of Followers
- If the URL data contain tweets from people who have the same largest number of followers, then your program should output all of the tweets.
Additional Information
CSV File
- Because your final program will get a csv file from a URL, you may think that it might be hard to test your program with different csv files. Not really! All you need to do is to artificially modify the number of followers of tweet objects you will have created when reading the data from the URL, and see if your program can output several tweets if they come from users with similarly large number of followers.
- To read the contents of the csv file on the Web (URL=http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt), you need to add this method to your class:
private static String readURLTweets(String urlAddress) { URL website = null;; try { website = new URL(urlAddress); } catch (MalformedURLException e) { System.out.println( "Invalid URL" ); System.exit( 1 ); } URLConnection connection = null; try { connection = website.openConnection(); } catch (IOException e) { System.out.println( "Could not connect to URL" ); System.exit( 2 ); } BufferedReader in=null; try { in = new BufferedReader(new InputStreamReader( connection.getInputStream())); } catch (IOException e) { System.out.println( "Could not get get data from URL" ); System.exit( 3 ); } StringBuilder response = new StringBuilder(); String inputLine; try { while ((inputLine = in.readLine()) != null) response.append(inputLine + "\n" ); in.close(); } catch (IOException e) { System.out.println( "Could not read data from URL" ); System.exit( 4 ); } return response.toString(); }
Expected Output
- Here are some examples of how your program should behave. The program is run on Beowulf2 in this example. Note that the testing software will expect your program to output the same information the solution program outputs for missing file name on the command line, or for files not found.
java Hw8_2 tweets.csv 11/14/14 18:07: https://twitter.com/karessaNV_M/status/533237307751792640 (28568 followers) (follows 23674) At least you don't h... java Hw8_2 Syntax: java Hw8_2 filename.csv where filename.csv is a csv file in the current path java Hw8_2 badname.csv Cannot find specified file badname.csv
- Note that the program truncates the text of the tweets and adds the ellipses (...).
The code looks like this: TweetText.substring(0, 20)+"..."
Submission
- Submit your program on Moodle, in the Homework 8, Problem 2 section.
- Your program will be tested with several different csv files, with different properties. Try to anticipate ways in which these files will be created.