Difference between revisions of "CSC212 Homework 8 2014"

From dftwiki3
Jump to: navigation, search
(CSV File)
(Submission)
 
(6 intermediate revisions by the same user not shown)
Line 36: Line 36:
 
* Submit your program to Moodle, in the Homework 8, Problem 1 section.
 
* Submit your program to Moodle, in the Homework 8, Problem 1 section.
 
* <font color="magenta">The test program will run your program and the solution program several times on large (500,000 items) arrays of sorted and unsorted ints.  Each array tested is given to both the solution program and your program, to make the comparison fair.  The average execution times for both programs are computed.  A grade of 100 is generated if the average execution time of your program is no more than 110% of the execution time of the solution program.  The grade is then decreased the longer your program takes compared to the solution program.</font>
 
* <font color="magenta">The test program will run your program and the solution program several times on large (500,000 items) arrays of sorted and unsorted ints.  Each array tested is given to both the solution program and your program, to make the comparison fair.  The average execution times for both programs are computed.  A grade of 100 is generated if the average execution time of your program is no more than 110% of the execution time of the solution program.  The grade is then decreased the longer your program takes compared to the solution program.</font>
* Testing your program this way may take a minute or longer...
+
* Testing your program this way may take several long seconds...
 
<br />
 
<br />
  
Line 120: Line 120:
 
-->
 
-->
 
<br />
 
<br />
=Problem #2=
+
 
 +
=Problem #2, Tweet, tweet...=
 
<br />
 
<br />
 
[[Image:Tweets.jpg|300px|right]]
 
[[Image:Tweets.jpg|300px|right]]
Line 129: Line 130:
 
==Your Assignment==
 
==Your Assignment==
 
<br />
 
<br />
* Write a Java program that reads a collection of tweets stored at a given URL, and outputs the tweet corresponding to the person with the most followers.
+
* Write a Java program that reads a collection of tweets stored at a given URL, and outputs the tweet published by the person with the most followers.
 
* Your program should be called '''Hw8_2.java'''.
 
* Your program should be called '''Hw8_2.java'''.
 
<br />
 
<br />
 +
 
==Requirements==
 
==Requirements==
 
<br />
 
<br />
Line 227: Line 229:
 
    
 
    
 
   
 
   
  '''java Hw8b_2 '''
+
  '''java Hw8_2 '''
 
  Syntax: java Hw8_2 url
 
  Syntax: java Hw8_2 url
where url is the url of the csv file containing tweets
+
where url is the url of the csv file containing tweets
 
   
 
   
  '''java Hw8b_2 http://cs.badurl.edu/tweet.csv'''
+
  '''java Hw8_2 http://cs.badurl.edu/tweet.csv.txt'''
 
  Could not get get data from URL
 
  Could not get get data from URL
 
    
 
    
Line 244: Line 246:
 
<br />
 
<br />
 
* Submit your program on Moodle, in the Homework 8, Problem 2 section.
 
* Submit your program on Moodle, in the Homework 8, Problem 2 section.
* Your program will be tested with several different csv files, with different properties.  Try to anticipate ways in which these files will be created.
+
* Your program will be tested with several different urls containing different csv files, with different properties.  Try to anticipate ways in which these files will be created.
 
<br />
 
<br />
 
<br />
 
<br />

Latest revision as of 09:16, 21 November 2014

--D. Thiebaut (talk) 20:30, 12 November 2014 (EST)




This homework assignment is due Nov. 21st, 2014, at 11:55 p.m.




Problem #1


  • Write a program called Hw8_1.java based on the quicksort program of Lab 12, and that contains a new method called quickQuicksort(... ), that has the following properties:
  • It receives an array A of ints (int[] A), along with two ints defining the low and high indexes it must apply itself.
  • It receives an integer N, with N less than or equal to the size of A.


        public static void quickQuicksort( int[] A,int low, int high, int N ) {
            ...
        }


  • It partially sorts the array A so that the lowest N integers of the array end up sorted in the left-most side of the array (the part of the array that contains all the cells with low indexes). For example, assume that quickQuicksort receives A = [10, 3, 5, 1, 2, 9, 0, 3], and N = 3. Quicksort will stop as soon as the lowest 3 integers of A are located in the cells of A with lowest index, i.e. A = [0, 1, 2, 5, 10, 3, 9]. Note that only the lowest 3 integers are sorted in the left-most part of the array. The remaining part of A is unsorted (but can be partially sorted).
  • The purpose of quickQuicksort is to have a very fast algorithm for locating the N lowest (or N highest) items of a large collection, without having to spend the time to sort the whole array. It is an alternative solution to the problem we solved in class with a heap.


Performance


  • Your program has to be robust and avoid O(N2) performance on sorted arrays.
  • Your program's execution time will be measured and compared to that of the solution program, on different arrays, some random, some sorted in increasing order, and others sorted in decreasing order. Your function will also be tested with values of N ranging from 0 to the length of A.
  • The grade will be proportional to the speed of your program.


Submission


  • Submit your program to Moodle, in the Homework 8, Problem 1 section.
  • The test program will run your program and the solution program several times on large (500,000 items) arrays of sorted and unsorted ints. Each array tested is given to both the solution program and your program, to make the comparison fair. The average execution times for both programs are computed. A grade of 100 is generated if the average execution time of your program is no more than 110% of the execution time of the solution program. The grade is then decreased the longer your program takes compared to the solution program.
  • Testing your program this way may take several long seconds...



Problem #2, Tweet, tweet...


Tweets.jpg

--D. Thiebaut (talk) 09:04, 21 November 2014 (EST)
Note: I have modified Problem 2. Too many people couldn't make Eclipse work with text files. The new version will have your Java program get the tweets from a Web page in my Smith Web account. This should work better for everybody. If you have already submitted your program and are happy with the grade, you do not have to submit a new program.


Your Assignment


  • Write a Java program that reads a collection of tweets stored at a given URL, and outputs the tweet published by the person with the most followers.
  • Your program should be called Hw8_2.java.


Requirements


Imports


  • Challenge: You are to use only the following libraries in your program:


import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;
import java.util.PriorityQueue;
import java.util.Scanner;


URL passed on the Command Line


  • Your program should get the name of the URL, where the tweets reside, on the command line. In Eclipse, this means providing the name in the Run Configurations/Arguments/Program Arguments. On Beowulf2, this means providing the name on the command line, as in java Hw8_2 http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt.


Same Number of Followers


  • If the URL data contain tweets from people who have the same largest number of followers, then your program should output all of the tweets.


Additional Information


CSV File


  • Because your final program will get a csv file from a URL, you may think that it might be hard to test your program with different csv files. Not really! All you need to do is to artificially modify the number of followers of tweet objects you will have created when reading the data from the URL, and see if your program can output several tweets if they come from users with similarly large number of followers.
  • To read the contents of the csv file on the Web (URL=http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt), you need to add this method to your class:


	private static String readURLTweets(String urlAddress) {
		URL website = null;;
		try {
			website = new URL(urlAddress);
		} catch (MalformedURLException e) {
			System.out.println( "Invalid URL" );
			System.exit( 1 );
		}
		
		URLConnection connection = null;
		try {
			connection = website.openConnection();
		} catch (IOException e) {
			System.out.println( "Could not connect to URL" );
			System.exit( 2 );
		}
		BufferedReader in=null;
		try {
			in = new BufferedReader(new InputStreamReader(
					connection.getInputStream()));
		} catch (IOException e) {
			System.out.println( "Could not get get data from URL" );
			System.exit( 3 );
		}
		
		StringBuilder response = new StringBuilder();
		String inputLine;

		try {
			while ((inputLine = in.readLine()) != null)
				response.append(inputLine + "\n" );
			in.close();
		} catch (IOException e) {
			System.out.println( "Could not read data from URL" );
			System.exit( 4  );
		}

		return response.toString();
	}


  • You can call the method above from your main program as follows:


        String allTweets = readURLTweets( args[0] );


(this assumes that you are sure args[0], the parameter passed on the command line, is defined.)


Expected Output


  • Here are some examples of how your program should behave. The program is run on Beowulf2 in this example. Note that the testing software will expect your program to output the same information the solution program outputs for missing file name on the command line, or for files not found.


java Hw8_2 http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt 
11/14/14 18:07: https://twitter.com/karessaNV_M/status/533237307751792640 (28568 followers) (follows 23674) At least you don't h...
 

java Hw8_2 
Syntax: java Hw8_2 url
where url is the url of the csv file containing tweets

java Hw8_2 http://cs.badurl.edu/tweet.csv.txt
Could not get get data from URL
 
 


  • Note that the program truncates the text of the tweets and adds the ellipses (...).
    The code looks like this: TweetText.substring(0, 20)+"..."



Submission


  • Submit your program on Moodle, in the Homework 8, Problem 2 section.
  • Your program will be tested with several different urls containing different csv files, with different properties. Try to anticipate ways in which these files will be created.