Difference between revisions of "CSC212 Homework 8 2014"

From dftwiki3
Jump to: navigation, search
(Performance)
(Submission)
 
(28 intermediate revisions by the same user not shown)
Line 14: Line 14:
 
* Write a program called Hw8_1.java  based on the '''quicksort''' program of [[CSC212 Lab 12 2014|Lab 12]], and that contains a new method called <tt>quickQuicksort(... )</tt>, that has the following properties:
 
* Write a program called Hw8_1.java  based on the '''quicksort''' program of [[CSC212 Lab 12 2014|Lab 12]], and that contains a new method called <tt>quickQuicksort(... )</tt>, that has the following properties:
 
:* It receives an array A of ints (int[] A), along with two ints defining the ''low'' and ''high'' indexes it must apply itself.
 
:* It receives an array A of ints (int[] A), along with two ints defining the ''low'' and ''high'' indexes it must apply itself.
:* It receives an integer ''N'', with N'''' less than or equal to the size of A.
+
:* It receives an integer ''N'', with ''N'' less than or equal to the size of A.
 
<br />
 
<br />
 
<source lang="java">
 
<source lang="java">
Line 35: Line 35:
 
<br />
 
<br />
 
* Submit your program to Moodle, in the Homework 8, Problem 1 section.
 
* Submit your program to Moodle, in the Homework 8, Problem 1 section.
 +
* <font color="magenta">The test program will run your program and the solution program several times on large (500,000 items) arrays of sorted and unsorted ints.  Each array tested is given to both the solution program and your program, to make the comparison fair.  The average execution times for both programs are computed.  A grade of 100 is generated if the average execution time of your program is no more than 110% of the execution time of the solution program.  The grade is then decreased the longer your program takes compared to the solution program.</font>
 +
* Testing your program this way may take several long seconds...
 
<br />
 
<br />
  
 +
<!--
 
=Problem #2=
 
=Problem #2=
 
[[Image:Tweets.jpg|300px|right]]
 
[[Image:Tweets.jpg|300px|right]]
 
<br />
 
<br />
 +
 
==Your Assignment==
 
==Your Assignment==
 
<br />
 
<br />
Line 66: Line 70:
 
===Same Number of Followers===
 
===Same Number of Followers===
 
<br />
 
<br />
* If there tweets from people who have the same largest number of followers, then your program should output '''all''' of them.
+
* If the file contains tweets from people who have the same largest number of followers, then your program should output '''all''' of the tweets.
 
<br />
 
<br />
  
Line 75: Line 79:
 
* You can use this file as a practice file for  your Java program: [http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt tweets.csv]
 
* You can use this file as a practice file for  your Java program: [http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt tweets.csv]
 
* To make your csv file accessible to your program, you have two options:
 
* To make your csv file accessible to your program, you have two options:
:::'''In Eclipse:''' click on the '''src''' folder containing your HW8_2.java file, and right/control click to add a '''New File'''.  Call it '''tweets.csv'''.<br />To access the file, use this code as inspiration:
+
:::'''In Eclipse:''' click on the '''src''' folder containing your HW8_2.java file, and right/control click to add a '''New File'''.  Call it '''tweets.csv.txt'''.  Note the .txt extension; if you do not have this extension, Eclipse might switch to a spreadsheet editor when you open the file.  The .txt extension will fool Eclipse in opening a text editor.<br />Copy/paste the sample tweets from the link above.  Once you have saved the tweets in your tweets.csv.txt file, '''refactor/rename''' the file and remove the ".txt" extension, to make the name '''tweets.csv'''.
 +
:::To access the file, use this code as inspiration:
 
<br />
 
<br />
 
:::::::<source lang="java">
 
:::::::<source lang="java">
Line 88: Line 93:
 
===Expected Output===
 
===Expected Output===
 
<br />
 
<br />
* Here are some examples of how your program should behave.  The program is run on Beowulf2 in this example.  <font color="magenta">Note that the testing software will expect your program to output the same information for missing file names or files not found.</font>
+
* Here are some examples of how your program should behave.  The program is run on Beowulf2 in this example.  <font color="magenta">Note that the testing software will expect your program to output the same information the solution program outputs for missing file name on the command line, or for files not found.</font>
 +
<br />
 +
 
 +
'''java Hw8_2 tweets.csv '''
 +
11/14/14 18:07: https&#58;//twitter.com/karessaNV_M/status/533237307751792640 (28568 followers) (follows 23674) At least you don't h...
 +
 +
'''java Hw8_2'''
 +
Syntax: java Hw8_2 filename.csv
 +
where filename.csv is a csv file in the current path
 +
 +
'''java Hw8_2 badname.csv'''
 +
Cannot find specified file badname.csv
 +
 +
 
 
<br />
 
<br />
::<source lang="text">
+
* Note that the program truncates the text of the tweets and adds the ellipses (...). <br \>The code looks like this: <tt>TweetText.substring(0, 20)+"..." </tt>
java Hw8_2 tweets.csv
 
11/14/14 18:07: https://twitter.com/karessaNV_M/status/533237307751792640 (28568 followers) (follows 23674) At least you don't h...
 
  
java Hw8_2
+
<br />
Syntax: java Hw8_2 filename.csv
+
<br />
where filename.csv is a csv file in the current path
 
  
java Hw8_2 badname.csv
+
==Submission==
Cannot find specified file badname.csv
+
<br />
 +
* Submit your program on Moodle, in the Homework 8, Problem 2 section.
 +
* Your program will be tested with several different csv files, with different properties.  Try to anticipate ways in which these files will be created.
 +
<br />
 +
-->
 +
<br />
  
 +
=Problem #2, Tweet, tweet...=
 +
<br />
 +
[[Image:Tweets.jpg|300px|right]]
 +
<bluebox>
 +
--[[User:Thiebaut|D. Thiebaut]] ([[User talk:Thiebaut|talk]]) 09:04, 21 November 2014 (EST) <br />'''Note''': I have modified Problem 2.  Too many people couldn't make Eclipse work with text files.  The new version will have your Java program get the tweets from a Web page in my Smith Web account.  This should work better for everybody.  If you have already submitted your program and are happy with the grade, you do not have to submit a new program.
 +
</bluebox>
 +
<br />
 +
==Your Assignment==
 +
<br />
 +
* Write a Java program that reads a collection of tweets stored at a given URL, and outputs the tweet published by the person with the most followers.
 +
* Your program should be called '''Hw8_2.java'''.
 +
<br />
 +
 +
==Requirements==
 +
<br />
 +
===Imports===
 +
<br />
 +
* Challenge: You are to use only the following libraries in your program:
 +
<br />
 +
::<source lang="java">
 +
import java.io.BufferedReader;
 +
import java.io.IOException;
 +
import java.io.InputStreamReader;
 +
import java.net.MalformedURLException;
 +
import java.net.URL;
 +
import java.net.URLConnection;
 +
import java.util.PriorityQueue;
 +
import java.util.Scanner;
 
</source>
 
</source>
 +
<br />
 +
 +
===URL passed on the Command Line===
 +
<br />
 +
* Your program should get the name of the URL, where the tweets reside, on the command line.  In Eclipse, this means providing the name in the '''Run Configurations/Arguments/Program Arguments'''.  On Beowulf2, this means providing the name on the command line, as in ''java Hw8_2 http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt''.
 +
<br />
 +
===Same Number of Followers===
 +
<br />
 +
* If the URL data contain tweets from people who have the same ''largest'' number of followers, then your program should output '''all''' of the tweets.
 +
<br />
 +
 +
==Additional Information==
 +
<br />
 +
===CSV File===
 +
<br />
 +
* Because your final program will get a csv file from a URL, you may think that it might be hard to test your program with different csv files.  Not really!  All you need to do is to artificially modify the number of followers of tweet objects you will have created when reading the data from the URL, and see if your program can output several tweets if they come from users with similarly large number of followers.
 +
* To read the contents of the csv file on the Web (URL=http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt), you need to add this method to your class:
 +
<br />
 +
::<source lang="java">
 +
private static String readURLTweets(String urlAddress) {
 +
URL website = null;;
 +
try {
 +
website = new URL(urlAddress);
 +
} catch (MalformedURLException e) {
 +
System.out.println( "Invalid URL" );
 +
System.exit( 1 );
 +
}
 +
 +
URLConnection connection = null;
 +
try {
 +
connection = website.openConnection();
 +
} catch (IOException e) {
 +
System.out.println( "Could not connect to URL" );
 +
System.exit( 2 );
 +
}
 +
BufferedReader in=null;
 +
try {
 +
in = new BufferedReader(new InputStreamReader(
 +
connection.getInputStream()));
 +
} catch (IOException e) {
 +
System.out.println( "Could not get get data from URL" );
 +
System.exit( 3 );
 +
}
 +
 +
StringBuilder response = new StringBuilder();
 +
String inputLine;
 +
 +
try {
 +
while ((inputLine = in.readLine()) != null)
 +
response.append(inputLine + "\n" );
 +
in.close();
 +
} catch (IOException e) {
 +
System.out.println( "Could not read data from URL" );
 +
System.exit( 4  );
 +
}
 +
 +
return response.toString();
 +
}
 +
 +
</source>
 +
<br />
 +
* You can call the method above from your main program as follows:
 +
<br />
 +
::<source lang="java">
 +
        String allTweets = readURLTweets( args[0] );
 +
</source>
 +
<br />
 +
:(this assumes that you are sure args[0], the parameter passed on the command line, is defined.)
 +
<br />
 +
 +
===Expected Output===
 +
<br />
 +
* Here are some examples of how your program should behave.  The program is run on Beowulf2 in this example.  <font color="magenta">Note that the testing software will expect your program to output the same information the solution program outputs for missing file name on the command line, or for files not found.</font>
 +
<br />
 +
 
 +
'''java Hw8_2 http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt '''
 +
11/14/14 18:07: https://twitter.com/karessaNV_M/status/533237307751792640 (28568 followers) (follows 23674) At least you don't h...
 +
 
 +
 +
'''java Hw8_2 '''
 +
Syntax: java Hw8_2 url
 +
where url is the url of the csv file containing tweets
 +
 +
'''java Hw8_2 http://cs.badurl.edu/tweet.csv.txt'''
 +
Could not get get data from URL
 +
 
 +
 
 
<br />
 
<br />
 
* Note that the program truncates the text of the tweets and adds the ellipses (...).  <br \>The code looks like this: <tt>TweetText.substring(0, 20)+"..." </tt>
 
* Note that the program truncates the text of the tweets and adds the ellipses (...).  <br \>The code looks like this: <tt>TweetText.substring(0, 20)+"..." </tt>
Line 111: Line 246:
 
<br />
 
<br />
 
* Submit your program on Moodle, in the Homework 8, Problem 2 section.
 
* Submit your program on Moodle, in the Homework 8, Problem 2 section.
* Your program will be tested with several different csv files, with different properties.  Try to anticipate ways in which these files will be created.
+
* Your program will be tested with several different urls containing different csv files, with different properties.  Try to anticipate ways in which these files will be created.
<br />
 
<br />
 
 
<br />
 
<br />
 
<br />
 
<br />

Latest revision as of 09:16, 21 November 2014

--D. Thiebaut (talk) 20:30, 12 November 2014 (EST)




This homework assignment is due Nov. 21st, 2014, at 11:55 p.m.




Problem #1


  • Write a program called Hw8_1.java based on the quicksort program of Lab 12, and that contains a new method called quickQuicksort(... ), that has the following properties:
  • It receives an array A of ints (int[] A), along with two ints defining the low and high indexes it must apply itself.
  • It receives an integer N, with N less than or equal to the size of A.


        public static void quickQuicksort( int[] A,int low, int high, int N ) {
            ...
        }


  • It partially sorts the array A so that the lowest N integers of the array end up sorted in the left-most side of the array (the part of the array that contains all the cells with low indexes). For example, assume that quickQuicksort receives A = [10, 3, 5, 1, 2, 9, 0, 3], and N = 3. Quicksort will stop as soon as the lowest 3 integers of A are located in the cells of A with lowest index, i.e. A = [0, 1, 2, 5, 10, 3, 9]. Note that only the lowest 3 integers are sorted in the left-most part of the array. The remaining part of A is unsorted (but can be partially sorted).
  • The purpose of quickQuicksort is to have a very fast algorithm for locating the N lowest (or N highest) items of a large collection, without having to spend the time to sort the whole array. It is an alternative solution to the problem we solved in class with a heap.


Performance


  • Your program has to be robust and avoid O(N2) performance on sorted arrays.
  • Your program's execution time will be measured and compared to that of the solution program, on different arrays, some random, some sorted in increasing order, and others sorted in decreasing order. Your function will also be tested with values of N ranging from 0 to the length of A.
  • The grade will be proportional to the speed of your program.


Submission


  • Submit your program to Moodle, in the Homework 8, Problem 1 section.
  • The test program will run your program and the solution program several times on large (500,000 items) arrays of sorted and unsorted ints. Each array tested is given to both the solution program and your program, to make the comparison fair. The average execution times for both programs are computed. A grade of 100 is generated if the average execution time of your program is no more than 110% of the execution time of the solution program. The grade is then decreased the longer your program takes compared to the solution program.
  • Testing your program this way may take several long seconds...



Problem #2, Tweet, tweet...


Tweets.jpg

--D. Thiebaut (talk) 09:04, 21 November 2014 (EST)
Note: I have modified Problem 2. Too many people couldn't make Eclipse work with text files. The new version will have your Java program get the tweets from a Web page in my Smith Web account. This should work better for everybody. If you have already submitted your program and are happy with the grade, you do not have to submit a new program.


Your Assignment


  • Write a Java program that reads a collection of tweets stored at a given URL, and outputs the tweet published by the person with the most followers.
  • Your program should be called Hw8_2.java.


Requirements


Imports


  • Challenge: You are to use only the following libraries in your program:


import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;
import java.util.PriorityQueue;
import java.util.Scanner;


URL passed on the Command Line


  • Your program should get the name of the URL, where the tweets reside, on the command line. In Eclipse, this means providing the name in the Run Configurations/Arguments/Program Arguments. On Beowulf2, this means providing the name on the command line, as in java Hw8_2 http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt.


Same Number of Followers


  • If the URL data contain tweets from people who have the same largest number of followers, then your program should output all of the tweets.


Additional Information


CSV File


  • Because your final program will get a csv file from a URL, you may think that it might be hard to test your program with different csv files. Not really! All you need to do is to artificially modify the number of followers of tweet objects you will have created when reading the data from the URL, and see if your program can output several tweets if they come from users with similarly large number of followers.
  • To read the contents of the csv file on the Web (URL=http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt), you need to add this method to your class:


	private static String readURLTweets(String urlAddress) {
		URL website = null;;
		try {
			website = new URL(urlAddress);
		} catch (MalformedURLException e) {
			System.out.println( "Invalid URL" );
			System.exit( 1 );
		}
		
		URLConnection connection = null;
		try {
			connection = website.openConnection();
		} catch (IOException e) {
			System.out.println( "Could not connect to URL" );
			System.exit( 2 );
		}
		BufferedReader in=null;
		try {
			in = new BufferedReader(new InputStreamReader(
					connection.getInputStream()));
		} catch (IOException e) {
			System.out.println( "Could not get get data from URL" );
			System.exit( 3 );
		}
		
		StringBuilder response = new StringBuilder();
		String inputLine;

		try {
			while ((inputLine = in.readLine()) != null)
				response.append(inputLine + "\n" );
			in.close();
		} catch (IOException e) {
			System.out.println( "Could not read data from URL" );
			System.exit( 4  );
		}

		return response.toString();
	}


  • You can call the method above from your main program as follows:


        String allTweets = readURLTweets( args[0] );


(this assumes that you are sure args[0], the parameter passed on the command line, is defined.)


Expected Output


  • Here are some examples of how your program should behave. The program is run on Beowulf2 in this example. Note that the testing software will expect your program to output the same information the solution program outputs for missing file name on the command line, or for files not found.


java Hw8_2 http://cs.smith.edu/~dthiebaut/classes/212/tweets.csv.txt 
11/14/14 18:07: https://twitter.com/karessaNV_M/status/533237307751792640 (28568 followers) (follows 23674) At least you don't h...
 

java Hw8_2 
Syntax: java Hw8_2 url
where url is the url of the csv file containing tweets

java Hw8_2 http://cs.badurl.edu/tweet.csv.txt
Could not get get data from URL
 
 


  • Note that the program truncates the text of the tweets and adds the ellipses (...).
    The code looks like this: TweetText.substring(0, 20)+"..."



Submission


  • Submit your program on Moodle, in the Homework 8, Problem 2 section.
  • Your program will be tested with several different urls containing different csv files, with different properties. Try to anticipate ways in which these files will be created.