Difference between revisions of "CSC220 Homework 5 Solution 2010"

From dftwiki3
Jump to: navigation, search
(Question 10)
(Option 2)
 
(8 intermediate revisions by the same user not shown)
Line 4: Line 4:
 
=Problem #1=
 
=Problem #1=
 
==Questions 1-9==
 
==Questions 1-9==
 +
* Main answers from Alex, with contributions from Aigerim, Katie and Millie.
 +
 
<code><pre>
 
<code><pre>
 
-- hw5b.sql     
 
-- hw5b.sql     
Line 28: Line 30:
 
SELECT COUNT(*) FROM `Connections2` WHERE `Id1`="470";
 
SELECT COUNT(*) FROM `Connections2` WHERE `Id1`="470";
  
-- Question 7: returns ids of all entries that are connected to entry ASCII
+
-- Question 7: returns ids of all entries that are connected to entry ASCII (this solution from Aigerim)
SELECT `Id2` FROM `Connections2` WHERE `Connections2`.`Id1` in
+
SELECT DISTINCT `Id2` FROM `Connections2` WHERE `Id1` IN (
  (SELECT `Names`.`Id` FROM `Names` WHERE `Names`.`Name`="ASCII");
+
      `Id` from `Names` WHERE `Name`='ASCII' )
  
-- Question 8: returns names of entries associated with ids from #7
+
-- Question 8: returns names of entries associated with ids from #7 (this solution from Katie + Millie)
SELECT `Name` FROM `Names` WHERE `Id` in  
+
SELECT DISTINCT `Name` FROM `Names` WHERE `Id` in  
 
     (SELECT `Id2` FROM `Connections2` WHERE `Id1` in  
 
     (SELECT `Id2` FROM `Connections2` WHERE `Id1` in  
 
           (SELECT `Id` FROM `Names` WHERE `Name`="ASCII"));
 
           (SELECT `Id` FROM `Names` WHERE `Name`="ASCII"));
Line 46: Line 48:
  
 
==Question 10==
 
==Question 10==
 +
=== Option 1===
 +
Here's an ingenious solution by Betsy.  It shows some good research, but it is not full-proof, as by some strange fluke, there could be duplicate entries that are such that the sums might be the same, while the graph is still directed.  But it's a very nice example of the use of variables...  Nice job!
 +
<code><pre>
 +
-- Question 10
 +
-- Generates a query that returns @a = the sum of the column Id1, @b = the sum of the
 +
-- column Id2 and @d = the difference of their sums.@d tells us whether Connections2
 +
-- is direct or not. If Connections2 is direct then Id1 and Id2 should have exactly the
 +
-- same entries. So if we find the sum of the entries of each column, they should be the
 +
-- same number. We can verify this by taking their difference.
 +
-- Thus if @d == 0, Connections2 is direct. Otherwise, it is not.
 +
 +
 +
SET @a := (SELECT SUM(`Id1`) FROM `Connections2`),
 +
@b := (SELECT SUM(`Id2`) FROM `Connections2`);
 +
SELECT @a, @b, @d := @a-@b;
 +
 +
</pre></code>
  
 +
===Option 2===
 
* To answer this question it is easier to work with a smaller graph, figure out the queries, and then apply them to the Connections2 table.  Here I am using a different database, and where I create a new ''play'' table also called '''Connections2'''.  It will simplify applying the queries I develop here to the original Connections2 table.
 
* To answer this question it is easier to work with a smaller graph, figure out the queries, and then apply them to the Connections2 table.  Here I am using a different database, and where I create a new ''play'' table also called '''Connections2'''.  It will simplify applying the queries I develop here to the original Connections2 table.
  
Line 58: Line 78:
 
                 (4) -----> (5)           
 
                 (4) -----> (5)           
  
</pre></code>
 
  
* Notice that the graph is '''directed''', as there is an edge from 2 to 4, and from 4 to 5, but no edges from 4 to 2 or from 5 to 4.
+
-- Notice that the graph is '''directed''', as there is an edge from 2 to 4,  
 +
-- and from 4 to 5, but no edges from 4 to 2 or from 5 to 4.
 +
 
 +
-- Let's see how we can figure out the directed property of this graph. 
 +
-- There are probably some more sophisticated and or simpler ways to perform this, but this works! :-)
  
* Let's see how we can figure out the directed property of this graph.  There are probably some more sophisticated and or simpler ways to perform this, but this works! :-)
 
  
<code><pre>
 
 
CREATE TABLE IF NOT EXISTS `Connections2` (
 
CREATE TABLE IF NOT EXISTS `Connections2` (
 
   `id1` int(1) NOT NULL,
 
   `id1` int(1) NOT NULL,
Line 70: Line 91:
 
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
 
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
  
--
 
-- Dumping data for table `Connections2`
 
--
 
  
 
INSERT INTO `Connections2` (`id1`, `id2`) VALUES
 
INSERT INTO `Connections2` (`id1`, `id2`) VALUES
Line 83: Line 101:
 
(2, 1),
 
(2, 1),
 
(2, 3);
 
(2, 3);
 +
 +
 +
-- Notice that I have created '''duplicate''' edges (the last two),
 +
-- to emulate the situation in the homework, where  the original Connections2 table
 +
-- also contains duplicates.
  
  
Line 163: Line 186:
 
DROP TABLE tempConn;
 
DROP TABLE tempConn;
 
DROP TABLE tempConn2;
 
DROP TABLE tempConn2;
 +
 +
-- Apply this algorithm to the real ''Connections2'' table, and find out that the original graph, too, is directed.
  
 
</pre></code>
 
</pre></code>

Latest revision as of 10:22, 26 October 2010

--D. Thiebaut 21:53, 24 October 2010 (UTC)


Problem #1

Questions 1-9

  • Main answers from Alex, with contributions from Aigerim, Katie and Millie.
-- hw5b.sql    
-- Alex Cheng (220a-ag)
-- 10/20/10
-- Answers to Homework 5, Part A

-- Question 1: returns all names and ids of records where name contains "processor"
SELECT * FROM `Names` WHERE `Name` LIKE "%processor%";

-- Question 2: returns all ids and names of records that contains "compute"
SELECT * FROM `Names` WHERE `Name` LIKE "%compute%" ORDER BY `Name`;

-- Question 3: same as #2 but returns last 10 entries of alphabetically sorted list
SELECT * FROM `Names` WHERE `Name` LIKE "%compute%" ORDER BY `Name` LIMIT 10;

-- Question 4: same as #2 but returns last 10 entries of alphabetically sorted list
SELECT * FROM `Names` WHERE `Name` LIKE "%compute%" ORDER BY `Name` DESC LIMIT 10;

-- Question 5: returns total number of records in table Names
SELECT COUNT(*) FROM `Names`;

-- Question 6: returns total number of records in table Connections2 whose Id1 field is 470 
SELECT COUNT(*) FROM `Connections2` WHERE `Id1`="470";

-- Question 7: returns ids of all entries that are connected to entry ASCII (this solution from Aigerim)
SELECT DISTINCT `Id2` FROM `Connections2` WHERE `Id1` IN (
       `Id` from `Names` WHERE `Name`='ASCII' )

-- Question 8: returns names of entries associated with ids from #7 (this solution from Katie + Millie)
SELECT DISTINCT `Name` FROM `Names` WHERE `Id` in 
    (SELECT `Id2` FROM `Connections2` WHERE `Id1` in 
          (SELECT `Id` FROM `Names` WHERE `Name`="ASCII"));

-- Question 9: same as #8 but returns last 5 entries when list is alpha ordered
SELECT `Name` FROM `Names` WHERE `Id` in (SELECT `Id2` FROM `Connections2` 
    WHERE `Id1` in 
       (SELECT `Id` FROM `Names` WHERE `Name`="ASCII")) 
    ORDER BY `Name` DESC LIMIT 5;

Question 10

Option 1

Here's an ingenious solution by Betsy. It shows some good research, but it is not full-proof, as by some strange fluke, there could be duplicate entries that are such that the sums might be the same, while the graph is still directed. But it's a very nice example of the use of variables... Nice job!

-- Question 10
-- Generates a query that returns @a = the sum of the column Id1, @b = the sum of the
-- column Id2 and @d = the difference of their sums.@d tells us whether Connections2 
-- is direct or not. If Connections2 is direct then Id1 and Id2 should have exactly the
-- same entries. So if we find the sum of the entries of each column, they should be the
-- same number. We can verify this by taking their difference. 
-- Thus if @d == 0, Connections2 is direct. Otherwise, it is not.


SET @a := (SELECT SUM(`Id1`) FROM `Connections2`),
@b := (SELECT SUM(`Id2`) FROM `Connections2`);
SELECT @a, @b, @d := @a-@b;

Option 2

  • To answer this question it is easier to work with a smaller graph, figure out the queries, and then apply them to the Connections2 table. Here I am using a different database, and where I create a new play table also called Connections2. It will simplify applying the queries I develop here to the original Connections2 table.
  • Here's the graph I'm using:

      	(1) <-----> (2) <-----> (3)                                                            
    	             |
      	      	     |
      	      	     V
      	      	    (4) -----> (5)          


-- Notice that the graph is '''directed''', as there is an edge from 2 to 4, 
-- and from 4 to 5, but no edges from 4 to 2 or from 5 to 4.

-- Let's see how we can figure out the directed property of this graph.   
-- There are probably some more sophisticated and or simpler ways to perform this, but this works! :-)


CREATE TABLE IF NOT EXISTS `Connections2` (
  `id1` int(1) NOT NULL,
  `id2` int(1) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;


INSERT INTO `Connections2` (`id1`, `id2`) VALUES
(1, 2),
(2, 1),
(2, 3),
(3, 2),
(2, 4),
(4, 5),
(2, 1),
(2, 3);


-- Notice that I have created '''duplicate''' edges (the last two), 
-- to emulate the situation in the homework, where  the original Connections2 table
-- also contains duplicates.


-- let's figure out how many rows we have:

SELECT count( * ) FROM `Connections2`;

-- output = 8

-- how many of these are unique?

SELECT count( distinct Id1, Id2) FROM `Connections2`;

-- output = 6 , so we have 6 unique edges in our graph
-- if this were an odd number we'd know for sure that the
-- graph is directed, as it would mean we'd have one edge
-- that does not have an inverse.  Here we could have two
-- such edges, in which case we still don't know

-- Let's create a temporary table that is like connections2
-- but without duplicates

CREATE TABLE tempConn SELECT DISTINCT * FROM Connections2;

-- contents is
-- 1 2
-- 2 1
-- 2 3
-- 3 2
-- 2 4
-- 4 5

-- create another version of tempConn, i.e.

CREATE TABLE tempConn2 SELECT * FROM tempConn;

-- contents is
-- 1 2
-- 2 1
-- 2 3
-- 3 2
-- 2 4
-- 4 5

-- Join the two tables on rows where Id1, Id2 of one table is equal 
-- to the Id2, Id1 of the other.  In other words, join together the
-- edges that are going from Node A to Node B, and from Node B to Node A

select tempConn.Id1, tempConn.Id2, tempConn2.Id1 as Id1b, tempConn2.Id2 as Id2b 
from tempConn join tempConn2 
  on tempConn.Id2 = tempConn2.Id1 and tempConn.Id1= tempConn2.Id2;

-- output
-- 2 1 1 2
-- 1 2 2 1
-- 3 2 2 3
-- 2 3 3 2

-- Now do the same thing with an outer join: If there are edges that
-- do not have reciprocating edges, then NULLs will show up in the table
-- Let's try a left join:

select tempConn.Id1, tempConn.Id2, tempConn2.Id1 as Id1b, tempConn2.Id2 as Id2b 
from tempConn left join tempConn2 
  on tempConn.Id2 = tempConn2.Id1 and tempConn.Id1= tempConn2.Id2;

-- output
-- 2 1 1 2
-- 1 2 2 1
-- 3 2 2 3
-- 2 3 3 2
-- 2 4 NULL NULL
-- 4 5 NULL NULL

-- Ah! There are some edges that do not have matching reciprocal edges. 
-- The graph IS therefore a directed graph!

-- We can delete the extra tables

DROP TABLE tempConn;
DROP TABLE tempConn2;

-- Apply this algorithm to the real ''Connections2'' table, and find out that the original graph, too, is directed.