Difference between revisions of "CSC220 Homework 5 Solution 2010"
(→Question 10) |
(→Option 2) |
||
(9 intermediate revisions by the same user not shown) | |||
Line 4: | Line 4: | ||
=Problem #1= | =Problem #1= | ||
==Questions 1-9== | ==Questions 1-9== | ||
+ | * Main answers from Alex, with contributions from Aigerim, Katie and Millie. | ||
+ | |||
<code><pre> | <code><pre> | ||
-- hw5b.sql | -- hw5b.sql | ||
Line 28: | Line 30: | ||
SELECT COUNT(*) FROM `Connections2` WHERE `Id1`="470"; | SELECT COUNT(*) FROM `Connections2` WHERE `Id1`="470"; | ||
− | -- Question 7: returns ids of all entries that are connected to entry ASCII | + | -- Question 7: returns ids of all entries that are connected to entry ASCII (this solution from Aigerim) |
− | SELECT `Id2` FROM `Connections2` WHERE | + | SELECT DISTINCT `Id2` FROM `Connections2` WHERE `Id1` IN ( |
− | + | `Id` from `Names` WHERE `Name`='ASCII' ) | |
− | -- Question 8: returns names of entries associated with ids from #7 | + | -- Question 8: returns names of entries associated with ids from #7 (this solution from Katie + Millie) |
− | SELECT `Name` FROM `Names` WHERE `Id` in | + | SELECT DISTINCT `Name` FROM `Names` WHERE `Id` in |
(SELECT `Id2` FROM `Connections2` WHERE `Id1` in | (SELECT `Id2` FROM `Connections2` WHERE `Id1` in | ||
(SELECT `Id` FROM `Names` WHERE `Name`="ASCII")); | (SELECT `Id` FROM `Names` WHERE `Name`="ASCII")); | ||
Line 46: | Line 48: | ||
==Question 10== | ==Question 10== | ||
+ | === Option 1=== | ||
+ | Here's an ingenious solution by Betsy. It shows some good research, but it is not full-proof, as by some strange fluke, there could be duplicate entries that are such that the sums might be the same, while the graph is still directed. But it's a very nice example of the use of variables... Nice job! | ||
+ | <code><pre> | ||
+ | -- Question 10 | ||
+ | -- Generates a query that returns @a = the sum of the column Id1, @b = the sum of the | ||
+ | -- column Id2 and @d = the difference of their sums.@d tells us whether Connections2 | ||
+ | -- is direct or not. If Connections2 is direct then Id1 and Id2 should have exactly the | ||
+ | -- same entries. So if we find the sum of the entries of each column, they should be the | ||
+ | -- same number. We can verify this by taking their difference. | ||
+ | -- Thus if @d == 0, Connections2 is direct. Otherwise, it is not. | ||
+ | |||
+ | |||
+ | SET @a := (SELECT SUM(`Id1`) FROM `Connections2`), | ||
+ | @b := (SELECT SUM(`Id2`) FROM `Connections2`); | ||
+ | SELECT @a, @b, @d := @a-@b; | ||
+ | |||
+ | </pre></code> | ||
− | * To answer this question it is easier to work with a smaller graph, figure out the queries, and then apply them to the Connections2 table. I am using a different database, and | + | ===Option 2=== |
+ | * To answer this question it is easier to work with a smaller graph, figure out the queries, and then apply them to the Connections2 table. Here I am using a different database, and where I create a new ''play'' table also called '''Connections2'''. It will simplify applying the queries I develop here to the original Connections2 table. | ||
− | * Here's the graph: | + | * Here's the graph I'm using: |
<code><pre> | <code><pre> | ||
Line 58: | Line 78: | ||
(4) -----> (5) | (4) -----> (5) | ||
− | |||
− | + | -- Notice that the graph is '''directed''', as there is an edge from 2 to 4, | |
+ | -- and from 4 to 5, but no edges from 4 to 2 or from 5 to 4. | ||
+ | |||
+ | -- Let's see how we can figure out the directed property of this graph. | ||
+ | -- There are probably some more sophisticated and or simpler ways to perform this, but this works! :-) | ||
− | |||
− | |||
CREATE TABLE IF NOT EXISTS `Connections2` ( | CREATE TABLE IF NOT EXISTS `Connections2` ( | ||
`id1` int(1) NOT NULL, | `id1` int(1) NOT NULL, | ||
Line 70: | Line 91: | ||
) ENGINE=MyISAM DEFAULT CHARSET=utf8; | ) ENGINE=MyISAM DEFAULT CHARSET=utf8; | ||
− | |||
− | |||
− | |||
INSERT INTO `Connections2` (`id1`, `id2`) VALUES | INSERT INTO `Connections2` (`id1`, `id2`) VALUES | ||
Line 83: | Line 101: | ||
(2, 1), | (2, 1), | ||
(2, 3); | (2, 3); | ||
+ | |||
+ | |||
+ | -- Notice that I have created '''duplicate''' edges (the last two), | ||
+ | -- to emulate the situation in the homework, where the original Connections2 table | ||
+ | -- also contains duplicates. | ||
Line 163: | Line 186: | ||
DROP TABLE tempConn; | DROP TABLE tempConn; | ||
DROP TABLE tempConn2; | DROP TABLE tempConn2; | ||
+ | |||
+ | -- Apply this algorithm to the real ''Connections2'' table, and find out that the original graph, too, is directed. | ||
</pre></code> | </pre></code> |
Latest revision as of 10:22, 26 October 2010
--D. Thiebaut 21:53, 24 October 2010 (UTC)
Problem #1
Questions 1-9
- Main answers from Alex, with contributions from Aigerim, Katie and Millie.
-- hw5b.sql
-- Alex Cheng (220a-ag)
-- 10/20/10
-- Answers to Homework 5, Part A
-- Question 1: returns all names and ids of records where name contains "processor"
SELECT * FROM `Names` WHERE `Name` LIKE "%processor%";
-- Question 2: returns all ids and names of records that contains "compute"
SELECT * FROM `Names` WHERE `Name` LIKE "%compute%" ORDER BY `Name`;
-- Question 3: same as #2 but returns last 10 entries of alphabetically sorted list
SELECT * FROM `Names` WHERE `Name` LIKE "%compute%" ORDER BY `Name` LIMIT 10;
-- Question 4: same as #2 but returns last 10 entries of alphabetically sorted list
SELECT * FROM `Names` WHERE `Name` LIKE "%compute%" ORDER BY `Name` DESC LIMIT 10;
-- Question 5: returns total number of records in table Names
SELECT COUNT(*) FROM `Names`;
-- Question 6: returns total number of records in table Connections2 whose Id1 field is 470
SELECT COUNT(*) FROM `Connections2` WHERE `Id1`="470";
-- Question 7: returns ids of all entries that are connected to entry ASCII (this solution from Aigerim)
SELECT DISTINCT `Id2` FROM `Connections2` WHERE `Id1` IN (
`Id` from `Names` WHERE `Name`='ASCII' )
-- Question 8: returns names of entries associated with ids from #7 (this solution from Katie + Millie)
SELECT DISTINCT `Name` FROM `Names` WHERE `Id` in
(SELECT `Id2` FROM `Connections2` WHERE `Id1` in
(SELECT `Id` FROM `Names` WHERE `Name`="ASCII"));
-- Question 9: same as #8 but returns last 5 entries when list is alpha ordered
SELECT `Name` FROM `Names` WHERE `Id` in (SELECT `Id2` FROM `Connections2`
WHERE `Id1` in
(SELECT `Id` FROM `Names` WHERE `Name`="ASCII"))
ORDER BY `Name` DESC LIMIT 5;
Question 10
Option 1
Here's an ingenious solution by Betsy. It shows some good research, but it is not full-proof, as by some strange fluke, there could be duplicate entries that are such that the sums might be the same, while the graph is still directed. But it's a very nice example of the use of variables... Nice job!
-- Question 10
-- Generates a query that returns @a = the sum of the column Id1, @b = the sum of the
-- column Id2 and @d = the difference of their sums.@d tells us whether Connections2
-- is direct or not. If Connections2 is direct then Id1 and Id2 should have exactly the
-- same entries. So if we find the sum of the entries of each column, they should be the
-- same number. We can verify this by taking their difference.
-- Thus if @d == 0, Connections2 is direct. Otherwise, it is not.
SET @a := (SELECT SUM(`Id1`) FROM `Connections2`),
@b := (SELECT SUM(`Id2`) FROM `Connections2`);
SELECT @a, @b, @d := @a-@b;
Option 2
- To answer this question it is easier to work with a smaller graph, figure out the queries, and then apply them to the Connections2 table. Here I am using a different database, and where I create a new play table also called Connections2. It will simplify applying the queries I develop here to the original Connections2 table.
- Here's the graph I'm using:
(1) <-----> (2) <-----> (3)
|
|
V
(4) -----> (5)
-- Notice that the graph is '''directed''', as there is an edge from 2 to 4,
-- and from 4 to 5, but no edges from 4 to 2 or from 5 to 4.
-- Let's see how we can figure out the directed property of this graph.
-- There are probably some more sophisticated and or simpler ways to perform this, but this works! :-)
CREATE TABLE IF NOT EXISTS `Connections2` (
`id1` int(1) NOT NULL,
`id2` int(1) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `Connections2` (`id1`, `id2`) VALUES
(1, 2),
(2, 1),
(2, 3),
(3, 2),
(2, 4),
(4, 5),
(2, 1),
(2, 3);
-- Notice that I have created '''duplicate''' edges (the last two),
-- to emulate the situation in the homework, where the original Connections2 table
-- also contains duplicates.
-- let's figure out how many rows we have:
SELECT count( * ) FROM `Connections2`;
-- output = 8
-- how many of these are unique?
SELECT count( distinct Id1, Id2) FROM `Connections2`;
-- output = 6 , so we have 6 unique edges in our graph
-- if this were an odd number we'd know for sure that the
-- graph is directed, as it would mean we'd have one edge
-- that does not have an inverse. Here we could have two
-- such edges, in which case we still don't know
-- Let's create a temporary table that is like connections2
-- but without duplicates
CREATE TABLE tempConn SELECT DISTINCT * FROM Connections2;
-- contents is
-- 1 2
-- 2 1
-- 2 3
-- 3 2
-- 2 4
-- 4 5
-- create another version of tempConn, i.e.
CREATE TABLE tempConn2 SELECT * FROM tempConn;
-- contents is
-- 1 2
-- 2 1
-- 2 3
-- 3 2
-- 2 4
-- 4 5
-- Join the two tables on rows where Id1, Id2 of one table is equal
-- to the Id2, Id1 of the other. In other words, join together the
-- edges that are going from Node A to Node B, and from Node B to Node A
select tempConn.Id1, tempConn.Id2, tempConn2.Id1 as Id1b, tempConn2.Id2 as Id2b
from tempConn join tempConn2
on tempConn.Id2 = tempConn2.Id1 and tempConn.Id1= tempConn2.Id2;
-- output
-- 2 1 1 2
-- 1 2 2 1
-- 3 2 2 3
-- 2 3 3 2
-- Now do the same thing with an outer join: If there are edges that
-- do not have reciprocating edges, then NULLs will show up in the table
-- Let's try a left join:
select tempConn.Id1, tempConn.Id2, tempConn2.Id1 as Id1b, tempConn2.Id2 as Id2b
from tempConn left join tempConn2
on tempConn.Id2 = tempConn2.Id1 and tempConn.Id1= tempConn2.Id2;
-- output
-- 2 1 1 2
-- 1 2 2 1
-- 3 2 2 3
-- 2 3 3 2
-- 2 4 NULL NULL
-- 4 5 NULL NULL
-- Ah! There are some edges that do not have matching reciprocal edges.
-- The graph IS therefore a directed graph!
-- We can delete the extra tables
DROP TABLE tempConn;
DROP TABLE tempConn2;
-- Apply this algorithm to the real ''Connections2'' table, and find out that the original graph, too, is directed.