Difference between revisions of "CSC220 Database Management Systems"

From dftwiki3
Jump to: navigation, search
(MySQL Server)
(Example)
 
(8 intermediate revisions by the same user not shown)
Line 688: Line 688:
  
 
[[Image:CSC220 MySql Server.png | 600px]]
 
[[Image:CSC220 MySql Server.png | 600px]]
 +
* Each client is associated a thread per connection
 +
* Each thread runs on 1 core only
 +
* Cache contains SELECT queries and their result
 +
* Parser generates a parse tree
 +
* Storage engines include MyIsam and InnoDB
 +
 +
==File System for myIsam Engine==
 +
* 1 database = 1 folder
 +
* 1 table = 2 files: data + index
 +
* located on /usr/local/mysql/data on '''xgridmac'''
 +
 +
<tanbox>
 +
* DEMO
 +
** Show where files are located.
 +
** show index files
 +
** show log files
 +
** Show how to export databases
 +
</tanbox>
 +
 +
* largest database table limited by file size: unlimited on Linux.  http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits
 +
* files are platform independent (can be copied!)
 +
* tables can be compressed (for read-only operations, s.a. CD-ROM)
 +
 +
=Transactions=
 +
 +
* Under MyIsam, a query is a transaction.
 +
 +
=Exercises=
 +
 +
* [[CSC220 Exercises with MySQL | Exercise Page]]
 +
 +
=Views=
 +
 +
* Use the Dept, Class, Enrollment tables (in 220a_example4 db) as example
 +
* Create a view that shows all dept classes and their enrollment
 +
 +
;Step 1
 +
 +
  create view totalEnroll as select * from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId`
 +
            join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
 +
 +
:Generates an error on duplicate Id...  ==> replace * with the actual list of fields wanted:
 +
 +
;Step 2: specify fields
 +
 +
  create view totalEnroll as select `Course`.`Id`, `Dept`.`Name`, `Course`.`Name`,
 +
          `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId`
 +
          join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
 +
 +
:Generates another error on duplicate field name `Name` (it keeps the name of the field, not the name of the table) ==> use aliases
 +
 +
;Step 3: use aliases
 +
 +
  create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`,
 +
          `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId`
 +
          join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
 +
 +
;Step 4: create new aggregate fields:
 +
 +
  create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`,
 +
          '''concat( `Dept`.`Name`, `Course`.`Name` ) as `DCName`''', `Enrollment`.`Count` from `Dept`
 +
          join `Course` on `Dept`.`Id`=`Course`.`deptId`
 +
          join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
 +
 +
 
 +
==Rules on Views==
 +
 +
* Technically, '''Create View''' statements may not contain "ORDER BY" clause (although MySQL ignores it).  If one selects from a view that contains an '''ORDER BY''' with a query that contains its own '''ORDER BY''', the view's '''ORDER BY''' is ignored.
 +
 +
* The select query used to create a view '''cannot contain subqueries''' in its '''FROM''' clause
 +
 +
* The table used in a view must exist when the view is created
 +
 +
=Index=
 +
 +
* First difference between PRIMARY KEY and INDEX.  A primary key is a logical entity regarding records in a table.  It usually is implemented as a UNIQUE INDEX.  An INDEX can be unique or not and speeds up access to a specific field in the table.
 +
 +
* Indexes are implemented as B-Trees, see http://www.bluerwhite.org/btree/ for review
 +
 +
==Example==
 +
 +
* Use database 220a_example5
 +
* Use this program to create it table user with a million entries: [[CSC220 fillTable.php | fillTable.php]]
 +
* Look at the table `users`  (verify it doesn't have any indexes)
 +
* Record its size
 +
* Perform some queries and check the execution time
 +
 +
SELECT * FROM `users` order by `Name` limit 10        ''(0.69 sec)''
 +
 +
SELECT * FROM `users` order by `Qty` limit 10            ''(twice as fast as previous one, why?)''
 +
 +
SELECT * FROM `users` where 100 < `Qty` and `Qty`<10000        ''(0.3sec)''
 +
 +
SELECT COUNT(*) FROM `users` where "2" < `Name` and `Name`< "3"   
 +
 +
* Now create an Index on `Name` and on `Qty`
 +
** Look at Structure, add index (non unique)
 +
** Why is it spending so long?
 +
* Check size of table
 +
* Redo same queries as above and check e
 +
 +
=References=
 +
 +
*  [http://www.faqs.org/docs/ppbook/x1428.htm Anatomy of a SQL query]
 +
* Joins
 +
** [http://dev.mysql.com/doc/refman/5.0/en/join.html Joining tables]
 +
* Indexes
 +
** [http://dev.mysql.com/doc/refman/5.0/en/create-index.html Creating an '''Index''']
 +
** [http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html How MySQL uses '''Indexes''']
 +
** [http://www.bluerwhite.org/btree/ B-Trees] (used for indexes)
 +
* Views
 +
** [http://dev.mysql.com/doc/refman/5.0/en/create-view.html '''Creating''' a ''View'']
 +
** [http://dev.mysql.com/doc/refman/5.0/en/views.html '''Using''' a ''View'']
  
 
<br />
 
<br />

Latest revision as of 08:31, 20 October 2010

--D. Thiebaut 12:08, 6 October 2010 (UTC)


Main Reference

DBMSTheCompleteBook.jpg










Important Concepts

  • DBMS = database management system






Definition
A DBMS is a tool for creating and managing large amounts of data efficiently and allowing it to persist over long periods of time, safely.




  • 1970: seminal paper by Ted Codd, where proposition is that DBMS users should see the data as tables, called relations


CSC220 DBMS1.jpg



  • ==> Relational database systems
  • SQL = Structured Query Language
  • 1990: relational database systems are the norm
  • today: DBMS have evolved to new forms (cloud).
  • example of large databases:
    • Google
    • Satellite data
    • Social Networks (Facebook)
    • YouTube videos
  • Two kinds of users of a DBMS: Admin and Users


CSC220 DBMS2.jpg


    • Admin create/modify organization of the data (table size, data size, constraints on the data values) = Schema of the database.
    • Users query the DB to lookup, search, insert new data, or delete new data. The result (SQL) is presented in the form of temporary tables.



Concepts of Transcation and Query
A transaction is a group of queries (possibly just one) that should operate atomically and in isolation from one another. A complete transaction should be durable. This means that logging and recovery processes are an integral part of a DBMS.



  • DBMS = { data, metadata, logs, statistics, indexes + data-structures }


CSC220 DBMS3.jpg


  • Measure of Goodness: ACID properties of a transaction
A
atomicity = all-or-nothing
C
consistency: some data cannot be negative, for example ==> The DBMS must ensure data consistency after the execution of a transaction.
I
Isolation: the transactions should appear to operate as if no other transactions are/were operating at the same time.
D
durability: Once the transaction has exected, the effect of this transaction on the data must never be lost.



  • MySql is in partial compliance of ACID when using the default myIsam storage engine. InnoDB, BDB, and cluster storage engines, which are also available with MySql provide ACID compliance, but with performance degredation.

Example

  • Users, purchasing items from a store.
  • 3 tables: users, orders, items

Create Tables

CREATE TABLE `users` (
 `userId` INT(1) NOT NULL AUTO_INCREMENT,
 `userName` VARCHAR(100) DEFAULT NULL,
 PRIMARY KEY  (`userId`)
) ENGINE=MYISAM DEFAULT CHARSET=utf8;


CREATE TABLE `orders` (
 `orderId` INT(1) NOT NULL AUTO_INCREMENT,
 `userId` INT(1) DEFAULT NULL,
 `itemId` INT(1) DEFAULT NULL,
 PRIMARY KEY  (`orderId`)
) ENGINE=MYISAM DEFAULT CHARSET=utf8;


CREATE TABLE `items` (
 `itemId` INT(1) NOT NULL AUTO_INCREMENT,
 `itemName` VARCHAR(100) DEFAULT NULL,
 `price` FLOAT DEFAULT NULL,
 PRIMARY KEY  (`itemId`)
) ENGINE=MYISAM DEFAULT CHARSET=utf8;

Insert Data

  • Add 5 users to the database (Marie, John, Manon, Rob, Elaine)
INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Marie');
INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'John');
INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Manon');
INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Rob');
INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Elaine');

  • Add 6 items and their price (pencil, eraser, paper, ink, ruler, markers)
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'pencil','1');
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'erase','0.5');
UPDATE `items` SET `itemId`='2',`itemName`='eraser',`price`='0.5' WHERE `itemId`='2';
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'paper','5');
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'ink','30');
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'ruler','1.5');
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'markers','2.5');

  • Create some user orders of items. e.g. Marie buys a pencil, an eraser, paper, and ink.
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','1');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','2');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','3');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','4');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'2','1');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'3','1');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'3','4');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'5','4');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'5','3');

Table-related operations

  • to delete a table: "DROP TABLE"
  • to erase a table without deleting it: "TRUNCATE TABLE"

Some queries

SELECT * FROM `items`;

SELECT `itemName` FROM `items`;

SELECT `itemName` FROM `items`WHERE `itemId`=2;

SELECT * FROM `users` ORDER BY `userName` ASC;

SELECT * FROM `orders` WHERE `userId`=1;

SELECT `itemName` FROM `items` WHERE `itemId` IN ( 1, 2, 3 );

SELECT `itemName` FROM `items` WHERE `itemId` IN ( SELECT `itemId` FROM `orders` WHERE `userId`=1 );

SELECT `itemName` FROM `items` WHERE `itemId` IN ( 
            SELECT `itemId` FROM `orders` WHERE `userId` IN  ( 
                         SELECT `userId` FROM `users` WHERE `userName`='Marie') );


Anatomy of a Query

Command

Description

CREATE DATABASE

Creates a new database

CREATE INDEX

Creates a new index on a table column

CREATE SEQUENCE

Creates a new sequence in an existing database

CREATE TABLE

Creates a new table in an existing database

CREATE TRIGGER

Creates a new trigger definition

CREATE VIEW

Creates a new view on an existing table

SELECT

Retrieves records from a table

INSERT

Adds one or more new records into a table

UPDATE

Modifies the data in existing table records

DELETE

Removes existing records from a table

DROP DATABASE

Destroys an existing database

DROP INDEX

Removes a column index from an existing table

DROP SEQUENCE

Destroys an existing sequence generator

DROP TABLE

Destroys an existing table

DROP TRIGGER

Destroys an existing trigger definition

DROP VIEW

Destroys an existing table view

CREATE USER

Adds a new PostgreSQL user account to the system

ALTER USER

Modifies an existing PostgreSQL user account

DROP USER

Removes an existing PostgreSQL user account

GRANT

Grant rights on a database object to a user

REVOKE

Deny rights on a database object from a user

CREATE FUNCTION

Creates a new SQL function within a database

CREATE LANGUAGE

Creates a new language definition within a database

CREATE OPERATOR

Creates a new SQL operator within a database

CREATE TYPE

Creates a new SQL data type within a database

MySQL Server

CSC220 MySql Server.png

  • Each client is associated a thread per connection
  • Each thread runs on 1 core only
  • Cache contains SELECT queries and their result
  • Parser generates a parse tree
  • Storage engines include MyIsam and InnoDB

File System for myIsam Engine

  • 1 database = 1 folder
  • 1 table = 2 files: data + index
  • located on /usr/local/mysql/data on xgridmac
  • DEMO
    • Show where files are located.
    • show index files
    • show log files
    • Show how to export databases

Transactions

  • Under MyIsam, a query is a transaction.

Exercises

Views

  • Use the Dept, Class, Enrollment tables (in 220a_example4 db) as example
  • Create a view that shows all dept classes and their enrollment
Step 1
  create view totalEnroll as select * from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` 
           join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
Generates an error on duplicate Id... ==> replace * with the actual list of fields wanted:
Step 2
specify fields
  create view totalEnroll as select `Course`.`Id`, `Dept`.`Name`, `Course`.`Name`, 
         `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` 
         join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
Generates another error on duplicate field name `Name` (it keeps the name of the field, not the name of the table) ==> use aliases
Step 3
use aliases
  create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`, 
         `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` 
         join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
Step 4
create new aggregate fields:
  create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`, 
         concat( `Dept`.`Name`, `Course`.`Name` ) as `DCName`, `Enrollment`.`Count` from `Dept` 
         join `Course` on `Dept`.`Id`=`Course`.`deptId` 
         join Enrollment on `Course`.`Id`=`Enrollment`.`Id`

 

Rules on Views

  • Technically, Create View statements may not contain "ORDER BY" clause (although MySQL ignores it). If one selects from a view that contains an ORDER BY with a query that contains its own ORDER BY, the view's ORDER BY is ignored.
  • The select query used to create a view cannot contain subqueries in its FROM clause
  • The table used in a view must exist when the view is created

Index

  • First difference between PRIMARY KEY and INDEX. A primary key is a logical entity regarding records in a table. It usually is implemented as a UNIQUE INDEX. An INDEX can be unique or not and speeds up access to a specific field in the table.

Example

  • Use database 220a_example5
  • Use this program to create it table user with a million entries: fillTable.php
  • Look at the table `users` (verify it doesn't have any indexes)
  • Record its size
  • Perform some queries and check the execution time
SELECT * FROM `users` order by `Name` limit 10         (0.69 sec)

SELECT * FROM `users` order by `Qty` limit 10            (twice as fast as previous one, why?)

SELECT * FROM `users` where 100 < `Qty` and `Qty`<10000        (0.3sec)

SELECT COUNT(*) FROM `users` where "2" < `Name` and `Name`< "3"     
  • Now create an Index on `Name` and on `Qty`
    • Look at Structure, add index (non unique)
    • Why is it spending so long?
  • Check size of table
  • Redo same queries as above and check e

References