Difference between revisions of "CSC220 Database Management Systems"
(ql) |
(→Example) |
||
(9 intermediate revisions by the same user not shown) | |||
Line 687: | Line 687: | ||
=MySQL Server= | =MySQL Server= | ||
− | [[Image:CSC220 MySql Server.png]] | + | [[Image:CSC220 MySql Server.png | 600px]] |
+ | * Each client is associated a thread per connection | ||
+ | * Each thread runs on 1 core only | ||
+ | * Cache contains SELECT queries and their result | ||
+ | * Parser generates a parse tree | ||
+ | * Storage engines include MyIsam and InnoDB | ||
+ | |||
+ | ==File System for myIsam Engine== | ||
+ | * 1 database = 1 folder | ||
+ | * 1 table = 2 files: data + index | ||
+ | * located on /usr/local/mysql/data on '''xgridmac''' | ||
+ | |||
+ | <tanbox> | ||
+ | * DEMO | ||
+ | ** Show where files are located. | ||
+ | ** show index files | ||
+ | ** show log files | ||
+ | ** Show how to export databases | ||
+ | </tanbox> | ||
+ | |||
+ | * largest database table limited by file size: unlimited on Linux. http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits | ||
+ | * files are platform independent (can be copied!) | ||
+ | * tables can be compressed (for read-only operations, s.a. CD-ROM) | ||
+ | |||
+ | =Transactions= | ||
+ | |||
+ | * Under MyIsam, a query is a transaction. | ||
+ | |||
+ | =Exercises= | ||
+ | |||
+ | * [[CSC220 Exercises with MySQL | Exercise Page]] | ||
+ | |||
+ | =Views= | ||
+ | |||
+ | * Use the Dept, Class, Enrollment tables (in 220a_example4 db) as example | ||
+ | * Create a view that shows all dept classes and their enrollment | ||
+ | |||
+ | ;Step 1 | ||
+ | |||
+ | create view totalEnroll as select * from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` | ||
+ | join Enrollment on `Course`.`Id`=`Enrollment`.`Id` | ||
+ | |||
+ | :Generates an error on duplicate Id... ==> replace * with the actual list of fields wanted: | ||
+ | |||
+ | ;Step 2: specify fields | ||
+ | |||
+ | create view totalEnroll as select `Course`.`Id`, `Dept`.`Name`, `Course`.`Name`, | ||
+ | `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` | ||
+ | join Enrollment on `Course`.`Id`=`Enrollment`.`Id` | ||
+ | |||
+ | :Generates another error on duplicate field name `Name` (it keeps the name of the field, not the name of the table) ==> use aliases | ||
+ | |||
+ | ;Step 3: use aliases | ||
+ | |||
+ | create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`, | ||
+ | `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` | ||
+ | join Enrollment on `Course`.`Id`=`Enrollment`.`Id` | ||
+ | |||
+ | ;Step 4: create new aggregate fields: | ||
+ | |||
+ | create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`, | ||
+ | '''concat( `Dept`.`Name`, `Course`.`Name` ) as `DCName`''', `Enrollment`.`Count` from `Dept` | ||
+ | join `Course` on `Dept`.`Id`=`Course`.`deptId` | ||
+ | join Enrollment on `Course`.`Id`=`Enrollment`.`Id` | ||
+ | |||
+ | |||
+ | ==Rules on Views== | ||
+ | |||
+ | * Technically, '''Create View''' statements may not contain "ORDER BY" clause (although MySQL ignores it). If one selects from a view that contains an '''ORDER BY''' with a query that contains its own '''ORDER BY''', the view's '''ORDER BY''' is ignored. | ||
+ | |||
+ | * The select query used to create a view '''cannot contain subqueries''' in its '''FROM''' clause | ||
+ | |||
+ | * The table used in a view must exist when the view is created | ||
+ | |||
+ | =Index= | ||
+ | |||
+ | * First difference between PRIMARY KEY and INDEX. A primary key is a logical entity regarding records in a table. It usually is implemented as a UNIQUE INDEX. An INDEX can be unique or not and speeds up access to a specific field in the table. | ||
+ | |||
+ | * Indexes are implemented as B-Trees, see http://www.bluerwhite.org/btree/ for review | ||
+ | |||
+ | ==Example== | ||
+ | |||
+ | * Use database 220a_example5 | ||
+ | * Use this program to create it table user with a million entries: [[CSC220 fillTable.php | fillTable.php]] | ||
+ | * Look at the table `users` (verify it doesn't have any indexes) | ||
+ | * Record its size | ||
+ | * Perform some queries and check the execution time | ||
+ | |||
+ | SELECT * FROM `users` order by `Name` limit 10 ''(0.69 sec)'' | ||
+ | |||
+ | SELECT * FROM `users` order by `Qty` limit 10 ''(twice as fast as previous one, why?)'' | ||
+ | |||
+ | SELECT * FROM `users` where 100 < `Qty` and `Qty`<10000 ''(0.3sec)'' | ||
+ | |||
+ | SELECT COUNT(*) FROM `users` where "2" < `Name` and `Name`< "3" | ||
+ | |||
+ | * Now create an Index on `Name` and on `Qty` | ||
+ | ** Look at Structure, add index (non unique) | ||
+ | ** Why is it spending so long? | ||
+ | * Check size of table | ||
+ | * Redo same queries as above and check e | ||
+ | |||
+ | =References= | ||
+ | |||
+ | * [http://www.faqs.org/docs/ppbook/x1428.htm Anatomy of a SQL query] | ||
+ | * Joins | ||
+ | ** [http://dev.mysql.com/doc/refman/5.0/en/join.html Joining tables] | ||
+ | * Indexes | ||
+ | ** [http://dev.mysql.com/doc/refman/5.0/en/create-index.html Creating an '''Index'''] | ||
+ | ** [http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html How MySQL uses '''Indexes'''] | ||
+ | ** [http://www.bluerwhite.org/btree/ B-Trees] (used for indexes) | ||
+ | * Views | ||
+ | ** [http://dev.mysql.com/doc/refman/5.0/en/create-view.html '''Creating''' a ''View''] | ||
+ | ** [http://dev.mysql.com/doc/refman/5.0/en/views.html '''Using''' a ''View''] | ||
<br /> | <br /> |
Latest revision as of 08:31, 20 October 2010
--D. Thiebaut 12:08, 6 October 2010 (UTC)
Contents
Main Reference
- The information here is taken from Chapter 1 (required reading) of Garcia-Molina, Ullman and Widom's Database Systems: the Complete Book. Their Chapter 2, which is also online, is great reading.
Important Concepts
- DBMS = database management system
- Definition
- A DBMS is a tool for creating and managing large amounts of data efficiently and allowing it to persist over long periods of time, safely.
- 1970: seminal paper by Ted Codd, where proposition is that DBMS users should see the data as tables, called relations
- ==> Relational database systems
- SQL = Structured Query Language
- 1990: relational database systems are the norm
- today: DBMS have evolved to new forms (cloud).
- example of large databases:
- Satellite data
- Social Networks (Facebook)
- YouTube videos
- Two kinds of users of a DBMS: Admin and Users
- Admin create/modify organization of the data (table size, data size, constraints on the data values) = Schema of the database.
- Users query the DB to lookup, search, insert new data, or delete new data. The result (SQL) is presented in the form of temporary tables.
- Concepts of Transcation and Query
- A transaction is a group of queries (possibly just one) that should operate atomically and in isolation from one another. A complete transaction should be durable. This means that logging and recovery processes are an integral part of a DBMS.
- DBMS = { data, metadata, logs, statistics, indexes + data-structures }
- Measure of Goodness: ACID properties of a transaction
- A
- atomicity = all-or-nothing
- C
- consistency: some data cannot be negative, for example ==> The DBMS must ensure data consistency after the execution of a transaction.
- I
- Isolation: the transactions should appear to operate as if no other transactions are/were operating at the same time.
- D
- durability: Once the transaction has exected, the effect of this transaction on the data must never be lost.
- MySql is in partial compliance of ACID when using the default myIsam storage engine. InnoDB, BDB, and cluster storage engines, which are also available with MySql provide ACID compliance, but with performance degredation.
Example
- Users, purchasing items from a store.
- 3 tables: users, orders, items
Create Tables
CREATE TABLE `users` ( `userId` INT(1) NOT NULL AUTO_INCREMENT, `userName` VARCHAR(100) DEFAULT NULL, PRIMARY KEY (`userId`) ) ENGINE=MYISAM DEFAULT CHARSET=utf8; CREATE TABLE `orders` ( `orderId` INT(1) NOT NULL AUTO_INCREMENT, `userId` INT(1) DEFAULT NULL, `itemId` INT(1) DEFAULT NULL, PRIMARY KEY (`orderId`) ) ENGINE=MYISAM DEFAULT CHARSET=utf8; CREATE TABLE `items` ( `itemId` INT(1) NOT NULL AUTO_INCREMENT, `itemName` VARCHAR(100) DEFAULT NULL, `price` FLOAT DEFAULT NULL, PRIMARY KEY (`itemId`) ) ENGINE=MYISAM DEFAULT CHARSET=utf8;
Insert Data
- Add 5 users to the database (Marie, John, Manon, Rob, Elaine)
INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Marie'); INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'John'); INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Manon'); INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Rob'); INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Elaine');
- Add 6 items and their price (pencil, eraser, paper, ink, ruler, markers)
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'pencil','1'); INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'erase','0.5'); UPDATE `items` SET `itemId`='2',`itemName`='eraser',`price`='0.5' WHERE `itemId`='2'; INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'paper','5'); INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'ink','30'); INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'ruler','1.5'); INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'markers','2.5');
- Create some user orders of items. e.g. Marie buys a pencil, an eraser, paper, and ink.
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','1'); INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','2'); INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','3'); INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','4'); INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'2','1'); INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'3','1'); INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'3','4'); INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'5','4'); INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'5','3');
- to delete a table: "DROP TABLE"
- to erase a table without deleting it: "TRUNCATE TABLE"
Some queries
SELECT * FROM `items`; SELECT `itemName` FROM `items`; SELECT `itemName` FROM `items`WHERE `itemId`=2; SELECT * FROM `users` ORDER BY `userName` ASC; SELECT * FROM `orders` WHERE `userId`=1; SELECT `itemName` FROM `items` WHERE `itemId` IN ( 1, 2, 3 ); SELECT `itemName` FROM `items` WHERE `itemId` IN ( SELECT `itemId` FROM `orders` WHERE `userId`=1 ); SELECT `itemName` FROM `items` WHERE `itemId` IN ( SELECT `itemId` FROM `orders` WHERE `userId` IN ( SELECT `userId` FROM `users` WHERE `userName`='Marie') );
Anatomy of a Query
Command | Description |
---|---|
CREATE DATABASE | Creates a new database |
CREATE INDEX | Creates a new index on a table column |
CREATE SEQUENCE | Creates a new sequence in an existing database |
CREATE TABLE | Creates a new table in an existing database |
CREATE TRIGGER | Creates a new trigger definition |
CREATE VIEW | Creates a new view on an existing table |
SELECT | Retrieves records from a table |
INSERT | Adds one or more new records into a table |
UPDATE | Modifies the data in existing table records |
DELETE | Removes existing records from a table |
DROP DATABASE | Destroys an existing database |
DROP INDEX | Removes a column index from an existing table |
DROP SEQUENCE | Destroys an existing sequence generator |
DROP TABLE | Destroys an existing table |
DROP TRIGGER | Destroys an existing trigger definition |
DROP VIEW | Destroys an existing table view |
CREATE USER | Adds a new PostgreSQL user account to the system |
ALTER USER | Modifies an existing PostgreSQL user account |
DROP USER | Removes an existing PostgreSQL user account |
GRANT | Grant rights on a database object to a user |
REVOKE | Deny rights on a database object from a user |
CREATE FUNCTION | Creates a new SQL function within a database |
CREATE LANGUAGE | Creates a new language definition within a database |
CREATE OPERATOR | Creates a new SQL operator within a database |
CREATE TYPE | Creates a new SQL data type within a database |
MySQL Server
- Each client is associated a thread per connection
- Each thread runs on 1 core only
- Cache contains SELECT queries and their result
- Parser generates a parse tree
- Storage engines include MyIsam and InnoDB
File System for myIsam Engine
- 1 database = 1 folder
- 1 table = 2 files: data + index
- located on /usr/local/mysql/data on xgridmac
- DEMO
- Show where files are located.
- show index files
- show log files
- Show how to export databases
- largest database table limited by file size: unlimited on Linux. http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits
- files are platform independent (can be copied!)
- tables can be compressed (for read-only operations, s.a. CD-ROM)
Transactions
- Under MyIsam, a query is a transaction.
Exercises
Views
- Use the Dept, Class, Enrollment tables (in 220a_example4 db) as example
- Create a view that shows all dept classes and their enrollment
- Step 1
create view totalEnroll as select * from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
- Generates an error on duplicate Id... ==> replace * with the actual list of fields wanted:
- Step 2
- specify fields
create view totalEnroll as select `Course`.`Id`, `Dept`.`Name`, `Course`.`Name`, `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
- Generates another error on duplicate field name `Name` (it keeps the name of the field, not the name of the table) ==> use aliases
- Step 3
- use aliases
create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`, `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
- Step 4
- create new aggregate fields:
create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`, concat( `Dept`.`Name`, `Course`.`Name` ) as `DCName`, `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
Rules on Views
- Technically, Create View statements may not contain "ORDER BY" clause (although MySQL ignores it). If one selects from a view that contains an ORDER BY with a query that contains its own ORDER BY, the view's ORDER BY is ignored.
- The select query used to create a view cannot contain subqueries in its FROM clause
- The table used in a view must exist when the view is created
Index
- First difference between PRIMARY KEY and INDEX. A primary key is a logical entity regarding records in a table. It usually is implemented as a UNIQUE INDEX. An INDEX can be unique or not and speeds up access to a specific field in the table.
- Indexes are implemented as B-Trees, see http://www.bluerwhite.org/btree/ for review
Example
- Use database 220a_example5
- Use this program to create it table user with a million entries: fillTable.php
- Look at the table `users` (verify it doesn't have any indexes)
- Record its size
- Perform some queries and check the execution time
SELECT * FROM `users` order by `Name` limit 10 (0.69 sec) SELECT * FROM `users` order by `Qty` limit 10 (twice as fast as previous one, why?) SELECT * FROM `users` where 100 < `Qty` and `Qty`<10000 (0.3sec) SELECT COUNT(*) FROM `users` where "2" < `Name` and `Name`< "3"
- Now create an Index on `Name` and on `Qty`
- Look at Structure, add index (non unique)
- Why is it spending so long?
- Check size of table
- Redo same queries as above and check e
References
- Anatomy of a SQL query
- Joins
- Indexes
- Creating an Index
- How MySQL uses Indexes
- B-Trees (used for indexes)
- Views