Latest revision as of 09:31, 20 October 2010

--D. Thiebaut 12:08, 6 October 2010 (UTC)

Main Reference

The information here is taken from Chapter 1 (required reading) of Garcia-Molina, Ullman and Widom's Database Systems: the Complete Book. Their Chapter 2, which is also online, is great reading.

Important Concepts

DBMS = database management system

Definition: A DBMS is a tool for creating and managing large amounts of data efficiently and allowing it to persist over long periods of time, safely.

1970: seminal paper by Ted Codd, where proposition is that DBMS users should see the data as tables, called relations

==> Relational database systems

SQL = Structured Query Language

1990: relational database systems are the norm

today: DBMS have evolved to new forms (cloud).

example of large databases:
- Google
- Satellite data
- Social Networks (Facebook)
- YouTube videos

Two kinds of users of a DBMS: Admin and Users

- Admin create/modify organization of the data (table size, data size, constraints on the data values) = Schema of the database.
- Users query the DB to lookup, search, insert new data, or delete new data. The result (SQL) is presented in the form of temporary tables.

Concepts of Transcation and Query: A transaction is a group of queries (possibly just one) that should operate atomically and in isolation from one another. A complete transaction should be durable. This means that logging and recovery processes are an integral part of a DBMS.

DBMS = { data, metadata, logs, statistics, indexes + data-structures }

Measure of Goodness: ACID properties of a transaction

A: atomicity = all-or-nothing

C: consistency: some data cannot be negative, for example ==> The DBMS must ensure data consistency after the execution of a transaction.

I: Isolation: the transactions should appear to operate as if no other transactions are/were operating at the same time.

D: durability: Once the transaction has exected, the effect of this transaction on the data must never be lost.

MySql is in partial compliance of ACID when using the default myIsam storage engine. InnoDB, BDB, and cluster storage engines, which are also available with MySql provide ACID compliance, but with performance degredation.

Example

Users, purchasing items from a store.
3 tables: users, orders, items

Create Tables

CREATE TABLE `users` (
 `userId` INT(1) NOT NULL AUTO_INCREMENT,
 `userName` VARCHAR(100) DEFAULT NULL,
 PRIMARY KEY  (`userId`)
) ENGINE=MYISAM DEFAULT CHARSET=utf8;


CREATE TABLE `orders` (
 `orderId` INT(1) NOT NULL AUTO_INCREMENT,
 `userId` INT(1) DEFAULT NULL,
 `itemId` INT(1) DEFAULT NULL,
 PRIMARY KEY  (`orderId`)
) ENGINE=MYISAM DEFAULT CHARSET=utf8;


CREATE TABLE `items` (
 `itemId` INT(1) NOT NULL AUTO_INCREMENT,
 `itemName` VARCHAR(100) DEFAULT NULL,
 `price` FLOAT DEFAULT NULL,
 PRIMARY KEY  (`itemId`)
) ENGINE=MYISAM DEFAULT CHARSET=utf8;

Insert Data

Add 5 users to the database (Marie, John, Manon, Rob, Elaine)

INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Marie');
INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'John');
INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Manon');
INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Rob');
INSERT INTO `users`(`userId`,`userName`) VALUES ( NULL,'Elaine');

Add 6 items and their price (pencil, eraser, paper, ink, ruler, markers)

INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'pencil','1');
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'erase','0.5');
UPDATE `items` SET `itemId`='2',`itemName`='eraser',`price`='0.5' WHERE `itemId`='2';
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'paper','5');
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'ink','30');
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'ruler','1.5');
INSERT INTO `items`(`itemId`,`itemName`,`price`) VALUES ( NULL,'markers','2.5');

Create some user orders of items. e.g. Marie buys a pencil, an eraser, paper, and ink.

INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','1');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','2');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','3');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'1','4');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'2','1');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'3','1');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'3','4');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'5','4');
INSERT INTO `orders`(`orderId`,`userId`,`itemId`) VALUES ( NULL,'5','3');

Table-related operations

to delete a table: "DROP TABLE"
to erase a table without deleting it: "TRUNCATE TABLE"

Some queries

SELECT * FROM `items`;

SELECT `itemName` FROM `items`;

SELECT `itemName` FROM `items`WHERE `itemId`=2;

SELECT * FROM `users` ORDER BY `userName` ASC;

SELECT * FROM `orders` WHERE `userId`=1;

SELECT `itemName` FROM `items` WHERE `itemId` IN ( 1, 2, 3 );

SELECT `itemName` FROM `items` WHERE `itemId` IN ( SELECT `itemId` FROM `orders` WHERE `userId`=1 );

SELECT `itemName` FROM `items` WHERE `itemId` IN ( 
            SELECT `itemId` FROM `orders` WHERE `userId` IN  ( 
                         SELECT `userId` FROM `users` WHERE `userName`='Marie') );

Anatomy of a Query

Command	Description
`CREATE DATABASE`	Creates a new database
`CREATE INDEX`	Creates a new index on a table column
`CREATE SEQUENCE`	Creates a new sequence in an existing database
`CREATE TABLE`	Creates a new table in an existing database
`CREATE TRIGGER`	Creates a new trigger definition
`CREATE VIEW`	Creates a new view on an existing table
`SELECT`	Retrieves records from a table
`INSERT`	Adds one or more new records into a table
`UPDATE`	Modifies the data in existing table records
`DELETE`	Removes existing records from a table
`DROP DATABASE`	Destroys an existing database
`DROP INDEX`	Removes a column index from an existing table
`DROP SEQUENCE`	Destroys an existing sequence generator
`DROP TABLE`	Destroys an existing table
`DROP TRIGGER`	Destroys an existing trigger definition
`DROP VIEW`	Destroys an existing table view
`CREATE USER`	Adds a new PostgreSQL user account to the system
`ALTER USER`	Modifies an existing PostgreSQL user account
`DROP USER`	Removes an existing PostgreSQL user account
`GRANT`	Grant rights on a database object to a user
`REVOKE`	Deny rights on a database object from a user
`CREATE FUNCTION`	Creates a new SQL function within a database
`CREATE LANGUAGE`	Creates a new language definition within a database
`CREATE OPERATOR`	Creates a new SQL operator within a database
`CREATE TYPE`	Creates a new SQL data type within a database

MySQL Server

Each client is associated a thread per connection
Each thread runs on 1 core only
Cache contains SELECT queries and their result
Parser generates a parse tree
Storage engines include MyIsam and InnoDB

File System for myIsam Engine

1 database = 1 folder
1 table = 2 files: data + index
located on /usr/local/mysql/data on xgridmac

DEMO
- Show where files are located.
- show index files
- show log files
- Show how to export databases

largest database table limited by file size: unlimited on Linux. http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits
files are platform independent (can be copied!)
tables can be compressed (for read-only operations, s.a. CD-ROM)

Transactions

Under MyIsam, a query is a transaction.

Exercises

Exercise Page

Views

Use the Dept, Class, Enrollment tables (in 220a_example4 db) as example
Create a view that shows all dept classes and their enrollment

Step 1

  create view totalEnroll as select * from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` 
           join Enrollment on `Course`.`Id`=`Enrollment`.`Id`

Generates an error on duplicate Id... ==> replace * with the actual list of fields wanted:

Step 2: specify fields

  create view totalEnroll as select `Course`.`Id`, `Dept`.`Name`, `Course`.`Name`, 
         `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` 
         join Enrollment on `Course`.`Id`=`Enrollment`.`Id`

Generates another error on duplicate field name `Name` (it keeps the name of the field, not the name of the table) ==> use aliases

Step 3: use aliases

  create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`, 
         `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId` 
         join Enrollment on `Course`.`Id`=`Enrollment`.`Id`

Step 4: create new aggregate fields:

  create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`, 
         concat( `Dept`.`Name`, `Course`.`Name` ) as `DCName`, `Enrollment`.`Count` from `Dept` 
         join `Course` on `Dept`.`Id`=`Course`.`deptId` 
         join Enrollment on `Course`.`Id`=`Enrollment`.`Id`

Rules on Views

Technically, Create View statements may not contain "ORDER BY" clause (although MySQL ignores it). If one selects from a view that contains an ORDER BY with a query that contains its own ORDER BY, the view's ORDER BY is ignored.

The select query used to create a view cannot contain subqueries in its FROM clause

The table used in a view must exist when the view is created

Index

First difference between PRIMARY KEY and INDEX. A primary key is a logical entity regarding records in a table. It usually is implemented as a UNIQUE INDEX. An INDEX can be unique or not and speeds up access to a specific field in the table.

Indexes are implemented as B-Trees, see http://www.bluerwhite.org/btree/ for review

Example

Use database 220a_example5
Use this program to create it table user with a million entries: fillTable.php
Look at the table `users` (verify it doesn't have any indexes)
Record its size
Perform some queries and check the execution time

SELECT * FROM `users` order by `Name` limit 10         (0.69 sec)

SELECT * FROM `users` order by `Qty` limit 10            (twice as fast as previous one, why?)

SELECT * FROM `users` where 100 < `Qty` and `Qty`<10000        (0.3sec)

SELECT COUNT(*) FROM `users` where "2" < `Name` and `Name`< "3"

Now create an Index on `Name` and on `Qty`
- Look at Structure, add index (non unique)
- Why is it spending so long?
Check size of table
Redo same queries as above and check e

References

Anatomy of a SQL query
Joins
- Joining tables
Indexes
- Creating an Index
- How MySQL uses Indexes
- B-Trees (used for indexes)
Views
- Creating a View
- Using a View

@@ Line 688: / Line 688: @@
 [[Image:CSC220 MySql Server.png | 600px]]
+* Each client is associated a thread per connection
+* Each thread runs on 1 core only
+* Cache contains SELECT queries and their result
+* Parser generates a parse tree
+* Storage engines include MyIsam and InnoDB
+==File System for myIsam Engine==
+* 1 database = 1 folder
+* 1 table = 2 files: data + index
+* located on /usr/local/mysql/data on '''xgridmac'''
+<tanbox>
+* DEMO
+** Show where files are located.
+** show index files
+** show log files
+** Show how to export databases
+</tanbox>
+* largest database table limited by file size: unlimited on Linux.  http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits
+* files are platform independent (can be copied!)
+* tables can be compressed (for read-only operations, s.a. CD-ROM)
+=Transactions=
+* Under MyIsam, a query is a transaction.
+=Exercises=
+* [[CSC220 Exercises with MySQL | Exercise Page]]
+=Views=
+* Use the Dept, Class, Enrollment tables (in 220a_example4 db) as example
+* Create a view that shows all dept classes and their enrollment
+;Step 1
+   create view totalEnroll as select * from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId`
+            join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
+:Generates an error on duplicate Id...  ==> replace * with the actual list of fields wanted:
+;Step 2: specify fields
+   create view totalEnroll as select `Course`.`Id`, `Dept`.`Name`, `Course`.`Name`,
+          `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId`
+          join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
+:Generates another error on duplicate field name `Name` (it keeps the name of the field, not the name of the table) ==> use aliases
+;Step 3: use aliases
+   create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`,
+          `Enrollment`.`Count` from `Dept` join `Course` on `Dept`.`Id`=`Course`.`deptId`
+          join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
+;Step 4: create new aggregate fields:
+   create view totalEnroll as select `Course`.`Id`, `Dept`.`Name` as `DName`, `Course`.`Name` as `CName`,
+          '''concat( `Dept`.`Name`, `Course`.`Name` ) as `DCName`''', `Enrollment`.`Count` from `Dept`
+          join `Course` on `Dept`.`Id`=`Course`.`deptId`
+          join Enrollment on `Course`.`Id`=`Enrollment`.`Id`
+==Rules on Views==
+* Technically, '''Create View''' statements may not contain "ORDER BY" clause (although MySQL ignores it).  If one selects from a view that contains an '''ORDER BY''' with a query that contains its own '''ORDER BY''', the view's '''ORDER BY''' is ignored.
+* The select query used to create a view '''cannot contain subqueries''' in its '''FROM''' clause
+* The table used in a view must exist when the view is created
+=Index=
+* First difference between PRIMARY KEY and INDEX.  A primary key is a logical entity regarding records in a table.  It usually is implemented as a UNIQUE INDEX.   An INDEX can be unique or not and speeds up access to a specific field in the table.
+* Indexes are implemented as B-Trees, see http://www.bluerwhite.org/btree/ for review
+==Example==
+* Use database 220a_example5
+* Use this program to create it table user with a million entries: [[CSC220 fillTable.php | fillTable.php]]
+* Look at the table `users`  (verify it doesn't have any indexes)
+* Record its size
+* Perform some queries and check the execution time
+ SELECT * FROM `users` order by `Name` limit 10         ''(0.69 sec)''
+ SELECT * FROM `users` order by `Qty` limit 10            ''(twice as fast as previous one, why?)''
+ SELECT * FROM `users` where 100 < `Qty` and `Qty`<10000        ''(0.3sec)''
+ SELECT COUNT(*) FROM `users` where "2" < `Name` and `Name`< "3"
+* Now create an Index on `Name` and on `Qty`
+** Look at Structure, add index (non unique)
+** Why is it spending so long?
+* Check size of table
+* Redo same queries as above and check e
+=References=
+*  [http://www.faqs.org/docs/ppbook/x1428.htm Anatomy of a SQL query]
+* Joins
+** [http://dev.mysql.com/doc/refman/5.0/en/join.html Joining tables]
+* Indexes
+** [http://dev.mysql.com/doc/refman/5.0/en/create-index.html Creating an '''Index''']
+** [http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html How MySQL uses '''Indexes''']
+** [http://www.bluerwhite.org/btree/ B-Trees] (used for indexes)
+* Views
+** [http://dev.mysql.com/doc/refman/5.0/en/create-view.html '''Creating''' a ''View'']
+** [http://dev.mysql.com/doc/refman/5.0/en/views.html '''Using''' a ''View'']
 <br />

Difference between revisions of "CSC220 Database Management Systems"

Latest revision as of 09:31, 20 October 2010

Contents

Main Reference

Important Concepts

Example

Create Tables

Insert Data

Table-related operations

Some queries

Anatomy of a Query

MySQL Server

File System for myIsam Engine

Transactions

Exercises

Views

Rules on Views

Index

Example

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools