Download presentation
Presentation is loading. Please wait.
1
Advanced Topics: Indexes & Transactions
Instructor: Mohamed Eltabakh cs3431
2
Indexes cs3431
3
Why Indexes With or without indexes, the query answer should be the same Indexes are needed for efficiency and fast access of data Without index, we check all 10,000 students SELECT * FROM Student WHERE sNumber = ; Assume we have 10,000 students With index, we can reach that student directly cs3431
4
Direct Access vs. Sequential Access
SELECT * FROM Student WHERE sNumber = ; Without index, we check all 10,000 students (sequential access) With index, we can reach that student directly (direct access) cs3431
5
What is an Index Student
A index is an auxiliary file that makes it more efficient to search for a record in the data file The index is usually specified on one field of the file Although it could be specified on several fields The index is stored separately from the base table Each table may have multiple indexes Student Can create an index on sNumber sNumber sName address pNum 1 Dave 320FL 2 Greg 3 Matt Can create a second index on sName cs3431
6
Example: Index on sNumber
Student Index on sNumber sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 1 2 3 4 10 100 Index file is always sorted Index size is much smaller than the table size Now any query (equality or range) on sNumber can be efficiently answered (Binary search on the index)
7
Example: Index on sName
Student Index on sName sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 Dave Greg John Matt Duplicates values have duplicate entries in the index Now any query (equality or range) on sName can be efficiently answered (Binary search on the index)
8
Creating an Index Student
Create Index <name> On <tablename>(<colNames>); Student sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 DB System knows how to: 1- create the index 2- when and how to use it Create Index sNumberIndex On Student(sNumber); Create Index sNameIndex On Student(SName);
9
Multiple Predicates Student
1- The best the DBMS can do is using addressIndex ‘320FL’ 2- From those tuples, check sName = ‘Dave’ Student sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 50WA … 4 John .. 3 200LA SELECT * FROM Student WHERE address = ‘320FL’ AND sName = ‘Dave’; Create Index addessIndex On Student(address); cs3431
10
Multi-Column Indexes Columns X, Y are frequently queried together (with AND) Each column has many duplicates Then, consider creating a multi-column index on X, Y SELECT * FROM Student WHERE address = ‘320FL’ AND sName = ‘Dave’; sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 50WA … 4 John .. 3 200LA Directly returns this record only Create Index nameAdd On Student(sName, address);
11
Using an Index DBMS automatically figures out which index to use based on the query SELECT * FROM Student WHERE sNumber = ; Student sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 Automatically uses SNumberIndex Create Index sNumberIndex On Student(sNumber); Create Index sNameIndex On Student(SName); cs3431
12
How Do Indexes Work? cs3431
13
Types of Indexes Primary vs. Secondary
Single-Level vs. Multi-Level (Tree Structure) Clustered vs. Non-Clustered cs3431
14
Primary vs. Secondary Indexes
Index on the primary key of a relation is called primary index (only one) Index on any other column is called secondary index (can be many) In primary index, all values are unique In secondary indexes, values may have duplicates Student Index on SSN is a Primary Index SSN sNumber sName address pNum 11111 1 Dave 320FL 22222 2 Greg 33333 100 Matt 44444 10 … 55555 4 John .. 66666 3 Index on sNumber is a Secondary Index Index on sName is a Secondary Index
15
Single-Level Indexes Student Index is one-level sorted list
Given a value v to query Perform a binary search in the index to find it (Fast) Follow the link to reach the actual record Student Index on sNumber sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 1 2 3 4 10 100
16
Multi-Level Index Student
Build index on top of the index (can go multiple levels) When searching for value v: Find the largest entry ≤ v, and follow its pointer Student 2nd level sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 1 2 3 4 10 100 1st level 1 4 cs3431 Index on sNumber
17
Clustered vs. Non-Clustered
Assume there is index X on column C If the records in the table are stored sorted based on C X Clustered index Otherwise, X Non-Clustered index Primary index is a clustered index Student SSN sNumber sName address 11111 1 Dave 320FL 22222 2 Greg 33333 100 Matt 44444 10 … 55555 4 John 66666 3 11111 22222 33333 44444 55555 66666 1 2 3 4 10 100 Non-Clustered index Clustered index
18
Index Maintenance Indexes are used in queries But, need to be maintained when data change Insert, update, delete DBMS automatically handles the index maintenance When insert new records the indexed field is added to the index When delete records their values are deleted from the index When update an indexed value delete the old value from index & insert the new value There is a cost for maintaining an index, however its benefit is usually more (if used a lot) cs3431
19
Summary of Indexes Indexes are auxiliary structures for efficient searching and querying Query answer is the same with or without index What to index depends on which columns are frequently queried (in Where clause) Main operations Create Index <name> On <tablename>(<colNames>); Drop Index <name>; cs3431
20
Transactions cs3431
21
Transactions solve these problems
What is a Transaction A set of operations on a database that are treated as one unit Execute All or None Transactions have semantics at the application level Want to reserve two seats in a flight Transfer money from account A to account B … What if two users are reserving the same flight seat at the same time??? Transactions solve these problems
22
Transactions By default, each SQL statement is a transaction
Can change the default behavior SQL > Start transaction; SQL > Insert …. SQL > Update … SQL > Delete .. SQL > Select … SQL> Commit | Rollback; All of these statements are now one unit (either all succeed all fail) End transaction successfully Cancel the transaction
23
Transaction Properties
Four main properties Atomicity – A transaction if one atomic unit Consistency – A transaction ensures DB is consistent Isolation – A transaction is considered as if no other transaction was executing simultaneously Durability – Changes made by a transaction must persist ACID: Atomicity, Consistency, Isolation, Durability ACID properties are enforced by the DBMS cs3431
24
What is the right answer??? Wrong, Inconsistent data
Consistency Issue Many users may update the data at the same time How to ensure the result is consistent x 2 3 4 10 100 2 1 Update T Set x = x * 3; Update T Set x = x + 2; 3 What is the right answer??? x 12 15 14 32 302 Wrong, Inconsistent data
25
Serial Order of Transactions
Given N concurrent transactions T1, T2, …TN Serial order is any permutation of these transactions (N!) T1, T2, T3, …TN T2, T3, T1, …, TN … DBMS will ensure that the end-result from executing the N transactions (concurrently) matches one of the serial order execution That is called Serializability As if transactions are executed in serial order cs3431
26
Serializable Execution
Given N concurrent transactions T1, T2, …TN DBMS will execute them concurrently (at the same time) But, the final effect matches one of the serial order executions x 2 3 4 10 100 Update T Set x = x * 3; Update T Set x = x + 2; x 12 15 18 36 306 x 8 11 14 32 302
27
That is the default in DBMS
Isolation Levels Read Uncommitted Read Committed Repeatable Read Serializable Gets stronger & avoids problems That is the default in DBMS cs3431
28
1- READ UNCOMMITTED NonRepeatable read (bad) Dirty read (bad)
Session 2 BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust ----- blue COMMIT Session 1 BEGIN TRANSACTION----- update cust set color='blue' where id=500; COMMIT | V Time NonRepeatable read (bad) Dirty read (bad)
29
2- READ COMMITTED Dirty Read Solved NonRepeatable read (bad)
Session 2 BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust ----- blue COMMIT Session 1 BEGIN TRANSACTION----- update cust set color='blue' where id=500; COMMIT | V Time NonRepeatable read (bad)
30
2- READ COMMITTED Phantom (bad) Session 2
BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust ----- COMMIT Session 1 BEGIN TRANSACTION----- delete cust where id=500; COMMIT | V Time Phantom (bad)
31
NonRepeatable Read Solved
Session 2 BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust ----- COMMIT Session 1 BEGIN TRANSACTION----- update cust set color='blue' where id=500; COMMIT | V Time
32
Phantom (For Delete) Solved
3- REPEATABLE READ Phantom (For Delete) Solved Session 2 BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust ----- COMMIT Session 1 BEGIN TRANSACTION----- delete cust where id=500; COMMIT | V Time
33
3- REPEATABLE READ Phantom Insert (bad) Session 2
BEGIN TRANSACTION----- select id from cust where color=‘blue’; id -- select id from cust 500 COMMIT Session 1 BEGIN TRANSACTION----- Insert into cust(id, color) values (500, ‘blue’); COMMIT | V Time Phantom Insert (bad)
34
4- SERIALIZABLE Phantom Solved Session 2
BEGIN TRANSACTION----- select id from cust where color=‘blue’; id -- select id from cust COMMIT Session 1 BEGIN TRANSACTION----- Insert into cust(id, color) values (500, ‘blue’); COMMIT | V Time
35
Summary of Transactions
Unit of work in DBMS Either executed All or None Ensures consistency among many concurrent transactions Ensures persistent data once committed (using recovery techniques) Main ACID properties Atomicity, Consistency, Isolation, Durability cs3431
36
END !!! cs3431
37
Friday’s Lecture (Revision + short Quiz)
Final Exam Dec. 13, at 8:15am – 9:30am (75 mins) Closed book, open sheet Answer in the same exam sheet Material Included ERD SQL (Select, Insert, Update, Delete) Views, Triggers, Assertions Cursors, Stored Procedures/Functions Material Excluded Relational Model & Algebra Normalization Theory ODBC/JDBC Indexes and Transactions Friday’s Lecture (Revision + short Quiz)
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.