Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPSC-608 Database Systems

Similar presentations


Presentation on theme: "CPSC-608 Database Systems"— Presentation transcript:

1 CPSC-608 Database Systems
Fall 2018 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #21 Notes #7

2 What Does DBMS Do? An input database program P SELECT a1, b1, c1 FROM A, B, C WHERE a2=1 AND b2=2 AND c2=3 Prepare a collection C of efficient algorithms for operations in relational algebra; parser A parse tree parse tree preprocessing parse tree × A B C σ π a1, b1, c1 a2=1, b2 =2, c2=3 parse tree-lqp convertor logic query plan apply logic laws logic query plan Optimization via logic and size × A B C σ π a1, b1, c1 a2=1 b2 =2 c2=3 logic query plan Lqp-pqp convertor take care of issues in optimization and security. physical query plan Optimization via algorithms and cost Machine executable code

3 What Does DBMS Do? An input database program P SELECT a1, b1, c1 FROM A, B, C WHERE a2=1 AND b2=2 AND c2=3 Prepare a collection C of efficient algorithms for operations in relational algebra; parser A parse tree parse tree preprocessing parse tree × A B C σ π a1, b1, c1 a2=1, b2 =2, c2=3 parse tree-lqp convertor logic query plan apply logic laws logic query plan Optimization via logic and size × A B C σ π a1, b1, c1 a2=1 b2 =2 c2=3 logic query plan Lqp-pqp convertor take care of issues in optimization and security. physical query plan Optimization via algorithms and cost Machine executable code

4 What Does DBMS Do? An input database program P SELECT a1, b1, c1 FROM A, B, C WHERE a2=1 AND b2=2 AND c2=3 Prepare a collection C of efficient algorithms for operations in relational algebra; parser A parse tree parse tree preprocessing parse tree × A B C σ π a1, b1, c1 a2=1, b2 =2, c2=3 parse tree-lqp convertor logic query plan apply logic laws logic query plan Optimization via logic and size × A B C σ π a1, b1, c1 a2=1 b2 =2 c2=3 logic query plan Lqp-pqp convertor take care of issues in optimization and security. physical query plan Optimization via algorithms and cost Machine executable code

5 Construction of Physical Query Plan

6 Construction of Physical Query Plan
Input: an optimized LQP T, and main memory constraint M × σ π F G B A σ σ D E C

7 Construction of Physical Query Plan
Input: an optimized LQP T, and main memory constraint M Replacing each leaf R of T by “scan(R)”; × σ scan(F) π scan(B) scan(G) scan(A) σ σ scan(D) scan(E) scan(C)

8 Construction of Physical Query Plan
Input: an optimized LQP T, and main memory constraint M Replacing each leaf R of T by “scan(R)”; Combining the “scan’s” with other operations; × σ scan(F) π scan(B) scan(G) scan(A) index-scan σ σ scan(D) index-scan scan(E) index-scan scan(C)

9 Construction of Physical Query Plan
Input: an optimized LQP T, and main memory constraint M Replacing each leaf R of T by “scan(R)”; Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm; × CJ J2P I1P σ scan(F) π J2P scan(B) scan(G) J1P scan(A) index-scan J1P σ σ scan(D) index-scan scan(E) index-scan scan(C)

10 Construction of Physical Query Plan
Input: an optimized LQP T, and main memory constraint M Replacing each leaf R of T by “scan(R)”; Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm; For each edge e in T, decide if e should be “materialized”; × CJ J2P I1P σ scan(F) π J2P scan(B) scan(G) J1P scan(A) index-scan J1P σ σ scan(D) index-scan scan(E) index-scan scan(C)

11 Construction of Physical Query Plan
Input: an optimized LQP T, and main memory constraint M Replacing each leaf R of T by “scan(R)”; Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm; For each edge e in T, decide if e should be “materialized”; × CJ J2P I1P σ scan(F) π J2P scan(B) scan(G) J1P scan(A) index-scan J1P σ σ scan(D) index-scan scan(E) index-scan scan(C)

12 Construction of Physical Query Plan
Input: an optimized LQP T, and main memory constraint M Replacing each leaf R of T by “scan(R)”; Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm; For each edge e in T, decide if e should be “materialized”; Cut all materialized edges; × CJ J2P I1P σ scan(F) π J2P scan(B) scan(G) J1P scan(A) index-scan J1P σ σ scan(D) index-scan scan(E) index-scan scan(C)

13 Construction of Physical Query Plan
Input: an optimized LQP T, and main memory constraint M Replacing each leaf R of T by “scan(R)”; Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm; For each edge e in T, decide if e should be “materialized”; Cut all materialized edges; Each subtree is a call to the subroutine at the root of the subtree. The order of the calls follows the bottom-up order in the structure. 3 × CJ 2 J2P I1P σ scan(F) π J2P scan(B) scan(G) J1P scan(A) 1 index-scan J1P σ σ scan(D) index-scan scan(E) index-scan scan(C)

14 Construction of Physical Query Plan
Input: an optimized LQP T, and main memory constraint M Replacing each leaf R of T by “scan(R)”; Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm; For each edge e in T, decide if e should be “materialized”; Cut all materialized edges; Each subtree is a call to the subroutine at the root of the subtree. The order of the calls follows the bottom-up order in the structure. 3 × CJ 2 J2P I1P σ scan(F) π J2P scan(B) scan(G) J1P scan(A) 1 index-scan J1P σ σ scan(D) index-scan scan(E) index-scan scan(C) This produces an executable code for the input DB program

15 Physical Query Plan: Summary
Replacing internal nodes of a LQP by proper algorithms; Deciding if a subroutine call should be pipelined or materialized; Many optimization techniques are involved here; In practice, heuristic optimization techniques are used to construct good physical query plans; The resulting physical query plan is an executable code.

16 DBMS graduate database in tables (relations) lock table DDL language
administrator DDL complier lock table DDL language file manager logging & recovery concurrency control transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database

17 DBMS graduate database in tables (relations) lock table DDL language
administrator DDL complier lock table DDL language file manager logging & recovery concurrency control transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database

18 DBMS What is still missing? graduate database in tables (relations)
administrator DDL complier lock table DDL language file manager logging & recovery concurrency control What is still missing? transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database

19 Efficient Algorithms for
in tables (relations) database administrator DDL complier lock table DDL language file manager logging & recovery concurrency control Efficient Algorithms for Relational algebriac operations transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database

20 Efficient Algorithms for
in tables (relations) database administrator DDL complier lock table DDL language file manager logging & recovery concurrency control Efficient Algorithms for Relational algebriac operations transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database

21 DBMS graduate database in tables (relations) lock table DDL language
administrator DDL complier lock table DDL language file manager logging & recovery concurrency control transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database

22 DBMS graduate database in tables (relations) lock table DDL language
administrator DDL complier lock table DDL language file manager logging & recovery concurrency control transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database

23 DBMS graduate database in tables (relations) lock table DDL language
administrator DDL complier lock table DDL language file manager logging & recovery concurrency control transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database

24 The Main Purpose of Index Structures
Notes #7

25 The Main Purpose of Index Structures
Speedup the search process blocks containing the desired tuples quickly figure out index σa=6(R) disks Notes #7

26 The Main Purpose of Index Structures
Speedup the search process blocks containing the desired tuples quickly figure out index σa=6(R) otherwise have to scan the entire R disks Notes #7

27 The Main Purpose of Index Structures
Speedup the search process blocks containing the desired tuples quickly figure out index σa=6(R) otherwise have to scan the entire R disks But also need to handle dynamic changes of R Notes #7

28 B+Trees Support fast search Support range search
Support dynamic changes Could be either dense or sparse dense: pointers to all records sparse: one pointer per block Notes #7

29 B+Trees A B+tree node of order n
where ph are pointers (disk addresses) and kh are search-keys (values of the attributes in the index) pn+1 kn k2 p2 k1 p1 p3 …… Notes #7

30 B+Trees A B+tree node of order n How big is n?
where ph are pointers (disk addresses) and kh are search-keys (values of the attributes in the index) How big is n? Basically we want each B+tree node to fit in a disk block so that a B+tree node can be read/written by a single disk I/O. Typically, n ~ pn+1 kn k2 p2 k1 p1 p3 …… Notes #7

31 B+Tree Example order n = 3
root 100 30 120 150 180 3 5 11 30 35 100 101 110 120 130 150 156 179 180 200 Notes #7

32 A B+Tree of order n Each node has: n keys and n+1 pointers
These are fixed To keep the nodes not too empty, also for the operations to be applied efficiently: * Non-leaf: at least (n+1)/2 pointers (to children) * Leaf: at least (n+1)/2 pointers to data (plus a “sequence pointer” to the next leaf) Basically: use at least one half of the pointers Notes #7

33 Sample non-leaf order n = 3
57 81 95 To keys k < 57 To keys 57 k<81 To keys 81 k<95 To keys k  95 Notes #7

34 Sample leaf node order n = 3
From non-leaf node To next leaf in sequence 57 81 95 To record with key 57 To record with key 81 To record with key 95 Notes #7

35 Example (B+ tree of order n=3)
Full node Min. node 120 150 180 30 Non-leaf 3 5 11 30 35 Leaf Notes #7

36 B+tree rules Rule 1. All leaves are at same lowest level (balanced tree) Rule 2. Pointers in leaves point to records except for “sequence pointer” Rule 3. Number of keys/pointers in nodes: Max. # pointers Max. # keys Min. # keys Non-leaf n+1 n (n+1)/2 (n+1)/2 1 Leaf (n+1)/2 + 1 (n+1)/2 Root 2 1 Notes #7

37 B+tree rules Rule 1. All leaves are at same lowest level (balanced tree) Rule 2. Pointers in leaves point to records except for “sequence pointer” Rule 3. Number of keys/pointers in nodes: Max. # pointers Max. # keys Min. # keys Non-leaf n+1 n (n+1)/2 (n+1)/2 1 Leaf (n+1)/2 + 1 (n+1)/2 Root 2 1 could be 1 Notes #7


Download ppt "CPSC-608 Database Systems"

Similar presentations


Ads by Google