Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 1.3: Data Models and DBMS Architecture Title: Anatomy of a Database System Authors: J. Hellerstein, M. Stonebraker Pages: 43-95.

Similar presentations


Presentation on theme: "Chapter 1.3: Data Models and DBMS Architecture Title: Anatomy of a Database System Authors: J. Hellerstein, M. Stonebraker Pages: 43-95."— Presentation transcript:

1 Chapter 1.3: Data Models and DBMS Architecture Title: Anatomy of a Database System Authors: J. Hellerstein, M. Stonebraker Pages: 43-95

2 Anatomy of a Database System Problem –Problem Statement –Why is this problem important? –Why is this problem hard? Approaches –Approach description, key concepts –Contributions (novelty, improved) –Assumptions

3 Problem Statement – DBMS Architecture Given –A data model –Platform, i.e. operating system, computer hardware architecture Find - An DBMS architecture –A set of building-block components –Interactions among building blocks Objectives –Efficiency, Scalability –Extensibility Constraints –Relational Data Model

4 Why is this problem important? Why review Relational DBMS architectural innovations? –Backbone of infrastructure applications Banking, airline reservation, medical records, CRM, SCM, … –Well-understood point of reference for New extensions and future revolution Architecture allows –Analysis of properties Availability, fault-tolerance, reliability –Mapping of multiple views User requirements to components - validation and acceptance tests Software developers, maintainer, … Software operational support group

5 Why is this problem Hard? Complexity –Mid-1970s – Efficient implementation of a Relational DBMS –Declarative Query Language –Logical and physical independence Changes –Platforms evolve Computer Hardware, Languages, Operating Systems Storage: Tapes  Disks (1960s)  RAID (1990s)  SAN … CPUs: Mainframe  Mini  Desktops  Multi-core CPUs (2000s) … –Integrate many views Enterprise – performance level, transaction reliability, … Data Processing Needs – data warehouses, reports, OLTP, Web,… …

6 Contributions, Validation Methodology Contributions –A simple yet relatively comprehensive RDBMS architecture –Decomposition into 4 components –Identification of depedencies Validation –Ability to explain academic and commercial RDBMSs –Expert opinion, authors have architected multiple DBMSs

7 Proposed Approach Four Components (Figure 1, pp. 44) –A Process Manager –Query Processing Engine –Transactional Storage Subsystem –Shared Utilities, e.g. Disk space management Interactions among components –Not explicit in Figure 1 –Implicit: Left-top to lower-right flow

8 Component 1 – Process Manager Responsibilities - Organization of processes Platform: Uni-processor, High-performance OS threads Two Options –Process per user (connection) Issues - scalability –Server Process (+ I/O Process per disk) Dispatcher thread, log manager thread Pool of worker threads Shared data (e.g. log, I/O buffer) in common heap space Issues – asynchronous I/O, protection across threads, … Client – Server communication –network socket Q? What is new in this paper relative to Parallel Database paper by DeWitt et al.?

9 Component 1 – Issues Mapping DBMS threads to OS Processes –Absence of OS threads – page 50 – Commercial examples – last para, sec. 2.2.1, page 51 Parallelism (Figures 5-7, pp. 52-54) –Shared memory – previous architectures port easily –Shared nothing Query processing parallelizes w/ horizontal data partitioning 2 phase commit need communication Partial failure –Shared disk Distributed lock manager, cache coherency protocol, … Admission Control –Avoid thrashing ( working set > memory buffers) –Control number of connections, number of queries

10 Component 2 – Query Processor Responsibility: –SQL query  execution plan (Fig. 8, pp. 64) Subcomponents –Parsing and Authorization –Catalogs –Query rewrite – views, constant expressions, semantic optimization, sub-query flattening –Optimizer – plan space, selectivity estimation, search, parallelism, extensibility, auto-tuning, … –Executor – iterator model (Figure 9, pp. 68) Q? What is new in optimizer since Selinger ?

11 Component 2 – Query Processor Issues Data Modification Statements –Plans are more complex –Ex. Halloween problem (Fig. 10, pp. 71) Access Methods –Unordered files, B+-tree, R-tree and bit-map indexes –API methods – init(), get_next(), … –Search by logical conditions (sarg) or record-id –Interacts with concurrency and recovery sub-components

12 Component 3 – Transactional Storage Manager Responsibilities – ACID properties Subcomponents –Lock Manager Serializability, 2PL, Isolation levels (p. 76) –Log Manager WAL – 3 rules (p. 78), performance tuning –Buffer pool –Access methods Latches in B+trees (p. 80) – conservative, latch-coupling, right-link Predicate locks – next-key locking

13 Component 3 – Transactional Storage Manager Interdependencies among subcomponents –Lock Manager, Log Manager WAL assume strict 2PL (p. 82) Q? What would happen without strict 2PL ? –Concurrency control, Access Methods Methods are unique to index types

14 Component 4 – Shared Utilities Sub-components –Memory allocator (p. 84) –Disk management subsystem Map tables to devices or files New issues with RAIDs (p. 86-87) –Replication services Physical, trigger based, log-based –Batch utilities Optimizer statistics gathering, backup/export, physical reorg and index construction

15 Summary Paper’s focus –DBMS Architectures – components and dependencies Insights - Four Components (Figure 1, pp. 44) –A Process Manager –Query Processing Engine –Transactional Storage Subsystem –Shared Utilities, e.g. Disk space management Interactions among components –Not explicit in Figure 1 –Q. List a few discussed in the paper!

16 Assumptions, Rewrite today Assumptions –Focus on Relational DBMS –Centralized DBMS (Recall T2.6 on R*) –Four component architecture reminds one of Ingres! –Lessons translate over to new domains Rewrite today –Cover a post-relational DBMS, e.g. Stream or XML –Illustrate how lessons translate over web-services, e-mail repositories, network monitors, etc.


Download ppt "Chapter 1.3: Data Models and DBMS Architecture Title: Anatomy of a Database System Authors: J. Hellerstein, M. Stonebraker Pages: 43-95."

Similar presentations


Ads by Google