Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#25: Column Stores

Slides:



Advertisements
Similar presentations
Arjun Suresh S7, R College of Engineering Trivandrum.
Advertisements

Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
6.814/6.830 Lecture 8 Memory Management. Column Representation Reduces Scan Time Idea: Store each column in a separate file GM AAPL.
BTrees & Bitmap Indexes
Database Implementation Issues CPSC 315 – Programming Studio Spring 2008 Project 1, Lecture 5 Slides adapted from those used by Jennifer Welch.
Cloud Computing Lecture Column Store – alternative organization for big relational data.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
Column Oriented Database Vs Row Oriented Databases By Rakesh Venkat.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#18: Physical Database Design.
Radix Sort and Hash-Join for Vector Computers Ripal Nathuji 6.893: Advanced VLSI Computer Architecture 10/12/00.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#25: Column Stores.
CS4432: Database Systems II
Column Oriented Database By: Deepak Sood Garima Chhikara Neha Rani Vijayita Gumber.
Oracle Announced New In- Memory Database G1 Emre Eftelioglu, Fen Liu [09/27/13] 1 [1]
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Lecture#18: Physical Database Design
Module 11: File Structure
15.1 – Introduction to physical-Query-plan operators
CS 540 Database Management Systems
Indexing Structures for Files and Physical Database Design
CS 440 Database Management Systems
Database Management System
Lecture 16: Data Storage Wednesday, November 6, 2006.
Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#1: Introduction
Physical Database Design and Performance
External Sorting Chapter 13
Decomposition Storage Model (DSM)
Database Applications (15-415) DBMS Internals- Part VII Lecture 16, October 25, 2016 Mohammad Hammoud.
Database Performance Tuning and Query Optimization
Evaluation of Relational Operations
Lecture#7: Fun with SQL (Part 2)
Chapter 15 QUERY EXECUTION.
DB storage architectures: Rows, Columns, LSM trees
April 30th – Scheduling / parallel
Database Implementation Issues
Lecture#24: Distributed Database Systems
Lecture#12: External Sorting (R&G, Ch13)
Physical Database Design
Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#27: Final Review
Cse 344 APRIL 23RD – Indexing.
Module 11: Data Storage Structure
CPSC-310 Database Systems
Computer Architecture
So far… Text RO …. printf() RW link printf Linking, loading
Main Memory Background Swapping Contiguous Allocation Paging
Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#13: Query Evaluation
External Sorting Chapter 13
Selected Topics: External Sorting, Join Algorithms, …
The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited)
Lecture 2- Query Processing (continued)
Chapter 13: Data Storage Structures
DATABASE IMPLEMENTATION ISSUES
CSTORE E0261 Jayant Haritsa Computer Science and Automation
Chapter 11 Database Performance Tuning and Query Optimization
ICOM 5016 – Introduction to Database Systems
Data Warehousing Concepts
Database Implementation Issues
Indexes and Performance
External Sorting Chapter 13
DB storage architectures: Rows and Columns
Chapter 13: Data Storage Structures
Chapter 13: Data Storage Structures
Database Implementation Issues
Lecture 20: Representing Data Elements
Presentation transcript:

Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#25: Column Stores CMU 15-415/615 Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos – A. Pavlo Lecture#25: Column Stores

Administrivia HW7 Phase 1: Wed Nov 9th HW7 Phase 2: Mon Nov 28th HW8: Mon Dec 5th Final Exam: Tue Dec 13th @ 5:30pm Exam will be held in two locations. We will send an email with your assigned room. Faloutsos/Pavlo CMU SCS 15-415/615

Today’s Class Storage Models System Architectures Vectorization Compression Data Modification Faloutsos/Pavlo CMU SCS 15-415/615

Wikipedia Example CREATE TABLE useracct ( userID INT PRIMARY KEY, userName VARCHAR UNIQUE, ⋮ ); CREATE TABLE pages ( pageID INT PRIMARY KEY, title VARCHAR UNIQUE, latest INT REFERENCES revisions (revID), ); CREATE TABLE revisions ( revID INT PRIMARY KEY, pageID INT REFERENCES pages (pageID), userID INT REFERENCES useracct (userID), content TEXT, updated DATETIME );

OLTP On-line Transaction Processing: Short-lived txns. Small footprint. Repetitive operations. SELECT * FROM useracct WHERE userName = ? AND userPass = ? SELECT P.*, R.* FROM pages AS P INNER JOIN revisions AS R ON P.latest = R.revID WHERE P.pageID = ? UPDATE useracct SET lastLogin = NOW(), hostname = ? WHERE userID = ? INSERT INTO revisions VALUES (?,?…,?) Faloutsos/Pavlo

OLAP On-line Analytical Processing: Long running queries. Complex joins. Exploratory queries. SELECT COUNT(U.lastLogin), EXTRACT(month FROM U.lastLogin) AS month FROM useracct AS U WHERE U.hostname LIKE ‘%.gov’ GROUP BY EXTRACT(month FROM U.lastLogin) Faloutsos/Pavlo CMU SCS 15-415/615

Data Storage Models There are different ways to store tuples. We have been assuming the n-ary storage model this entire semester. Faloutsos/Pavlo CMU SCS 15-415/615

n-ary Storage Model The DBMS stores all attributes for a single tuple contiguously in a block. NSM Disk Page userID userName userPass lastLogin hostname - Faloutsos/Pavlo CMU SCS 15-415/615

n-ary Storage Model B+Tree SELECT * FROM useracct WHERE userName = ? AND userPass = ? INSERT INTO useracct VALUES (?,?,…?) userID userName userPass lastLogin hostname - NSM Disk Page userID userName userPass lastLogin hostname Faloutsos/Pavlo CMU SCS 15-415/615

X n-ary Storage Model SELECT COUNT(U.lastLogin), EXTRACT(month FROM U.lastLogin) AS month FROM useracct AS U WHERE U.hostname LIKE ‘%.gov’ GROUP BY EXTRACT(month FROM U.lastLogin) X userID userName userPass lastLogin hostname NSM Disk Page Faloutsos/Pavlo CMU SCS 15-415/615

n-ary Storage Model Advantages Disadvantages Fast inserts, updates, and deletes. Good for queries that need the entire tuple. Disadvantages Not good for scanning large portions of the table and/or a subset of the attributes. Faloutsos/Pavlo CMU SCS 15-415/615

Decomposition Storage Model The DBMS stores a single attribute for all tuples contiguously in a block. DSM Disk Page hostname userID userName userPass lastLogin hostname - userID lastLogin userName userPass Faloutsos/Pavlo CMU SCS 15-415/615

Decomposition Storage Model SELECT COUNT(U.lastLogin), EXTRACT(month FROM U.lastLogin) AS month FROM useracct AS U WHERE U.hostname LIKE ‘%.gov’ GROUP BY EXTRACT(month FROM U.lastLogin) DSM Disk Page hostname Faloutsos/Pavlo CMU SCS 15-415/615

Decomposition Storage Model Advantages Reduces the amount wasted I/O because the DBMS only reads the data that it needs. Better query processing and data compression (more on this later). Disadvantages Slow for point queries, inserts, updates, and deletes because of tuple splitting/stitching. Faloutsos/Pavlo CMU SCS 15-415/615

History 1970s: Cantor DBMS 1980s: DSM Proposal 1990s: SybaseIQ (in-memory only) 2000s: Vertica, VectorWise, MonetDB 2010s: Cloudera Impala, Amazon Redshift, “The Big Three” Faloutsos/Pavlo CMU SCS 15-415/615

System Architectures Fractured Mirrors Partition Attributes Across (PAX) Pure Columnar Storage Faloutsos/Pavlo CMU SCS 15-415/615

Fractured Mirrors Store a second copy of the database in a DSM layout that is automatically updated. Examples: Oracle, IBM DB2 BLU NSM DSM Faloutsos/Pavlo CMU SCS 15-415/615

PAX Data is still stored in NSM blocks, but each block is organized as mini columns. PAX Disk Page hostname lastLogin userPass userName userID Faloutsos/Pavlo CMU SCS 15-415/615

Column Stores Entire system is designed for columnar data. Query Processing, Storage, Operator Algorithms, Indexing, etc. Examples: Vertica, VectorWise, Paraccel, Cloudera Impala, Amazon Redshift Faloutsos/Pavlo CMU SCS 15-415/615

Today’s Class Storage Models System Architectures Vectorization Compression Data Modification Faloutsos/Pavlo CMU SCS 15-415/615

Query Processing Strategies The DBMS needs to process queries differently when using columnar data. We have already discussed the Iterator Model for processing tuples in the DBMS query operators. Faloutsos/Pavlo CMU SCS 15-415/615

Iterator Model Each operator calls next() on their child operator to process tuples one at a time. CUSTOMER ACCOUNT s ⨝ p acctno=acctno amt>1000 cname, amt next SELECT cname, amt FROM customer, account WHERE customer.acctno = account.acctno AND account.amt > 1000 next next Faloutsos/Pavlo CMU SCS 15-415/615

Materialization Model Each operator consumes its entire input and generates the full output all at once. CUSTOMER ACCOUNT s ⨝ p acctno=acctno amt>1000 cname, amt SELECT cname, amt FROM customer, account WHERE customer.acctno = account.acctno AND account.amt > 1000 Faloutsos/Pavlo CMU SCS 15-415/615

Observations The Iterator Model is bad with a DSM because it requires the DBMS to stitch tuples back together each time. The Materialization Model is a bad because the intermediate results may be larger than the amount of memory in the system. Faloutsos/Pavlo CMU SCS 15-415/615

Vectorized Model Like the Iterator Model but each next() invocation returns a vector of tuples instead of a single tuple. This vector does not have to contain the entire tuple, just the attributes that are needed for query processing. Faloutsos/Pavlo CMU SCS 15-415/615

Vectorized Model Each operator calls next() on their child operator to process vectors. CUSTOMER ACCOUNT s ⨝ M acctno=acctno amt>1000 cname, amt next SELECT cname, amt FROM customer, account WHERE customer.acctno = account.acctno AND account.amt > 1000 next next M acctno, amt next acctno amt Faloutsos/Pavlo CMU SCS 15-415/615

Virtual IDs vs. Offsets Need a way to stitch tuples back together. Two approaches: Fixed length offsets Virtual ids embedded in columns userID userName userPass hostname lastLogin 1 2 3 4 5 6 7 userID userName userPass hostname lastLogin 1 2 3 4 5 6 7 Offsets Virtual Ids

Vectorized Model Reduced interpretation overhead. Better cache locality. Compiler optimization opportunities. AFAIK, VectorWise is still the only system that uses this model. Other systems use query compilation instead… Faloutsos/Pavlo CMU SCS 15-415/615

Today’s Class Storage Models System Architectures Vectorization Compression Data Modification Faloutsos/Pavlo CMU SCS 15-415/615

Compression Overview Compress the database to reduce the amount of I/O needed to process queries. DSM databases compress much better than NSM databases. Storing similar data together is ideal for compression algorithms. Faloutsos/Pavlo CMU SCS 15-415/615

Naïve Compression Use a general purpose algorithm to compress pages when they are stored on disk. Example: 10KB page in memory, 4KB compressed page on disk. Do we have to decompress the page when it is brought into memory? Why or why not? Faloutsos/Pavlo CMU SCS 15-415/615

Fixed-width Compression Sacrifice some compression in exchange for having uniform-length values per attribute. Tuples are no longer aligned at offsets userID userName userPass hostname lastLogin 1 2 3 4 5 6 7 Fixed-Length Compression userID userName userPass hostname lastLogin 1 2 3 4 5 6 7 userID userName userPass hostname lastLogin 1 2 3 4 5 6 7 Variable-Length Compression Original Data Faloutsos/Pavlo CMU SCS 15-415/615

Run-length Encoding Compress runs of the same value into a compact triplet: (value, startPosition, runLength) All tuples are sorted on this column. Reduces the # of triplets userID sex M F 1 2 5 6 7 3 4 Sorted Data userID sex M F 1 2 3 4 5 6 7 Original Data userID sex (M,0,6) (F,6,2) 1 2 5 6 7 3 4 Sorted RLE userID sex (M,0,3) (F,3,2) (M,5,3) 1 2 3 4 5 6 7 Unsorted RLE Faloutsos/Pavlo CMU SCS 15-415/615

Faloutsos/Pavlo CMU - 15-415/615 Delta Encoding Record the difference between successive values in the same column. time 12:00 +1 temp 99.5 -1 1 2 3 4 5 6 7 Delta Encoding time 12:00 (+1,7) temp 99.5 -1 +1 Delta+RLE 1 2 3 4 5 6 7 time temp 1 2 3 4 5 6 7 12:00 99.5 12:01 99.4 12:02 99.5 12:03 99.6 12:04 99.6 12:05 99.5 12:06 99.4 12:07 99.5 Original Data Faloutsos/Pavlo CMU SCS 15-415/615

Bit-Vector Encoding Store a separate bit-vector for each unique value for a particular attribute where an offset in the vector corresponds to a tuple. userID sex M F 1 2 3 4 5 6 7 Original Data userID sex M → 1 1 1 0 0 1 1 1 1 2 3 4 5 6 7 Bit-Vector Compression F → 0 0 0 1 1 0 0 0 A ‘1’ means that the tuple at that offset has the bit-vector’s value Faloutsos/Pavlo CMU SCS 15-415/615

Dictionary Compression Replace frequent patterns with smaller integer codes. Need to support fast encoding and decoding. Need to also support range queries. Faloutsos/Pavlo CMU SCS 15-415/615

Dictionary Compression Construct a separate table of the unique values for an attribute sorted by value. SELECT * FROM users WHERE name LIKE ‘Tru%’ SELECT * FROM users WHERE name BETWEEN 70 AND 80 userId 101 102 103 104 105 106 107 108 name Truman Obama Bush Reagan Trump Nixon Carter Ford 1 2 3 4 5 6 7 Original Data userId 101 102 103 104 105 106 107 108 value Bush Carter Ford Nixon Obama Reagan Truman Trump 1 2 3 4 5 6 7 Compressed Data name 70 50 10 60 80 40 20 30 code

Dictionary Compression A dictionary needs to support two operations: Encode: For a given uncompressed value, convert it into its compressed form. Decode: For a given compressed value, convert it back into its original form. We need two data structures to support operations in both directions. Faloutsos/Pavlo CMU SCS 15-415/615

Summary Some operator algorithms can operate directly on compressed data Saves I/O without having to decompress! Difficult to implement when the DBMS uses multiple compression schemes. It’s generally good to wait as long as possible to materialize/decompress data when processing queries… Faloutsos/Pavlo CMU SCS 15-415/615

Today’s Class Storage Models System Architectures Vectorization Compression Data Modification Faloutsos/Pavlo CMU SCS 15-415/615

Bifurcated Architecture All txns are executed on OLTP database. Periodically migrate changes to OLAP database. OLTP OLAP Data Warehouse OLTP Extract Transform Load OLTP Faloutsos/Pavlo CMU SCS 15-415/615

Modifying a Column Store Updating compressed data is expensive. Updating sorted data is expensive. The DBMS will store updates in an staging area and then apply them in batches. Have to make sure that we execute queries on both the staging and main storage. Faloutsos/Pavlo CMU SCS 15-415/615

Delta Store Stage updates in delta store and periodically apply them in batches to the main storage. Examples: Vertica, SAP HANA Main Delta Faloutsos/Pavlo CMU SCS 15-415/615

HTAP Hybrid Transaction-Analytical Processing Single database instance that can handle both OLTP workloads and OLAP queries. Row-store for OLTP Column-store for OLAP Examples: SAP HANA, MemSQL, HyPer, SpliceMachine, Peloton, Cloudera Kudu (???) Faloutsos/Pavlo CMU SCS 15-415/615

Conclusion If you’re running OLAP queries, you need to be using a column store. Don’t let anybody try to tell you otherwise. Faloutsos/Pavlo CMU SCS 15-415/615

Rest of the Semester http://cmudb.io/f16-systems Mon Nov 28th – Column Stores Wed Nov 30th – Data Warehousing + Mining Mon Dec 5th – SpliceMachine Guest Speaker Wed Dec 7th –Review + Systems Potpourri http://cmudb.io/f16-systems Faloutsos/Pavlo CMU SCS 15-415/615