Query Execution in Main Memory DBMS

Slides:



Advertisements
Similar presentations
1 Optimizing compilers Managing Cache Bercovici Sivan.
Advertisements

Query Task Model (QTM): Modeling Query Execution with Tasks 1 Steffen Zeuch and Johann-Christoph Freytag.
Memory Operation and Performance To understand the memory architecture so that you could write programs that could take the advantages and make the programs.
Ingres/VectorWise Doug Inkster – Ingres Development.
Recap. The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the.
Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.
File System Implementation: beyond the user’s view A possible file system layout on a disk.
File Systems Implementation
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
Dutch-Belgium DataBase Day University of Antwerp, MonetDB/x100 Peter Boncz, Marcin Zukowski, Niels Nes.
SQL Server 2005 Performance Enhancements for Large Queries Joe Chang
Scalable Data Warehouse & Data Marts ReportsAnalysis SQL Server DBMS SQL Server Integration Services Custom OLTP Increase usage & trust.
C-Store: Column Stores over Solid State Drives Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun 19, 2009.
CS 345: Topics in Data Warehousing Tuesday, October 19, 2004.
Getting Started With Ingres VectorWise
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Buffering Database Operations for Enhanced Instruction Cache Performance Jingren Zhou, Kenneth A. Ross SIGMOD International Conference on Management of.
MonetDB/X100 hyper-pipelining query execution Peter Boncz, Marcin Zukowski, Niels Nes.
Ingres/VectorWise Doug Inkster – Ingres Development.
CS Data Warehouse & Performance Tuning Xiaofang Zhou School of Computing, NUS Office: S URL:
1 Recovery Tuning Main techniques Put the log on a dedicated disk Delay writing updates to the database disks as long as possible Setting proper intervals.
Column Oriented Database Vs Row Oriented Databases By Rakesh Venkat.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
CS Operating System & Database Performance Tuning Xiaofang Zhou School of Computing, NUS Office: S URL:
File Systems Security File Systems Implementation.
Buffer-pool aware Query Optimization Ravishankar Ramamurthy David DeWitt University of Wisconsin, Madison.
Why Do We Need Files? Must store large amounts of data. Information stored must survive the termination of the process using it - that is, be persistent.
4.3 Virtual Memory. Virtual memory  Want to run programs (code+stack+data) larger than available memory.  Overlays programmer divides program into pieces.
Based on slides developed by Hakan Hacigumus, Bala Iyer, and Sharad Mehrotra ICDE 2002, San Jose, CA, USA Lecture 17: Providing Database as a Service Professor.
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
Big Data Infrastructure Week 7: Analyzing Relational Data (2/3) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0.
Fixing Page Life Expectancy Steve Hood Blog: SimpleSQLServer.com.
1 Lecture 5a: CPU architecture 101 boris.
Big Data Infrastructure
Cache Memory.
ECE232: Hardware Organization and Design
Build Successful Big Data infrastructure using Azure HDInsight
Memory and cache CPU Memory I/O.
Value of Serializability
Virtual Memory - Part II
Selective Code Compression Scheme for Embedded System
143A: Principles of Operating Systems Lecture 6: Address translation (Paging) Anton Burtsev October, 2017.
NoDB: Efficient Query Execution on Raw Data Files
Database Management Systems (CS 564)
History of compiler development
Cache Memory Presentation I
BitWarp Energy Efficient Analytic Data Processing on Next Generation General Purpose GPUs Jason Power || Yinan Li || Mark D. Hill || Jignesh M. Patel.
6.830 Lecture 7 B+Trees & Column Stores 9/27/2017
Lesson Objectives Aims You should be able to:
Buffering Database Operations for Enhanced
Lecture 10: Buffer Manager and File Organization
Overview of Fast Track and PDW
Compiler Construction
Memory and cache CPU Memory I/O.
SQL Server Data Warehouse reference architectures
Computer Organization & Compilation Process
Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#25: Column Stores
MICROPROCESSOR MEMORY ORGANIZATION
Query Optimization CS 157B Ch. 14 Mien Siao.
(A Research Proposal for Optimizing DBMS on CMP)
Practical Session 9, Memory
Chapter 13: Data Storage Structures
Evolution in memory management techniques
Computer Organization & Compilation Process
Virtual Memory.
4.3 Virtual Memory.
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #03 Row/Column Stores, Heap Files, Buffer Manager, Catalogs Instructor: Chen Li.
Virtual Memory 1 1.
Presentation transcript:

Query Execution in Main Memory DBMS

TPC-H Q1 Scan Select 99% Group into 4 groups Aggregate 8 numbers select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from lineitem where l_shipdate <= date '1998-12-01' - interval '90' day (3) group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus;

TPC-H Q1 100s 0.25s

TPC-H Q1 100s 96s 0.25s

Chain of Work MonetDB X100 Monet DB (1999) Hyper (2011) (2005) How much faster

Properties of RAM RAM ==== Volatile Expensive (100x HDD) Random Access => how random – 64 bytes Memory pages – physical addressed Address Translation – Complicated & Expensive – Cacheable Designated Address Cache Fast - how fast ? 1600 MHz * 4 channels * 8 bytes ~ 50GBps 100x Faster than disk

Columnar Layout > Reads lesser data > No tuple header overhead > Better cache utilization

Properties of CPU

Functions calls in Postgres

Function calls are bad 5165ms 1104ms 227ms

TPC-H Q1 Profile

Solution Elementary columnar operations WHERE A < 5 AND B = 2 int v[len] // Bitmap sel_lt(A, 5, v) sel_eq(B, 2, v) Operators are connected by materializing Intermediate results as temporary tables. Significantly reduces number of functions calls

In MonetDB select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from lineitem where l_shipdate <= date '1998-12-01' - interval '90' day (3) group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus;

Branching is bad Select X from table if X > 5

Predication if input[i] > 5: output[j++] = input[i] Transforms a control dependency into data dependency Pro: Does not cause pipeline buddle Con: Writes additional data

Is this it ? No!

Vectorized Execution MonetDB missed due to cost of materialization Instead of operating on column-at-a-time, operate on vector at a time

Example For op Pos.SELECT Without Vectorized Execution Would read entire sym column and generated the entire position bitmap With Vectorized Execution Would read ~ 1k entries at a time and run it through the pipeline

Now

The Gap SELECT X FROM table WHERE X > 5 AND X < 10; In C++: In X100: for (i = 0; i < size; i++) if (x[i] > 5 && x[i] < 10) output[j++] = x[i] for (j = 0; j < size; j += 1024) sel_col_lt_init(&x[j], b, 10) sel_col_gt_and(&x[j], b, 5) ret = gather(output, b, ret) void sel_col_gt_and(col, bitmap, val) for (i = 0; I < 1024; i++) bitmap[i] = bitmap[i] && (col[i] > val)

Query Compilation

LLVM

Why LLVM

Voila 96s 0.41s 0.25s

Almost Done 

Paper Stack MonetDB () MonetDB X100 (https://pdfs.semanticscholar.org/2e84/4872e32a4a4e94e229a9a9e70ac47d710252.pdf) Hyper (https://wiki.epfl.ch/edicpublic/documents/Candidacy%20exam/hyper-code- generation.pdf)