Presentation on theme: "MonetDB: A column-oriented DBMS Ryan Johnson CSC2531."— Presentation transcript:
MonetDB: A column-oriented DBMS Ryan Johnson CSC2531
The memory wall has arrived CPU performance +70%/year Memory performance latency: -50%/decade bandwidth: +20%/year (est.) Why? – DRAM focus on capacity (+70%/year) – Physical limitations (pin counts, etc.) – Assumption that caches "solve” latency problem DBMS spends 95% of time waiting for memory
The problem: data layouts Logical layout: 2-D relation => Unrealizable in linear address space! N-ary storage layout, aka “slotted pages” – Easy row updates, strided access to columns => Low cache locality for read-intensive workloads “NSM layouts considered harmful”...
Coping with The Wall Innovation: decompose all data vertically – Columns stored separately, rejoined at runtime Binary Association Table (BAT) replaces Relation – List of (recordID, columnValue) pairs – Compression and other tricks ==> 1 byte/entry BAT + clever algos => cache locality => Winner!
Exploring deeper Performance study (motivation) Physical data layouts Cache-optimized algorithms Evaluating MonetDB performance Implications and lingering questions
NSM: access latency over time Read one column (record size varies with x) Latency increases ~10x as accesses/cache line 1 (slope changes at L1/L2 line size)
Efficient physical BAT layout Idea #1: “virtual OID” – Optimizes common case – Dense, monotonic OIDs – All BATs sorted by OID Idea #2: compression – Exploits small domains – Boosts cache locality, effective mem BW How to handle gaps? Out-of-band values? Can’t we compress NSM also? Joining two BAT on OID has O(n) cost!
Cache-friendly hash join Hash partitioning: one pass but trashes L1/L2 – #clusters > #cache lines Radix-partitioning: limit active #partitions by making more passes Recall: CPU is cheap compared to memory access
Great, but how well does it work? Three metrics of interest – L1/L2 misses (= suffer latency of memory access) – TLB misses (even more expensive than cache miss) – Query throughput (higher is better) Should be able to explain throughput using other metrics – Given model makes very good predictions => Memory really is (and remains!) the bottleneck
A few graphs Big win: stability as cardinalities vary Radix clustering behavior as cardinality varies Radix-clustered HJ vs. other algorithms
Implications and discussion points Cache-friendly really matters (even w/ I/O) – Traditional DBMS memory-bound Vertically decomposed data: superior density – Data brought to cache only if actually needed – Compression gives further density boost Questions to consider... – Queries accessing many columns? – What about inserts/updates (touch many BAT)? – What about deletes/inserts (bad for compression)?
Implications and discussion points Cache-friendly really matters (even w/ I/O) – Traditional DBMS memory-bound Vertically decomposed data: superior density – Data brought to cache only if actually needed – Compression gives further density boost Questions to consider... – Queries accessing many columns? – How to make a good query optimizer? – Performance of transactional workloads? Update-intensive, concurrency control,... – What about inserts (bad for compression)?