TerarkDB Introduction Peng Lei Terark Inc ( ). All rights reserved 1.

Why TerarkDB High Compression with High Performance Not space-time tradeoff Space and time are both reduced Latency is stable and very low Schema with rich data types Optimize different data types in different ways Multiple indices on one table Column store, column groups Terark Inc ( http://terark.com ). All rights reserved2

Migrate to TerarkDB Easy & simple API C++(With bindings for Java, Python…) Leverage the full power of TerarkDB Storage engine(MongoDB, MySQL, FUSE…) Very low migration effort, zero training High level DB interface Leverage most power of TerarkDB Terark Inc ( http://terark.com ). All rights reserved5

Searchable index compression Indices are relatively small Compressed, in-memory String fields Nested succinct tries Integer fields Small range compression Fixed length binary fields Configurable compression Terark Inc ( http://terark.com ). All rights reserved6

Seekable data compression: Concepts Traditional database compression Compress multiple records into one block/page Compressed on disk/ssd, uncompressed in memory, double caching Large blocks gain compression, lose reading speed Our seekable data compression Exact read by record id, no extra decompression Larger data set, higher compression ratio No need to cache uncompressed records Utilize all free memory for file system cache Much faster read speed Terark Inc ( http://terark.com ). All rights reserved7

Seekable data compression: Algorithms Small binary & string fields Nested succinct tries Very high compression ratio Relatively slow read(but much faster than block compression) Large binary & string fields Global+Local dictionary compression(lz77 variation) High compression ratio, higher than gzip, sometimes beat bzip2 Very fast read(at memcpy speed) Terark Inc ( http://terark.com ). All rights reserved8

TerarkDB Architecture Terark Inc ( http://terark.com ). All rights reserved9

Architecture highlights Integrate novel technologies together Loosely coupled components Flexible Extensible Transparent on-disk dir/file organization Minimum overhead Terark Inc ( http://terark.com ). All rights reserved10

Glossary Column/Field: A typed atomic object Row/Record: An object including multiple columns Record id: Invariant integer record identifier(object pointer…) Table: A collection including many records Index: An ordered map of keys(multiple fields) to record id Unique Index: An index in which keys are all different Column Group: A collection of a subset of columns, identified by record id Segment: A subset of a table, in which all record id are continuous Terark Inc ( http://terark.com ). All rights reserved11

A table is a 2D array Delmark Column[0]Column[1]….Column[N-1] Record[0]0,0 Record[1]0,0 Record[2]1,0Logical deleted but still exists …1,1 Physical deleted must also be logical deleted.... Terark Inc ( http://terark.com ). All rights reserved12 The record id is conceptually the array index of the first dimension, deleted records still occupy record id(s). Logical delmark: the record is invisible Physical delmark: the record doesn’t exist A physical deleted record must be logical deleted. Physical deletion mark is used for record id invariant. record id invariant Logical Physical id mapping uses rank select, the overhead is very low(typically less than 1%, even 0.1%) Logical id and physical id are identical if there is no physical deletions, so the id mapping overhead is avoided

Record id invariant Same record id, same data Reading of same record id gets same data Searching of same key gets same record id Segments changes(compressed, merged, purged) don’t impact record id Record id invariant lifetime(configurable) Permanent: Invariable between table reloads Can be used as a permanent id(such as: http://somehost?id=123) Table life: Physically deleted record id is squeezed on reload Can minimize the id space Can slightly improve the performance Terark Inc ( http://terark.com ). All rights reserved13

Segment, Index, Column Group Index[0]Index[1]Column[0]….Column[N-1] Seg[0] Seg[1] Seg[2] ….. Terark Inc ( http://terark.com ). All rights reserved14 Index operations: 1.Search key for getting record id, iterate keys in sorting order 2.Read key by record id(for reducing storage size) Such columns do not need to be stored in column groups Traditional databases need to store all columns Non-index columns are stored in column groups, column groups are defined in table schema, this is fine grained column store. Typical column groups are compressed by seekable data compression algorithms. Fixed-length column groups can be configured as inplace-updatable.

Two-stage searches Search key for getting (logical) record id Exact search: Fastest Range search: By index iterator Regex search: Make best efforts to avoid linear scan Read record data by record id Read the full record Read specified columns Read specified colgroups(fastest) Update for inplace-updatable… Terark Inc ( http://terark.com ). All rights reserved15

Writable & Readonly segment Writable segment Can be writing or frozen Implemented by a traditional database Is not compressed Accessing is slower than readonly segment Readonly segment(the core competitiveness) High compression & High performance Stable latency(no slow query, much better P99 latency) No need for dedicated cache(double caching is gone) Fixed length columns are still inplace-updatable Terark Inc ( http://terark.com ). All rights reserved16

Writing & Frozen segment A writing segment is a traditional database A writing segment become frozen when size is large enough Then a new writable segment is created and become the new writing segment Insertions/Updates/Deletions are in realtime Will not block user threads A frozen segment May be a readonly segment(is always frozen) May be a writable segment(is waiting for compressing or is compressing) Deletions(and inplace updates) are in realtime May be registered for later sync to compressed(or merged) readonly segment Terark Inc ( http://terark.com ). All rights reserved17

The writing segment The writing segment is the newest writable segment This is the first place where new records go in Records in writing segment can be updated directly Hot data(esp. frequent updated records) are likely in writing segment Index synchronizing in writes(insert/update/delete) Index sync can slow down the insertion Index sync can be disabled for some reasons For example: When batch importing data, there are no concurrent search operations, this can significantly improve the performance If not synced, the inserted record cannot be searched by any index, but still accessible by record id Terark Inc ( http://terark.com ). All rights reserved19

Frozen segments A frozen segment is writable or readonly Records are readonly(even for writable segments) Except for the inplace-updatable column groups Deletion: set the logical deletion mark Update: set the logical deletion mark and create a new record in the writing segment( even for writable segments ) Physical deletion Permanently purge the logical deleted records Need to rebuild the readonly segments Terark Inc ( http://terark.com ). All rights reserved20

Readonly segments Most data are stored in readonly segment(say 99% of total) Fast searchable index compression Fast seekable data compression(up to 7GB/s at 8x compression ratio) Built by background threads, user threads are not blocked Larger segment, higher compression, higher speed Optimized for fine grained column store Terark Inc ( http://terark.com ). All rights reserved21

Building readonly segments Building a readonly segment is: To compress a writable segment into one readonly segment To purging the logical deleted records To merge multiple readonly segments into one readonly segment Once a writing segment is frozen It is put into the compression queue Then compressed into a readonly segment by background threads Compression is usually slower than insertion There may be many threads running compression Once the compressing(merging/purging) is done The source segments are replaced by the result segment and will be deleted Sync the registered updates/deletions during compressing Record id invariant is kept(same record id gets same data) Terark Inc ( http://terark.com ). All rights reserved22

Inplace updatable column groups Principle: Less pain, more gain Just for fixed length colgroups Can be directly accessed by memory address Can be implemented with very low overhead For all segments(including readonly segments) When a segment is in compressing, merging, or physical deleting(purging) Register the updates Sync the updates when compressing/merging is completed Does not block the user threads Terark Inc ( http://terark.com ). All rights reserved23

TerarkDB Introduction Peng Lei Terark Inc ( ). All rights reserved 1.

Similar presentations

Presentation on theme: "TerarkDB Introduction Peng Lei Terark Inc ( ). All rights reserved 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

TerarkDB Introduction Peng Lei Terark Inc ( ). All rights reserved 1.

Similar presentations

Presentation on theme: "TerarkDB Introduction Peng Lei Terark Inc ( ). All rights reserved 1."— Presentation transcript:

Similar presentations

About project

Feedback