Presentation is loading. Please wait.

Presentation is loading. Please wait.

SILT: A Memory-Efficient, High-Performance Key-Value Store Hyeontaek Lim, Bin Fan, David G. Andersen Michael Kaminsky† Carnegie Mellon University †Intel.

Similar presentations


Presentation on theme: "SILT: A Memory-Efficient, High-Performance Key-Value Store Hyeontaek Lim, Bin Fan, David G. Andersen Michael Kaminsky† Carnegie Mellon University †Intel."— Presentation transcript:

1 SILT: A Memory-Efficient, High-Performance Key-Value Store Hyeontaek Lim, Bin Fan, David G. Andersen Michael Kaminsky† Carnegie Mellon University †Intel Labs

2 Key-Value Store 2 Clients PUT(key, value) value = GET(key) DELETE(key) Key-Value Store Cluster E-commerce (Amazon) Web server acceleration (Memcached) Data deduplication indexes Photo storage (Facebook)

3 Many projects have examined flash memory- based key-value stores – Faster than disk, cheaper than DRAM This talk will introduce SILT, which uses drastically less memory than previous systems while retaining high performance. 3

4 Flash Must be Used Carefully 4 Random reads / sec48,000 $ / GB1.83 Fast, but not THAT fast Space is precious Another long-standing problem : random writes are slow and bad for flash life (wearout)

5 DRAM Must be Used Efficiently DRAM used for index (locate) items on flash 1 TB of data to store on flash 4 bytes of DRAM for key-value pair (previous state-of-the-art) 5 Key-value pair size (bytes) Index size (GB) 32 B: Data deduplication => 125 GB! 168 B: Tweet => 24 GB 1 KB: Small image => 4 GB

6 Three Metrics to Minimize Memory overhead Read amplification Write amplification Ideally 0 (no memory overhead) Limits query throughput Ideally 1 (no wasted flash reads) Limits insert throughput Also reduces flash life expectancy Must be small enough for flash to last a few years = Index size per entry = Flash reads per query = Flash writes per entry 6

7 Landscape: Where We Were Read amplification Memory overhead (bytes/entry) FAWN-DS HashCache BufferHashFlashStore SkimpyStash 7 ?

8 Seesaw Game? Memory efficiencyHigh performance 8 FAWN-DS HashCache BufferHash FlashStore SkimpyStash How can we improve?

9 9 SILT Sorted Index (Memory efficient) SILT Log Index (Write friendly) Solution Preview: (1) Three Stores with (2) New Index Data Structures Memory Flash SILT Filter Inserts only go to Log Data are moved in background Queries look up stores in sequence (from new to old)

10 LogStore: No Control over Data Layout 6.5+ bytes/entry1 Memory overheadWrite amplification Inserted entries are appended On-flash log Memory Flash Still need pointers: size ≥ log N bits/entry 10 SILT Log Index (6.5+ B/entry) (Older)(Newer) Naive Hashtable (48+ B/entry)

11 SortedStore: Space-Optimized Layout 0.4 bytes/entryHigh On-flash sorted array Memory overheadWrite amplification Memory Flash 11 SILT Sorted Index (0.4 B/entry) Need to perform bulk- insert to amortize cost

12 Combining SortedStore and LogStore On-flash log SILT Sorted Index Merge 12 SILT Log Index On-flash sorted array

13 Achieving both Low Memory Overhead and Low Write Amplification 13 SortedStoreLogStore SortedStore LogStore Low memory overhead High write amplification High memory overhead Low write amplification Now we can achieve simultaneously: Write amplification = 5.4 = 3 year flash life Memory overhead = 1.3 B/entry With “HashStores”, memory overhead = 0.7 B/entry! (see paper)

14 bytes/entry5.4 Memory overheadRead amplificationWrite amplification 14 SILT’s Design (Recap) On-flash sorted array SILT Sorted Index On-flash log SILT Log Index On-flash hashtables SILT Filter MergeConversion

15 Review on New Index Data Structures in SILT Partial-key cuckoo hashing For HashStore & LogStore Compact (2.2 & 6.5 B/entry) Very fast (> 1.8 M lookups/sec) 15 SILT Filter & Log Index Entropy-coded tries For SortedStore Highly compressed (0.4 B/entry) SILT Sorted Index

16 Compression in Entropy-Coded Tries 16 Hashed keys (bits are random)  # red (or blue) leaves ~ Binomial(# all leaves, 0.5)  Entropy coding (Huffman coding and more) (More details of the new indexing schemes in paper)

17 Landscape: Where We Are Read amplification Memory overhead (bytes/entry) FAWN-DS HashCache BufferHashFlashStore SkimpyStash 17 SILT

18 Evaluation 1.Various combinations of indexing schemes 2.Background operations (merge/conversion) 3.Query latency Experiment Setup CPU2.80 GHz (4 cores) Flash drive SATA 256 GB (48 K random 1024-byte reads/sec) Workload size20-byte key, 1000-byte value, ≥ 50 M keys Query patternUniformly distributed (worst for SILT) 18

19 LogStore Alone: Too Much Memory 19 Workload: 90% GET ( M keys) + 10% PUT (50 M keys)

20 LogStore+SortedStore: Still Much Memory 20 Workload: 90% GET ( M keys) + 10% PUT (50 M keys)

21 Full SILT: Very Memory Efficient 21 Workload: 90% GET ( M keys) + 10% PUT (50 M keys)

22 Small Impact from Background Operations 33 K K Workload: 90% GET (100~ M keys) + 10% PUT Oops! bursty TRIM by ext4 FS

23 Low Query Latency 23 # of I/O threads Workload: 100% GET (100 M keys) Best 16 threads Median = 330 μs 99.9 = 1510 μs

24 Conclusion SILT provides both memory-efficient and high-performance key-value store – Multi-store approach – Entropy-coded tries – Partial-key cuckoo hashing Full source code is available – https://github.com/silt/silt 24

25 Thanks! 25


Download ppt "SILT: A Memory-Efficient, High-Performance Key-Value Store Hyeontaek Lim, Bin Fan, David G. Andersen Michael Kaminsky† Carnegie Mellon University †Intel."

Similar presentations


Ads by Google