Presentation on theme: "Paper by: Yu Li, Jianliang Xu, Byron Choi, and Haibo Hu Department of Computer Science Hong Kong Baptist University Slides and Presentation By: Justin."— Presentation transcript:
Paper by: Yu Li, Jianliang Xu, Byron Choi, and Haibo Hu Department of Computer Science Hong Kong Baptist University Slides and Presentation By: Justin Weaver
Flash devices offer… Much faster random reads compared to HDDs Lower power consumption But random writes can be very slow Proposed solution called StableBuffer Takes advantage of efficient write patterns Implemented as DBMS buffer manager add-on No need for firmware or OS driver updates
Why are random writes to NAND flash so slow? Only empty cells can be written to Writes at page level, but erases operate on blocks Overwrite operation: read, erase, modify, write
Execute all writes to empty slots in a pre- allocated buffer on the flash device Take note of the writes actual destinations and look for efficient write patterns Flush the buffer by writing the discovered patterns to their actual locations
Sequential Writes – consecutive addresses Focused Writes – addresses within a specific range Partitioned Writes – mix of sequential writes to several areas
How should the space inside of the StableBuffer be managed? How should efficient write patterns be recognized without too much overhead? Which write patterns should be flushed to their destinations, and when should this happen?
StableBuffer Pre-allocated area on the flash device Broken into slots, each the size of a page Example: 4MB SB, 4KB pages 1,024 slots Translation Table In-memory table that maps actual destinations with slot numbers Example entry: Implemented as hash table, key is destination address Bitmap for Free Slots 1 empty, 0 occupied Free slot pointer points to next open slot Metadata added to pages for fault tolerance
On-Demand Finds efficient write patterns upon request Scan on sorted destination addresses finds sequential and partitioned patterns Sliding window on sorted addresses finds focused area patterns Incremental Maintains pattern information; updated after every write Maintain set of S i = (addr min, addr max ) to find sequential writes Maintain set of P l where each entry points to all S i with size l; each P l is a candidate partitioned write pattern Maintain set of F i = (addr min, addr max, set addr ) where set addr is a set of addresses between min and max; each F i is a candidate focused write pattern
Passive Flushes pages when there are no open slots Triggers on-demand pattern recognition OR chooses incrementally generated patterns Chooses to flush the longest instance that is expected to be written fastest Proactive Flushes pages during any write operation when qualified to do so Requires incremental pattern recognition Runs in the background, detecting good efficient write patterns Checks if maintained patterns are qualified for flushing Qualified patterns have a threshold value higher than θ x where x is one of the three write patterns θ seq = l min / l θ par = θ seq * T par / T seq θ focus = θ seq * T focus / T seq
Three flash devices 16GB SSD 8GB USB flash drive 8GB SDHC card Three StableBuffer configurations Ondemand+Passive Incremental+Passive Incremental+Proactive TPC-C benchmark on PostgreSQL 8.4
All measurements outperform direct method SD card has poor performance with incremental+proactive Incremental+passive and ondemand+passive perform the best
SD card performs very poorly in parallel write test USB flash drive and SD card are clearly IO bound, not CPU bound
Best performance achieved when StableBuffer size was set to 4MB for this particular drive
Using the StableBuffer DBMS add-on, random write performance on flash memory was shown to dramatically increase No firmware or OS driver updates were needed
Comprehension questions?... Presentation questions... Does optimal performance rely too heavily on the need to benchmark each specific device model first? Could developments in firmware and/or OS drivers make an add-on like StableBuffer no longer necessary? USB flash drives and SD cards would probably never be used for DBMS storage; why not just test several different SSD models instead? Could OS support for the TRIM command help with random write performance within a DBMS?