Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sougata Bhattacharjee Caching for Flash-Based Databases Summer Semester 2013.

Similar presentations


Presentation on theme: "Sougata Bhattacharjee Caching for Flash-Based Databases Summer Semester 2013."— Presentation transcript:

1 Sougata Bhattacharjee Caching for Flash-Based Databases Summer Semester 2013

2 MOTIVATION FLASH MEMORY FLASH CHARACTERISTICS FLASH SSD ARCHITECTURE FLASH TRANSLATION LAYER PAGE REPLACEMENT ALGORITHM ADAPTIVE REPLACEMENT POLICY FLASH-AWARE ALGORITHMS CLEAN-FIRST LRU ALGORITHM CLEAN-FIRST DIRTY-CLUSTERED (CFDC) ALGORITHM AD-LRU ALGORITHM CASA ALGORITHM CONCLUSION REFERENCES OUTLINE

3 Data Explosion The worldwide data volume is growing at an astonishing speed. In 2007, we had 281 EB data; in 2011, we had 1800 EB data. Motivation Flash Memory Page Replacement Algorithm Flash-Aware Algorithms Conclusion Data storage technology : HDDs and DRAM HDDs suffer from HIGH LATENCY. DRAM comes with HIGHER PRICE. Energy consumption In 2005, total power used by servers in USA was 0.6% of its total annual electricity consumption. We need to find a memory technology which may overcome these limitations.

4 BACKGROUND In 1980, Dr. Fujio Masuoka invented Flash memory. In 1988, Intel Corporation introduced Flash chips. In 1995, M-Systems introduced flash-based solid-state drives. Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion What is flash? Flash memory is an electronic non-volatile semiconductor storage device that can be electrically erased and reprogrammed. 3 Operations: program (Write), Erase, and Read. Two major forms NAND flash and NOR flash NAND is newer and much more popular.

5 Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion FLASH AND MEMORY HIERARCHY Registers CACHE RAM HDD Higher Speed, Cost Larger Size Flash is faster, has lower latency, is more reliable, but more expensive than hard disks

6 Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion Why Flash is popular? Benefits over magnetic hard drives Offers lower access latencies. Semi-conductor technology, no mechanical parts. High data transfer rate. Higher reliability (no moving parts). Lower power consumption. Small in size and light in weight. Longer life span. Benefits over RAM Lower power consumption. Lower price.

7 Flash SSD is widening its range of applications Embedded devices Desktop PCs and Laptops Servers and Supercomputers Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion USE OF FLASH _S308_Cooke.pdf, Page 2

8 Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion FLASH OPERATIONS …… Block 1Block 2Block n Three operations: Read, Write, Erase. Reads and writes are done at the granularity of a page (2KB or 4KB) A flash block is much larger than a disk block: Contains p (typically ) fixed-size flash pages with 512 B - 2 KB Page Data …… Page Data …… Page Erasures are done at the granularity of a block ( 10,000 – 100,000 erasures) Block erase is the slowest operation requiring about 2ms Update of flash pages not possible; only overwrite of an entire block where erasure is needed first …… Page Data …… Page

9 Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion FLASH OPERATIONS Block 1Block 2 Page Update of flash pages not possible; only overwrite of an entire block where erasure is needed first Page Data Page Free Page Data Page Block 1 Block 2 Page Free Steps of modified DB Pages Page Free ERASE Full Block Updates go to new page (new block).

10 Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion FLASH CONSTRAINTS Write/Erase granularity asymmetry (Cons1) Write/Erase granularity asymmetry (Cons1) Erase-before- write rule (Cons2) Erase-before- write rule (Cons2) Limited cell lifetime (Cons3) Limited cell lifetime (Cons3) + Invalidate Out-of-place update Logical to Physical Mapping Garbage Collection + Wear Leveling

11 Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion FLASH MEMORY STRUCTURE FTL Mapping Garbage Collection Wear Leveling Other File System Flash Device Various operations need to be carried out to ensure correct operation of a flash device. Garbage collection Wear leveling Mapping Flash Translation Layer controls flash management. Hides complexities of device management from the application Garbage collection and wear leveling Enable mobility – flash becomes plug and play

12 Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion MAPPING TECHNIQUES (1/2) 3 Types of Basic Mappings Page-Level Mapping Block-Level Mapping Hybrid Mapping LPNPPN Page-Level Mapping Each page mapped independently. Highest performance potential. Highest resource use Large size of mapping table.

13 Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion MAPPING TECHNIQUES (2/2) 3 Types of Basic Mapping Page-Level Mapping Block-Level Mapping Hybrid Mapping = 7/4 =1 LBNPBN Block-Level Mapping Only block numbers kept in the mapping table. Page offsets remain unchanged. Small mapping table. Bad performance for write updates. 7 mod 4 =3

14 Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion FTL BLOCK-LEVEL MAPPING (BEST CASE) k flash blocks: B g log blocks: L free blocks: F ….. 12k-1k ij1g ….. Erase B1 Switch: L1 becomes B1, Erase old B1 1 Erase operation

15 Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion FTL BLOCK-LEVEL MAPPING (SOME CASE) k flash blocks: B g log blocks: L free blocks: F ….. 12k-1k ij1g ….. Erase B1 2 Erase operation Merge: B1 and L1 to Fi Erase L1 Merge of n flash blocks and one log block to Fi n+1 erasures

16 Flash Memory Motivation Page Replacement Algorithm Flash-Aware Algorithms Conclusion GARBAGE COLLECTION Moves valid pages from blocks containing invalid data and then erases the blocks Removes invalid pages and increases free pages FREE Valid Invalid ERASE Full Block Wear Leveling decides where to write the new data Wear Leveling picks up most frequently erased blocks and least worn-out blocks to equalize overall usage and swap their content; enhances lifespan of flash

17 Find the location of the desired page on disk. Find free frame: If a free frame exists, use it. Otherwise, use a page replacement algorithm to select a victim page. Page Replacement Algorithm Motivation Flash Memory Flash-Aware Algorithms Conclusion BASICS OF PAGE REPLACEMENT Load the desired page into the frame. Update the page allocation table (page mapping in the buffer). Upon the next page replacement, repeat the whole process again in the same way.

18 Cache is FAST but EXPENSIVE HDDs are SLOW but CHEAP Page Replacement Algorithm Motivation Flash Memory Flash-Aware Algorithms Conclusion THE REPLACEMENT CACHE PROBLEM How to manage the cache? Which page to replace? How to maximize the hit ratio?

19 How LRU Works? Page Replacement Algorithm Motivation Flash Memory Flash-Aware Algorithms Conclusion PAGE REPLACEMENT ALGORITHMS (1/2) Least Recently Used (LRU) - Removes the least recently used items first - Constant time and space complexity & simple-to-implement - Expensive to maintain statistically significant usage statistics - Does not exploit "frequency - It is not scan-resistant A B C D String: C A B D E F D G E Time: C A B D A C B D D F E B D B A C B A C D E D B A F E D B G D F E E G D F Page Fault C goes out Page Fault A goes out Page Fault B goes out

20 Page Replacement Algorithm Motivation Flash Memory Flash-Aware Algorithms Conclusion PAGE REPLACEMENT ALGORITHMS (2/2) Least Frequently Used (LFU) - Removes least frequently used items first - Is scan-resistant - Logarithmic time complexity (per request) - Stale pages can remain a long time in the buffer LRU + LFU = LRFU (Least Recently/Frequently Used) - Exploit both recency and frequency - Better performance than LRU and LFU - Logarithmic time complexity - Space and time overhead Adaptive Replacement Cache(ARC) is a Solution

21 Page Replacement Algorithm Motivation Flash Memory Flash-Aware Algorithms Conclusion ARC (ADAPTIVE REPLACEMENT CACHE) CONCEPT L1 General double cache structure (cache size is 2C) MRULRU L1 contains recently seen pages: Recency list. L2 MRULRU L2 contains pages seen at least twice recently: Frequency list. If L1 contains exactly C pages, replace the LRU page from L1. Otherwise, replace the LRU page in L2. Cache is partitioned into two lists L1 and L2.

22 Page Replacement Algorithm Motivation Flash Memory Flash-Aware Algorithms Conclusion ARC CONCEPT ARC structure (cache size is C) MRULRU Divide L2 into T2 (MRU end) & B2 (LRU end). MRULRU Upon a page request: if it is found in T1 or T2, move it to MRU of T2. When cache miss occurred, new page is added in MRU of T1 - If T1 is full, LRU of T1 is moved to MRU of B1. Divide L1 into T1 (MRU end) & B1 (LRU end). T2 T1 L1L2 The size of T1 and T2 is C. The size of T1 and B1 is C, same for T2 and B2. 2C B1B2 C

23 Page Replacement Algorithm Motivation Flash Memory Flash-Aware Algorithms Conclusion ARC PAGE EVICTION RULE ARC structure (cache size is C) MRULRU If requested page found in B1, P is increased & the page moved to MRU position of T2. MRULRU ARC adapts parameter P, according to an observed workload. - P determines the target size of T1. T2 T1 L1L2 If requested page found in B2, P is decreased & the page moved to MRU position of T2. 2C B1B2 C

24 Page Replacement Algorithm Motivation Flash Memory Flash-Aware Algorithms Conclusion HOW ARC WORKS? (1/2) B1 T1T2B2 Recency Frequency C CC Reference String : ABCADEEFGD A AB ABC BCA BCDA BCDEA BCDEA BCDFEA BCDFGEA BCFGDEA Time

25 Page Replacement Algorithm Motivation Flash Memory Flash-Aware Algorithms Conclusion HOW ARC WORKS? (2/2) Reference String : ABCADEEFGD H I JG KH LD BCFGHDEA CFGHIJDEA CFHIJGDEA CFHIJKGDEA Time BCFGDEA BCFGHIDEA Increase T1, Decrease B1 Page B is out from the list Scan-Resistant Self-Tuning CFIJKHGDEA 15 CFIJKLHGDEA CFIJKLDHGEA Increase T2, Decrease B2 Self-Tuning

26 Page Replacement Algorithm Motivation Flash Memory Flash-Aware Algorithms Conclusion ARC ADVANTAGE ARC is scan-resistant. ARC is self-tuning and empirically universal. Stale pages do not remain in memory; better than LFU. ARC consumes about 10% - 15% more time than LRU, but the hit ratio is almost twice as for LRU. Low space overhead for B lists.

27 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion FLASH-AWARE BUFFER TECHNIQUES Minimize the number of physical write operations. Cost of page write is much higher than page read. Buffer manager decides How and When to write. CFLRU (Clean-First LRU) LRUWSR (LRU Write Sequence Reordering) CCFLRU (Cold-Clean-First LRU) AD-LRU (Adaptive Double LRU) Read/Write entire flash blocks (addressing the FRW problem) FAB (Flash Aware Buffer) REF (Recently-Evicted-First)

28 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CLEAN-FIRST LRU ALGORITHM (1/3) One of the earliest proposals of flash-aware buffer techniques. CFLRU is based on LRU replacement policy. LRU list is divided into two regions: Working region: Recently accessed pages. Clean-first region: Pages for eviction. P1P2P3P4P5P6P7P8 Working RegionClean-First Region Window, W = 4 LRU MRU CFLRU always selects clean pages to evict from the clean- first region first to save flash write costs. If there is no clean page in this region, a dirty page at the end of the LRU list is evicted. Clean Dirty

29 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CLEAN-FIRST LRU ALGORITHM (2/3) P1P2P3P4P5P6P7P8 Working RegionClean-First Region LRU MRU CFLRU always selects clean pages to evict from the clean- first region first to save flash write costs. If there is no clean page in this region, a dirty page at the end of the LRU list is evicted. Clean Dirty Evicted Pages : P7 P5P8P6

30 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CLEAN-FIRST LRU ALGORITHM (3/3) Disadvantage : CFLRU has to search in a long list in case of a buffer fault. Keeping dirty pages in the clean-first region can shorten the memory resources. Determine the size of W, the window size of the clean-first region. CFDC : Clean-First, Dirty-Clustered

31 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CFDC (CLEAN-FIRST, DIRTY-CLUSTERED) ALGORITHM Clean Queue Dirty Queue Victim Working Region Priority Region Divide clean-first region (CFLRU) into two queue: Clean Queue and Dirty Queue; Separation of Clean and Dirty Pages. Dirty pages are grouped in clusters according to spatial locality. Clusters are ordered by priority. Implement two-region scheme. Buffer divided into two region: 1. Working Region : Keep hot pages 2. Priority Region: Assign priority to pages Otherwise, a dirty page is evicted from the LRU end of a cluster having lowest priority. Clean pages are always chosen first as victim pages.

32 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CFDC ALGORITHM – PRIORITY FUNCTION For a cluster c with n pages, its priority P(c) is computed according to Formula 1 Where P 0, …, P n-1 are the page numbers ordered by their time of entering the cluster. IPD (Inter-Page Distance) Example : Victim Page Timestamp -> Priority -> 2/91/81/141/18 Lowest Priority GlobalTime : 10

33 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CFDC ALGORITHM – EXPERIMENTS CFDC vs. CFLRU: 41% CFLRU vs. LRU: 6% Cost of page flushes Clustered writes are efficient Number of page flushes CFDC has close write count to CFLRU Influence of increasing update ratios CFDC is equal with LRU for update workload.

34 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CFDC ALGORITHM – CONCLUSION Reduces the number of physical writes Improves the efficiency of page flushing Keeps high hit ratio. Size of the Priority Window is a concern for CFDC. CASA : Dynamically adjusts buffer size

35 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion AD-LRU (ADAPTIVE DOUBLE LRU) ALGORITHM AD-LRU integrates the properties of recency, frequency and cleanness into the buffer replacement policy. Cold LRUHot LRU LRU MRU Cold LRU: Keeps pages referenced once Hot LRU: Keeps pages referenced at least twice (frequency) FCFC FCFC Min_lc FC (First-Clean) indicates the victim page. If page miss occurs, increase the size of the cold queue If buffer is full, cold clean pages are evicted from Cold LRU. If cold clean pages are not found, then cold dirty pages are evicted by using a second-chance algorithm.

36 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion AD-LRU ALGORITHM EVICTION POLICY Example : Buffer size : 9 pages 3 Dirty Hot 7 Dirty Hot 2 Clean Hot 1 Dirty Hot 4 Dirty Cold 6 Clean Cold 5 Clean Cold 9 Dirty Cold 8 Dirty Cold Hot Queue Cold Queue MRU LRU Ad-LRU Victim 6 Clean Cold 4 Dirty Cold If no clean cold page is found, then a dirty cold page will be chosen as victim using a second-chance algorithm. 10 Dirty Cold New Page

37 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion AD-LRU ALGORITHM EXPERIMENTS Write count vs. buffer size for various workload patterns AD-LRU has the lowest write count RandomRead-Most Write-Most Zipf

38 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion AD-LRU ALGORITHM - CONCLUSION AD-LRU considers reference frequency, an important property of reference patterns, which is more or less ignored by CFLRU. AD-LRU frees the buffer from the cold pages as soon as appropriate. AD-LRU is self-tuning. AD-LRU is scan-resistant.

39 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CASA (COST-AWARE SELF-ADAPTIVE) ALGORITHM CASA makes trade-off between physical reads and physical writes. It adapts automatically to varying workloads. LRU MRU b= |L c |+ |L d | |L c | |L d | Clean List L c Dirty List L d Divide buffer pool into 2 dynamic lists: Clean and Dirty list Both lists are ordered by reference recency. CASA continuously adjust parameter τ; 0 τ b In case of a buffer fault: τdecides from which list the victim page will be chosen. τ is the dynamic target size of L c, so size of L d is (b – τ).

40 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CASA ALGORITHM CASA algorithm considers both read and write cost. LRU MRU b= |L c |+ |L d | |L c | |L d | Clean List L c Dirty List L d Case 1: Logical Read request in L c, τ increased. Case 2: Logical Write request in L d, τ decreased. CASA algorithm also considers the status (R/W) of a requested page

41 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CASA ALGORITHM – EXAMPLE (1/2) Total buffer size b = 13, τ = 6, Target Size of L c = 6, L d = LRU b= |L c |+ |L d | Clean List L c Dirty List L d Case 1: Logical Read request in L c, τ increased. Incoming page : 14 (Read) in L c Total buffer size b = 13, τ = 7, target size of L c = 7, L d =

42 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CASA ALGORITHM – EXAMPLE (2/2) LRU b= |L c |+ |L d | Clean List L c Dirty List L d Case 1: Logical Read request in L c, τ increased. Case 2: Logical Write request in L d, τ decreased. Incoming page : 15 (Write) in L d Total buffer size b = 13, τ = 7,Target Size of L c = 7, L d = Total buffer size b = 13, τ = 6, target size of L c = 6, L d = 7

43 Flash-Aware Algorithms Motivation Flash Memory Page Replacement Algorithm Conclusion CASA ALGORITHM - CONCLUSION CASA is implemented for two-tier storage systems based on homogeneous storage devices with asymmetric R/W costs. CASA can detect cost ratio dynamically. CASA is self-tuning. It adapts itself to varying cost ratios and workloads

44 Motivation Flash Memory Page Replacement Algorithm Flash-Aware Algorithms CONCLUSION However, the performance behavior of flash devices is still remaining unpredictable due to complexity of FTL implementation and its proprietary nature. Conclusion Flash memory is a widely used, reliable, and flexible non-volatile memory to store software code and data in a microcontroller. To gain more efficient performance, we need to implement a flash device simulator. We addressed issues of buffer management for two-tier storage systems (Caching for a flash DB); ARC and CASA are two better approach. Phase-change memory (PCM) is a promising next-generation memory technology, which can be used for database storage systems.

45 REFERENCES 1. Yi Ou: Caching for flash-based databases and flash-based caching for databases, Ph.D. Thesis, University of Kaiserslautern, Verlag Dr. Hut, Online August Nimrod Megiddo, Dharmendra S. Modha: ARC: A Self-Tuning, Low Overhead Replacement Cache. FAST 2003: ( ) 3.Nimrod Megiddo, Dharmendra S. Modha: Outperforming LRU with an Adaptive Replacement Cache Algorithm. IEEE Computer 37(4): (2004) 4.Yi Ou, Theo Härder: Clean first or dirty first?: a cost-aware self-adaptive buffer replacement policy. IDEAS 2010: Seon-Yeong Park, Dawoon Jung, Jeong-Uk Kang, Jinsoo Kim, Joonwon Lee: CFLRU: a replacement algorithm for flash memory. CASES 2006: Yi Ou, Theo Härder, Peiquan Jin: CFDC: a flash-aware replacement policy for database buffer management. DaMoN 2009: Peiquan Jin, Yi Ou, Theo Härder, Zhi Li: AD-LRU: An efficient buffer replacement algorithm for flash-based databases. Data Knowl. Eng. 72: (2012) 8.Suman Nath, Aman Kansal: FlashDB: dynamic self-tuning database for NAND flash. IPSN 2007: Kyoungmoon Sun, Seungjae Baek, Jongmoo Choi, Donghee Lee, Sam H. Noh, Sang Lyul Min: LTFTL: lightweight time-shift flash translation layer for flash memory based embedded storage. EMSOFT 2008: Nimrod Megiddo, Dharmendra S. Modha: System and method for implementing an adaptive replacement cache policy, US B2, Wikipedia: Flash memory 12.Wikipedia: Page replacement algorithm 13.N. Megiddo, D. S. Modha: Adaptive Replacement Cache, IBM Almaden Research Center, April Yang Hu, Hong Jiang, Dan Feng, Lei Tian, Shu Ping Zhang, Jingning Liu, Wei Tong, Yi Qin, Liuzheng Wang: Achieving page-mapping FTL performance at block-mapping FTL cost by hiding address translation. MSST 2010: 1- 12

46 THANK YOU


Download ppt "Sougata Bhattacharjee Caching for Flash-Based Databases Summer Semester 2013."

Similar presentations


Ads by Google