Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sougata Bhattacharjee

Similar presentations


Presentation on theme: "Sougata Bhattacharjee"— Presentation transcript:

1 Sougata Bhattacharjee
Caching for Flash-Based Databases Summer Semester 2013 Sougata Bhattacharjee

2 OUTLINE MOTIVATION FLASH MEMORY PAGE REPLACEMENT ALGORITHM
FLASH CHARACTERISTICS FLASH SSD ARCHITECTURE FLASH TRANSLATION LAYER PAGE REPLACEMENT ALGORITHM ADAPTIVE REPLACEMENT POLICY FLASH-AWARE ALGORITHMS CLEAN-FIRST LRU ALGORITHM CLEAN-FIRST DIRTY-CLUSTERED (CFDC) ALGORITHM AD-LRU ALGORITHM CASA ALGORITHM CONCLUSION REFERENCES

3 Data storage technology : HDDs and DRAM
Data Explosion The worldwide data volume is growing at an astonishing speed. In 2007, we had 281 EB data; in 2011, we had 1800 EB data. Motivation Flash Memory Data storage technology : HDDs and DRAM HDDs suffer from HIGH LATENCY. DRAM comes with HIGHER PRICE. Page Replacement Algorithm Energy consumption In 2005, total power used by servers in USA was 0.6% of its total annual electricity consumption. Flash-Aware Algorithms Conclusion We need to find a memory technology which may overcome these limitations.

4 BACKGROUND In 1980, Dr. Fujio Masuoka invented Flash memory. In 1988, Intel Corporation introduced Flash chips. In 1995, M-Systems introduced flash-based solid-state drives. Motivation Flash Memory Page Replacement Algorithm Flash-Aware Algorithms What is flash? Flash memory is an electronic non-volatile semiconductor storage device that can be electrically erased and reprogrammed. Conclusion 3 Operations: program (Write), Erase, and Read. In contrast, SSDs use microchips  that retain data in non-volatile memory chips and contain no moving parts. Two major forms NAND flash and NOR flash NAND is newer and much more popular.

5 FLASH AND MEMORY HIERARCHY
Motivation Flash Memory Page Replacement Algorithm Registers CACHE RAM HDD Higher Speed, Cost Larger Size Flash-Aware Algorithms Flash is faster, has lower latency, is more reliable, but more expensive than hard disks Conclusion In contrast, SSDs use microchips  that retain data in non-volatile memory chips and contain no moving parts. NAND Flash READ - 50 μsec WRITE – 200 μsec ERASE- Very Slow

6 Benefits over magnetic hard drives
Why Flash is popular? Motivation Benefits over magnetic hard drives Flash Memory Semi-conductor technology, no mechanical parts. Offers lower access latencies. Page Replacement Algorithm High data transfer rate. Flash-Aware Algorithms Higher reliability (no moving parts). Conclusion Lower power consumption. Small in size and light in weight. In contrast, SSDs use microchips  that retain data in non-volatile memory chips and contain no moving parts. Longer life span. Benefits over RAM Lower price. Lower power consumption.

7 Flash SSD is widening its range of applications
USE OF FLASH Motivation Flash SSD is widening its range of applications Embedded devices Desktop PCs and Laptops Servers and Supercomputers Flash Memory Page Replacement Algorithm Flash-Aware Algorithms Conclusion , Page 2

8 FLASH OPERATIONS Page Data Page Data Page Data Block 1 Block 2 Block n
Motivation Flash Memory Page Data …… Page Data …… Page Data …… Page Replacement Algorithm …… …… Flash-Aware Algorithms Conclusion Block 1 Block 2 Block n Three operations: Read, Write, Erase. In contrast, SSDs use microchips  that retain data in non-volatile memory chips and contain no moving parts. Reads and writes are done at the granularity of a page (2KB or 4KB) A flash block is much larger than a disk block: Contains p (typically ) fixed-size flash pages with 512 B - 2 KB Erasures are done at the granularity of a block (10,000 – 100,000 erasures) Block erase is the slowest operation requiring about 2ms Update of flash pages not possible; only overwrite of an entire block where erasure is needed first

9 FLASH OPERATIONS Page Data Page Free Page Data Page Free Page Free
Motivation Flash Memory Page Data Page Free Page Data Page Free Page Free Page Replacement Algorithm Steps of Page Page Flash-Aware Algorithms Page Page modified DB Pages Page Page Page Page Page ERASE Full Block Conclusion Block 1 Block 2 Block 1 Block 2 In contrast, SSDs use microchips  that retain data in non-volatile memory chips and contain no moving parts. Update of flash pages not possible; only overwrite of an entire block where erasure is needed first Updates go to new page (new block).

10 + + FLASH CONSTRAINTS Write/Erase granularity asymmetry
Motivation Write/Erase granularity asymmetry (Cons1) Erase-before-write rule (Cons2) Limited cell lifetime (Cons3) Flash Memory Page Replacement Algorithm + Flash-Aware Algorithms Conclusion Invalidate Out-of-place update + Logical to Physical Mapping Garbage Collection Wear Leveling

11 FTL FLASH MEMORY STRUCTURE File System Mapping Garbage collection
Motivation Various operations need to be carried out to ensure correct operation of a flash device. Flash Memory File System Page Replacement Algorithm Mapping Garbage collection Flash-Aware Algorithms Wear leveling FTL Mapping Garbage Collection Conclusion Flash Translation Layer controls flash management. Wear Leveling Other Hides complexities of device management from the application Garbage collection and wear leveling Flash Device Enable mobility – flash becomes plug and play

12 MAPPING TECHNIQUES (1/2)
Motivation 3 Types of Basic Mappings Flash Memory Page-Level Mapping Block-Level Mapping Hybrid Mapping LPN PPN @7 8 1 4 2 3 11 9 5 6 7 10 Page Replacement Algorithm Flash-Aware Algorithms Each page mapped independently. Highest performance potential. Highest resource use Large size of mapping table. Conclusion Page-Level Mapping

13 MAPPING TECHNIQUES (2/2)
Motivation 3 Types of Basic Mapping Flash Memory Page-Level Mapping Block-Level Mapping Hybrid Mapping Page Replacement Algorithm 7 mod 4 =3 Flash-Aware Algorithms Only block numbers kept in the mapping table. Page offsets remain unchanged. Small mapping table. Bad performance for write updates. LBN PBN @7 = 7/4 =1 3 1 2 Conclusion Block-Level Mapping

14 FTL BLOCK-LEVEL MAPPING (BEST CASE)
Motivation k flash blocks: B free blocks: F g log blocks: L Flash Memory Page Replacement Algorithm ….. ….. ….. ….. Flash-Aware Algorithms 1 2 k-1 k i j 1 g Conclusion Switch: L1 becomes B1, Erase old B1 Erase B1 1 Erase operation

15 FTL BLOCK-LEVEL MAPPING (SOME CASE)
Motivation k flash blocks: B free blocks: F g log blocks: L Flash Memory Page Replacement Algorithm ….. ….. ….. ….. Flash-Aware Algorithms 1 2 k-1 k i j 1 g Conclusion Merge: B1 and L1 to Fi Erase L1 Erase B1 2 Erase operation Merge of n flash blocks and one log block to Fi  n+1 erasures

16 GARBAGE COLLECTION Motivation Moves valid pages from blocks containing invalid data and then erases the blocks Flash Memory Removes invalid pages and increases free pages Page Replacement Algorithm Flash-Aware Algorithms Conclusion FREE ERASE Full Block ERASE Full Block Valid Invalid Wear Leveling decides where to write the new data Wear Leveling picks up most frequently erased blocks and least worn-out blocks to equalize overall usage and swap their content; enhances lifespan of flash

17 BASICS OF PAGE REPLACEMENT
Motivation Find the location of the desired page on disk. Find free frame: If a free frame exists, use it. Otherwise, use a page replacement algorithm to select a victim page. Flash Memory Page Replacement Algorithm Flash-Aware Algorithms Load the desired page into the frame. Update the page allocation table (page mapping in the buffer). Upon the next page replacement, repeat the whole process again in the same way. Conclusion

18 Cache is FAST but EXPENSIVE HDDs are SLOW but CHEAP
THE REPLACEMENT CACHE PROBLEM Motivation Cache is FAST but EXPENSIVE HDDs are SLOW but CHEAP Flash Memory Page Replacement Algorithm Flash-Aware Algorithms How to manage the cache? Which page to replace? How to maximize the hit ratio? Conclusion

19 PAGE REPLACEMENT ALGORITHMS (1/2)
Motivation Least Recently Used (LRU) - Removes the least recently used items first - Constant time and space complexity & simple-to-implement - Expensive to maintain statistically significant usage statistics - Does not exploit "frequency” - It is not scan-resistant Flash Memory Page Replacement Algorithm Flash-Aware Algorithms How LRU Works? Conclusion String: C A B D E F D G E Time: A B C D C A B D A C B D B A C D D B A C E D B A F E D B D F E B G D F E E G D F Page Fault C goes out Page Fault A goes out Page Fault B goes out

20 Adaptive Replacement Cache(ARC) is a Solution
PAGE REPLACEMENT ALGORITHMS (2/2) Motivation Least Frequently Used (LFU) - Removes least frequently used items first - Is scan-resistant - Logarithmic time complexity (per request) - Stale pages can remain a long time in the buffer Flash Memory Page Replacement Algorithm Flash-Aware Algorithms LRU + LFU = LRFU (Least Recently/Frequently Used) - Exploit both recency and frequency - Better performance than LRU and LFU - Logarithmic time complexity - Space and time overhead Conclusion LFU is implemented by a heap; hence it has a logarithmic complexity for adding or removing a page from the heap and also for updating a place of a page in the heap. Stale pages can remain a long time in the memory, while "hot" pages can be mistakenly taken out. Enormous space overheads : Information about the time of every reference to each block Enormous time overheads : Computation of the CRF value of every block at each time Adaptive Replacement Cache(ARC) is a Solution

21 ARC (ADAPTIVE REPLACEMENT CACHE) CONCEPT
Motivation General double cache structure (cache size is 2C) Flash Memory L1 L2 Page Replacement Algorithm LRU MRU MRU LRU Flash-Aware Algorithms Cache is partitioned into two lists L1 and L2. L1 contains recently seen pages: Recency list. Conclusion L2 contains pages seen at least twice recently: Frequency list. If L1 contains exactly C pages, replace the LRU page from L1. Otherwise , replace the LRU page in L2.

22 ARC CONCEPT ARC structure (cache size is C) C B1 T1 T2 B2 L1 L2 LRU
Motivation ARC structure (cache size is C) C Flash Memory B1 T1 T2 B2 L1 L2 Page Replacement Algorithm LRU MRU MRU LRU 2C Flash-Aware Algorithms Divide L1 into T1 (MRU end) & B1 (LRU end). Divide L2 into T2 (MRU end) & B2 (LRU end). Conclusion The size of T1 and T2 is C. The size of T1 and B1 is C, same for T2 and B2. Upon a page request: if it is found in T1 or T2 , move it to MRU of T2. When cache miss occurred, new page is added in MRU of T1 - If T1 is full, LRU of T1 is moved to MRU of B1.

23 ARC PAGE EVICTION RULE ARC structure (cache size is C) C B1 T1 T2 B2
Motivation ARC structure (cache size is C) C Flash Memory B1 T1 T2 B2 L1 L2 Page Replacement Algorithm LRU MRU MRU LRU 2C Flash-Aware Algorithms ARC adapts parameter P , according to an observed workload. - P determines the target size of T1. Conclusion If requested page found in B1, P is increased & the page moved to MRU position of T2. If requested page found in B2, P is decreased & the page moved to MRU position of T2.

24 HOW ARC WORKS? (1/2) C C B1 T1 T2 B2 Recency Frequency C A B C D E F G
Motivation C C B1 T1 T2 B2 Flash Memory Recency Frequency C Page Replacement Algorithm Reference String : A B C D E F G Flash-Aware Algorithms Time A Conclusion 1 A B 2 A B C 3 B C A 4 B C D A 5 B C D E A 6 B C D E A 7 B C D F E A 8 B C D F G E A 9 B C F G D E A

25 HOW ARC WORKS? (2/2) A B C D E F G H I J K L Time Scan-Resistant 9 B C
Motivation Reference String : A B C D E F G H I J K L Flash Memory Page Replacement Algorithm Time Scan-Resistant 9 B C F G D E A Flash-Aware Algorithms 10 B C F G H D E A 11 B C F G H I D E A Conclusion 12 C F G H I J D E A Page B is out from the list 13 C F H I J G D E A Increase T1, Decrease B1 Self-Tuning 14 C F H I J K G D E A 15 C F I J K H G D E A 16 C F I J K L H G D E A 17 C F I J K L D H G E A Increase T2, Decrease B2 Self-Tuning

26 ARC ADVANTAGE ARC is scan-resistant.
Motivation ARC is scan-resistant. Flash Memory ARC is self-tuning and empirically universal. Stale pages do not remain in memory; better than LFU. Page Replacement Algorithm ARC consumes about 10% - 15% more time than LRU, but the hit ratio is almost twice as for LRU. Flash-Aware Algorithms Low space overhead for ‘B’ lists. Conclusion

27 FLASH-AWARE BUFFER TECHNIQUES
Motivation Cost of page write is much higher than page read. Flash Memory Buffer manager decides How and When to write. Page Replacement Algorithm Minimize the number of physical write operations. Flash-Aware Algorithms CFLRU (Clean-First LRU) LRUWSR (LRU Write Sequence Reordering) CCFLRU (Cold-Clean-First LRU) AD-LRU (Adaptive Double LRU) Conclusion Read/Write entire flash blocks (addressing the FRW problem) FAB (Flash Aware Buffer) REF (Recently-Evicted-First)

28 CLEAN-FIRST LRU ALGORITHM (1/3)
Motivation One of the earliest proposals of flash-aware buffer techniques. Flash Memory CFLRU is based on LRU replacement policy. Page Replacement Algorithm LRU list is divided into two regions: Working region: Recently accessed pages. Flash-Aware Algorithms Clean-first region: Pages for eviction. Working Region Clean-First Region MRU Conclusion LRU P1 P2 P3 P4 P5 P6 P7 P8 Clean Window , W = 4 Dirty CFLRU always selects clean pages to evict from the clean first region first to save flash write costs. If there is no clean page in this region, a dirty page at the end of the LRU list is evicted.

29 CLEAN-FIRST LRU ALGORITHM (2/3)
Motivation CFLRU always selects clean pages to evict from the clean first region first to save flash write costs. Flash Memory If there is no clean page in this region, a dirty page at the end of the LRU list is evicted. Page Replacement Algorithm Flash-Aware Algorithms Working Region Clean-First Region MRU Conclusion LRU P1 P2 P3 P4 P5 P6 P7 P8 Clean Dirty Evicted Pages : P7 P5 P8 P6

30 CFDC : Clean-First, Dirty-Clustered
CLEAN-FIRST LRU ALGORITHM (3/3) Motivation Disadvantage : Flash Memory CFLRU has to search in a long list in case of a buffer fault. Page Replacement Algorithm Keeping dirty pages in the clean-first region can shorten the memory resources. Flash-Aware Algorithms Conclusion Determine the size of W, the window size of the clean-first region. CFLRU algorithm is one of the earliest buffer replacement algorithms and it has several disadvantages. First, in case of a buffer fault, the CFLRU algorithm has to work on a long buffer list as it searches for clean pages from the LRU end; which is not always the case. Clean pages are always staying close towards the working region as clean pages are selected in the clean first region over working region. Second, keeping the dirty pages in the clean-first region can shorten the memory resources, because clean pages are more frequently accessed than dirty pages. Third, a main disadvantage is to statically determine the size of w, the window size of the clean-first region. However, we can configure the window size of the clean-first region statically and dynamically. The dynamic CFLRU algorithm has a benefit that we do not have to change the window size each time the workload changes, while we can achieve the similar performance results with the static CFLRU algorithm, by configuring the best performing window size. Finding the right window size of the clean-first region is important to minimize the total replacement cost. A large window size will increase the cache miss rate and a small window size will increase the number of evicted dirty pages, that is, the number of flash write operations. The flash write operations can also cause a large number of the costly erase operations, as mentioned in section 2.1. Therefore, the window size of the clean-first region needs to be decided properly in order to minimize the overall replacement cost. CFDC : Clean-First, Dirty-Clustered

31 CFDC (CLEAN-FIRST, DIRTY-CLUSTERED) ALGORITHM
Motivation Victim Clean Queue 39 69 48 7 11 Flash Memory 54 1 45 33 44 Dirty Queue 20 6 4 13 8 15 27 28 29 Page Replacement Algorithm Working Region Priority Region Flash-Aware Algorithms Conclusion Implement two-region scheme. Buffer divided into two region: 1. Working Region : Keep hot pages 2. Priority Region: Assign priority to pages Divide clean-first region (CFLRU) into two queue: Clean Queue and Dirty Queue; Separation of Clean and Dirty Pages. To administrate these clusters, CFDC maintains a hash table with cluster numbers as keys. When a dirty page enters the priority region, we derive its cluster number by dividing its page number by a constant MAX CLUSTER SIZE and perform a hash lookup using this cluster number. If the cluster exists, the page is added to the cluster tail and the cluster position in the priority queue is adjusted. Otherwise, a new cluster containing this page is created and inserted to the priority queue. Moreover, the new cluster is registered in the hash table. In case of a page hit in the clusters, the page is simply moved to the working region. Upon a buer fault, if the clean queue is empty, we select the rst page in the lowest-priority cluster as victim. After relling the priority region with a victim of the working region, the requested page can be loaded. Dirty pages are grouped in clusters according to spatial locality. Clusters are ordered by priority. Clean pages are always chosen first as victim pages. Otherwise, a dirty page is evicted from the LRU end of a cluster having lowest priority.

32 CFDC ALGORITHM – PRIORITY FUNCTION
Motivation For a cluster c with n pages, its priority P(c) is computed according to Formula 1 Flash Memory IPD (Inter-Page Distance) Page Replacement Algorithm Flash-Aware Algorithms Where P0, …, Pn-1 are the page numbers ordered by their time of entering the cluster. Conclusion Example : 15 8 13 20 4 6 29 28 27 Victim Page GlobalTime : 10 depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18. Timestamp -> 4 2 3 6 Priority -> 2/9 1/8 1/14 1/18 Lowest Priority

33 CFDC ALGORITHM – EXPERIMENTS
Motivation Flash Memory Page Replacement Algorithm Flash-Aware Algorithms CFDC vs. CFLRU: 41% CFLRU vs. LRU: 6% Cost of page flushes Clustered writes are efficient Conclusion depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18. Influence of increasing update ratios CFDC is equal with LRU for update workload. Number of page flushes CFDC has close write count to CFLRU

34 CASA : Dynamically adjusts buffer size
CFDC ALGORITHM – CONCLUSION Motivation Reduces the number of physical writes Flash Memory Improves the efficiency of page flushing Page Replacement Algorithm Keeps high hit ratio. Flash-Aware Algorithms Conclusion Size of the Priority Window is a concern for CFDC. depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18. CASA : Dynamically adjusts buffer size

35 AD-LRU (ADAPTIVE DOUBLE LRU) ALGORITHM
Motivation AD-LRU integrates the properties of recency, frequency and cleanness into the buffer replacement policy. Flash Memory Page Replacement Algorithm Cold LRU Hot LRU Flash-Aware Algorithms MRU LRU MRU LRU Min_lc FC FC Conclusion Cold LRU: Keeps pages referenced once Hot LRU: Keeps pages referenced at least twice (frequency) The page fetching algorithm is characterized as follows (see Algorithm AD- LRU fetch). If the requested page is found in the hot LRU queue, we just move the page to the MRU position (line 1 to line 5). If the page is found in the cold LRU queue, we enlarge the hot queue, thereby automatically reduce the cold queue, and move the page to the MRU position in the hot queue (line 6 to line 12). If a page miss occurs and the buer has free space, we increase the size of the cold queue and put the fetched page into the cold queue (line 14 to line 19). If the buer is full, then we have to select a page for replacement. If the cold LRU queue contains more than min lc pages, we evict the victim from the cold queue (line 22), otherwise, we evict it from the hot queue and reduce the hot queue thereby enlarging the cold queue (line 24 to line 25). When a page is referenced, its referenced bit is set to 1, which will be used in the SelectV ictim routine to determine the victim based on the second-chance policy. The algorithm SelectV ictim rst selects the FC page (least-recently-used clean page) from the LRU queue as the victim. If no clean pages exist in the queue, it selects a dirty page using the second-chance policy. The referenced bit of the buer page under consideration is checked and, if it is 1, we move the page to the MRU position and set the referenced bit to 0; this inspection is continued until the rst page with referenced bit having 0 is located, which is then returned as the result FC (First-Clean) indicates the victim page. If page miss occurs, increase the size of the cold queue If buffer is full, cold clean pages are evicted from Cold LRU. If cold clean pages are not found, then cold dirty pages are evicted by using a second-chance algorithm.

36 AD-LRU ALGORITHM EVICTION POLICY
Motivation Example : Flash Memory Buffer size : 9 pages MRU Page Replacement Algorithm LRU 3 Dirty Hot 7 Dirty Hot 2 Clean Hot 1 Dirty Hot Flash-Aware Algorithms Hot Queue 10 Dirty Cold 4 Dirty Cold 4 Dirty Cold 6 Clean Cold 6 Clean Cold 5 Clean Cold 9 Dirty Cold 8 Dirty Cold Conclusion Cold Queue depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18. New Page Ad-LRU Victim Ad-LRU Victim If no clean cold page is found, then a dirty cold page will be chosen as victim using a second-chance algorithm.

37 AD-LRU has the lowest write count
AD-LRU ALGORITHM EXPERIMENTS Motivation Flash Memory Page Replacement Algorithm Flash-Aware Algorithms Random Read-Most Conclusion depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18. Zipf Write-Most Write count vs. buffer size for various workload patterns AD-LRU has the lowest write count

38 AD-LRU ALGORITHM - CONCLUSION
Motivation AD-LRU considers reference frequency, an important property of reference patterns, which is more or less ignored by CFLRU. Flash Memory Page Replacement Algorithm AD-LRU frees the buffer from the cold pages as soon as appropriate. Flash-Aware Algorithms AD-LRU is self-tuning. Conclusion AD-LRU is scan-resistant. depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18.

39 CASA (COST-AWARE SELF-ADAPTIVE) ALGORITHM
Motivation CASA makes trade-off between physical reads and physical writes It adapts automatically to varying workloads. Flash Memory Page Replacement Algorithm Clean List Lc Dirty List Ld LRU MRU MRU LRU Flash-Aware Algorithms |Lc| |Ld| b= |Lc|+ |Ld| Conclusion Divide buffer pool into 2 dynamic lists: Clean and Dirty list Both lists are ordered by reference recency. depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18. CASA continuously adjust parameter τ; 0 ≤ τ ≤ b τ is the dynamic target size of Lc, so size of Ld is (b – τ). In case of a buffer fault: τdecides from which list the victim page will be chosen.

40 CASA ALGORITHM CASA algorithm considers both read and write cost.
Motivation CASA algorithm considers both read and write cost. Flash Memory CASA algorithm also considers the status (R/W) of a requested page Page Replacement Algorithm Clean List Lc Dirty List Ld LRU MRU MRU LRU Flash-Aware Algorithms |Lc| |Ld| b= |Lc|+ |Ld| Conclusion Case 1: Logical Read request in Lc , τ increased. Case 2: Logical Write request in Ld , τ decreased. depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18.

41 CASA ALGORITHM – EXAMPLE (1/2)
Motivation Total buffer size b = 13, τ = 7 , target size of Lc = 7, Ld = 6 Total buffer size b = 13, τ = 6, Target Size of Lc = 6, Ld = 7 Flash Memory Incoming page : 14 (Read) in Lc Page Replacement Algorithm Clean List Lc Dirty List Ld 24 13 19 16 21 33 14 24 13 19 16 21 33 11 22 34 4 5 7 8 11 22 34 4 5 7 Flash-Aware Algorithms LRU LRU b= |Lc|+ |Ld| Conclusion Case 1: Logical Read request in Lc , τ increased. depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18.

42 CASA ALGORITHM – EXAMPLE (2/2)
Motivation Total buffer size b = 13, τ = 6 , target size of Lc = 6, Ld = 7 Total buffer size b = 13, τ = 7 ,Target Size of Lc = 7, Ld = 6 Flash Memory Incoming page : 15 (Write) in Ld Page Replacement Algorithm Clean List Lc Dirty List Ld 24 13 19 16 21 33 14 13 19 16 21 33 14 15 11 22 34 4 5 7 11 22 34 4 5 7 Flash-Aware Algorithms LRU LRU b= |Lc|+ |Ld| Conclusion Case 1: Logical Read request in Lc , τ increased. depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18. Case 2: Logical Write request in Ld , τ decreased.

43 CASA ALGORITHM - CONCLUSION
Motivation CASA is implemented for two-tier storage systems based on homogeneous storage devices with asymmetric R/W costs. Flash Memory Page Replacement Algorithm CASA can detect cost ratio dynamically. Flash-Aware Algorithms CASA is self-tuning. It adapts itself to varying cost ratios and workloads Conclusion depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18.

44 CONCLUSION Motivation Flash memory is a widely used, reliable, and flexible non-volatile memory to store software code and data in a microcontroller. Flash Memory However, the performance behavior of flash devices is still remaining unpredictable due to complexity of FTL implementation and its proprietary nature. Page Replacement Algorithm Flash-Aware Algorithms To gain more efficient performance, we need to implement a flash device simulator. Conclusion We addressed issues of buffer management for two-tier storage systems (Caching for a flash DB); ARC and CASA are two better approach. depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18. Phase-change memory (PCM) is a promising next-generation memory technology, which can be used for database storage systems.

45 REFERENCES Yi Ou: Caching for flash-based databases and flash-based caching for databases, Ph.D. Thesis, University of Kaiserslautern, Verlag Dr. Hut, Online August 2012 Nimrod Megiddo, Dharmendra S. Modha: ARC: A Self-Tuning, Low Overhead Replacement Cache. FAST 2003: ( ) Nimrod Megiddo, Dharmendra S. Modha: Outperforming LRU with an Adaptive Replacement Cache Algorithm. IEEE Computer 37(4): (2004) Yi Ou, Theo Härder: Clean first or dirty first?: a cost-aware self-adaptive buffer replacement policy. IDEAS 2010: 7-14 Seon-Yeong Park, Dawoon Jung, Jeong-Uk Kang, Jinsoo Kim, Joonwon Lee: CFLRU: a replacement algorithm for flash memory. CASES 2006: Yi Ou, Theo Härder, Peiquan Jin: CFDC: a flash-aware replacement policy for database buffer management. DaMoN 2009: 15-20 Peiquan Jin, Yi Ou, Theo Härder, Zhi Li: AD-LRU: An efficient buffer replacement algorithm for flash-based databases. Data Knowl. Eng. 72: (2012) Suman Nath, Aman Kansal: FlashDB: dynamic self-tuning database for NAND flash. IPSN 2007: Kyoungmoon Sun, Seungjae Baek, Jongmoo Choi, Donghee Lee, Sam H. Noh, Sang Lyul Min: LTFTL: lightweight time-shift flash translation layer for flash memory based embedded storage. EMSOFT 2008: 51-58 Nimrod Megiddo, Dharmendra S. Modha: System and method for implementing an adaptive replacement cache policy, US B2, 2006 Wikipedia: Flash memory Wikipedia: Page replacement algorithm N. Megiddo , D. S. Modha: Adaptive Replacement Cache, IBM Almaden Research Center, April 2003 Yang Hu, Hong Jiang, Dan Feng, Lei Tian, Shu Ping Zhang, Jingning Liu, Wei Tong, Yi Qin, Liuzheng Wang: Achieving page-mapping FTL performance at block-mapping FTL cost by hiding address translation. MSST 2010: 1-12 depicts a priority queue with four clusters, where globaltime is 10, timestamp(c) is kept at the top right corner of each cluster, and the clustered pages are marked with their page numbers. From left to right, the cluster priorities are obtained using Formula 1: 2=9; 1=8; 1=14; 1=18.

46 THANK YOU


Download ppt "Sougata Bhattacharjee"

Similar presentations


Ads by Google