Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Evaluation of Using Deduplication in Swappers Weiyan Wang, Chen Zeng.

Similar presentations


Presentation on theme: "An Evaluation of Using Deduplication in Swappers Weiyan Wang, Chen Zeng."— Presentation transcript:

1 An Evaluation of Using Deduplication in Swappers Weiyan Wang, Chen Zeng

2 Motivation Deduplication detects duplicate pages in storage  NetApp, Data Domain: billion $ business We explore another direction: use deduplication in swappers Our experimental results indicate that using deduplication in swappers is beneficial

3 What is a swapper? A mechanism to expand usable address spaces  Swap out: swap a page in memory to swap area  Swap in: swap a page in swap area to memory Swap area is on disk pte’ Free P1 Used P1

4 Why deduplication is useful? Writes to disk is slow  Disk accesses is much slower than memory! When duplicate pages exist:  Do we really need to swap out all of them?  If a duplicate page appear in swap area, we can save one I/O. P1P3P2 P1

5 Architecture Swap out A page Compute checksum Lookup in the dedup cache YES Skip pageout pageout NO Add to dedup cache

6 Computing Checksum SHA-1 checksum (160bit)  Collision probability of one in 2 80  Only use the first 32bit (one in 2 16 ) Related to the implementation of dedup cache  Only store checksum We assume two pages are identical if their checksums are equal  Trade consistency for performance

7 Dedup Cache Dedup cache - radix tree  Checksum -> dedup_entry_t  A Trie with O(|key|) lookup and update overhead  Well written in the kernel Key in radix tree is 32 bits  We only keep the first 32 bits of a checksum as key

8 Entries in Dedup Cache The index of a page in swap area The number of duplicates pages given a checksum A lock for consistency typedef struct { swp_entry_t base; atomic_t count; spinlock_t lock; }dedup_entry_t;

9 Changes to Linux Kernel Swap cache  swap_entry_t ->page  Avoid repeatedly swapping in Happens when a page swapped out is shared by multiple processes Example Process A and B share the page P P is swapped out, PTE in A and B are updated A wants to access P B wants to access P

10 Will dedup cache grows infinitely? Swap Counter for each swap_entry_t  # of reference in the memory  counter++ when one more pte contains swap_entry_t It’s in swap cache It’s in dedup cache  counter-- when swap in a page  remove swap_entry_t from dedup cache and swap cache when counter = 2

11 Reference Counters (4) A B Swap cache dedup cache Swap area (2)

12 Changes to Swap Cache Maintain the mapping between swap_entry and page We change that mapping to swap_entry and a list of pages of same contents Why we need a list?

13 Possible Inconsistency Swap out page P1 to swap_entry e1 Swap out page P2, a duplicate of P1  The mapping of e1->P2 can not be added to swap cache Swap in P1: mapping is deleted Swap in P2: Ooops! Swap Cache E1 -> P1

14 Our Solution Swap out page P1 to swap_entry E1 Swap out page P2, a duplicate of P1  The mapping of e1->P2 is added to the list Swap in P1: only P1 is deleted Swap in P2: delete E1->P2 Swap Cache E1 -> P2E1 -> P1,P2 E1 -> P1

15 Experimental Evaluation We run our experiment on VMWare with Linux 2.6.26 Our testing program: sequentially access an array  Each element is of size 4KB  We change the percentage of duplicate pages in that array

16 All of the pages are duplicates Duplication significantly reduces the access time

17 No Duplicate Pages However, duplication also incurs a significant overhead

18 Overheads in Deduplication Major overheads:  Calculating checksums: 35 us When a page is swapped in or swapped out, we all calculate the checksums.  Maintain the reference counter Explicitly require locks impose significant overhead: average of 65 us in our experiments

19 Conclusion Deduplication is a double-edged sword in swappers  When a lot of duplicate pages are presented, deduplication reduces the access time by orders of magnitude  When few duplicate pages are presented, the overhead is also non-negligible


Download ppt "An Evaluation of Using Deduplication in Swappers Weiyan Wang, Chen Zeng."

Similar presentations


Ads by Google