Davide Frey, Anne-Marie Kermarrec, Konstantinos Kloudas INRIA Rennes, France Plug.

Slides:

Advertisements

Similar presentations

Google News Personalization: Scalable Online Collaborative Filtering

Advertisements

Hierarchical Cellular Tree: An Efficient Indexing Scheme for Content-Based Retrieval on Multimedia Databases Serkan Kiranyaz and Moncef Gabbouj.

Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.

Aggregating local image descriptors into compact codes

Scalable Content-Addressable Network Lintao Liu

1 A GPU Accelerated Storage System NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany Sathish Gopalakrishnan Matei.

Fast Algorithms For Hierarchical Range Histogram Constructions

Near-Duplicates Detection

Yasuhiro Fujiwara (NTT Cyber Space Labs)

Computer Science Dr. Peng NingCSC 774 Adv. Net. Security1 CSC 774 Advanced Network Security Topic 7.3 Secure and Resilient Location Discovery in Wireless.

Hashing Part Two Better Collision Resolution Small parts of this material stolen from "File Organization and Access" by Austing and Cassel.

Precept 6 Hashing & Partitioning 1 Peng Sun. Server Load Balancing Balance load across servers Normal techniques: Round-robin? 2.

Ashok Anand, Aaron Gember-Jacobson, Collin Engstrom, Aditya Akella 1 Design Patterns for Tunable and Efficient SSD-based Indexes.

Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.

Tradeoffs in Scalable Data Routing for Deduplication Clusters FAST '11 Wei Dong From Princeton University Fred Douglis, Kai Li, Hugo Patterson, Sazzala.

Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage Wei Zhang, Tao Yang, Gautham Narayanasamy University of California at Santa Barbara.

1 Live Deduplication Storage of Virtual Machine Images in an Open-Source Cloud Chun-Ho Ng, Mingcao Ma, Tsz-Yeung Wong, Patrick P. C. Lee, John C. S. Lui.

Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.

CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang

1 Overview of Storage and Indexing Chapter 8 (part 1)

Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.

Scalable and Distributed Similarity Search in Metric Spaces Michal Batko Claudio Gennaro Pavel Zezula.

1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.

Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,

Chapter 3: Data Storage and Access Methods

Quick Review of material covered Apr 8 B+-Tree Overview and some definitions –balanced tree –multi-level –reorganizes itself on insertion and deletion.

1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman

1 stdchk : A Checkpoint Storage System for Desktop Grid Computing Matei Ripeanu – UBC Sudharshan S. Vazhkudai – ORNL Abdullah Gharaibeh – UBC The University.

Tree-Structured Indexes. Range Searches ``Find all students with gpa > 3.0’’ –If data is in sorted file, do binary search to find first such student,

Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Detecting Near-Duplicates for Web Crawling Manku, Jain, Sarma

Min Xu1, Yunfeng Zhu2, Patrick P. C. Lee1, Yinlong Xu2

M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.

The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.

HPDC 2014 Supporting Correlation Analysis on Scientific Datasets in Parallel and Distributed Settings Yu Su*, Gagan Agrawal*, Jonathan Woodring # Ayan.

Porting Irregular Reductions on Heterogeneous CPU-GPU Configurations Xin Huo, Vignesh T. Ravi, Gagan Agrawal Department of Computer Science and Engineering.

Improving Content Addressable Storage For Databases Conference on Reliable Awesome Projects (no acronyms please) Advanced Operating Systems (CS736) Brandon.

1 Overview of Storage and Indexing Chapter 8 (part 1)

May 30, 2016Department of Computer Sciences, UT Austin1 Using Bloom Filters to Refine Web Search Results Navendu Jain Mike Dahlin University of Texas at.

The Simigle Image Search Engine Wei Dong

HPDC 2013 Taming Massive Distributed Datasets: Data Sampling Using Bitmap Indices Yu Su*, Gagan Agrawal*, Jonathan Woodring # Kary Myers #, Joanne Wendelberger.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.

Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.

RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups Chun-Ho Ng, Patrick P. C. Lee The Chinese University of Hong Kong.

Document duplication (exact or approximate) Paolo Ferragina Dipartimento di Informatica Università di Pisa Slides only!

CS848 Similarity Search in Multimedia Databases Dr. Gisli Hjaltason Content-based Retrieval Using Local Descriptors: Problems and Issues from Databases.

1 Subscription Partitioning and Routing in Content-based Publish/Subscribe Networks Yi-Min Wang, Lili Qiu, Dimitris Achlioptas, Gautam Das, Paul Larson,

Bandwidth-Efficient Continuous Query Processing over DHTs Yingwu Zhu.

Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,

Active Learning and the Importance of Feedback in Sampling Rui Castro Rebecca Willett and Robert Nowak.

Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,

Partially Overlapped Channels Not Considered Harmful Arunesh Mishra, Vivek Shrivastava, Suman Banerjee, William Arbaugh (ACM SIGMetrics 2006) Slides adapted.

1 Traffic Engineering By Kavitha Ganapa. 2 Introduction Traffic engineering is concerned with the issue of performance evaluation and optimization of.

Database Management Systems, R. Ramakrishnan and J. Gehrke1 File Organizations and Indexing Chapter 8 Jianping Fan Dept of Computer Science UNC-Charlotte.

Scalability of Local Image Descriptors Björn Þór Jónsson Department of Computer Science Reykjavík University Joint work with: Laurent Amsaleg (IRISA-CNRS)

International Conference on Data Engineering (ICDE 2016)

Xiaodong Wang, Shuang Chen, Jeff Setter,

(slides by Nick Feamster)

RE-Tree: An Efficient Index Structure for Regular Expressions

A Scalable Routing Architecture for Prefix Tries

Efficient Document Analytics on Compressed Data: Method, Challenges, Algorithms, Insights Feng Zhang †⋄, Jidong Zhai ⋄, Xipeng Shen #, Onur Mutlu ⋆, Wenguang.

Efficient Document Analytics on Compressed Data:

How Yahoo! use to serve millions of videos from its video library.

Publisher : TRANSACTIONS ON NETWORKING Author : Haoyu Song, Jonathan S

Similarity based deduplication

Minwise Hashing and Efficient Search

Hash Functions for Network Applications (II)

Donghui Zhang, Tian Xia Northeastern University

Presentation transcript:

Davide Frey, Anne-Marie Kermarrec, Konstantinos Kloudas INRIA Rennes, France Plug

Motivation Volume of data stored increases exponentially. Provided services are highly dependent on data. 2

Motivation Traditional solutions combine data in tarballs and store them on tape. – Pros: cost efficient. – Cons: low throughput. 3 TapeDisk Acquisition cost$407,000$1,620,000 Operational cost$205,000$573,000 Total cost$612,000$2,193,000 * Source:

Deduplication Store data only once and replace duplicates with references. 4

Deduplication Store data only once and replace duplicates with references. 5 file1

Deduplication file1 file2 6 Store data only once and replace duplicates with references.

Deduplication 7 file1 file2 Store data only once and replace duplicates with references.

Deduplication 8 file1 file2 Store data only once and replace duplicates with references.

Challenges Single-node deduplication systems. – Compact indexing structures. – Efficient duplicate detection. 9

Challenges Single-node deduplication systems. – Compact indexing structures. – Efficient duplicate detection. Cluster-based solutions. – Single-machine tradeoffs. – Deduplication vs Load balancing. 10 We focus on Cluster-based Deduplication Systems. Plug

Storage Nodes Coordinator Clients Example: Deduplication Vs Load Balancing A B C D A client wants to store a file. 11

Storage Nodes Coordinator Clients Example: Deduplication Vs Load Balancing A B C D The client sends the file to the Coordinator. 12

Storage Nodes Coordinator Clients Example: Deduplication Vs Load Balancing A10% B30% C60% D0% The Coordinator computes the overlap between the contents of and those of each Storage Node. 13

Storage Nodes Coordinator Clients Example: Deduplication Vs Load Balancing A10% B30% C60% D0% To maximize DEDUPLICATION, the new file should go to node C. 14

Storage Nodes Coordinator Clients Example: Deduplication Vs Load Balancing A10% B30% C60% D0% To achieve LOAD BALANCING, the new file should go to node D. 15

Goal: Scalable Cluster Deduplication. 16 Load Balancing. Minimize: Ideally, equal to 1. Load Balancing. Minimize: Ideally, equal to 1. Good Data Deduplication. Maximize: Ideally, deduplication of a single-node system. Good Data Deduplication. Maximize: Ideally, deduplication of a single-node system.

Goal: Scalable Cluster Deduplication. 17 Load Balancing. Minimize: Ideally, equal to 1. Load Balancing. Minimize: Ideally, equal to 1. Good Data Deduplication. Maximize: Ideally, deduplication of a single-node system. Good Data Deduplication. Maximize: Ideally, deduplication of a single-node system. Scalability. Minimize memory usage at Coordinator. Scalability. Minimize memory usage at Coordinator.

Goal: Scalable Cluster Deduplication. 18 Load Balancing. Minimize: Ideally, equal to 1. Load Balancing. Minimize: Ideally, equal to 1. Good Data Deduplication. Maximize: Ideally, deduplication of a single-node system. Good Data Deduplication. Maximize: Ideally, deduplication of a single-node system. Good Throughput. Minimize CPU/Memory usage at Coordinator. Good Throughput. Minimize CPU/Memory usage at Coordinator. Scalability. Minimize memory usage at Coordinator. Scalability. Minimize memory usage at Coordinator.

State-of-the-art Divided in stateless and stateful. Stateless : – Assign data to nodes regardless of previous assignment decisions. Stateful : – Keep state for each storage node and assign data to nodes based on their current state.

State-of-the-art : comparison MemoryCPUDeduplication Stateless Stateful

State-of-the-art : comparison MemoryCPUDeduplication Stateless Stateful Goal: Make stateful approaches viable

PRODUCK architecture Coordinator Storage Nodes Client Split the file in chunks of data. Store and retrieve data. Store the chunks. Provide directory services. Assign chunks to nodes. Keep the system load balanced. 22

Client: chunking Chunks: – use content-based chunking techniques. – basic deduplication unit. Super-chunks: – group of consecutive chunks. – basic routing and storage unit. 23

Client: chunking 24 Split the file in chunks

Client: chunking 25 Organize the chunks in super-chunks

Client: chunking 26

PRODUCK architecture Coordinator Storage Nodes Client Split the file in chunks of data. Store and retrieve data. Store the chunks. Provide directory services. Assign chunks to nodes. Keep the system load balanced. 27

Coordinator: goals Estimate the overlap between a super-chunk and the chunks of a given node. – Maximize deduplication. Equally distribute storage load among nodes. – Guarantee a load balanced system. 28

Coordinator: our contributions Novel chunk overlap estimation. – Based on probabilistic counting—PCSA [Flajolet et al. 1985, Michel et al. 2006]. – Never used before in storage systems. Novel load balancing mechanism. – Operating at chunk-level granularity. – Improving co-localization of duplicate chunks. 29

Coordinator: Overlap Estimation Main observation : – Do not need the exact matches. – Need only an estimation of the size of the overlap. PCSA permits : – Compact set descriptors. – Accurate intersection estimation. – Computationally efficient. 30

Coordinator: Overlap Estimation Chunk 5 Chunk 1 Chunk 2 Chunk 3 Chunk 4 Original Set of Chunks 31

Coordinator: Overlap Estimation Chunk 5 Chunk 1 Chunk 2 Chunk 3 Chunk 4 hash() Original Set of Chunks 32

Coordinator: Overlap Estimation Chunk 5 Chunk 1 Chunk 2 Chunk 3 Chunk Original Set of Chunks p(y) = min(bit(y, k)) BITMAP hash() 33

Coordinator: Overlap Estimation Chunk 5 Chunk 1 Chunk 2 Chunk 3 Chunk 4 Original Set of Chunks hash() BITMAP p(y) = min(bit(y, k)) INTUITION P(bitmap[0] = 1) = 1/2 P(bitmap[1] = 1) = 1/4 P(bitmap[2] = 1) = 1/8 … INTUITION P(bitmap[0] = 1) = 1/2 P(bitmap[1] = 1) = 1/4 P(bitmap[2] = 1) = 1/8 …

Coordinator : Overlap Estimation Chunk 5 Chunk 1 Chunk 2 Chunk 3 Chunk l = 2 Original Set of Chunks hash() BITMAP p(y) = min(bit(y, k))

Coordinator : Overlap Estimation Chunk 5 Chunk 1 Chunk 2 Chunk 3 Chunk l = 2 sizeOf(A) = 2 2 / 0.77 = 5.19 Original Set of Chunks p(y) = min(bit(y, k)) BITMAP hash()

Coordinator : Overlap Estimation Intersection Cardinality Estimation ?

Coordinator: Overlap Estimation Intersection Cardinality Estimation ?

Coordinator: Overlap Estimation Intersection Cardinality Estimation ? Union Cardinality Estimation ? BITMAP(A) BITMAP(B) BITMAP(A V B) BitwiseOR 39

Coordinator: Overlap Estimation 40 PCSA set cardinality estimation. Set intersection estimation. Selection of best storage node.

In Practice 41 Client creates the bitmap of each superchunck (8192 vectors, total size 64KB) – Trade-off between efficiency and error Coordinator stores only a bitmap for each Storage Node

Coordinator: our contributions Novel chunk overlap estimation. – Based on probabilistic counting—PCSA [Flajolet et al. 1985, Michel et al. 2006]. – Never used before in storage systems. Novel load balancing mechanism. – Operating at chunk-level granularity. – Improving co-localization of duplicate chunks. 42

Load Balancing 43 Existing solution: choose Storage Nodes that do not exceed average load by a percentage threshold.

Load Balancing Problems Too aggressive, especially when a few data are stored in the system. 44 Existing solution: choose Storage Nodes that do not exceed average load by a percentage threshold.

Bucket-based storage quota management. – Measure storage space in fixed-size buckets. – Coordinator grants buckets to nodes one by one. – No node can exceed the least loaded by more than a maximum allowed bucket difference. 45 Load Balancing: our solution

Bucket-based storage quota management. Bucket 46 Load Balancing: our solution

Bucket-based storage quota management. Bucket Can I get a new Bucket? 47 Load Balancing: our solution

Bucket-based storage quota management. Bucket Yes, you can. 48 Load Balancing: our solution

Bucket-based storage quota management. Bucket Yes, you can. 49 Load Balancing: our solution

Bucket-based storage quota management. Bucket 50 Load Balancing: our solution

Bucket-based storage quota management. Bucket 51 Load Balancing: our solution

Bucket-based storage quota management. Bucket Can I get a new Bucket? 52 Load Balancing: our solution

Bucket-based storage quota management. Bucket NO you cannot! 53 Load Balancing: our solution

Bucket-based storage quota management. Bucket Searching for the second biggest overlap. 54 Load Balancing: our solution

Bucket-based storage quota management. Bucket 55 Load Balancing: our solution

Contribution Summary Novel chunk overlap estimation. – Based on probabilistic counting—PCSA [Flajolet et al. 1985, Michel et al. 2006]. – Never used before in storage systems. Novel load balancing mechanism. – Operating at chunk-level granularity. – Improving co-localization of duplicate chunks. 56

Evaluation: Datasets 2 real world workloads: 2 competitors [Dong et al. 2011]: – Minhash – BloomFilter 57

Evaluation: Competitors MinHash : stateless – Use the minimum hash from a super-chunk as its fingerprint. – Assign super-chunks to bins using the mod(# bins) operator. – Initially assign bins to nodes randomly and re- assign bins to nodes when unbalanced. 58

Evaluation: Competitors BloomFilter : statefull – The Coordinator keeps a Bloom filter for each one of the Storage Nodes. – If a node deviates more than 5% from the average load, it is considered overloaded. 59

Evaluation: Metrics Deduplication: Load balancing: Overall:  ED and TD are normalized to the performance of a single- node system to ease comparison. Throughput : 60

Evaluation: Effective Deduplication WikipediaImages nodes :Wikipedia 7%Images 16% 64 nodes : Wikipedia 16%Images 21% 32 nodes :Wikipedia 7%Images 16% 64 nodes : Wikipedia 16%Images 21%

Evaluation: Throughput 62 WikipediaImages 32 nodes :Wikipedia 11XImages 13X 64 nodes :Wikipedia 16XImages 21X 32 nodes :Wikipedia 11XImages 13X 64 nodes :Wikipedia 16XImages 21X

Evaluation: Throughput 63 WikipediaImages Memory : 64KB for Produck 9,6bits/chunk or 168GB for 140TB/node Memory : 64KB for Produck 9,6bits/chunk or 168GB for 140TB/node 32 nodes :Wikipedia 11XImages 13X 64 nodes : Wikipedia 16XImages 21X 32 nodes :Wikipedia 11XImages 13X 64 nodes : Wikipedia 16XImages 21X

Evaluation: Load Balancing Load Balancing 64 WikipediaImages

To Take Away Lessons learned from cluster-based deduplication – Stateful: good deduplication but impractical – Stateless: practical but poorer deduplication Useful Concepts for SocioPlug – PCSA: Data placement – Load balancing: bucket based 65

66