Data Currency in Replicated DHTs Reza Akbarinia, Esther Pacitti and Patrick Valduriez University of Nantes, France, INIRA ACM SIGMOD 2007 Presenter Jerry.

Slides:



Advertisements
Similar presentations
CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
Advertisements

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Evaluation of a Scalable P2P Lookup Protocol for Internet Applications
Digital Library Service – An overview Introduction System Architecture Components and their functionalities Experimental Results.
Scalable Content-Addressable Network Lintao Liu
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
TAP: A Novel Tunneling Approach for Anonymity in Structured P2P Systems Yingwu Zhu and Yiming Hu University of Cincinnati.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
Overview on ZHT 1.  General terms  Overview to NoSQL dabases and key-value stores  Introduction to ZHT  CS554 projects 2.
1 P2P Logging and Timestamping for Reconciliation M. Tlili, W. Dedzoe, E. Pacitti, R. Akbarinia, P. Valduriez, P. Molli, G. Canals, S. Laurière VLDB Auckland,
Small-world Overlay P2P Network
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Storage Management and Caching in PAST, a large-scale, persistent peer- to-peer storage utility Authors: Antony Rowstorn (Microsoft Research) Peter Druschel.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software Applications Logistics / reminders Project Send Samer and me.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Symmetric Replication in Structured Peer-to-Peer Systems Ali Ghodsi, Luc Onana Alima, Seif Haridi.
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
PNear Combining Content Clustering and Distributed Hash-Tables Ronny Siebes Vrije Universiteit, Amsterdam The netherlands
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Ahmad Al-Shishtawy 1,2,Tareq Jamal Khan 1, and Vladimir Vlassov KTH Royal Institute of Technology, Stockholm, Sweden {ahmadas, tareqjk,
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
Preventive Replication in Database Cluster Esther Pacitti, Cedric Coulon, Patrick Valduriez, M. Tamer Özsu* LINA / INRIA – Atlas Group University of Nantes.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Rendezvous Regions: A Scalable Architecture for Service Location and Data-Centric Storage in Large-Scale Wireless Sensor Networks Karim Seada, Ahmed Helmy.
Chord Advanced issues. Analysis Theorem. Search takes O (log N) time (Note that in general, 2 m may be much larger than N) Proof. After log N forwarding.
1 CS 525 Advanced Distributed Systems Spring 2014 Indranil Gupta (Indy) Lecture 5 Peer to Peer Systems II February 4, 2014 All Slides © IG.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Chord Advanced issues. Analysis Search takes O(log(N)) time –Proof 1 (intuition): At each step, distance between query and peer hosting the object reduces.
Peer to Peer Network Design Discovery and Routing algorithms
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Malugo – a scalable peer-to-peer storage system..
NGMAST Mobile DHT Energy1 Optimizing Energy Consumption of Mobile Nodes in Heterogeneous Kademlia-based Distributed Hash Tables Imre Kelényi Budapest.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
CS 268: Lecture 22 (Peer-to-Peer Networks)
Early Measurements of a Cluster-based Architecture for P2P Systems
A Scalable content-addressable network
Chord Advanced issues.
Chord Advanced issues.
Chord Advanced issues.
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Presentation transcript:

Data Currency in Replicated DHTs Reza Akbarinia, Esther Pacitti and Patrick Valduriez University of Nantes, France, INIRA ACM SIGMOD 2007 Presenter Jerry Wu

Motivation P2P data sharing systems –Enable large amount of users to share a massive number of files –Query  Reply  Send request  Download Message forwarding on these systems –Flooding : KaZaA, Gnutella –DHT : CAN, Chord, Pastry, … etc.

Distributed Hash Table (DHT) Use hash functions to locate files –h(meta data) = k (for identification) –g(k) = k 1 (for routing) A BF D E C Meta FreeLoop.mp3 g(k)=k 1 (A) U k1k1

k1k1 Data Replication What if node A fails? Duplicate several copies A BF D E C g(h(FreeLoop.mp3))=k 1 (A) U g 2 (h(FreeLoop.mp3))=k 2 (D) g 3 (h(FreeLoop.mp3))=k 3 (E) Meta FreeLoop.mp3 k2k2 k3k3

Basic Operations put  (meta key k, File D) –Insert a file into the DHT get  (meta key k) –Retrieve the file from the DHT  : { g(k, D) | g is used as a hash function} |  | : The replication level of the system Each file will be stored at |  | peers

Additional Problems If the owner can modify the data … The nature of P2P system –Peers can join and leave dynamically Update while some peers depart and rejoins later? Concurrent update?

Solution If we have a timestamp for each transaction of update/insert ? –The currency of the file is judged by its timestamp –FileX = File + timestamp –Put (k, FileX) instead of (k, File) into the DHT!! Then we know the freshness of the file Only the latest update can succeed

How Can We Get A Timestamp? KTS (Key-based Timestamp Service) –Issue timestamps for each transaction –gen_ts(key k) Generate a timestamp w.r.t. key k –last_ts(key k) Return the finally issued timestamp

The New DHT Functions Based on the KTS service Insert(key k, FileX D, Hash function set H r ) –Insert or update a file with identity key k into the DHT Retrieve(k, H r ) –Retrieve the latest copy of the file with identity key k

Insert A File BF G E C g(k)=k 1 (A) U g 2 (k)=k 2 (C) Insert P.avi k2k2 k1k1 D H h(P.avi)=k KTS Timestamp Service gen_ts(k)=t A A put g (k, (t A, P.avi)) put g 2 (k, (t A, P.avi))

Retrieve A File BF G E C g(k)=k 1 (A) U g 2 (k)=k 2 (C) Get P.avi k2k2 k1k1 D H h(P.avi)=k KTS Timestamp Service last_ts(k)=t A A get g (k) get g 2 (k) (t 0, P.avi) (t A, P.avi)

If( ts x > ts 0 ) then –Update File D Update A File put g (k, (ts x, File D)) KeyTSFile kts 0 File D (P.avi) k1k1 ts 1 File D 1 (X.mp3) k2k2 ts 2 File D 2 (Y.m4v) k3k3 ts 3 File D 3 (Z.tar)

Retrieval Cost Analysis C = C kts + N * C ret C kts = C ret = O(logn), n = # of peers Let X be the random variable of N N : Number of retries to get the latest copy p t : The probability of finding a fresh copy Prob(X = i) = p t * (1 - p t ) i-1 |H r | = number of replicas of the system

Retrieval Cost Analysis Then, how can we get a timestamp? –Key-based Timestamp Service (KTS)

The KTS Service Use the same DHT but with different hash function h ts 1 2 Hash Table Req (k, h ts ) Req(k, h ts )=p TimeStamp Request (k) Hash Table Req(k, h ts ) 3 4

The KTS Service How can node p generate timestamps w.r.t. key k? –Receive the counters from a leaving peer DHT system will distribute the load of the leaving peer to its neighbors Direct initialization –Send a file request w.r.t. key k to obtain the latest timestamp Take place if the leaving peer fails Indirect initialization

The KTS Service Indirect initialization –The probability to fail  p f –p f = (1-p t ) |  | –If p t = 30%, |  |=13, then p f < 1% After initialization, increase timestamp on every timestamp request

Experiments And Simulations Environments –64 node cluster –10000 nodes on the SimJava platform Metrics –Response time : Time to return a current replica in response to a query –Communication cost : # of messages to send to answer a query

The Competitor - BRICKS Use a function to map key k to multiple keys (k1, k2, k3, k4, …) Each replica has a version number –Concurrent update problems –Must extract all replicas to find the newest one

Response Time VS DHT Size

Communication Cost VS DHT Size

Response Time VS # of Replica

Failure Rate VS Response Time

Conclusion Pros –Use DHT to provide timestamp service is smart! –Consider the concurrent update problem –Easy to apply on exiting DHTs Cons –KTS service can raise additional communication overhead

Thank You