P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks An overview of Gnutella.
Advertisements

CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
Topics in Database Systems: Data Management in Peer-to-Peer Systems
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Peer to Peer and Distributed Hash Tables
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
VDR: Proactive element Conclusions VDR reaches 3.5% more nodes than VDR-R and 9% more nodes than our modified random walk routing strategy (RWR) VDR shows.
Efficient Search - Overview Improving Search In Peer-to-Peer Systems Presented By Jon Hess cs294-4 Fall 2003.
Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe
Technion –Israel Institute of Technology Computer Networks Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi.
1 An Overview of Gnutella. 2 History The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems Routing indexes A. Crespo & H. Garcia-Molina ICDCS 02.
LightFlood: An Optimal Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
Expediting Searching Processes via Long Paths in P2P Systems 05/30 IDEA Lab.
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
A Dynamic Routing Protocol for Keyword Search in Unstructured Peer-to-peer Networks Authors: Cong Shi, Dingyi Han, Yuanjie Liu, Shicong Meng, Yong Yu APEX.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
A Trust Based Assess Control Framework for P2P File-Sharing System Speaker : Jia-Hui Huang Adviser : Kai-Wei Ke Date : 2004 / 3 / 15.
Improving Search in P2P Networks By Shadi Lahham.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao Cisco Systems, Inc. (Joint work with Christine Lv, Edith Cohen, Kai Li and Scott Shenker)
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Topics in Database Systems: Data Management in Peer-to-Peer Systems
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Efficient Search in Peer to Peer Networks By: Beverly Yang Hector Garcia-Molina Presented By: Anshumaan Rajshiva Date: May 20,2002.
Searching in Unstructured Networks Joining Theory with P-P2P.
P2P Course, Structured systems 1 Introduction (26/10/05)
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
P2P File Sharing Systems
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
IR Techniques For P2P Networks1 Information Retrieval Techniques For Peer-To-Peer Networks Demetrios Zeinalipour-Yazti, Vana Kalogeraki and Dimitrios Gunopulos.
Communication (II) Chapter 4
Searching In Peer-To-Peer Networks Chunlin Yang. What’s P2P - Unofficial Definition All of the computers in the network are equal Each computer functions.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
1 Exploiting locality for scalable information retrieval in peer-to-peer networks D. Zeinalipour-Yazti, Vana Kalogeraki, Dimitrios Gunopulos Manos Moschous.
P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems Routing indexes A. Crespo & H. Garcia-Molina ICDCS 02.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Routing Indices For P-to-P Systems ICDCS Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
By Jonathan Drake.  The Gnutella protocol is simply not scalable  This is due to the flooding approach it currently utilizes  As the nodes increase.
An overview of Gnutella
LightFlood: An Efficient Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems Search in Unstructured P2p.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
1 Reading Report 3 Yin Chen 20 Feb 2004 Reference: Efficient Search in Peer-to-Peer Networks, Beverly Yang, Hector Garcia-Molina, In 22 nd Int. Conf. on.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Peer-to-Peer File Sharing Systems Group Meeting Speaker: Dr. Xiaowen Chu April 2, 2004 Centre for E-transformation Research Department of Computer Science.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Composing Web Services and P2P Infrastructure. PRESENTATION FLOW Related Works Paper Idea Our Project Infrastructure.
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
Peer-to-Peer and Social Networks
EE 122: Peer-to-Peer (P2P) Networks
Paraskevi Raftopoulou, Euripides G.M. Petrakis
Improving Performance in the Gnutella Protocol
Presentation transcript:

P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005

P2p, Spring 05 2 Outline  More on Search Strategies in Unstructured p2p  Replication  general  review of structured  techniques for unstructured

P2p, Spring 05 3 Notes  No class on April 5  Next assignment (tomorrow in the web page)  Present one paper (3 papers, 1 per group)  MAX 35’ each  Topology  Join/Search  Evaluation  Other Issues the presentation should also include  a short discussion (3-5 slides) of what replication strategies you think could be applied in the system you will be presenting

P2p, Spring 05 4 Topics in Database Systems: Data Management in Peer-to-Peer Systems D. Tsoumakos and N. Roussopoulos, “A Comparison of Peer-to- Peer Search Methods”, WebDB03

P2p, Spring 05 5 Overview  Centralized Constantly-updated directory hosted at central locations (do not scale well, updates, single points of failure)  Decentralized but structured The overlay topology is highly controlled and files (or metadata/index) are not placed at random nodes but at specified locations “loosely” vs “highly-structured” DHT  Decentralized and Unstructured peers connect in an ad-hoc fashion the location of document/metadata is not controlled by the system No guaranteed for the success of a search No bounds on search time

P2p, Spring 05 6 Flooding on Overlays xyz.mp3 ? xyz.mp3

P2p, Spring 05 7 Flooding on Overlays xyz.mp3 ? xyz.mp3 Flooding

P2p, Spring 05 8 Flooding on Overlays xyz.mp3 ? xyz.mp3 Flooding

P2p, Spring 05 9 Flooding on Overlays xyz.mp3

P2p, Spring Search in Unstructured P2P BFS vs DFS BFS better response time, larger number of nodes (message overhead per node and overall) Note: search in BFS continues (if TTL is not reached), even if the object has been located on a different path Recursive vs Iterative During search, whether the node issuing the query direct contacts others, or recursively. Does the result follows the same path?

P2p, Spring Search in Unstructured P2P Two general types of search in unstructured p2p: Blind: try to propagate the query to a sufficient number of nodes (example Gnutella) Informed: utilize information about document locations (example Routing Indexes) Informed search increases the cost of join for an improved search cost

P2p, Spring Blind Search Methods Gnutella: Use flooding (BFS) to contact all accessible nodes within the TTL value Huge overhead to a large number of peers + Overall network traffic Hard to find unpopular items Up to 60% bandwidth consumption of the total Internet traffic Modified-BFS: Choose only a ratio of the neighbors (some random subset)

P2p, Spring Blind Search Methods Iterative Deepening: Start BFS with a small TTL and repeat the BFS at increasing depths if the first BFS fails Works well when there is some stop condition and a “small” flood will satisfy the query Else even bigger loads than standard flooding (more later …)

P2p, Spring Blind Search Methods Random Walks: The node that poses the query sends out k query messages to an equal number of randomly chosen neighbors Each step follows each own path at each step randomly choosing one neighbor to forward it Each path – a walker Two methods to terminate each walker:  TTL-based or  checking method (the walkers periodically check with the query source if the stop condition has been met) It reduces the number of messages to k x TTL in the worst case Some kind of local load-balancing

P2p, Spring Blind Search Methods Random Walks: In addition, the protocol bias its walks towards high-degree nodes

P2p, Spring Blind Search Methods Using Super-nodes: Super (or ultra) peers are connected to each other Each super-peer is also connected with a number of lead nodes Routing among the super-peers The super-peers then contact their leaf nodes

P2p, Spring Blind Search Methods Using Super-nodes: Gnutella2 When a super-peer (or hub) receives a query from a leaf, it forwards it to its relevant leaves and to neighboring super-peers The hubs process the query locally and forward it to their relevant leaves Neighboring super-peers regularly exchange local repository tables to filter out traffic between them

P2p, Spring Blind Search Methods Ultrapeers can be installed (KaZaA) or self-promoted (Gnutella) Interconnection between the superpeers

P2p, Spring ? query... Informed Search Methods Intelligent BFS Nodes store simple statistics on its neighbors: (query, NeigborID) tuples for recently answered requests from or through their neighbors so they can rank them For each query, a node finds similar ones and selects a direction How?

P2p, Spring Heuristics for Selecting Direction >RES: Returned most results for previous queries <TIME: Shortest satisfaction time <HOPS: Min hops for results >MSG: Forwarded the largest number of messages (all types), suggests that the neighbor is stable <QLEN: Shortest queue <LAT: Shortest latency >DEG: Highest degree ? query... Informed Search Methods Intelligent or Directed BFS

P2p, Spring Informed Search Methods Intelligent or Directed BFS No negative feedback Depends on the assumption that nodes specialize in certain documents

P2p, Spring Informed Search Methods APS Again, each node keeps a local index with on entry for each object it has requested per neighbor – this reflects the relative probability of the node to be chosen to forward the query k independent walkers and probabilistic forwarding Each node forwards the query to one of its neighbor based on the local index If a walker, succeeds the probability is increased, else is decreased – How? After a walker miss (optimistic update) or after a hit (pessimistic update)

P2p, Spring Informed Search Methods Local Index Each node indexes all files stored at all nodes within a certain radius r and can answer queries on behalf of them Search process at steps of r Flood inside each r with TTL = r Increased cost for join/leave