Presentation is loading. Please wait.

Presentation is loading. Please wait.

Count / Top-k Continuous Queries on P2P Networks 01/11/2006.

Similar presentations


Presentation on theme: "Count / Top-k Continuous Queries on P2P Networks 01/11/2006."— Presentation transcript:

1 Count / Top-k Continuous Queries on P2P Networks 01/11/2006

2 Outline Problem Definition P2P Architecture Count Top-K Experiment Setup Future Work

3 Streaming Data in P2P P2P  Dynamic changing topology, large scale, … Streaming data  Continuous, unbounded, rapid, time-varying, noise P2P + Streaming data  Dynamic in both data and topology

4 Objective and Goal Objective  Issue a continuous query to estimate count and top-K Goal  Lower down the communication cost  Lightweight maintenance  Approximated answers  An adaptive and progressive approach

5 Naïve approach Flooding the overlay continuous  Pros Closer to the exact answer  Cons Network congestion Still non-real time

6 The State-of-the-Art Count  Focus on one-time answer in P2P  Deal with streaming data only Top-K  P2P environment without streaming data  Distributed environment not P2P

7 P2P architecture Assumption  Hierarchical P2P (Focused) Super-peer hierarchical structure Query issuer is a super-peer Super peer connect with other super peers Each peer belongs to only one super peer  Pure unstructured P2P

8 Big picture Group Accumulate information within a group based on the constraint and statistics Set Constraint Report changes Approximated answer

9 Group in hierarchical P2P Issuer Coordinator Peer

10 Group in hierarchical P2P 3 1 4 2

11 4 3 3 1 4 2

12 4 3 3 1 4 2

13 After partition Group1 Group3 Group2 Assume we have N objects and K Groups after partition

14 User-specified Epsilon Group1 Group3 Group2 User-specified ε(Precision)

15 Consider a group P4P4 P1P1 P3P3 P2P2 Coordinator Node Objects O1O1 O2O2 O3O3

16 Each node maintain the distribution information of owning objects P2P2 P4P4 P1P1 P3P3 object Rate # R1R1 R2R2 R3R3 R4R4

17 At initial - Polling P4P4 P1P1 P3P3 P2P2 Coordinator Node

18 At initial - Polling P4P4 P1P1 P3P3 P2P2 Coordinator Node

19 Information at coordinator after polling object # 22 26 33 P4P4 P3P3 P2P2 P1P1

20 Statistics information object # P 1 P 2 P 3 P 4 Δ O 1 1/1 6/6 10/10 5/5 22 O 2 11/11 13/13 5/5 9/7 36 O 3 15/15 6/6 3/3 9/9 33 R 0.3 0.2 -0.05 0.6 T 15 15 17 13 22 26 33 Updated time stamp Maximum changing rate(+/-) of objects in each peer Change value for each object Latest real value Estimated value

21 Update to Coordinator ( Δ 11, Δ 21, Δ 31) T2T2 ( Δ 12, Δ 22, Δ 32) ( Δ 13, Δ 23, Δ 33)

22 Calculate Count

23 Redistribute Epsilon w i =Max(Δ i )/C x,0 where x is the i-index of Max(Δ i ) δ i =w i εC x,0 / ∑w i

24 Visiting sequence P4P4 P3P3 P2P2 P1P1 Pick those peers would violate δ

25 Update information Group P 1 P 2 P 3 P 4 Δ O 1 1/1 6/6 10/10 8/8 - O 2 11/11 11/11 5/5 6/6 - O 3 15/15 5/5 3/3 11/11 - R 0.3 0.4 -0.05 0.2 T 15 30 17 33

26 For those nodes not being visited Group P 1 P 2 P 3 P 4 Δ O 1 1/2 6/6 10/9 8/8 25 O 2 11/13 11/11 5/4 6/6 34 O 3 15/18 5/5 3/2 11/11 36 R 0.3 0.4 -0.05 0.2 T 15 30 17 33

27 Un-notified Leave P1P1 Ping P 1 is dead Remove P 1 ’s information P4P4 P3P3 P2P2

28 Experiment Setup Generate synthetic data set by statistics distribution for  Streaming data  Life time of peers Metrics  Message size  Communication cost  Response latency  Result accuracy

29 Top-K Use Regression to predicate the reasonable trend of changes  Once a updated result is required, Super Peer only need to ask those doubtful peers for doubtful objects  Update its counting list, and return the top k objects

30 Future Work Connect and recommend latent good friends for each user  Good friends: the ones with the same interests (behaviors) Exploiting current connecting peers to discover good friends bit by bit Design a system that could make clusters reflecting current interests of individual peers and connecting them together based on their similarity by using user’s social network

31 Advantages Reduce search time and diminish query traffic by using friends list By utilizing their different strength of arcs/edges/ties = friendshipness, social networks exceed random-walk networks in quickly finding target objects

32 Example Level 1 Level 2

33 Example has larger weight than Score(N i ) Similarity


Download ppt "Count / Top-k Continuous Queries on P2P Networks 01/11/2006."

Similar presentations


Ads by Google