Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thomas ZahnCST1 Seminar: Information Management in the Web Query Processing Over Peer- to-Peer Data Sharing Systems (UC Santa Barbara)

Similar presentations


Presentation on theme: "Thomas ZahnCST1 Seminar: Information Management in the Web Query Processing Over Peer- to-Peer Data Sharing Systems (UC Santa Barbara)"— Presentation transcript:

1 Thomas ZahnCST1 Seminar: Information Management in the Web Query Processing Over Peer- to-Peer Data Sharing Systems (UC Santa Barbara)

2 Thomas ZahnCST2 Motivation  E.g. find all object whose attribute values (NOT hash IDs!!) are between 100 and 200  DHTs poorly support range queries  Due to hashing, semantically succeeding objects could be stored at "opposite" ends of the overlay  for each value in range, a separate lookup needs to be issued

3 Thomas ZahnCST3 Overlay Object Placement d46a1c 3102ab 02 128 -1 d1a08e h(15) = d1a08e h(16) = 3102ab h(17) = d46a1c 15 16 17

4 Thomas ZahnCST4 Problem  for each value in range, a separate lookup would have to be issued  while theoretically possible for discrete sets (e.g. [10,11,12,…,50] )  completely impossible for continuous sets (e.g. [10.0, 50.0])

5 Thomas ZahnCST5 General Concept (1)  uses 2-dimensional CAN virtual space  virtual space is partitioned into rectangular zones  each zone is owned by an active node  each node maintains RT with its neighbors 4 7 (20,20) 3 6 5 1 (80,20) (20,80)(80,80) 35 50 2 61 3042

6 Thomas ZahnCST6 General Concept (2)  node stores results of queries whose range are hashed to its zone  range query hashed to target point (a,b)  target zone, target node  result of range query is stored at target node/zone 4 7 (20,20) 3 6 5 1 (80,20) (20,80)(80,80) 35 50 2 61 3042  e.g. range query

7 Thomas ZahnCST7 General Concept (3)  given two range queries r1: and r2:  two target points t1 (r1) and t2 (r2) 1.if a1 < a2  t1 lies to the left of t2 2.if b1 < b2  t1 lies below t2 3.t1 lies to the upper-left of t2 iff range r1 contains range r2

8 Thomas ZahnCST8 General Concept (4)  range query hashed into zone A  if any prior range query result containing exists  must have been hashed to point in shaded region  any intersecting zone can potentially contain a result C D B (x,y) A

9 Thomas ZahnCST9 General Concept (5)  two target points t1 (r1) and t2 (r2)  t1 lies to the upper-left of t2 iff range r1 contains range r2  Diagonal Zone: zone z (x1,y1),(x2,y2), zone z' (a1,b1),(a2,b2) z' is diagonal zone of z if a2 ≤ x1 and b1 ≥ y2 Intuitively: z' is diagonally above upper-left corner of z  only non-empty zones exist  a diagonal zone of z can answer ALL range queries that hash into z C z' B z

10 Thomas ZahnCST10 Zone Maintenance  initially entire hash space is single zone assigned to one active node  each active node has RT containing its neighbor active nodes along with their zone coordinates  a zone splits when load (storage and/or processing) too high  decision made by zone owner  owner contacts a passive node  assigns it portion of its zone  transfer corresponding results, neighbor list

11 Thomas ZahnCST11 Query Routing (1)  result likely to be cached at target zone  range query is routed through virtual space toward its target zone  starting at requesting zone, each zone passes query on to a neighboring zone  a zone chooses neighbor zone whose coordinates are closest to target point  process continues until target zone is reached

12 Thomas ZahnCST12 Query Routing (2)  simple way: compute Euclidean distance between target point and center of a zone  might not converge 1 2 4 3 t

13 Thomas ZahnCST13 Query Routing (3)  distance of target t from a zone Z should be measured as the closest distance of t from the entire zone R5R1R6 R3ZR4 R7R2R8

14 Thomas ZahnCST14 Forwarding (1)  query reaches target zone  check local cache  if no result containing query range is found  forward query  only zones to upper left of target point can have a result containing the given range 7 2 6 11 5 10 3 4 8  forwarding similar to flooding  Forward Limit 0.0 – 1.0

15 Thomas ZahnCST15 Forwarding (2)  Again: diagonal zones are especially interesting  guaranteed to have a result containing the given range  Because: every point in the diagonal zone contains the range  every point lies to upper-left of target point  BUT: zone may not have a diagonal zone 6 7 5 3 4 2 1

16 Thomas ZahnCST16 Updates  tuple t with range attribute A=k is updated  sent update message to target zone containing (k,k)  tuple t included in all ranges s.th. a ≤ k ≤ b  forward to all zones that lie on the upper left of target zone

17 Thomas ZahnCST17 Conclusion  Does not assume natural equal distribution of attribute values  Efficient average path length (O( ) )  BUT: hot spot nodes in upper-left section  many splits  heavy partitioning  longer path length  cached results may not reflect current result  updates / deletion expensive

18 Thomas ZahnCST18 Questions ?


Download ppt "Thomas ZahnCST1 Seminar: Information Management in the Web Query Processing Over Peer- to-Peer Data Sharing Systems (UC Santa Barbara)"

Similar presentations


Ads by Google