Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

Similar presentations


Presentation on theme: "Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006."— Presentation transcript:

1 Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006

2 2 User remote; connected via base station How do users pose queries? –by event name (e.g., “Zebras?”) Query(“Zebra”) {(“Zebra”, i, [u, v]); (“Zebra”, j, [x, y])} –Geographic Hash Table (GHT) In-network storage of data Data placement, query routing built on geographic routing One View of Sensor Networks: Querying Zebra Sightings (x, y) (u, v) user j i

3 3 Problem: Data Dissemination in Sensornets Sensors numerous and widely dispersed Sensed data must reach remote user Data dissemination problem: –How best can we supply measured data to users? Design drivers for system: Energy scarce Wireless media prone to contention

4 4 Context: Directed Diffusion [Estrin et al., 2000] “Zebra?” (“Zebra”, i, [u,v]) (u, v) i (“Zebra”, j, [x,y]) j (x, y) Data-centric routing: flood queries (interests) by name Return any responses along reverse paths

5 5 Assumptions, Metrics, Terminology Large-scale networks with known geographic boundaries Users on WAN, a few APs with WAN uplinks Nodes know own geographic locations; often needed to annotate sensed data Energy metrics –Total usage: total number packet txs –Hotspot usage: max. number txs by one node Event: discrete, named object recognized by sensor (e.g., “Zebra”) Query: request from user for data under same naming scheme

6 6 Outline Motivation and Context Canonical Data Dissemination Approaches Geographic Hash Table (GHT) Service Evaluation in Simulation Summary

7 7 Canonical Approach: Local Storage For n nodes, Q event names queried for, and D q events detected with those names, cost (in pkts): –Total: –Hotspot: (at access point) “Zebra?” (“Zebra”, i, [u,v]) (u, v) i (“Zebra”, j, [x,y]) j (x, y)

8 8 Canonical Approach: External Storage (“Zebra”, i, [u,v]) (u, v) i (“Zebra”, j, [x,y]) j (x, y) (“Cat”, k, [s,t]) (s, t) For n nodes, D t total events detected, cost (in pkts): –Total: –Hotspot: (at access point)

9 9 Canonical Approach: Data-Centric Storage (DCS) For n nodes, Q names queried, D q of those events detected, cost (in pkts): Total (full enumeration): Total (summarization): Hotspot (full enumeration): (at access point) Hotspot (summarization): (at access point) user “Zebra?” j i (x, y) (u, v)

10 10 Cost Comparison of Canonical Approaches Local storage incurs greatest total message count as n grows External storage always sends fewer total messages than DCS When many more event types detected than queried for, DCS incurs least hotspot message count DCS permits summarization of events (return multiple events in one packet)

11 11 Outline Motivation and Context Canonical Data Dissemination Approaches Geographic Hash Table (GHT) Service Evaluation in Simulation Summary

12 12 Geographic Hash Table: A Sketch Two operations: –Put(k, v) stores event v under key k –Get(k) retrieves event associated with key k Hash key k into geo coordinates; store and retrieve events for that key at that location –Spreads key space storage load evenly across network! user H(“Zebra”) = (a, b) (a, b) “Zebra?” j i (x, y) (u, v)

13 13 Design Criteria for Scalable, Robust DCS Storage system must offer persistence despite node and link failures –If node holding k changes, queries and data must make consistent choice of new node Storage shouldn’t concentrate at any one node Storage capacity should increase with node count As ever, avoid traffic concentration, minimize message count

14 14 GHT: Home Nodes and Perimeters Likely no node exactly at H(k); hash function ignorant of topology Home node: closest node to point output by H(k) Home perimeter: perimeter enclosing point output by H(k)

15 15 Consistency: Perimeter Refresh Protocol (PRP) (k,v) pairs replicated at all nodes on home perimeter Non-home nodes on home perimeter: replica nodes Home node sends refresh packets every T h seconds, containing all (k,v), to H(k) Receiver of refresh who is closer to H(k) than originator consumes it, initiates its own Replica node becomes home node if its own refresh returns Upon forwarding a refresh, node resets takeover timer for T t seconds; upon expiration, node generates a refresh for k Death timer: all nodes expire (k,v) pairs they cache after T d seconds; reset every time refresh for k received.

16 16 Outline Motivation and Context Canonical Data Dissemination Approaches Geographic Hash Table (GHT) Service Evaluation in Simulation Summary

17 17 Simulation Parameters Radio Type802.11 MAC and PHY, 40 m range Node Density1 node / 256 m2 Mobility Rate0.0, 0.1, 1.0 m/s Number of Nodes50, 100, 150, 200 Query Generation Rate2 qps Query Start Time42 s Refresh Interval10 s Event Types20 Simulation Time300 s Events Detected10 / type Up/Down Duty Cycle[0, 120] s up; [0, 60] s down

18 18 Query Success Rate w/Node Failures (100 Nodes)

19 19 Storage per Node w/Node Failures (100 Nodes)

20 20 Further Scaling and Robustness Results Mean and maximum storage load per node decrease as node population increases Query success rate above 96% for mobility rates of 0.1 m/s and 1 m/s Query success rate degrades gracefully as alternation between up/down states accelerates Validation of relative message costs of three canonical approaches in simulations of up to 100,000 nodes

21 21 Follow-On Work in DCS Mapping geographic boundaries of a network; support hashing to inside a network with changing boundaries DCS without geographic routing: GEM [NeSo03] Range queries for GHT using K-D trees: DIM [LiGo03] Assigning coordinates for geographic routing using only topological knowledge (not, e.g., GPS) [RaRa03] Dealing with non-uniform node distributions; multiple hash functions [GaEs03]

22 22 DCS: Summary Three canonical approaches will be useful in data dissemination for sensor networks: local storage, external storage, and data-centric storage Summarization is a key advantage of the DCS approach in reducing hotspot usage and total usage; home node is a useful aggregation point Sensor applications with many nodes, many event types, not all queried are those where DCS offers most attractive performance vs. other canonical approaches GHT spreads storage load evenly on sensor networks GHT offers robust persistence under node failures and mobility, because it binds data to fixed locations, rather than to “volatile” nodes


Download ppt "Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006."

Similar presentations


Ads by Google