Presentation is loading. Please wait.

Presentation is loading. Please wait.

DNA Research Group 1 Growth Codes: Maximizing Sensor Network Data Persistence Abhinav Kamra, Vishal Misra, Dan Rubenstein Department of Computer Science,

Similar presentations


Presentation on theme: "DNA Research Group 1 Growth Codes: Maximizing Sensor Network Data Persistence Abhinav Kamra, Vishal Misra, Dan Rubenstein Department of Computer Science,"— Presentation transcript:

1 DNA Research Group 1 Growth Codes: Maximizing Sensor Network Data Persistence Abhinav Kamra, Vishal Misra, Dan Rubenstein Department of Computer Science, Columbia University Jon Feldman Google Labs

2 ACM Sigcomm 2006 2 Outline  Problem Description  Solution Approach: Growth Codes  Experiments and Simulations  Conclusions and Ongoing work

3 ACM Sigcomm 2006 3 Background: A generic sensor network Sink(s) Sensor Nodes Data follows multi-hop path to sink(s) Sensed Data x1x1 x9x9 x 10 x 12 x 11 x 13 x4x4 x5x5 x6x6 x3x3 x2x2 x8x8 x7x7 A few node failures can break the data flow Generic Aim: Collect data from all nodes at sink(s)

4 ACM Sigcomm 2006 4 Specific Context: Disaster Scenarios  e.g., Monitoring earthquakes, fires, floods, war zones  Problems in this setting  Congestion near sink(s)  All nodes simultaneously forward data  Overwhelm sink(s) capacity Congestion near sink Virtual queue:

5 ACM Sigcomm 2006 5 Specific Context: Disaster Scenarios - 2  Problems in this setting  Network Collapsing: nodes failing rapidly  Pre-computed routes may fail  Data from failed nodes can be lost  Data Recovery from subset of nodes acceptable

6 ACM Sigcomm 2006 6 Challenges  Networking Challenges:  Disaster scenarios: feedback often infeasible  Frequent disruptions to routing tree if setup  Difficult to predict node failures: sink locations unknown, surviving routes unknown  Difficult to synchronize nodes’ clocks  Coding Challenges:  Data source distributed (among all sensor nodes)  Prior approaches ( Turbo codes, LDPC codes ) aim at fast complete recovery  Sensor nodes have very limited memory, CPU, bandwidth

7 ACM Sigcomm 2006 7 Maximize Data Persistence Preserve data from failed sensor nodes Deliver data to sink(s) as fast as possible Objectives 6 of 10 symbols reach sink. Persistence = 60% Fraction of data that eventually reaches the sink(s) x1x1 x9x9 x5x5 x3x3 x2x2 x8x8 x 10 x 12 x 11 x6x6 + = Sink Data Persistence

8 ACM Sigcomm 2006 8 Limitations of Previous Work  Channel Coding based (e.g. Turbo Codes [Anderson-ISIT94], LT Codes [Luby02] )  Aim for complete recovery in minimum time  Difficult to implement with distributed sources  Routing-based (e.g. Directed Diffusion [Govindan00], Cougar [Yao-SIGMOD02] )  Conjecture: Too fragile (disrupted easily) for disaster scenarios

9 ACM Sigcomm 2006 9 Our Approach  Two main ideas  Randomized routing and replication  Avoid actively maintaining routes  Replicate data to increase data survival  Distributed channel codes (Growth Codes)  Expedite data delivery & survivability First (to our knowledge) distributed channel codes

10 ACM Sigcomm 2006 10 Outline  Problem Description  Our Solution: Growth Codes  Experiments and Simulations  Conclusions and Ongoing work

11 ACM Sigcomm 2006 11 Network Assumptions  N node sensor network  Limited storage: each node stores small # of data units  Large storage at sink(s): sink receives codewords from random node(s)  All sensed data assumed independent (no source coding) 5 1 4 3 7 2 6 S S

12 ACM Sigcomm 2006 12 High Level View of the Protocol 1 4 2 3 Nodes send data at random times (Current implementation: exponentially distributed timers)

13 ACM Sigcomm 2006 13 High Level View of the Protocol (2) 1 2 After time K 1, nodes start sending degree 2 codewords  Degree 2 codeword Symbols Degree 1 codewords Sender picks a random symbol XORs it with its own symbol 4 3 Even if node 3 fails Node 3’s data survives 0 K2K2 K3K3 K1K1

14 ACM Sigcomm 2006 14 High Level View of the Protocol (3)  After time K 1, nodes start sending degree 2 codewords  After time K 2, nodes start sending degree 3 codewords.  After time K i, nodes start sending degree i+1 codewords (Times K i can be out of sync at different nodes) Note: No need to tightly synchronize clocks 0 K2K2 K3K3 K1K1 What are good values for {K i }? Please refer to our paper

15 ACM Sigcomm 2006 15 The Intuition behind Growth Codes Set of symbols decoded at Sink Codewords When very few symbols decoded Easy to decode low degree codewords time

16 ACM Sigcomm 2006 16 The Intuition behind Growth Codes(2) When significant number of symbols decoded Low degree codewords often redundant Higher degree codewords more likely to be useful Set of symbols decoded at Sink Codewords

17 ACM Sigcomm 2006 17 Outline  Problem Description  Growth Codes  Simulations and Experiments  Conclusions and Ongoing work

18 ACM Sigcomm 2006 18 Simulations/Experiments: Compare data persistence of various approaches 1. Simulations:  Centralized Setting: compare GC with other channel coding schemes  Distributed Simulation: assess large-scale performance of coding vs no coding 2. Experiments on motes:  Compare time of complete recovery for GC vs routing  Measure resilience to node failures

19 ACM Sigcomm 2006 19  No coding is fast in beginning: slowdown is explained via Coupon Collector’s problem  Soliton/ R-Soliton: poor partial recovery (reason: high degree codewords sent too early)  Growth Codes closest to theoretical upper bound (reason: right degree at the right time) Centralized Simulation (to compare with other channel coding schemes for which only centralized versions exist)  Single source, single sink  Source generates random codewords according to coding scheme (GC, Soliton)  Zero failure rate Comparison with various coding schemes (N = 1500) 1 Source Sink

20 ACM Sigcomm 2006 20 Growth Codes vs No Coding (Varying N) Distributed Simulation (to assess the performance gain of coding)  N sources, single sink  Random graph topology (avg degree 10)  Sink receives 1 codeword per time unit  Complete recovery takes:  O(N logN) time without coding (Coupon Collector’s effect)  Linear time with Growth Codes  Soliton/R-Soliton: cannot compare in a distributed setup

21 ACM Sigcomm 2006 21 Experiments with (micaz) motes (to measure data persistence with time)  GC vs TinyOS’s “MultiHop” routing protocol  No routing state at time 0 (scenario where sensor nodes are deployed rapidly)  “MultiHop” for persistence: takes long time to complete route setup  Comparison with GC simulator validates simulator performance S Experimental Topology

22 ACM Sigcomm 2006 22 Motes experiments: Resilience to node failures  Nodes generate data every 300 seconds  3 nodes fail just after 3 rd data generation 0300 600900 Nodes generate data “MultiHop” sets up routing “MultiHop” repairs routes Nodes send data to sink 3 random nodes fail S Experimental Topology

23 ACM Sigcomm 2006 23 Motes experiments: Resilience to node failures  1 st generation: GC faster, MH takes time to setup routes  2 nd generation: routing already setup, MH very fast  3 rd generation: MH needs to repair routes 0300 600900 Nodes generate data “MultiHop” sets up routing “MultiHop” repairs routes Nodes send data to sink 3 random nodes fail

24 ACM Sigcomm 2006 24 Other Results: Please refer to our paper  Good values for K 1, K 2, …  More simulations/experiments  Various topologies  Other failure scenarios  Implementation details:  Memory usage at sensor nodes: how it affects performance  How to handle periodic data generation  How to reduce overhead of coefficients

25 ACM Sigcomm 2006 25 Conclusions  Data persistence in sensor networks:  First distributed channel codes (GC)  Protocol requires minimal configuration  Is robust to node failures  Simulations and experiments on micaz motes show, (compared to prior coding and routing methods)  GC achieves complete recovery faster  GC recovers more partial data at any time

26 ACM Sigcomm 2006 26 Ongoing Work  Adapt Growth Codes to scenarios where sensor data is correlated  Take advantage of any available routing information (e.g. before a disaster)  Estimate network size on the fly to use in Growth Codes

27 ACM Sigcomm 2006 27 Thanks for your patience ! For more information DNA Research Lab, Columbia University http://dna-wsl.cs.columbia.edu/


Download ppt "DNA Research Group 1 Growth Codes: Maximizing Sensor Network Data Persistence Abhinav Kamra, Vishal Misra, Dan Rubenstein Department of Computer Science,"

Similar presentations


Ads by Google