Download presentation
Presentation is loading. Please wait.
1
Growth Codes: Maximizing Sensor Network Data Persistence Abhinav Kamra, Vishal Misra, Dan Rubenstein Department of Computer Science, Columbia University Jon Feldman Google Labs ACM SIGCOMM 2006
2
Outline Problem Description Solution Approach: Growth Codes Experiments and Simulations Conclusions
3
Background: A generic sensor network Sink(s) Sensor Nodes Data follows multi-hop path to sink(s) Sensed Data x1x1 x9x9 x 10 x 12 x 11 x 13 x4x4 x5x5 x6x6 x3x3 x2x2 x8x8 x7x7 A few node failures can break the data flow Generic Aim: Collect data from all nodes at sink(s)
4
Data Persistence We define data persistence of a sensor network to be the fraction of data generated within the network that eventually reaches the sink. Focus of Work: Maximizing Data Persistence
5
Specific Context: Disaster Scenarios e.g., Monitoring earthquakes, fires, floods, war zones Problems in this setting Congestion near sink(s) All nodes simultaneously forward data Overwhelm sink(s) capacity Congestion near sink Virtual queue:
6
Specific Context: Disaster Scenarios - 2 Problems in this setting Network Collapsing: nodes failing rapidly Pre-computed routes may fail Data from failed nodes can be lost Data Recovery from subset of nodes acceptable
7
Challenges Networking Challenges: Disaster scenarios: feedback often infeasible Frequent disruptions to routing tree if setup Difficult to predict node failures: sink locations unknown, surviving routes unknown Difficult to synchronize nodes’ clocks Coding Challenges: Data source distributed (among all sensor nodes) Prior approaches ( Turbo codes, LDPC codes ) aim at fast complete recovery Sensor nodes have very limited memory, CPU, bandwidth
8
Maximize Data Persistence Preserve data from failed sensor nodes Deliver data to sink(s) as fast as possible Objectives 6 of 10 symbols reach sink. Persistence = 60% Fraction of data that eventually reaches the sink(s) x1x1 x9x9 x5x5 x3x3 x2x2 x8x8 x 10 x 12 x 11 x6x6 + = Sink Data Persistence
9
Limitations of Previous Work Channel Coding based (e.g. Turbo Codes [Anderson-ISIT94], LT Codes [Luby02] ) Aim for complete recovery in minimum time Difficult to implement with distributed sources Routing-based (e.g. Directed Diffusion [Govindan00], Cougar [Yao-SIGMOD02] ) Conjecture: Too fragile (disrupted easily) for disaster scenarios
10
Our Approach Two main ideas Randomized routing and replication Avoid actively maintaining routes Replicate data to increase data survival Distributed channel codes (Growth Codes) Expedite data delivery & survivability First (to our knowledge) distributed channel codes
11
Outline Problem Description Our Solution: Growth Codes Experiments and Simulations Conclusions
12
Network Assumptions N node sensor network Limited storage: each node stores small # of data units Large storage at sink(s): sink receives codewords from random node(s) All sensed data assumed independent (no source coding) 5 1 4 3 7 2 6 S S
13
Terminology Codewords linear combinations of (randomly selected) groupings of data units original data or XOR’d conglomerates of original data C = (A ⊕ B) ⊕ (A ⊕ B ⊕ C) Degree of a codeword The number of symbols XOR’d together to form the codeword
14
Growth Codes Degree of a codeword “grows” with time At each timepoint codeword of a specific degree has the most utility for a decoder (on average) This “most useful” degree grows monotonically with time R: Number of decoded symbols sink has R1R1 R3R3 R2R2 R4R4 d=1 d=2d=3d=4 Time ->
15
Ideas of Proposed Method Method: Growth Codes: Been designed for sensor networks in catastrophic or emergency scenarios. To make new received encoded packet useful. –Can be decoded immediately. To avoid new received encoded packet useless. –Cannot be decoded. http://www.powercam.cc/slide/284
16
Ideas of Proposed Method Growth Codes: A received encoded packet is immediately useful: if d - 1 of the data used to form this encoded packet are already decoded/known. y4y4 x3x5x6x3x5x6 already decoded data:new received packets: x1x1 x2x2 x3x3 x5x5 x3x3 x5x5 y4y4 x6x6 d = 3 d – 1 data are already decoded. http://www.powercam.cc/slide/284
17
Ideas of Proposed Method Growth Codes: A received encoded packet is useless: if all d data used to form a encoded packet are already known. y1y1 x1x3x1x3 already decoded data:new received packets: x1x1 x2x2 x3x3 x5x5 d = 2 d data are already decoded. new received packet is useless. http://www.powercam.cc/slide/284
18
Ideas of Proposed Method Consider the degree of an encoded packet: Decoder has decoded r original data. The probability that new received encoded packet is immediately decodable to the decoder: Number of decoded original data: r Importance of Immediately Decodable Packet : Low Degree : High Degree http://www.powercam.cc/slide/284
19
2 8 1 x1x1 x3x3 In the beginning: Nodes 1 and 3 exchanging codewords 3 x3x3 x3x3 x3x3 x3x3 x1x1 x1x1 x1x1 x1x1 Later on: Node 1 is destroyed: Symbol x 1 survives in the network. Nodes are now exchanging degree 2 codewords 2 8 1 3 x4⊕x3x4⊕x3 x8x8 x8⊕x7x8⊕x7 x1⊕x4x1⊕x4 x2⊕x8x2⊕x8 x3x3 x6⊕x3x6⊕x3 x4⊕x5x4⊕x5 x2⊕x8x2⊕x8 x1⊕x4x1⊕x4 Figure 1: Localized view of the network. In the beginning, the nodes exchange degree 1 codewords, gradually increasing the degree over time. Even when a node fails, its data survives in the another node’s storage
20
Figure 2: Growth Codes in action: The sink receives low degree codewords in the beginning and higher and higher degree later on
21
Growth Codes: Encoding R i is what the sink has received What about encoding? To decode R i, sink needs to receive some K i codewords, sampled uniformly Sensor nodes estimate K i and transition accordingly Optimal transition points a function of N, the size of the network Exact value of K 1 computed. Upper bounds for K i, i > 1 computed.
22
Implementation of Growth Codes Time divided into rounds Each node exchanges degree 1 codewords with random neighbor until round K 1 Between round K i and K i-1 nodes exchange degree i codewords Sink receives codewords as they get exchanged in the network Growth Code degree distribution at time k
23
High Level View of the Protocol 1 4 2 3 Nodes send data at random times (Current implementation: exponentially distributed timers)
24
High Level View of the Protocol (2) 1 2 After time K 1, nodes start sending degree 2 codewords Degree 2 codeword Symbols Degree 1 codewords Sender picks a random symbol XORs it with its own symbol 4 3 Even if node 3 fails Node 3’s data survives 0 K2K2 K3K3 K1K1
25
High Level View of the Protocol (3) After time K 1, nodes start sending degree 2 codewords After time K 2, nodes start sending degree 3 codewords. After time K i, nodes start sending degree i+1 codewords (Times K i can be out of sync at different nodes) Note: No need to tightly synchronize clocks 0 K2K2 K3K3 K1K1
26
The Intuition behind Growth Codes Set of symbols decoded at Sink Codewords When very few symbols decoded Easy to decode low degree codewords time
27
The Intuition behind Growth Codes(2) When significant number of symbols decoded Low degree codewords often redundant Higher degree codewords more likely to be useful Set of symbols decoded at Sink Codewords
28
Outline Problem Description Growth Codes Simulations and Experiments Conclusions
29
Simulations/Experiments: Compare data persistence of various approaches 1. Simulations: Centralized Setting: compare GC with other channel coding schemes Distributed Simulation: assess large-scale performance of coding vs no coding 2. Experiments on motes: Compare time of complete recovery for GC vs routing Measure resilience to node failures
30
No coding is fast in beginning: slowdown is explained via Coupon Collector’s problem Soliton/ R-Soliton: poor partial recovery (reason: high degree codewords sent too early) Growth Codes closest to theoretical upper bound (reason: right degree at the right time) Centralized Simulation (to compare with other channel coding schemes for which only centralized versions exist) Single source, single sink Source generates random codewords according to coding scheme (GC, Soliton) Zero failure rate Comparison with various coding schemes (N = 1500) 1 Source Sink
31
Growth Codes vs No Coding (Varying N) Distributed Simulation (to assess the performance gain of coding) N sources, single sink Random graph topology (avg degree 10) Sink receives 1 codeword per time unit Complete recovery takes: O(N logN) time without coding (Coupon Collector’s effect) Linear time with Growth Codes Soliton/R-Soliton: cannot compare in a distributed setup
32
Recovery Rate Without coding, a lot of data is lost during the disaster even when using randomized replication
33
Effect of Topology 500 nodes placed at random in a 1x1 square, nodes connected if within a distance of 0.3 R : the radius of the network
34
Resilience to Random Failures 500 node random topology network Nodes fail every second with a probability of 0.0005 (1 every 4 seconds in the beginning)
35
Experiments with Motes Crossbow micaz 2.4GHz IEEE 802.15.4 250 Kbps High Data Rate Radio
36
Experiments with (micaz) motes (to measure data persistence with time) GC vs TinyOS’s “MultiHop” routing protocol No routing state at time 0 (scenario where sensor nodes are deployed rapidly) “MultiHop” for persistence: takes long time to complete route setup Comparison with GC simulator validates simulator performance S Experimental Topology
37
Motes experiments: Resilience to node failures Nodes generate data every 300 seconds 3 nodes fail just after 3 rd data generation 0300 600900 Nodes generate data “MultiHop” sets up routing “MultiHop” repairs routes Nodes send data to sink 3 random nodes fail S Experimental Topology
38
Motes experiments: Resilience to node failures 1 st generation: GC faster, MH takes time to setup routes 2 nd generation: routing already setup, MH very fast 3 rd generation: MH needs to repair routes 0300 600900 Nodes generate data “MultiHop” sets up routing “MultiHop” repairs routes Nodes send data to sink 3 random nodes fail
39
Conclusions Data persistence in sensor networks: First distributed channel codes (GC) Protocol requires minimal configuration Is robust to node failures Simulations and experiments on micaz motes show: GC achieves complete recovery faster GC recovers more partial data at any time
40
Received codewords Iterative Decoding x1x1 x3x3 x5x5 x2x2 x1x1 x3x3 x4x4 x3x3 Recovered symbols Unused codewords 5 original symbols x 1 … x 5 4 codewords received Each codeword is XOR of component original symbols
41
Online Decoding at the Sink x1x1 Recovered Symbols x6x6 x3x3 Undecoded codewords x2⊕x5x2⊕x5 Sink New codeword x2⊕x6x2⊕x6 x1x1 Recovered Symbols x6x6 x3x3 Undecoded codewords x2x2 = x6x6 ⊕ x2⊕x5x2⊕x5 x5x5 = x2x2 ⊕ x2⊕x6x2⊕x6 x5x5 Sink x2x2
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.