Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and.

Similar presentations


Presentation on theme: "Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and."— Presentation transcript:

1 Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and Dan Rubenstein DNA Research Group, Columbia University

2 Motivation and Model Typical Scenario of Sensor Networks  Large number of nodes deployed to ``sense'' environment  Data collected periodically pulled/pushed through a sink/gateway node  Nodes prone to failure (disaster, battery life, targeted attack) Want data to survive individual node failures ``Data Persistence''

3 Overview  Erasure codes  LT-Codes  Soliton distribution  Coding for failure-prone sensor networks  Major results  A brief sketch of proofs  A case study of failure-prone sensor networks

4 Erasure Codes Message Encoding Received Message Encoding Algorithm Decoding Algorithm Transmission n cn n n

5 Luby Transform Codes  Simple Linear Codes  Improvement over “Tornado codes”  Rateless Codes

6 Erasure Codes: LT-Codes b1b1 b2b2 b3b3 b4b4 b5b5 F= n=5 input blocks

7 LT-Codes: Encoding b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 1.Pick degree d 1 from a pre-specified distribution. ( d 1 =2) 2.Select d 1 input blocks uniformly at random. (Pick b 1 and b 4 ) 3.Compute their sum (XOR). 4.Output sum, block IDs E(F)= F=

8 LT-Codes: Encoding E(F)= b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 F=

9 LT-Codes: Decoding b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 F= b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 E(F)= F= b5b5 b5b5 b5b5 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c5c5 c6c6 c7c7 b5b5 c4c4 b5b5 b5b5 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c5c5 c6c6 c7c7 b5b5 c4c4 b5b5 b5b5 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c5c5 c6c6 c7c7 b5b5 c4c4 b5b5 b5b5 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c5c5 c6c6 c7c7 b5b5 c4c4 b5b5 b5b5 b2b2 b2b2 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c5c5 c6c6 c7c7 b5b5 c4c4 b5b5 b5b5 b2b2 b2b2

10 Degree Distribution for LT- Codes  Soliton Distribution:  Avg degree H(N) ~ ln(N)  In expectation: Exactly one degree 1 symbol in each round of decoding  Distribution very fragile in practice

11 Failure-prone Sensor Networks  All earlier works:  How many encoded symbols needed to recover all original symbols (all or nothing decoding)  Failure-prone networks:  How many original symbols can be recovered from given surviving encoded symbols

12 Iterative Decoder x1x1 x3x3 x5x5 x2x2 x1x1 x3x3 x4x4 x3x3 Received Symbols Recovered Symbols 5 original symbols x 1 … x 5 4 encoded symbols received Each encoded symbol is XOR of component original symbols x3x3 x1x1 x4x4

13 Sensor Network Model  Encoded Symbols remaining: k  Want to maximize “r”, the recovered original data symbols  No idea apriori what k will be

14 Coding is bad, for small k  N original symbols  k encoded symbols received  If k ≤ 0.75N, no coding required N = 128

15 Proof Sketch Theorem: To recover first N/2 symbols, it is best to not do any encoding Proof: 1.Let C(i, j) = Expected symbols recovered from i degree 1 and j symbols of degree 2 or more. 2.C(i, j) ≤ C(i+1, j-1) if C(i, j) ≤ N/2 a.Sort given symbols in decoding order b.All degree 1 symbols will be decoded before other symbols c.Last symbol in decoded order will be of degree > 1 (see b.) d.Replace this symbol by a random degree 1 symbol e.New degree 1 symbol more likely to be useful 3.Hence, more degree 1 symbols => Better output 4.No coding is best to recover any first N/2 symbols 5.All degree 1 => Coupon Collector’s => ≈ 3N/4 symbols to recover N/2 distinct symbols

16 Ideal Degree Distribution Theorem: To recover r data units such that r < jN/(j+1), the optimal degree distribution has symbols of degree j or less only.

17 Lower degree are better for small k  If k ≤ k j, use symbols of up to degree j  So, use k j – k j-1 degree j symbols in close to optimal distribution N = 128

18 Case Study: Single-sink Sensor Network Storage 1 2 4 3 Sink Sensor node nodes exchange symbols nodes 2 and 3 transfer new symbols to the sink

19 Case Study: Single-sink Sensor Network  Network prone to failure  Nodes store unencoded symbols at first and higher degrees with time  Sink receives low degree symbols first and higher degree as time goes on 1 2 4 3 Sink

20 Distributed Simulation Clique Topology  N = 128 nodes in a clique topology  Sink receives one symbol per unit time

21 Distributed Simulation Chain Topology  N = 128 nodes in a chain topology 1 2 3 …N

22 Related Work  Bulk Data Distribution: Coding is useful  Tornado (Efficient Erasure Correcting Codes by M. Luby et. al., IEEE Transactions on Information Theory, vo. 47, no. 2, 2001)  LT-Codes (LT Codes by M. Luby, FOCS 2002)  Reliable Storage in Sensor Networks  Decentralized erasure code (Ubiquitous Access to Distributed Data in Large-Scale Sensor Networks through Decentralized Erasure Codes by A. Dimakis et. al., IPSN 2005)  Random Linear Coding (“How Good is Random Linear Coding Based Distributed Networked Storage?” by M. Medard et. al., NetCod 2005)


Download ppt "Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and."

Similar presentations


Ads by Google