1 Fountain Codes Based Distributed Storage Algorithms for Large-scale Wireless Sensor Networks Salah A. Aly, Zhenning Kong, Emina Soljanin IEEE IPSN 2008.

Slides:

Advertisements

Similar presentations

Jesper H. Sørensen, Toshiaki Koike-Akino, and Philip Orlik 2012 IEEE International Symposium on Information Theory Proceedings Rateless Feedback Codes.

Advertisements

Performance analysis of LT codes with different degree distribution

D.J.C MacKay IEE Proceedings Communications, Vol. 152, No. 6, December 2005.

An Energy Efficient Routing Protocol for Cluster-Based Wireless Sensor Networks Using Ant Colony Optimization Ali-Asghar Salehpour, Babak Mirmobin, Ali.

LT-AF Codes: LT Codes with Alternating Feedback Ali Talari and Nazanin Rahnavard Oklahoma State University IEEE ISIT (International Symposium on Information.

1 Routing Techniques in Wireless Sensor networks: A Survey.

Source-Location Privacy Protection in Wireless Sensor Network Presented by: Yufei Xu Xin Wu Da Teng.

Data Persistence in Sensor Networks: Towards Optimal Encoding for Data Recovery in Partial Network Failures Abhinav Kamra, Jon Feldman, Vishal Misra and.

A Novel Cluster-based Routing Protocol with Extending Lifetime for Wireless Sensor Networks Slides by Alex Papadimitriou.

Compressive Oversampling for Robust Data Transmission in Sensor Networks Infocom 2010.

1 Data Persistence in Large-scale Sensor Networks with Decentralized Fountain Codes Yunfeng Lin, Ben Liang, Baochun Li INFOCOM 2007.

1 Rateless Packet Approach for Data Gathering in Wireless Sensor Networks Dejan Vukobratovic, Cedomir Stefanovic, Vladimir Crnojevic, Francesco Chiti,

Localized Techniques for Power Minimization and Information Gathering in Sensor Networks EE249 Final Presentation David Tong Nguyen Abhijit Davare Mentor:

Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.

Growth Codes: Maximizing Sensor Network Data Persistence Abhinav Kamra, Vishal Misra, Dan Rubenstein Department of Computer Science, Columbia University.

Fountain Codes Amin Shokrollahi EPFL and Digital Fountain, Inc.

Sliding-Window Digital Fountain Codes for Streaming of Multimedia Contents Matta C.O. Bogino, Pasquale Cataldi, Marco Grangetto, Enrico Magli, Gabriella.

DNA Research Group 1 Growth Codes: Maximizing Sensor Network Data Persistence Abhinav Kamra, Vishal Misra, Dan Rubenstein Department of Computer Science,

On the Construction of Energy- Efficient Broadcast Tree with Hitch-hiking in Wireless Networks Source: 2004 International Performance Computing and Communications.

Network Coding Project presentation Communication Theory 16:332:545 Amith Vikram Atin Kumar Jasvinder Singh Vinoo Ganesan.

Geographic Gossip: Efficient Aggregations for Sensor Networks Author: Alex Dimakis, Anand Sarwate, Martin Wainwright University: UC Berkeley Venue: IPSN.

Building Low-Diameter P2P Networks Eli Upfal Department of Computer Science Brown University Joint work with Gopal Pandurangan and Prabhakar Raghavan.

A Hierarchical Energy-Efficient Framework for Data Aggregation in Wireless Sensor Networks IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 55, NO. 3, MAY.

1 Distributed LT Codes Srinath Puducheri, Jörg Kliewer, and Thomas E. Fuja. Department of Electrical Engineering, University of Notre Dame, Notre Dame,

Code and Decoder Design of LDPC Codes for Gbps Systems Jeremy Thorpe Presented to: Microsoft Research

Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.

DNA Research Group 1 Growth Codes: Maximizing Sensor Network Data Persistence Vishal Misra Joint work with Abhinav Kamra, Jon Feldman (Google) and Dan.

Probability Grid: A Location Estimation Scheme for Wireless Sensor Networks Presented by cychen Date ： 3/7 In Secon (Sensor and Ad Hoc Communications and.

Distributed Combinatorial Optimization

How to Turn on The Coding in MANETs Chris Ng, Minkyu Kim, Muriel Medard, Wonsik Kim, Una-May O’Reilly, Varun Aggarwal, Chang Wook Ahn, Michelle Effros.

Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.

Data Selection In Ad-Hoc Wireless Sensor Networks Olawoye Oyeyele 11/24/2003.

Anya Apavatjrut, Katia Jaffres-Runser, Claire Goursaud and Jean-Marie Gorce Combining LT codes and XOR network coding for reliable and energy efﬁcient.

An Adaptive Probability Broadcast- based Data Preservation Protocol in Wireless Sensor Networks Liang, Jun-Bin ； Wang, Jianxin; Zhang, X.; Chen, Jianer.

Quantum Error Correction Jian-Wei Pan Lecture Note 9.

Repairable Fountain Codes Megasthenis Asteris, Alexandros G. Dimakis IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 32, NO. 5, MAY /5/221.

1 Secure Cooperative MIMO Communications Under Active Compromised Nodes Liang Hong, McKenzie McNeal III, Wei Chen College of Engineering, Technology, and.

Rateless Codes with Optimum Intermediate Performance Ali Talari and Nazanin Rahnavard Oklahoma State University, USA IEEE GLOBECOM 2009 & IEEE TRANSACTIONS.

A novel gossip-based sensing coverage algorithm for dense wireless sensor networks Vinh Tran-Quang a, Takumi Miyoshi a,b a Graduate School of Engineering,

Optimal Degree Distribution for LT Codes with Small Message Length Esa Hyytiä, Tuomas Tirronen, Jorma Virtamo IEEE INFOCOM mini-symposium

Related Works of Data Persistence in WSN htchiu 1.

A Non-Monetary Protocol for P2P Content Distribution in Wireless Broadcast Networks with Network Coding I-Hong Hou, Yao Liu, and Alex Sprintson Dept. of.

Shifted Codes Sachin Agarwal Deutsch Telekom A.G., Laboratories Ernst-Reuter-Platz Berlin Germany Joint work with Andrew Hagedorn and Ari Trachtenberg.

An Optimal Partial Decoding Algorithm for Rateless Codes Valerio Bioglio, Rossano Gaeta, Marco Grangetto, and Matteo Sereno Dipartimento di Informatica.

Chih-Ming Chen, Student Member, IEEE, Ying-ping Chen, Member, IEEE, Tzu-Ching Shen, and John K. Zao, Senior Member, IEEE Evolutionary Computation (CEC),

User Cooperation via Rateless Coding Mahyar Shirvanimoghaddam, Yonghui Li, and Branka Vucetic The University of Sydney, Australia IEEE GLOBECOM 2012 &

1/30 Energy-Efficient Forwarding Strategies for Geographic Routing in Lossy Wireless Sensor Networks Wireless and Sensor Network Seminar Dec 01, 2004.

Threshold Phenomena and Fountain Codes Amin Shokrollahi EPFL Joint work with M. Luby, R. Karp, O. Etesami.

Growth Codes: Maximizing Sensor Network Data Persistence abhinav Kamra, Vishal Misra, Jon Feldman, Dan Rubenstein Columbia University, Google Inc. (SIGSOMM’06)

Salah A. Aly,Moustafa Youssef, Hager S. Darwish,Mahmoud Zidan Distributed Flooding-based Storage Algorithms for Large-Scale Wireless Sensor Networks Communications,

CprE 545 project proposal Long.  Introduction  Random linear code  LT-code  Application  Future work.

Stochastic Networks Conference, June 19-24, Connections between network coding and stochastic network theory Bruce Hajek Abstract: Randomly generated.

LT Network Codes Mary-Luc Champel, Kevin Huguenin, Anne-Marie Kermarrec and Nicolas Le Scouarnec Technicolor, Rennes, France IEEE ICDCS (International.

UEP LT Codes with Intermediate Feedback Jesper H. Sørensen, Petar Popovski, and Jan Østergaard Aalborg University, Denmark IEEE COMMUNICATIONS LETTERS,

O PTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY G REGORY L EVITIN, Y UAN -S HUN D AI Adviser: Frank, Yeong-Sung Lin.

ON THE INTERMEDIATE SYMBOL RECOVERY RATE OF RATELESS CODES Ali Talari, and Nazanin Rahnavard IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 60, NO. 5, MAY 2012.

Multi-Edge Framework for Unequal Error Protecting LT Codes H. V. Beltr˜ao Neto, W. Henkel, V. C. da Rocha Jr. Jacobs University Bremen, Germany IEEE ITW(Information.

Network Coding Data Collecting Mechanism based on Prioritized Degree Distribution in Wireless Sensor Network Wei Zhang, Xianghua Xu, Qinchao Zhang, Jian.

Nour KADI, Khaldoun Al AGHA 21 st Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications 1.

Energy Efficient Data Management for Wireless Sensor Networks with Data Sink Failure Hyunyoung Lee, Kyoungsook Lee, Lan Lin and Andreas Klappenecker †

Raptor Codes Amin Shokrollahi EPFL. BEC(p 1 ) BEC(p 2 ) BEC(p 3 ) BEC(p 4 ) BEC(p 5 ) BEC(p 6 ) Communication on Multiple Unknown Channels.

Hongjie Zhu,Chao Zhang,Jianhua Lu Designing of Fountain Codes with Short Code-Length International Workshop on Signal Design and Its Applications in Communications,

Data funneling : routing with aggregation and compression for wireless sensor networks Petrovic, D.; Shah, R.C.; Ramchandran, K.; Rabaey, J. ; SNPA 2003.

SERENA: SchEduling RoutEr Nodes Activity in wireless ad hoc and sensor networks Pascale Minet and Saoucene Mahfoudh INRIA, Rocquencourt Le Chesnay.

KAIS T Location-Aided Flooding: An Energy-Efficient Data Dissemination Protocol for Wireless Sensor Networks Harshavardhan Sabbineni and Krishnendu Chakrabarty.

Coding for Multipath TCP: Opportunities and Challenges Øyvind Ytrehus University of Bergen and Simula Res. Lab. NNUW-2, August 29, 2014.

Salah A. Aly ,Moustafa Youssef, Hager S. Darwish ,Mahmoud Zidan

Introduction to Wireless Sensor Networks

Net 435: Wireless sensor network (WSN)

CRBcast: A Collaborative Rateless Scheme for Reliable and Energy-Efficient Broadcasting in Wireless Sensor/Actuator Networks Nazanin Rahnavard, Badri N.

Presentation transcript:

1 Fountain Codes Based Distributed Storage Algorithms for Large-scale Wireless Sensor Networks Salah A. Aly, Zhenning Kong, Emina Soljanin IEEE IPSN 2008

2 Outlines  Introduction  Fountain Codes  LT-Codes Based Distributed Storage (LTCDS) Algorithms With limited Global Information - LTCDS-I Without any Global Information - LTCDS-II  Performance Evaluation  Conclusion

3 Introduction  Nodes in wireless sensor networks have limited resources e.g. CPU power, bandwidth, memory, lifetime.  They can monitor objects, detect fires, floods, and other disaster phenomenon.  We consider a network with n randomly distributed nodes; among them are k sensor nodes, k/n = 10%.  Our goal is to find techniques to redundantly distribute data from k source nodes to n storage nodes.  So, by querying any (1+ε)k nodes, one can retrieve the original information acquired by the k sources with low computational complexity.

4 Introduction  There are 25 sensors monitoring an area.  There are 225 additional storage nodes.  Information acquired by the sensors should 1) be available in any neighborhood 2) easy to compute from storage 3) be extractable from any 25+ nodes Fig. 1. A sensor network has 25 sensors (big dots) monitoring an area and 225 storage nodes (small dots). A good distributed storage algorithm should enable us to recover the original 25 source packets from any 25+ nodes (e.g., the set of nodes within any one of the three illustrated circular regions).

5 Introduction  We know how to solve the centralized version of this problem by coding (e.g. Fountain Codes, MDS Codes, linear codes).  Our contribution: Solve the problem in a distributed decentralized random way on a network.  Problem: Find an efficient strategy to add some redundancy distribute information randomly through a network decode easily from (1+ε)k nodes

6 Network Model Suppose a network with n storage nodes randomly distributed. k source nodes have information to be disseminated randomly throughout the network for storage. Every source node s i generates an independent packet.  We will use Fountain codes and random walks to disseminate information from k to n. The idea is to build a system of n equations in k variables. For example, y 1 = x 1 ⊕ x 2 ⊕ x 3 y 2 = x 2 ⊕ x 3 ⊕ x 5 ⊕ x i y 3 = x 1 ⊕ x 3 ⊕ … ⊕ x k... y n = x 4 ⊕ x 6 ⊕ x i ⊕ … ⊕ x k Decode easily from (1+ε)k equations, forε>0.

7 Fountain codes  Assume k source blocks S = {x 1, x 2, …,x k }.  Each output block y i is obtained by XORing some source blocks from S.  d(y i ) is number of incoming blocks in ith equation, 1 ≦ d(y i ) ≦ k. The Fountain code idea: Choose d(y i ) randomly according to a probability distribution such that it is easy to decode from any (1+ε)k output blocks. Easy to decode:Hard to decode: x 1 x 1 ⊕ x 2 x 1 ⊕ x 2 x 2 ⊕ x 3 x 1 ⊕ x 2 ⊕ x 3 x 1 ⊕ x 4 ⊕ x 5  LT and Raptor codes are two classes of Fountain codes.

8 Fountain codes  For k source blocks {x 1, x 2, …,x k } and a probability distribution Ω (d) with 1 ≦ d ≦ k, a Fountain code with parameters (k, Ω ) is a potentially limitless stream of output blocks {y 1,y 2, … }  Each output block y i is obtained by XORing d randomly and independently chosen source blocks. Figure 1. The encoding operations of Fountain codes: each output is obtained by XORing d source blocks chosen uniformly and independently at random from k source inputs, where d is drawn according to a probability distribution Ω(d).

9 LT Codes  Definition 2. (Code Degree) For Fountain codes, the number of source blocks used to generate an encoded output y is called the code degree of y, and denoted by d c (y). By constraction, the code degree distributionΩ(d) is the probability distribution of d c (y).  LT (Luby Transform) codes are a special class of Fountain codes which uses Ideal Soliton or Robust Soliton distributions. The Ideal Soliton distribution Ω is (d) for k source blocks is given by  Robust Soliton distribution is a special case of Ideal Solition distribution with some further assumptions.

10 LT Codes  Lemma 3 (Luby [12]). For LT codes with Robust Soliton distribution, k original source blocks can be recovered from any k + O(√k ln 2 (k/δ)) encoded output blocks with probability 1 − δ.  Both encoding and decoding complexity is O(kln(k/δ)).

11 LT-Codes Based Distributed Storage (LTCDS) Algorithms  We propose 2 LT-Codes Based Distributed Storage (LTCDS) Algorithms. In both algorithms, the source packets are disseminated throughout the network by simple random walks with Robust Soliton distribution. i) Algorithm 1, called LTCDS-I, we assume that each node in the network knows the global information k and n. ii) Algorithm 2, called LTCDS-II, is a fully distributed algorithm and values of n and k are not known. The price we pay is extra transmissions of the source packets to obtain some estimations for n and k.

12 Previous Work  Previous work focused on techniques based on some pre-assumptions about the network such as geographical locations or routing tables. Lin el al.[Infocomm07] studied the question ” how to retrieve historical data that the sensors have gathered even if some nodes failed or disappeared? ” They proposed two decentralized algorithms using Fountain codes to guarantee the persistence and reliability of cached data on unreliable sensors. But they assume that the maximaun degree of a node is known and the source sends b packets (high complexity). Dimakis el al.[Infocomm06] used a decentralized implementation of Fountain codes that uses geographic routing and every node has to know its location. They applied their work to grid networks. Kamara el al.[Sigcomm07] proposed a novel technique called growth codes to increase data persistence, i.e. increasing the amount of information that can be recovered at the sink.

13 Algorithm 1: Knowing global information k and n  We use simple random walk for each source to disseminate its information.  Each node u that has packets to transmit chooses one node v among its neighbors uniformly independently at random.  We let each node to accept a source packet equiprobability.  Each source packet to visit each node in the network at least once.

14  The algorithm consists of three phases. (i) Initialization Phase:  (1) Each node u draws a random number d c (u) according to the distributionΩ is (d). Each source node S i,i = 1,..,k generate a header for its source packet x si and put its ID and a counter c(x si ) = 0.  (2) Each source node s i send out its own source packet si to one of its neighbor u, chosen uniformly at random among all its neighbors N(s i ).  (3) The node u accepts this source packet si with probability d/k and updates its storage as Algorithm 1: Knowing global information k and n

15 Algorithm 1: Knowing global information k and n (ii) Encoding Phase:  (1) In each round, when a node u receives at least one source packet before the current round, u forwards the head- of-line (HOL) packet x in its forward queue to one of its random neighbor v.  (2) The node v makes its decisions: If it is the first time that x visits u, then the node v accepts this source packet with probability d/k and updates its storage as Else if c(x) < C 1 nlogn where C 1 is a system parameter, then node v puts it into its forward queue and increases the counter of x by one: If c(x) ≧ C 1 nlogn then the node v discards packet x forever. (iii) Storage Phase:  When a node u has made its decisions for all the source packets x s1,x s2,..,x sk, i.e., all these packets have visited the node u at least once, the node u finishes its encoding process and y u is the storage packet of u.

16 Algorithm 1: Knowing global information k and n

17 Algorithm 1: Knowing global information k and n  Theorem 7. Suppose sensor networks have n nodes and k sources and the LTCDS-I algorithm uses the Robust Soliton distribution Ω rs. Then, when n and k are sufficient large, the k original source packets can be recovered from any k + O(√k ln 2 (k/δ)) storage nodes with probability 1 − δ. The decoding complexity is O(kln(k/δ)).  Theorem 8. Denote by the total number of transmissions of the LTCDS-I algorithm, then we have where k is the total number of sources, and n is the total number of nodes in the network.

18 Algorithm 2: Without knowing global information n and k  In the previous algorithm, values of n and k are known.  We do not assume any thing about the network topology.  Every node does not need to maintain a routing table or knows the maximum degree of a graph.  We design LTCDS-II algorithm, for large values of n and k.

19 Algorithm 2: Without knowing global information n and k  We design a fully distributed storage algorithm which does not require any global information i.e., values of k and n are not known.  The idea is to utilize simple random walks to do inference to obtain individual estimations of n and k for each node.  We use inter-visit time of random graphs.  Definition 9. (Inter-Visit Time or Return Time) For a random walk on a graph, the inter-visit time of node u, T visit (u), is the amount of time between any two consecutive visits of the random walk to node u. This inter-visit time is also called return time.  Our goal is to compute and

20 Algorithm 2: Without knowing global information n and k  Lemma 10. For a node u with node degree d n (u) in a random geometric graph, the mean inter-visit time is given by where μ is the mean degree of the graph.  The total number of nodes n can be estimated by  However, the mean degree μ is a global information and may be hard to obtain. Thus, with some further approximation and we have

21 Algorithm 2: Without knowing global information n and k  Definition 11. (Inter-Packet Time) For k random walks on a graph, the inter-packet time of node u, T packet (u), is the amount of time between any two consecutive visits of those k random walks to node u.  Lemma 12. For a node u with node degree d n (u) in a random geometric graph with k simple random walks, the mean inter- packet time is given by  An estimation of k can be obtained by  After obtaining estimations for both n and k, we can employ similar techniques used in LTCDS-I to do LT coding and storage.

22 Algorithm 2: Without knowing global information n and k  The algorithm consists of four phases. (i)Initialization Phase (ii)Inference Phase (iii)Encoding Phase (iv)Storage Phase

23 Performance Evaluation  Definition 16. (Successful Decoding Probability) Successful decoding probability P s is the probability that the k source packets are all recovered from the h querying nodes. P s = M s / M  Definition 15. (Decoding Ratio) Decoding ratio η is the ratio between the number of queried nodes h and the number of sources k, η= h/k

24 Performance Evaluation Figure 3. Decoding performance of LTCDSI algorithm with small number of nodes and sources When the decoding ratio is above 2, the successful decoding probability is about 99%. When the total number of nodes increases but the ratio between k and n and the decoding ratio η are kept as constants, the successful decoding probability P s increases when η ≥ 1.5 and decreases when η < 1.5.

25 Performance Evaluation Figure 4. Decoding performance of LTCDS-I algorithm with medium number of nodes and sources

26 Performance Evaluation Figure 5. Decoding performance of LTCDS-I algorithm with different number of nodes Fixing the ratio between n and k as 10%, k/n=0.1 As n grows, the successful decoding probability increases until it reaches some platform which is the successful decoding probability of real LT codes.

27 Performance Evaluation Figure 6. Decoding performance of LTCDS-I algorithm with different system parameter C 1 Studying values of the constant C 1, for C 1 ≧ 3, P s is almost a constant close to 1. It means after 3nlogn steps, almost all source packets visit each node at least once.

28 Performance Evaluation Figure 7. Decoding performance of LTCDSII algorithm with small number of nodes and sources The decoding performance of the LTCDS-II algorithm is a little bit worse than the LTCDS-I algorithm when decoding ratio η is small, and almost the same when η is large.

29 Performance Evaluation Figure 8. Decoding performance of LTCDS-II algorithm with medium number of nodes and sources

30 Performance Evaluation Figure 9. Estimation results in LTCDS-II algorithm with n = 200 nodes and k = 20 sources: (a) estimations of n; (b) estimations of k. The estimations of k are more accurate and concentrated than the estimations of n.

31 Performance Evaluation Figure 10. Estimation results in LTCDS-II algorithm with n = 1000 nodes and k = 100 sources: (a) estimations of n; (b) estimations of k.

32 Performance Evaluation Figure 11. Decoding performance of LTCDS-II algorithm with different system parameter C 2 When C 2 is chosen to be small, the performance of the LTCDS-II algorithm is very poor. This is due to the inaccurate estimations of k and n of each node. When C 2 is large, for example, when C 2 ≥ 30, the performance is almost the same.

33 Conclusion  We proposed 2 new decentralized algorithms that utilize Fountain codes and random walks to distribute information sensed by k sensing source nodes to n storage nodes.

34 References  [1] D. Aldous and J. Fill. Reversible Markov Chains and Random Walks on Graphs. Preprint, available at  [6] A. G. Dimakis, V. Prabhakaran, and K. Ramchandran. Distributed fountain codes for networked storage. Acoustics, Speech and Signal Processing, ICASSP 2006, may  [9] A. Kamra, V.Misra, J. Feldman, and D. Rubenstein. Growth codes: Maximizing sensor network data persistence. In Proc. of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications, Sigcomm06, pages 255 – 266, Pisa, Italy,  [10] Y. Lin, B. Li,, and B. Liang. Differentiated data persistence with priority random linear code. In Proc. of 27th International Conference on Distributed Computing Systems (ICDCS ’ 07), Toronto, Canada, June,  [11] Y. Lin, B. Liang, and B. Li. Data persistence in large-scale sensor networks with decentralized fountain codes. In Proc. of the 26th IEEE INFOCOM07, Anchorage, Alaska, May 6-12,  [12] M. Luby. LT codes. In Proc. 43rd Symposium on Foundations of Computer Science (FOCS 2002), November 2002, Vancouver, BC, Canada,  [13] D. S. Lun, N. Ranakar, R. Koetter, M. Medard, E. Ahmed, and H. Lee. Achieving minimum-cost multicast: A decentralized approach based on network coding. In In Proc. The 24th IEEE INFOCOM, volume 3, pages 1607 – 1617, March  [14] R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press,  [17] S. Ross. Stochastic Processes. Wiley, New York, second edition, 1995.