Erasure Channel: ◦ Files are transmitted in multiple small packets. ◦ Each packet is either received without error or loss. ◦ Such as the Internet. How to deal with packet loss? ◦ Some simple retransmission protocols: ACKs: for missing packets. ACKs: for received packets. ◦ Erasure Correcting Codes. 3
Why Erasure Correcting Codes? Retransmission are wasteful when erasure is serious: ACKs: for missing packets. ACKs would be enormous. ACKs: for received packets. Would lead to multiple copies. Broadcast Channel with erasure: 4 server A D C F E
Erasure Correcting Codes: Block Code, such as (N, K) Reed-Solomon Code: Any K of the N transmitted symbols are received, then the original K source symbols can be recovered. High Complexity: O( K(N-K) log 2 N) Estimate the erasure probability f first, then choose the code rate R = K/N before transmission. Ex. If loss-rate = 50%, then set code rate R = 1/(1-50%) = 1/2 = K/N. ( N = 2K ) 5 loss-rate = 50%
Erasure Correcting Codes: Block Code, such as (N, K) Reed-Solomon Code: If f is larger than expected (decoder receives fewer than K symbols): Ex. We thought loss-rate is 50%, and set the code rate R = 1/2 ( N = 2K ); however, the actual loss-rate = 66.7%, the proper code rate R should be lower: R = 1/3 ( N = 3K ) We would like a simple way to extend the code on the fly to create a lower-rate (N’, K) code. 6 loss-rate = 66.7% No way!
Fountain Codes are rateless: The number of encoded packets generated can be determined on the fly. 7
Fountain Codes are rateless: The number of encoded packets generated can be determined on the fly. Fountain Codes can also have fantastically small encoding and decoding complexities. Depends on the careful choice of Degree Distribution.
Balls–and–Bins Problem: ◦ Imagine that we throw N balls independently at random into K bins, what probability of one bin have no balls in it? 9 … K bins … N throws
Balls–and–Bins Problem: ◦ After N balls have been thrown, what probability of one bin have no ball in it? The probability that one particular bin is empty after N balls have been thrown: 10 … K bins
Balls–and–Bins Problem: ◦ After N balls have been thrown, what probability of one bin have no ball in it? The probability that one particular bin is empty after N balls have been thrown: The expected number of empty bins: δ =,which roughly implies: the probability of all bins have a ball is (1- δ) only if: 11 … K bins
Luby Transform (LT) Codes: ◦ Encoding process: For the i th encoded packet, select degree d i by carefully chosen Degree Distribution (Robust Soliton Distribution). Choose d i source data. Perform XOR on chosen data. ◦ Decoding process: Decode degree-one encoded packets. Remove degree-one edges iteratively. 12 Degree 123…k probability μ1μ1 μ2μ2 μ3μ3 …μkμk … x1x1 x2x2 x3x3 x4x4 x5x5 y1y1 y2y2 y3y3 y4y4 y5y5 x1x3x1x3 x2x2 x2x5x2x5 x4x4 x6x6 x3x5x6x3x5x6
Designing the Degree Distribution: ◦ A few encoded packets must have high degree. To ensure that every source data are connected to encoded packets. ◦ Many encoded packets must have low degree. So that decoding process can get started, and keep going. Also the total number of XOR operations involved in the encoding and decoding is kept small. 13 x1x1 x2x2 x3x3 x4x4 x5x5 y1y1 y2y2 y3y3 y4y4 y5y5
Some properties of Degree Distribution: ◦ The complexity (both encoding and decoding) are scaled linearly with the number of edges in the graph. ◦ Key factor: The average degree of the encoded packets. How small (number of edges) can this be? ◦ Recall: Balls–and–Bins Problem. Balls: linked edges. Bins: source data. 14 x1x1 x2x2 x3x3 x4x4 x5x5 y1y1 y2y2 y3y3 y4y4 y5y5 How small number of edges can assure that every source data must have at least one edge on it? (all bins have a ball)
Some properties of encoder: ◦ Encoder throws edges into source data at random. The number of edges must be at least of order : K lnK. Balls–and–Bins Problem: The expected number of empty bins: δ =,which roughly implies: the probability of all bins have a ball is (1- δ) only if: 15
For decoder: ◦ If decoder received optimal K encoded packets, the average degree of each encoded packet must be at least: lnK The number of edges must be at least of order: K lnK. The complexity (both encoding and decoding) of an LT code will definitely be at least: K lnK Luby showed that this bound on complexity can indeed be achieved by a careful choice of Degree Distribution. 16
Ideally, to avoid redundancy: ◦ We would like just one check node has degree one at each iteration. Ideal Soliton Distribution: The expected average degree under this distribution is roughly: lnK 17
In practice, this distribution works poorly: ◦ Fluctuations around the expected behavior: Sometimes in the decoding process there will be no degree-one check node. ◦ A few source data will receive no connections at all. Some small modification fixes these problems. ◦ Robust Soliton Distribution: More degree-one check node. A bit more high-degree check node. 18
Robust Soliton Distribution: Two extra parameters: c and δ The expected number of degree-one check node (through out decoding process) is about: δ: a bound on the decoding failure probability. Decoding fails to run to completion after K’ of encoded packets have been received. c: a free parameters smaller than 1. Luby’s key result. 19
Luby defines a positive function:, then adds the Ideal Soliton Distribution ρ to τ and normalize to obtain the Robust Soliton Distribution μ: 20, where ※ Receiver once receives K' = KZ encoded packets ensures that the decoding can run to completion with probability at least 1 - δ.
21 τ( k/S ) ※ High-degree ensures every source data is likely to be connected to a check. ※ Small-degree of τ ensures the decoding process gets started.
22 ※ Histograms of the number of encoded packets N required in order to recover source data K = 10,000
23 ※ Practical performance of LT codes - Three experimental decodings are shown. ※ All codes created with c = 0.03, δ = 0.5 (S= 30, K/S = 337, Z = 1.03), and K = 10,000 an overhead of 10%
Complexity cost: ◦ LT Codes: O(K lnK), where K: the number of original data. Average degree of the encoded packets: lnK Encoding and decoding complexity: lnK per encoded packet ◦ Raptor Codes: Linear time encoding and decoding. Concatenating a weakened LT Code with an outer code. Average degree of weakened LT code ≒ 3 24
Weakened LT Code: ◦ Average degree of encoded packets ≒ 3 ◦ A fraction of source data will receive no connections at all. What fraction? ◦ Balls–and–Bins Problem: 25 Also the fraction of empty bins ≒ 5%
Shokrollahi’s trick: ※ Encoder: ◦ Find a outer code can correct erasures if the erasure rate is:, then pre-code K source data into: ◦ Transmit this slightly enlarged data using a weaken LT Code. ※ Decoder: ◦ Once slightly more than K encoded packets been received, can recover of the pre-coded packets (roughly K packets). ◦ Then use the outer code to recover all the original data. 26
27 K = 16 K’ = 20 N = 18 ※ Schematic diagram of a Raptor Code Pre-Coding Weaken LT covered = 17
28 ※ The idea of a weakened LT Code. ※ LT codes created with c = 0.03, δ = 0.5 and truncated at degree 8 (thus average degree = 3)
30 Ideal Soliton Distribution: The average degree is roughly: lnK
31 Robust Soliton Distribution μ: The average degree is roughly: lnK, where