# Error correcting codes

## Presentation on theme: "Error correcting codes"— Presentation transcript:

Error correcting codes
A practical problem of theoretical importance

Claude Shannon (1916-2001) 1937. MSc, MIT.
“As a 21-year-old master's student at MIT, he wrote a thesis demonstrating that electrical application of Boolean algebra could construct and resolve any logical, numerical relationship. It has been claimed that this was the most important master's thesis of "all time".” 1948 – paper inventing the field of information theory.

The Symmetric binary channel
Alice wants to send Bob a binary string. Each transmitted bit is flipped with probability p<1/2 How many bits are needed to reliably transfer k bits of information? How much “information” is transferred in each bit?

Pictorially The scheme is useful if Prx[x'≠x]≤ 
y=y1..yn y'=y'1..y'n x {0,1}k x' =D(x) {0,1}k y = E(x)  {0,1}k The scheme is useful if Prx[x'≠x]≤  Shannon: There exists a useful scheme (with exponentially small error) transmitting k=(1-H(p))n bits using n communication bits.

Richard Hamming( ) “History Hamming worked at Bell Labs in the 1940s on the Bell Model V computer, an electromechanical relay-based machine with cycle times in seconds. Input was fed in on punch cards, which would invariably have read errors. During weekdays, special code would find errors and flash lights so the operators could correct the problem. During after-hours periods and on weekends, when there were no operators, the machine simply moved on to the next job. Hamming worked on weekends, and grew increasingly frustrated with having to restart his programs from scratch due to the unreliability of the card reader. Over the next few years he worked on the problem of error-correction, developing an increasingly powerful array of algorithms. In 1950 he published what is now known as Hamming Code, which remains in use today in applications such as ECC memory.” THE PURPOSE OF COMPUTING IS INSIGHT, NOT NUMBERS.

y' differs in at most δ n bits from y y=y1..yn x  {0,1}k x'=D(x)  {0,1}k y = E(x)  {0,1}k The scheme is useful if x,  adversary, x'=x

Rate and distance x ≠y, d(E(x),E(y)) ≥ d
Definition: E is a (n,k,d) code, if E :k → n, and, x ≠y, d(E(x),E(y)) ≥ d length Information bits distance Relative rate=k/n Relative distance= d/n Lemma: If E is a (n,k,d) code, then the encoding can detect d-1 errors and correct (d-1)/2 errors.

Linear codes k-dimensional vector space over F, and has distance d.
Definition: Let F be a field. E is a [n,k,d]F code, if E is a linear operator from Fk → Fn, and has distance d. Equivalently: C ⊆ Fn is an [n,k,d]F code if it is a k-dimensional vector space over F, and has distance d. Fact: dist(C)=min{c ∈ C} weight(c).

Examples

Repetition code Linear code Length: t Dimension: 1 Relative rate: 1/t
Distance: t Relative distance: 1

One Parity bit Linear code Length: n+1 Dimension: n
x1 x2 xn  x1 x2 xn xi Linear code Length: n+1 Dimension: n Relative rate: 1-1/(n+1) Distance: 2 Relative distance: 2/(n+1)

Hamming code [7,4,3]2 Encoding: x → Gx
Decoding: compute s=Hy If the output it (000) - no error Otherwise – column index of s is the corrupted bit.

Reed-Solomon code Linear code Length: n=|F| Dimension: k
Relative rate: r=k/n Distance: n-k+1 Relative distance: δ=1-r+1/n

The singleton bound Theorem: δ ≤ 1-r+1/n Equivalently: d ≤ n-k+1
Proof: Project all codewords on the first k-1 coordinates. Two codewords must have the same projection. Hence d≤ n-(k-1)

Here we assume q=|F| > d > m.
Reed-Muller codes C={ f(x1),...,f(xN) | f:Fm  F, def(f) < d}, Fm={x1,...,xN} Here we assume q=|F| > d > m. Linear code Length: qm Dimension: (m+d choose m) Relative rate: ≈ (d/mq)m ≈0 Relative distance: 1-d/q

C={ f∙x1,...,f∙xN | f ∈ F2m}, F2m={x1,...,xN}
Hadamard code C={ f∙x1,...,f∙xN | f ∈ F2m}, F2m={x1,...,xN} Linear code Length: 2m Dimension: m Relative rate: m/2m ≈ 0 Relative distance:1/2

How good can Binary error correcting codes be?

The Gilbert Varshamov bound
Claim: There exists a (non-linear) code C with length n, distance d, and, |Σ|n / |B(0;d-1)| codewords. Proof: On board. Asymptotic behavior for |Σ|=2, r ≥ 1- H() Same asymptotic for random linear codes.

The quest for asymptotically good binary ECC
Concatenation, RS + HAD. (No) RS+RS+RS+OPT (Hmm...) Justesen – sophisticated concatenation (Yes) AG+HAD (yes) Expander codes Finding an explicit ECC (explicit encoding + decoding) approaching the GV bound is still a major open problem.

Can we do better than random?
There is a gap between the best upper bound (obtained by a linear programming argument) and the GV bound. For q=p^2 a prime power, q≥49, there exists a better construction (even explicit). For a Fixed , growing q, GV: r = 1-  - O(1/log q) AG: r ≥ 1-  - 1/(sqrt(q)-1)

An Example F=ℤ13 Evaluation polynomials: Span{1,x,x2,x3,y,xy}
Evaluation points: {(x,y) : y2-2(x-1)x(x+1)=0 } A linear [19,6,13]13 Compare with RS: [19,6,14]19 S={(0,0)(1,0),(2,  5),(3,  3),(4,  4),(6,  2),(7,  3),(9,  6),(10,  2),(11,  1)}

Find the Parity-check matrix Compute the syndrom (should be 0)

Plenty of algorithms for specific codes. A beautiful algorithm for decoding RS Elwin Berlekamp 1940- Madhu Sudan 1966-

Decoding Reed-Solomon codes
Input: (x1,y1),...,(xn,yn) Promise: there exists a degree k polynomial p(x) such that p(xi)=yi at least 2n/3 times. Goal: Find p. Algorithm: Find a non-zero low degree Q:F2 →F such that i Q(xi)=yi. Factor Q. Check all factors y-f(x)

Pictorially Input: 13 points in the real Euclidean plane. Algorithm:
Find low degree Q, and factor it.

Is it used in practice?

NASA spaceships Deep-space telecommunications
NASA has used many different error correcting codes. For missions between 1969 and the Mariner spacecraft used a Reed-Muller code. The noise these spacecraft were subject to was well approximated by a "bell-curve" (normal distribution), so the Reed-Muller codes were well suited to the situation. The Voyager 1 & Voyager 2 spacecraft transmitted color pictures of Jupiter and Saturn in and 1980. Color image transmission required 3 times the amount of data, so the Golay (24,12,8) code was used.[citation needed][3] This Golay code is only 3-error correcting, but it could be transmitted at a much higher data rate. Voyager 2 went on to Uranus and Neptune and the code was switched to a concatenated Reed-Solomon code-Convolutional code for its substantially more powerful error correcting capabilities. Current DSN error correction is done with dedicated hardware. For some NASA deep space craft such as those in the Voyager program, Cassini-Huygens (Saturn), New Horizons (Pluto) and Deep Space 1—the use of hardware ECC may not be feasible for the full duration of the mission. The different kinds of deep space and orbital missions that are conducted suggest that trying to find a "one size fits all" error correction system will be an ongoing problem for some time to come.

Satellite communication
“Satellite broadcasting (DVB) The demand for satellite transponder bandwidth continues to grow, fueled by the desire to deliver television (including new channels and High Definition TV) and IP data. Transponder availability and bandwidth constraints have limited this growth, because transponder capacity is determined by the selected modulation scheme and Forward error correction (FEC) rate. Overview QPSK coupled with traditional Reed Solomon and Viterbi codes have been used for nearly 20 years for the delivery of digital satellite TV. Higher order modulation schemes such as 8PSK, 16QAM and 32QAM have enabled the satellite industry to increase transponder efficiency by several orders of magnitude. This increase in the information rate in a transponder comes at the expense of an increase in the carrier power to meet the threshold requirement for existing antennas. Tests conducted using the latest chipsets demonstrate that the performance achieved by using Turbo Codes may be even lower than the 0.8 dB figure assumed in early designs.”

Data storage (erasure codes, systematic codes)
“RAID 1 RAID 1 mirrors the contents of the disks, making a form of 1:1 ratio realtime backup. The contents of each disk in the array are identical to that of every other disk in the array. A RAID 1 array requires a minimum of two drives. RAID 1 mirrors, though during the writing process copy the data identically to both drives, would not be suitable as a permanent backup solution, as RAID technology by design allows for certain failures to take place.  RAID 3/4 RAID 3 or 4 (striped disks with dedicated parity) combines three or more disks in a way that protects data against loss of any one disk. Fault tolerance is achieved by adding an extra disk to the array and dedicating it to storing parity information. The storage capacity of the array is reduced by one disk. A RAID 3 or 4 array requires a minimum of three drives: two to hold striped data, and a third drive to hold parity data.  RAID 5 RAID 5 (striped disks with distributed parity) combines three or more disks in a way that protects data against the loss of any one disk. It is similar to RAID 3 but the parity is not stored on one dedicated drive, instead parity information is interspersed across the drive array. The storage capacity of the array is a function of the number of drives minus the space needed to store parity. The maximum number of drives that can fail in any RAID 5 configuration without losing data is only one. Losing two drives in a RAID 5 array is referred to as a "double fault" and results in data loss.  RAID 6 RAID 6 (striped disks with dual parity) combines four or more disks in a way that protects data against loss of any two disks.  RAID 10 RAID 1+0 (or 10) is a mirrored data set (RAID 1) which is then striped (RAID 0), hence the "1+0" name. A RAID 1+0 array requires a minimum of four drives: two mirrored drives to hold half of the striped data, plus another two mirrored for the other half of the data. In Linux MD RAID 10 is a non-nested RAID type like RAID 1, that only requires a minimum of two drives, and may give read performance on the level of RAID 0.”

Barcodes

And everywhere else “Reed–Solomon codes are used in a wide variety of commercial applications, most prominently in CDs, DVDs and Blu-ray Discs, in data transmission technologies such as DSL & WiMAX, in broadcast systems such as DVB and ATSC, and in computer applications such as RAID 6 systems.”

Are we done? Not at all. Q1: Can we handle errors when the number of errors is close to the distance? Q2: Can we decode a single bit more efficiently then the whole string? Q3: Are ECC useful for tasks other than error correction? (E.g., for propagating entropy ???)

Johnson's bound Observation: No unique decoding is possible when the number of errors is above half the distance Johnson's bound: Let C be a code with relative distance δ. Then for any α > 2 sqrt{1-δ}  w∈{0,1}n, | {c ∈ C : Ag(w,c) ≥ αn| ≤ O(1/α).

List decoding vs. Stochastic noise
Def: A code C list decodes p noise, if  w∈{0,1}n, | { c ∈ C : Ag(w,c) ≥ (1-p)n } | ≤ poly(n). Claim: A code C that can list decode p noise, has rate r<1-H(p) Claim: For any ε>0, there exists a code C that can list decode p noise with r>1-H(p)+ε rate.

List decode RS? The RS decoding algorithm we saw, list decodes RS close to the Johnson's bound. An improvement of Guruswami-Sudan matches the Johnson's bound. r=k/n, needs sqrt{k/n} agreement, r= (1-p)2. It is not known whether one can list decode RS better.

PV - Bundled RS Fq a field. E irreducible, deg(E)=n.
F=Fq[X] mod E(X) an extension field. Given f ∈F, compute fi=fh^i∈F , for i=1..m. For every x∈ Fq output (f1(x),..,fm(x)) Farzad Parvaresh was born in Isfahan, Iran, in 1978.He received the B.S. degree in electrical engineering from the Sharif University of Technology, Tehran, Iran, in 2001, and the M.S. and Ph.D. degrees in electrical and computer engineering form the University of California, San Diego, in 2003 and 2007, respectively. He is currently a post doctoral scholar at the Center for Mathematics of Information, California Institute of Technology, Pasadena, CA. His research interests include error-correcting codes, algebraic decoding algorithms, information theory, networks, and fun math problems. Dr. Parvaresh was a recipient of the best paper award from the 46th Annual IEEE symposium on foundations of computer science (FOCS'05).

GR - Folded RS Fq a field. g generates Fq*.
Given f ∈F, output a word in (Fqc)(q/c) by chunking the output (f(g),f(g2),..,f(gq-1)) to length c blocks. Rate vs. noise: For any ε>0, r=1-p+ε with field size q=function(ε).

Local decoding Up to now, decoding meant taking the noisy word and recovering the message from it. x=D(y). Suppose we are only interested in one bit of x. Can we recover xi with less queries to y. Note: we are still with an adversarial noise model.

Setting: y=Had(X), y'=y  noise, with up to ⅕ noise. Goal: recover xi from y', i ∈ [n]. Algorithm: Choose z ∈{0,1}n at random. Output: y(z)  y(z  ei).

Efremenko code Lemma: All 2-locally decodable codes have exponential length. Yekhanin gave the first sub-exponential 3-locally decodable code. Efremenko gave 3-locally decodable codes of length ≈ 22^sqrt{log n} Even the non-explicit bound is still wide open.

What's next PCP heavily uses local-testing
Derandomization heavily uses local list decoding. Randomness extractors are codes with good list recovery decoding. Many modern extractors are codes in disguise. Intimate connection between randomness extractors and pseudo random generators. And much more...

Summary Rich (and deep) theory Practical applications
A very Basic Theoretical notion. Intimately related to: Randomness extractors Pseudo-randomness Derandomization PCP, and more.

Many open problems Is the GV bound tight for binary codes?
Find efficient codes meeting the GV bound. Do asymptotically good locally testable codes exist? Is efficient local decoding with a constant number of queries possible?

A project Reading the paper Nearly-linear size holographic proofs
By Polishchuk and Spielman On low-degree testing of bi-variate polynomials And explaining it to me.