Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information and Coding Theory

Similar presentations


Presentation on theme: "Information and Coding Theory"— Presentation transcript:

1 Information and Coding Theory
Linear Codes. Groups, fields and vector spaces - a brief survey. Some bounds on code parameters. Hemming and Golay codes. Syndrome decoding. Juris Viksna, 2015

2 Topics Codes – how to define them?
Groups (some basic definitions and results) Fields (just reminding of definition, we will study them more thoroughly later) Vector spaces Linear codes - definitions and some basic results Some simpler codes – Hemming codes, Golay codes Some bounds on code parameters Syndrome decoding

3 Codes – how to define them?
In most cases it would be natural to use binary block codes that maps input vectors of length k to output vectors of length n. For example:  etc. Thus we can define code as an injective mapping from vector space V with dimension k to vector space W with dimension n. Such definition essentially is used in the original Shanon’s theorem. V W

4 Codes – how to define them?
We can define code as an injective mapping from “vector space” V with dimension k to “vector space” W with dimension n. Arbitrary mappings between vector spaces are hard either to explicitly define or to use (encode ort decode) in practice (there are almost 2n2k of them – already around for k=4 and n=7). Simpler to define and use are linear codes that can be defined by multiplication with matrix of size kn (called generator matrix). Shanon’s results hold also for linear codes.

5 Codes – how to define them?
Simpler to define and use are linear codes that can be defined by multiplication with matrix of size kn (called generator matrix). What should be the elements of vector spaces V and W? In principle in most cases it will be sufficient to have just 0-s and 1-s, however, to define vector space in principle we need a field – an algebraic system with operations “+” and “” defined and having similar properties as we have in ordinary arithmetic (think of real numbers). Field with just “0” and “1” may look very simple, but it turns out that to get some real progress we will need more complicated fields, just that elements of these fields themselves will be regarded as (most often) binary vectors.

6 What are good codes? Linear codes can be defined by their generator matrix of size kn . Shanon’s theorem tells us that for a transmission channel with a bit error probability p and for an arbitrary small bit error probability pb we wish to achieve there exists codes with rates R = k/n that allows us to achieve pb as long as R<C(p). In general, however, the error rate could be different for different codewords, pb being an “average” value. We however will consider codes that are guaranteed to correct up to t errors for any of codewords.

7 What are good codes? We however will consider codes that are guaranteed to correct up to t errors for any of codewords – this is equivalent with minimum distance between codewords being d and t = d1/2. Such codes will then be characterized by 3 parameters and will be referred to as (n,k,d) codes. For a given k we are thus interested: - to minimize n - to maximize d In most cases for fixed values n and k the larger values of d will give us lower bit error probability pb, although the computation of pb is not that straightforward and depends from a particular code. Note that one can completely “spoil” d value of good code with low pb by including in it a vector with weight 1 

8 Groups - definition Consider set G and binary operator +. Definition
Pair (G,+) is a group, if there is eG such that for all a,b,cG: a+bG (a+b)+c = a+(b+c) a+e = a and e+a = a there exists inv(a) such that a+ inv(a)= e and inv(a)+a = e if additionally a+b = b+a, group is commutative (Abelian) If group operation is denoted by “+” then e is usually denoted by 0 and inv(a) by a. If group operation is denoted by “” hen e is usually denoted by 1 and inv(a) by a1 (and ab are usually written as ab). It is easy to show that e and inv(a) are unique.

9 Groups - definition Definition
Pair (G,+) is a group, if there is eG such that for all a,b,cG: a+bG (a+b)+c = a+(b+c) a+e = a and e+a = a there exists inv(a) such that a+ inv(a)= e and inv(a)+a = e Examples: (Z,+), (Q,+), (R,+) (Q{0},), (R{0},) (but not (Z{0},)) (Z2,+), (Z3,+), (Z4,+) (Z2{0},), (Z3{0},) (but not (Z4{0},)) A simple non-commutative group: x,y, where x: abccab ( rotation) and y: abcacb ( inversion).

10 Groups - definition (H,+) is a subgroup of (H,+) if HG and (H,+) is a group. H<G - notation that H is a subgroup of G. x1, x1,,xk - a subgroup of G generated by x1, x1,,xk G. o(G) - number of elements of group G. For first few lectures we just need to remember group definition. Other facts will become important when discussing finite fields. We will consider only commutative and finite groups!

11 Lagrange theorem Theorem If H < G then o(H) | o(G). Proof
For all aG consider sets aH = {ah | hH} (these are called cosets). All elements of a coset aH are distinct and |aH| = o(H). Each element gG belongs to some coset aH (for example ggH). Thus G is a union of disjoint cosets, each having o(H) elements, hence o(H) | o(G).

12 Fields - definition Consider set F and binary operators “+” and “”.
Triple (F,+, ) is a field, if 0,1F such that for all a,b,c,dF and d0: a+bF and ab F a+b=b+a and ab= ba (a+b)+c=a+(b+c) and (ab)c=a(bc) a+0=a and a1 = a there exist a,d1F such that a+( a)= 0 and dd1=1 a(b+c)=ab+bc We can say that F is a field if both (F,+) and (F{0}, ) are commutative groups and operators “+” and “” are linked with property (6). Examples: (Q,+,), (R,+,), (Z2,+,), (Z3,+,) (but not (Z4, ,+,))

13 Finite fields - some examples
[Adapted from V.Pless]

14 Finite fields - some examples
Note that we could use alternative notation and consider the field elements as 2-dimensional binary vectors (and usual vector addition corresponds to operation “+” in field!) : 0=(0 0), 1=(0 1), =(1 0),  =(1 1). [Adapted from V.Pless]

15 Vector spaces - definition
What we usually understand by vectors? In principle we can say that vectors are n-tuples of the form: (x1,x2,,xn) and operations of vector addition and multiplication by scalar are defined and have the following properties: (x1,x2,,xn)+(y1,y2,,yn)=(x+y1,x+y2,,x+yn) a(x1,x2,,xn)=(ax1,ax2,,axn) The requirements actually are a bit stronger – elements a and xi should come from some field F. We might be able to live with such a definition, but then we will link a vector space to a unique and fixed basis and often this will be technically very inconvenient.

16 Vector spaces - definition
Let (V,+) be a commutative group with identity element 0, let F be a field with multiplicative identity 1 and let “” be an operator mapping FV to V. Definition 4-tuple (V,F,+,) is a vector space if (V,+) is a commutative group with identity element 0 and for all u,vV and all a,bF: a(u+v)=au+av (a+b)v=av+bv a(bv)=(ab)v 1v=v The requirement that (V,+) is a group is used here for brevity of definition – the structure of (V,+) will largely depend from F. Note that operators “+” and “” are “overloaded” – the same symbols are used for field and vector space operators.

17 Vector spaces - definition
4-tuple (V,F,+,) is a vector space if (V,+) is a commutative group with identity element 0 and for all u,vV and all a,bF: a(u+v)=au+av (a+b)v=av+bv a(bv)=(ab)v 1v=v Usually we will represent vectors as n-tuples of the form (x1,x2,,xn), however such representations will not be unique and will depend from a particular basis of vector space, which we will chose to use (but 0 will always be represented as n-tuple of zeroes (0,0,,0)).

18 Vector spaces - some terminology
A subset S of vectors from vector space V is a subspace of V if S is itself a vector space. We denote this by S⊑V. A linear combination of v1,,vkV (V being defined over a field F) is a vector of the form a1v1++akvk where aiF. For given v1,,vkV a set of all linear combinations of these vectors are denoted by v1,,vk. It is easy to show that v1,,vk⊑V. We also say that vectors v1,,vk span the subspace v1,,vk. A set of vectors v1,,vk is (linearly) independent if a1v1++akvk  0 for all a1,,akF such that ai  0 for at least one index i.

19 Independent vectors - some properties
A set of vectors v1,,vk is (linearly) independent if a1v1++akvk  0 for all a1,,akF such that ai  0 for at least one index i. Assume that vectors v1,v2,v3,,vk are independent. Then independent will be also the following: any permutation of vectors v1,v2,v3,,vk (this doesn’t change the set :) av1,v2,v3,,vk , if a  0 v1+v2,v2,v3,,vk Moreover, these operations do not change subspace spanned by initial vectors v1,,vk.

20 Vector spaces - some results
Theorem 1 If v1,,vk V span the entire vector space V and w1,,wr V is an independent set of vectors then r  k. Proof Suppose this is not the case and r  k+1. Since v1,,vk=V we have: w1 = a11v1+  +a1kvk w2 = a21v1+  +a2kvk wk = ak1v1+  +akkvk wk+1 = a(k+1)1v1++a(k+1)kvk We can assume that a11 ≠ 0 (at least some scalar should be non-zero due to independence of wi's).

21 Vector spaces - some results
Thus, we have b1w1 = v1++a1kvk for some non-zero b1. Subtracting the value b1w1ai1 from all wi's with i >1 gives us a new system of equations with independent set of k+1 vectors on left side. By repeating this process k+1 time by starting with the i-th equation in each iteration we end up with set of k+1 independent vectors, where the i-th vector is expressed as a sum of no more than k+1–i vi's. However, this means that k+1-st vector will be 0 and this contradicts that set of k+1 vectors is independent.

22 Vector spaces - basis Theorem 2
If two finite sets of independent vectors span a space V, then there are the same number of vectors in each set. Proof Let k be a number of vectors in the first set and r the number of vectors in the second set. By Theorem 1 we have k  r and r  k. We will be interested in vector spaces that are spanned by a finite number of vectors, so we assume this from now. We say that a set of vectors v1,,vkV is a basis of V if they are independent and span V.

23 Vector spaces - basis Theorem 3
Let V be a vector space over a field F. Then the following hold: V has a basis. Any two bases of V contain the same number of vectors. If B is a basis of V then every vector in V has a unique representation as a linear combination of vectors in B. Proof Let v1V and v1 ≠ 0. If V = v1 we are finished. If not, there is a vector v2Vv1 and v2 ≠ 0. If V = v1,v2 we are finished. If not, continue until we obtain a basis for V. The process must terminate by our assumption that V is spanned by a finite number k of vectors and the result given by Theorem 1 that the basis cannot have more than k vectors.

24 Vector spaces - basis Theorem 3
Let V be a vector space over a field F. Then the following hold: V has a basis. Any two bases of V contain the same number of vectors. If B is a basis of V then every vector in V has a unique representation as a linear combination of vectors in B. Proof This is a direct consequence from Theorem 2. Since the vectors in B span V, any vV can be written as v = a1b1+...+akbk If we also have v = c1b1+...+ckbk then a1b1+...+akbk = c1b1+...+ckbk and, since bi's are independent, ai = ci for all i.

25 Vector spaces - dimension
The dimension of a vector space V, denoted by dim V, is the number of vectors in any basis of V. By Theorem 3 the dimension of V is well defined. Independent vectors – few more properties Assume that set of vectors v1,,vk V is independent and in some fixed basis b1,,bkV we have representations vi = ai1b1+...+aikbk. Then independent will be also the following sets of vectors obtained from v1,,vk by: for some j,k swapping all aij-s with aik-s for some j and non-zero cF replacing all aij-s with caij-s (*) for some k replacing all aij-s with aij+aik-s (**) Note however that these operations can change subspace spanned by initial vectors v1,,vk.

26 Vector spaces and rangs of matrices
If M is a matrix whose elements are contained in a field F, then the row rank of M, denoted by rr(M), is defined to be dimension of the subspace spanned by rows of M. The column rank of M , denoted by rc(M), is defined to be dimension of the subspace spanned by columns of M. An nn matrix A is called nonsingular if rr(A) = n. This means that the rows of A are linearly independent.

27 Vector spaces and rangs of matrices
If M is a matrix, the following operations on its rows are called elementary row operations: Permuting rows. Multiplying a row by a nonzero scalar. Adding a scalar multiple of one row to another row. Similarly, we can define the following to be elementary column operations: Permuting columns. Multiplying a column by a nonzero scalar. Adding a scalar multiple of one column to another column. As we saw, none of these operations affects rr(M) or rc(M), although 1,2,3 and 4,5,6 correspondingly could change M column/row spaces.

28 Vector spaces - matrix row-eschelon form
A matrix M is said to be in row-echelon form if, after possibly permuting its columns, M = where I is kk identity matrix. Lemma If M is any matrix, then M can by applying elementary operations 1,2,3,4 it can be transformed into a matrix M in row echelon form.

29 Vector spaces - matrix row-eschelon form
Lemma If M is any matrix, then M can by applying elementary operations 1,2,3,4 it can be transformed into a matrix M in row echelon form. Proof (sketch) Proceed iteratively in steps 1...k as follows: Step i Perform row permutation (if such exists) that places in i-th position row with a non-zero element in i-th column. Let it be row v. If there are no appropriate permutation, perform first column permutation that places non-zero column in i-th position. Add to each of other rows vector av, where a is chosen such that after the addition the row has value 0 in i-th position.

30 Vector spaces - matrix row-eschelon form
Theorem The row rank rr(M) of matrix M, equals its column rank rc(M). Proof Transform M into M in row echelon form applying elementary operations 1,2,3,4. This doesn’t change either rr(M) or rc(M), it is also obvious that for matrix M we have rr(M) = rc(M) = k, where k is the size of identity matrix I. M =

31 Linear codes Message source Encoder Receiver Decoder Channel x = x1,...,xk message x' estimate of y = c + e received vector e = e1,...,en error from noise c = c1,...,cn codeword Generally we will define linear codes as vector spaces – by taking C to be a k-dimensional subspace of some n-dimensional space V.

32 Codes and linear codes Let V be an n-dimensional vector space over a finite field F. Definition A code is any subset CV. A linear (n,k) code is any k-dimensional subspace C⊑V. Whilst this definition is largely “standard”, it doesn’t distinguish any particular basis of V. However the properties and parameters of code will vary greatly with selection of particular basis. So in fact we assume that V is given already with some fixed basis b1,,bnV and will assume that all elements of V and C will be represented in this particular basis. At the same time we might be quite flexible in choosing a specific basis for C.

33 Codes and linear codes Let V be an n-dimensional vector space over a finite field F. Definition A linear (n,k) code is any k-dimensional subspace C⊑V. Example (choices of bases for V and code C): Basis of V (fixed): 001,010,100 Set of V elements: {000,001,010,011,100,101,110,111} Set of C elements: {000,001,010,011} 2 alternative bases for code C: 001,010 001,011 Essentially, we will be ready to consider alternative bases, but will stick to “main one” for representation of V elements.

34 Hamming code [7,4] What to do, if there are errors?
- we assume that the number of errors is as small as possible - i.e. we can find the code word c (and the corresponding x) that is closest to received vector y (using Hamming distance) consider vectors a = , b = and c = , - if y is received, compute ya, yb and yc (inner products), e.g., for y = we obtain ya = 1, yb = 0 and yc = 0. -- this represents a binary number (100 or 4 in example above) and we conclude that error is in 4th digit, i.e. x = Easy, bet why this method work?

35 Vector spaces – dot (scalar) product
Let V be a k-dimensional vector space over field F. Let b1,,bkV be some basis of V. For a pair of vectors u,vV, such that u=a1b1+...+akbk and v=c1b1+...+ckbk their dot (scalar) product is defined by: u·v = a1·c ak·ck Thus operator “” maps VV to F. Lemma For u,v,wV and all a,bF the following properties hold: u·v = v·u. (au+bv)·w = a(u·v)+b(v·w). If u·v = 0 for all v in V, then u = 0. Note. The Lemma above can also be used as a more abstract definition of inner product. The question whether for an inner product defined in abstract way there will be a basis that will allow to compute it as a scalar product by formula above is somewhat tricky and may depend from particular vector space V.

36 Vector spaces – dot (scalar) product
Let V be a k-dimensional vector space over field F. Let b1,,bkV be some basis of V. For a pair of vectors u,vV, such that u=a1b1+...+akbk and v=c1b1+...+ckbk their dot (scalar) product is defined by: u·v = a1·c ak·ck Two vectors u and v are said to be orthogonal if u·v = 0. If C is a subspace of V then it is easy to see that the set of all vectors in V that are orthogonal to each vector in C is a subspace, which is called the space orthogonal to C and denoted by C. For us important and not too obvious result! Theorem If C is a subspace of V, then dim C + dim C = dim V.

37 Vector spaces – dot (scalar) product
Let V be a k-dimensional vector space over field F. Let b1,,bkV be some basis of V. For a pair of vectors u,vV, such that u=a1b1+...+akbk and v=c1b1+...+ckbk their dot (scalar) product is defined by: u·v = a1·c ak·ck Two vectors u and v are said to be orthogonal if u·v = 0. If C is a subspace of V then it is easy to see that the set of all vectors in V that are orthogonal to each vector in C is a subspace, which is called the space orthogonal to C and denoted by C. For us important and not too obvious result! Theorem If C is a subspace of V, then dim C + dim C = dim V.

38 Vector spaces and linear transformations
Definition Let V be a vector space over field F. Function f : VV is called a linear transformation, if for all u,vV and all aF the following hold: af(u) = f(au). f(u)+f(v) = f(u+v). The kernel of f is defined as ker f ={vV | f(v) = 0}. The range of f is defined as range f ={f(v) | vV}. It is quite obvious property of f linearity that vector sums and scalar products doesn't leave ker f or range f . Thus ker f ⊑V and range f ⊑V.

39 Vector spaces and linear transformations
Theorem (rank-nullity theorem) dim (ker f) + dim (range f) = dim V. ker f range f v  0 Proof? Chose some basis u1,,uk of ker f and some basis w1,,wn of range f. Then try to show that u1,,uk, v1,,vn is a basis for V, where wi = f(vi) (vectors vi may not be uniquely defined, but is sufficient to chose arbitrary pre-images of wi -s).

40 Rank-nullity theorem Theorem dim (ker f) + dim (range f) = dim V.
[Adapted from R.Milson]

41 Rank-nullity theorem [Adapted from R.Milson]

42 Dimensions of orthogonal vector spaces
Theorem If C is a subspace of V, then dim C + dim C = dim V. C C Proof? We could try to reduce this to “similarly looking” equality dim V = dim (ker f) + dim (range f). However how we can define a linear transformation from dot product?

43 Dimensions of orthogonal vector spaces
Theorem If C is a subspace of V, then dim C + dim C = dim V. Proof However how we can define a linear transformation from dot product? Let u1,,uk be some basis of C. We define transformation f as follows: for all vV: f(v) = (vu1) u1 +  + (vuk) uk Note that vui F, thus f(v)C. Therefore we have: ker f = C (this directly follows form definition of C) range f = C (this follows form definition of f) Thus from rank-nullity theorem: dim C + dim C = dim V.

44 Linear codes Message source Encoder Receiver Decoder Channel x = x1,...,xk message x' estimate of y = c + e received vector e = e1,...,en error from noise c = c1,...,cn codeword Generally we will define linear codes as vector spaces – by taking C to be a k-dimensional subspace of some n-dimensional space V.

45 Codes and linear codes Let V be an n-dimensional vector space over a finite field F. Definition A code is any subset CV. A linear (n,k) code is any k-dimensional subspace C⊑V. Whilst this definition is largely “standard”, it doesn’t distinguish any particular basis of V. However the properties and parameters of code will vary greatly with selection of particular basis. So in fact we assume that V is given already with some fixed basis b1,,bnV and will assume that all elements of V and C will be represented in this particular basis. At the same time we might be quite flexible in choosing a specific basis for C.

46 Codes and linear codes Let V be an n-dimensional vector space over a finite field F. Definition A linear (n,k) code is any k-dimensional subspace C⊑V. Example (choices of bases for V and code C): Basis of V (fixed): 001,010,100 Set of V elements: {000,001,010,011,100,101,110,111} Set of C elements: {000,001,010,011} 2 alternative bases for code C: 001,010 001,011 Essentially, we will be ready to consider alternative bases, but will stick to “main one” for representation of V elements.

47 Codes and linear codes Let V be an n-dimensional vector space over a finite field F together with some fixed basis b1,,bnV. Definition A linear (n,k) code is any k-dimensional subspace C⊑V. The weight wt(v) of a vector vV is a number of nonzero components of v in its representation as a linear combination v = a1b1+...+anbn. The distance d(v,w) between vectors v,wV is a number of distinct components in their representation in given basis. The minimum weight of code C⊑V is defined as minvC,v0 wt(v).

48 Codes and linear codes Let V be an n-dimensional vector space over a finite field F together with some fixed basis b1,,bnV. Definition A linear (n,k) code is any k-dimensional subspace C⊑V. The minimum weight of code C⊑V is defined as minvC,v0 wt(v). A linear (n,k) code with minimum weight d is often referred to as (n,k,d) code.

49 Codes and linear codes Theorem
Linear (n,k,d) code can correct any number of errors not exceeding t = (d1)/2. Proof The distance between any two codewords is at least d. So, if the number of errors is smaller than d/2 then the closest codeword to the received vector will be the transmitted one However a far less obvious problem: how to find which codeword is the closest to received vector?

50 Coding theory - the main problem
A good (n,k,d) code has small n, large k and large d. The main coding theory problem is to optimize one of the parameters n, k, d for given values of the other two.

51 Generator matrices Definition
Consider (n,k) code C⊑V. G is a generator matrix of code C, if C = {vG | vV} and all rows of G are independent. It is easy to see that generator matrix exists for any code – take any matrix G rows of which are vectors v1,,vk (represented as n-tuples in the initially agreed basis of V) that form a basis of C. By definition G will be a matrix of size kn. Obviously there can be many different generator matrices for a given code. For example, these are two alternative generator matrices for the same (4,3) code:

52 Equivalence of codes Definition
Codes C1,C2⊑V. are equivalent, if a generator matrix G2 of C2 can be obtained from a generator matrix G1 of C1 by a sequence of the following operations: permutation of rows multiplication of a row by a non-zero scalar addition of one row to another permutation of columns multiplication of a column by a non-zero scalar Note that operations 1-3 actually doesn’t change the code C1. Applying operations 4 and 5 C1 could be changed to a different subspace of V, however the weight distribution of code vectors remains the same. In particular, if C1 is (n,k,d) code so is C2. In binary case vectors of C1 and C2 would differ only by permutation of positions.

53 Generator matrices Definition
A generator matrix G of (n,k) code C⊑V is said to be in standard form if G = (I,A), where I is kk identity matrix. Theorem For code C⊑V there is an equivalent code C that has a generator matrix in standard form. Proof We have already shown that each matrix can be transformed in row-echelon form by applying the same operations that define equivalent codes. Since for (n,k) code generator matrix must have rank k, we obtain that in this case I should be kk identity matrix (i.e. we can’t have rows with all zeroes).

54 Dual codes Definition Consider code C⊑V. A dual or orthogonal code of C is defined as C = {vV |  wC: vw = 0}. It is easy check that C  ⊑V, i.e. C is a code. Note that actually this is just a re-statement of definition of orthogonal vector spaces we have already seen. Remember theorem we have proved shortly ago: If C is a subspace of V, then dim C + dim C = dim V. Thus, if C is (n,k) code then C is (nk,k) code and vice versa. There are codes that are self-dual, i.e. C = C.

55 Dual codes - some examples
For the (n,1) -repetition code C, with the generator matrix G = (1 1 … 1) the dual code C^ is (n, n1) code with the generator matrix G^, described by: ^

56 Dual codes - some examples
[Adapted from V.Pless]

57 Dual codes – parity checking matrices
Definition Let code C⊑V and let C be its dual code. A generator matrix H of C is called a parity checking matrix of C. Theorem If kn generator matrix of code C⊑V is in standard form if G = (I,A) then (kn)n matrix H = (AT,I) is a parity checking matrix of C. Proof It is easy to check that any row of G is orthogonal to any row of H (each dot product is a sum of only two non-zero scalars with opposite signs). Since dim C + dim C = dim V, i.e. k + dim C = n we have to conclude that H is a generator matrix of C. Note that in binary vector spaces H = (AT,I) = (AT,I).

58 Dual codes – parity checking matrices
Theorem If kn generator matrix of code C⊑V is in standard form if G = (I,A) then (kn)n matrix H = (AT,I) is a parity checking matrix of C. So, up to the equivalence of codes we have an easy way to obtain a parity check matrix H from a generator matrix G in standard form and vice versa. Example of generator and parity check matrices in standard form:

59 Dual codes and vector syndromes
Definition Let C⊑V be an (n,k) code with a parity check matrix H. For each vV a syndrome of v is nk dimensional vector syn(v) = vHT. By definition of H we have syn(c) = 0 for all codewords cC. If some errors have occurred, then instead of c we have received vector y = c + e, where c is codeword and e is error vector. In this case syn(c) = syn(c) + syn(e) = 0 + syn(e). That is, syndrome is determined solely by error vector and in principle the knowing of vector syndrome should us allow to infer which bits have been transmitted incorrectly.

60 Hamming code (7,4) We have already seen a Hemming (7,4) code
with generator matrix (in standard form) shown above and have proved that it can correct any single error (or even more – that it is a perfect (7,4,3) code. Can we generalize this to codes of different lengths/dimensions? How simple the decoding procedure for such codes might be? Parity bits of H(7,4)

61 Hamming codes For simplicity we will consider codes over binary fields, although the definition (and design idea) easily extends to codes over arbitrary finite fields. Definition For a given positive integer r a Hemming code Ham(r) is code a parity check of which as its rows contains all possible non-zero r-dimensional binary vectors. There are 2r  1 such vectors, thus parity check matrix has size 2r  1r and respectively Ham(r) is (n = 2r  1,n  r) code.

62 Hamming codes Definition
For a given positive integer r a Hemming code Ham(r) is code a parity check of which as its rows contains all possible non-zero r-dimensional binary vectors. Example of Hamming code Ham(4): Also not required by definition, note that in this particular case columns can be regarded as consecutive integers 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 written in binary form.

63 Hamming codes Hamming code Ham(4):
Why we have defined this code in such particular way? Note that if there have been no errors we have syn(y) = 0 for some received vector y. If a single error have occurred we will have syn(y) = [i], where is binary representation of position (column) in which this error has occurred.

64 Hamming codes Note that if there have been no errors we have syn(y) = 0 for some received vector y. If a single error have occurred we will have syn(y) = [i], where is binary representation of position (column) in which this error has occurred. Thus we are able to correct any single error. Also note that any larger number of errors is not distinguishable from 0 or 1 error case. Thus all (binary) Hamming codes have parameters (n = 2r  1,n  r,3).

65 A ternary (13,10,3) Hamming code
[Adapted from V.Pless]

66 Golay codes - some history
The brief history of the Golay Codes begins in 1949, when M. J. E. Golay published his “Notes on Digital Coding” in the Proceedings of the Institute of Electrical and Electronic Engineers”, ½ page in length. It described the (23,12,7)2 code (although he evidently did not name it after himself). This inspired a search for more perfect codes. After all, if there was some series of perfect codes, or better yet an algorithm that produces them, much of the rest of coding theory would possibly become obsolete. For any given rate and blocklength, no code with a higher minimum distance or average minimum distance can be constructed, so if it had been determined that perfect codes existed with many rates and many blocklengths, it may have been worthwhile to only search for perfect codes. It soon appeared that such prayers fell on deaf ears, as the existence of perfect codes was disproved in more and more general scenarios. Finally, in 1973, when Aimo Tietäväinen disproved the existence of perfect codes over finite fields in his “Nonexistence of Perfect Codes over Finite Fields” in the SIAM Journal of Applied Mathematics, January 1973. [Adapted from

67 Golay codes In mathematical terms, the extended binary Golay code consists of a 12-dimensional subspace W of the space V=F224 of 24-bit words such that any two distinct elements of W differ in at least eight coordinates. Equivalently, any non-zero element of W has at least eight non-zero coordinates. The possible sets of non-zero coordinates as w ranges over W are called code words. In the extended binary Golay code, all code words have Hamming weight 0, 8, 12, 16, or 24. Up to relabeling of coordinates W is unique. [Adapted from

68 Golay codes Golay codes G24 and G23 were used by Voyager I and Voyager II to transmit color pictures of Jupiter and Saturn. Generation matrix for G24 has the form: G24 is (24,12,8) –code and the weights of all codewords are multiples of 4. G23 is obtained from G24 by deleting last symbols of each codeword of G24. G23 is (23,12,7) –code.

69 Golay codes Matrix G for Golay code G24 has actually a simple and regular construction. The first 12 columns are formed by a unitary matrix I12, next column has all 1’s. Rows of the last 11 columns are cyclic permutations of the first row which has 1 at those positions that are squares modulo 11, that is 0, 1, 3, 4, 5, 9.

70 Ternary Golay code The ternary Golay code consists of 36 = 729 codewords. Its parity check matrix is: Any two different codewords differ in at least 5 positions. Every ternary word of length 11 has a Hamming distance of at most 2 from exactly one codeword. The code can also be constructed as the quadratic residue code of length 11 over the finite field F3. This is a (11,6,5) code. [Adapted from

71 Punctured codes There are certain modifications that can be applied to codes and sometimes allows to obtain codes with better parameters. The usual codes that can be obtained in such a way are punctured, extended and shortened codes. Definition A punctured code of a binary (n,k,d) code C with a generator matrix G is a code C with generator matrix G obtained from G by deleting any one of G columns. By deleting different columns we may obtain different punctured codes. Punctured codes may have parameters (n1,k,d), (n1,k,d1), (n1,k1,d) or (n1,k1,d1).

72 Punctured codes - example
[Adapted from V.Pless]

73 Punctured codes - example
[Adapted from V.Pless]

74 Extended codes Definition
An extended code of a binary (n,k,d) code C with a generator matrix G is a code C with generator matrix G obtained from G by adding n+1-st column (a1,,ak), where each ai is the sum of all elements of i-th row of G. Thus for any code C there is just one extended code C. In the case when C contains vectors with odd weight code C has parameters (n+1,k,d) or (n+1,k,d+1). If all vectors of C has even weight code C has parameters (n+1,k,d).

75 Extended codes - example
[Adapted from V.Pless]

76 Shortened codes Definition
A shortened code of a binary (n,k,d) code C is a code C obtained from C by selecting all vectors with 0 in some fixed position i and removing from these vectors elements in i-th position. By choosing different positions i we may obtain different shortened codes. Shortened codes has parameters (n1,k1,d), where d d.

77 Shortened codes - example
[Adapted from V.Pless]

78 Shortened codes - example
[Adapted from V.Pless]

79 Hemming inequality and perfect codes
We will formulate these results in binary case, although they easily extends to codes over any finite field F. Theorem (Hemming inequality) For any (n,k,d) code C from vector space V over the binary field F the following inequality holds: where t = (d1)/2.

80 Hemming inequality and perfect codes
Definition An (n,k,d) code C⊑V is perfect if balls with radius t = (d1)/2 around vectors of C cover the whole vector space V. Proposition A binary (n,k,d) code C⊑V is perfect if and only if the following equality holds: It is easy to see that all “full-space” (n, n,1) codes and all binary repetition (2k+1,1,2k+1) codes are perfect. These are trivial perfect codes.

81 Hemming inequality and perfect codes
[Adapted from J.MacCay]

82 Hemming inequality and perfect codes
Do good perfect codes exist? Theorem The only non-trivial perfect codes are the following: all Hamming codes binary (23,12,7) Golay code ternary (12,6,5) Golay code. We will not prove this result here. However it is very simple to show that these particular codes are perfect, but the proof that these are the only ones is difficult.

83 Singleton bound and MDS codes
Theorem (Singleton inequality) For any (n,k,d) code C the inequality n  k  d  1 holds. Proof (version 1) All codewords in C are distinct. If we delete first d1 components, they are still distinct. New code has length nd+1 and still has size qk, thus n  d +1  k. Named after Richard Collom Singleton

84 Singleton bound and MDS codes
Theorem (Singleton inequality) For any (n,k,d) code C the inequality n  k  d  1 holds. Proof (version 2) Observe that rank H = nk. Any dependence of s columns in parity check matrix yields a codeword of weight s (any set of non-zero components in codeword  dependence relation in H). Thus n  k  d  1.

85 Singleton bound and MDS codes
Definition An (n,k,d) code is called maximum distance separable (or MDS code) if n  k = d  1. There exist only trivial binary MDS codes. However for codes over other finite fields this is not necessarily so. MDS conjecture: [Adapted from G.Seroussi]

86 Some positive results For simplicity we again will state just the binary case. Theorem (Gilbert-Varshamow bound) In n dimensional vector space V over the binary field F there exists a code C with parameters (n,k,d) code for at least one value d  k = n  m, if the following inequality holds:

87 Some positive results Theorem (Gilbert-Varshamow bound)
In n dimensional vector space V over the binary field F there exists a code C with parameters (n,k,d) code for at least one value d  k = n  m, if the following inequality holds: Proof We start with d and m and attempt to construct parity check matrix, so that no d  1 columns of it is dependent (implying that minimal weight will be at least d). The number of columns will be n. We start with an arbitrary column and continue to add other columns whilst number of linear combinations from d  2 of them doesn’t exceed number of all m-tuples.

88 Sizes of some known codes
[Adapted from V.Pless]

89 Cosets Definition Let C⊑V be an (n,k) code. For each aV a set a+C ={v+c | cC} is called a coset. Note that this is the same definition we used in proof of Lagrange theorem, if we consider C as an additive group. So, by the same simple arguments we can show that: all cosets have the same number of elements (equal to |C|); two cosets are either completely disjoint or equal; every vector belongs to some coset. Also note, that if error vector has been e, then received message y = c + e is in coset e+C.

90 Cosets Definition Let C⊑V be an (n,k) code. For each aV a set a+C ={v+c| cC} is called a coset. Note, that if error vector has been e, then received message y = c + e is in coset e+C. Thus for all y  e+C we have syn(y) = syn(e). How do we decode y? It seems reasonable to assume that the number of errors has been minimal, so we could chose error e as a vector from e+C with minimal weight (such vector can be non-unique). A vector x  a+C with a minimal weight is called coset leader.

91 Syndrome decoding Syndrome decoding:
precompute array of syndromes and their coset leaders, if vector y is received, compute syn(y), locate its coset and coset leader e, decode y as x = y  e.

92 Syndrome decoding - example
[Adapted from V.Pless]

93 Syndrome decoding Complexity??
“Brute force” method (precompute all codewords (2k of them) and for the received vector find the closest codeword): Time: ~ 2k n Memory: ~ 2k (k+n) Syndrome decoding: Time (precomputing): ~ 2n (k+n) Memory: ~ 2nk n Time (decoding): ~ (k+n) n Depends... but for a range of parameters syndrome decoding has advantages.


Download ppt "Information and Coding Theory"

Similar presentations


Ads by Google