Information Redundancy Fault Tolerant Computing Introduction Fundamental notions Parity codes Linear codes Cyclic codes Unordered codes Arithmetic codes Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing Introduction Encoding is the powerful technique allowing us to ensure that information has not been changed during storage or transmission. Attaching special check bits to blocks of digital information enables special-purpose hardware to detect and correct a number of communication and storage faults, such as changes in single bits or changes to several adjacent bits. Coding theory was originated in the late 1940s, by two seminal works by Hamming and Shannon. Hamming defined the notion of distance between two words and observed this was a metric, thus leading to interesting properties. This distance is now called Hamming distance. His first attempt produced a code in which four data bits were followed by three check bits which allowed not only the detection but the correction of a single error. Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing Fundamental notions Information rate The ratio k/n is called the information rate of the code. The information rate determines the redundancy of the code. For example, a repetition code obtained by repeating the data three times, has the information rate 1/3. Hamming distance Hamming distance between two binary n-tuples, x and y, denoted by d(x, y), is the number of bit positions in which the n-tuples differ. Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing Code distance Code distance of a code C is the minimum Hamming distance between any two distinct pairs of codewords of C. In general, to be able to correct e-bit errors, a code should have the code distance of at least 2e+1. To be able to detect e-bit errors, the code distance should be at least e+1. Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing Code efficiency 1 Number bit errors a code can detect/correct, reflecting the fault tolerant capabilities of the code. 2 Information rate k/n, reflecting the amount of information redundancy added. 3 Complexity of encoding and decoding schemes, reflecting the amount of hardware, software and time redundancy added. Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing Parity codes The even (odd) parity code of length n is composed of all binary n-tuples that contain an even (odd) number of 1’s. Any single-bit error can be detected, since the parity of the affected n-tuple will be odd (even) rather than even (odd). It is not possible to locate the position of the erroneous bit. Thus, it is not possible to correct it. The most common application of parity is error-detection in memories of computer systems. Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing All operations related to the error-detection (encoding, decoding, comparison) are done by memory control logic on the mother-board, in the chip set, or, for some systems, in the CPU. The memory itself only stores the parity bits, just as it stores the data bits. Therefore, parity checking does not slow down the operation of the memory. Parity can only detect single bit errors and an odd number of bit errors. A modification of parity code is horizontal and vertical parity code, which arrange the data in a 2-dimensional array and add one parity bit on each row and one parity bit on each column. Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing Linear codes Linear codes provide a general framework for generating many codes, including Hamming code. A (n, k) linear code over the field Z2 is a k-dimensional subspace of Vn . Example (4,2) linear code vectors v0 =[1000] and v1 =[0110] d = [d0d1] d = [11] c = 1 [1000]+1 [0110] = [1110] Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing Generator matrix The codeword c is a product of the generator matrix G and the data word d: c = dG generating matrix of the form [IKA], where IK is an identity matrix of size k ˣ k. Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing Parity check matrix To decode a (n; k) linear code, we use an (n-k) ˣ n matrix H, called the parity check matrix of the code. The parity check matrix represents the parity of the codewords. Syndrome The resulting k-bit vector s is called syndrome. If the syndrome is zero, no error has occurred. If s matches one of the columns of H, then a single-bit error has occurred. The bit position of the error corresponds to the position of the matching column in H. s = HcT Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing Constructing linear codes we must ensure that every column of the parity check matrix is linearly independent. This is equivalent to the requirement of not having a zero column, since the zero vector can never be a member of a set of linearly independent vectors. The code distance of the resulting (4,2) code is two. So, this code could be used to detect single-bit errors. Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing Hamming codes the first single-error correcting Hamming code and its extended version, single-error correcting double-error detecting Hamming code. (7,4) Hamming code: Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing The information rate of a (7;4) Hamming code is k=n = 4=7. In general the rate of a (n; k) Hamming code is: Hamming are widely used for DRAM error-correction. Encoding is usually performed on complete words, rather than individual bytes. As in the parity code case, when aword is written into a memory, the check bits are computed by a check bits generator. Extended Hamming codes It can correct single-bit errors and detect double-bit errors Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018
Fault Tolerant Computing 12/29/2018