Noise, Information Theory, and Entropy (cont.) CS414 – Spring 2007 By Karrie Karahalios, Roger Cheng, Brian Bailey
Coding Intro - revisited Assume alphabet K of {A, B, C, D, E, F, G, H} In general, if we want to distinguish n different symbols, we will need to use, log 2 n bits per symbol, i.e. 3. Can code alphabet K as: A 000 B 001 C 010 D 011 E 100 F 101 G 110 H 111
Coding Intro - revisited “BACADAEAFABBAAAGAH” is encoded as the string of 54 bits (fixed length code)
Coding Intro With this coding: A 0 B 100 C 1010 D 1011 E 1100 F 1101 G 1110 H bits, saves more than 20% in space
Huffman Tree A (8), B (3), C(1), D(1), E(1), F(1), G(1), H(1)
Limitations Diverges from lower limit when probability of a particular symbol becomes high always uses an integral number of bits Must send code book with the data lowers overall efficiency Must determine frequency distribution must remain stable over the data set
Arithmetic Coding Replace stream of input symbols with a single floating point number bypasses replacement of symbols with codes Use probability distribution of symbols (as they appear) to successively narrow original range The longer the sequence, the greater the precision of the floating point number requires infinite precision (but this is possible)
Encoding Example Encode “BILL” p(B)=1/4; p(I)=1/4; p(L) = 2/4 Assign symbols to range [0.0, 1.0] based on p Successively reallocate low- high range based on sequence of input symbols SymbolLowHigh B00.25 I 0.50 L 1.00
Encoding Example When B appears, compute symbol portion [0.0, 0.25] from current range [0.0,1.0] SymbolLowHigh B
Encoding Example When I appears, compute symbol portion [0.25, 0.50] of current range [0.0, 0.25] SymbolLowHigh B I
Encoding Example When L appears, compute symbol portion [0.50, 1.0] of current range [0.0625, 0.125] SymbolLowHigh B I L
Encoding Example When L appears, compute symbol portion [0.50, 1.0] of current range [ , 0.125] SymbolLowHigh B I L L
Encoding Example When L appears, compute symbol portion [0.50, 1.0] of current range [ , 0.125] SymbolLowHigh B I L L The final low range value encodes entire sequence Actually, ANY value within final range will encode entire sequence
Encoding Algorithm Set low to 0.0 Set high to 1.0 WHILE input symbols remain Range = high – low Get symbol High = low + high_range(symbol)*range Low = low + low_range(symbol)*range END while Output any value in [low, high)
Decoding Example E = between 0.0, 0.25; output ‘B’ E = ( – 0.0) / 0.25 = between 0.25, 0.5; output ‘I’ E = (.4375 – 0.25) / 0.25 = between 0.5, 1.0; output ‘L’ E = (0.75 – 0.5) / 0.5 = between 0.5, 1.0; output ‘L’ E = (0.5 – 0.5) / 0.5 = 0.0 -> STOP SymbolLowHigh B00.25 I 0.50 L 1.00
Decoding Algorithm encoded = Get (encoded number) DO Find symbol whose range contains encoded Output the symbol range = high(symbol) – low(symbol) encoded = (encoded – low(symbol)) / range UNTIL (EOF)
Code Transmission Transmit any number within final range choose number that requires fewest bits Recall that the minimum number of bits required to represent an ensemble is Note that we are not comparing directly to H because no code book is generated
Compute Size of Interval Interval: [L, L + S] Size of interval (S): For ensemble “BILL” =.25*.25*.5*.5 = Check algorithm result = SymbolLowHigh B00.25 I 0.50 L 1.00
Number of Bits to Represent S Requires bits (min) to specify S where Same as the minimum number of bits
Determine Representation Compute midpoint L + S/2 truncate its binary representation after Truncated number lies within [L, L+S], as
Practical Notes Achieve infinite precision using fixed width integers as shift registers represent only fractional part of each range as precision of each range increases, the most significant bits will match shift out MSB and continue algorithm Caveat underflow can occur if ranges approach same number without MSB being equal
Exercise: Huffman vs Arithmetic Given message AAAAB where p(A)=.9; p(B)=.1 Huffman code (a) compute entropy (H) (b) build Huffman tree (simple) (c) compute average codeword length (d) compute number of bits needed to encode message Arithmetic coding (a) compute theoretical min. number of bits to transmit message (b) compute the final value that represents the message (c) independent of (b), what is the min number of bits needed to represent the final interval? How does this value compare to (a)? How does this value compare to Huffman part (d)
Error detection and correction Error detection is the ability to detect errors that are made due to noise or other impairments during transmission from the transmitter to the receiver. Error correction has the additional feature that enables localization of the errors and correcting them. Error detection always precedes error correction.
Error Detection Data transmission can contain errors Single-bit Burst errors of length n where n is the distance between the first and last errors in data block. How to detect errors If only data is transmitted, errors cannot be detected Send more information with data that satisfies a special relationship Add redundancy
Error Detection Methods Vertical Redundancy Check (VRC) / Parity Check Longitudinal Redundancy Check (LRC) Checksum Cyclic Redundancy Check
Vertical Redundancy Check (VRC) aka Parity Check Vertical Redundancy Check (VRC) Append a single bit at the end of data block such that the number of ones is even Even Parity (odd parity is similar) Odd Parity Performance: Detects all odd-number errors in a data block (even)
Longitudinal Redundancy Check (LRC) Organize data into a table and create a parity for each column Original Data LRC
Performance: Detects all burst errors up to length n (number of columns) Misses burst errors of length n+1 if there are n-1 uninverted bits between the first and last bit
Parallel Parity One error gives 2 parity errors. Can detect which value is flipped.
Checksum Used by upper layer protocols Similar to LRC, uses one’s complement arithmetic Ex FB B4 BB 09 B BB E FA A F5 00 DA F B E 22 F E E 22 B4
Cyclic Redundancy Check Powerful error detection scheme Rather than addition, binary division is used Finite Algebra Theory (Galois Fields) Can be easily implemented with small amount of hardware Shift registers XOR (for addition and subtraction)
CRC Let us assume k message bits and n bits of redundancy Associate bits with coefficients of a polynomial x 6 +0x 5 +1x 4 +1x 3 +0x 2 +1x+1 = x 6 +x 4 +x 3 +x+1
CRC Let M(x) be the message polynomial Let P(x) be the generator polynomial P(x) is fixed for a given CRC scheme P(x) is known both by sender and receiver Create a block polynomial F(x) based on M(x) and P(x) such that F(x) is divisible by P(x)
CRC Sending 1.Multiply M(x) by x n 2.Divide x n M(x) by P(x) 3.Ignore the quotient and keep the reminder C(x) 4.Form and send F(x) = x n M(x)+C(x) Receiving 1.Receive F’(x) 2.Divide F’(x) by P(x) 3.Accept if remainder is 0, reject otherwise
Properties of CRC Sent F(x), but received F’(x) = F(x)+E(x) When will E(x)/P(x) have no remainder, i.e., when does CRC fail to catch an error? 1.Single Bit Error E(x) = x i If P(x) has two or more terms, P(x) will not divide E(x) 2.2 Isolated Single Bit Errors (double errors) E(x) = x i +x j, i>j E(x) = x j (x i-j +1) Provided that P(x) is not divisible by x, a sufficient condition to detect all double errors is that P(x) does not divide (x t +1) for any t up to i-j (i.e., block length)
Properties of CRC 3.Odd Number of Bit Errors If x+1 is a factor of P(x), all odd number of bit errors are detected Proof: Assume an odd number of errors has x+1 as a factor. Then E(x) = (x+1)T(x). Evaluate E(x) for x = 1 E(x) = E(1) = 1 since there are odd number of terms (x+1) = (1+1) = 0 (x+1)T(x) = (1+1)T(1) = 0 E(x) ≠ (x+1)T(x)
Properties of CRC 4.Short Burst Errors (Length t ≤ n, number of redundant bits) E(x) = x j (x t-1 +…+1) Length t, starting at bit position j If P(x) has an x 0 term and t ≤ n, P(x) will not divide E(x) All errors up to length n are detected 5.Long Burst Errors (Length t = n+1) Undetectable only if burst error is the same as P(x) P(x) = x n + … + 1n-1 bits between x n and x 0 E(x) = 1 + … + 1must match Probability of not detecting the error is 2 -(n-1) 6.Longer Burst Errors (Length t > n+1) Probability of not detecting the error is 2 -n
Error Correction Hamming Codes (more next week)