Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples.

Similar presentations


Presentation on theme: "Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples."— Presentation transcript:

1 Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples

2 Example Consider the code C = { 0000000, 0010111, 0101101, 0111010, 1001011, 1011100, 1100110, 1110001 } Use nearest-neighbor method to decode w = 1010100: So, w decodes to 1011100. Code word0000000001011101011010111010 W 1010100 Bit sum1010100100001111110011101110 Distance3355 Code word1001011101110011001101110001 W 1010100 Bit sum0011111000100001100100100101 Distance5133

3 Example Given the code C = { 0000000, 0010111, 0101101, 0111010, 1001011, 1011100, 1100110, 1110001 }. Can W = 1000001 be decoded? This cannot be decoded because there is no one closest code word. Code word0000000001011101011010111010 W 1000001 Bit sum1000001101011011011001111011 Distance2446 Code word1001011101110011001101110001 W 1000001 Bit sum0001010001110101001110110000 Distance2442

4 Spring 2015 Mathematics in Management Science Data Compression Huffman Codes Code Trees Encoding & Decoding Constructing

5 Types of Codes Error Detection/Correction Codes for accuracy of data Data Compression Codes for efficiency Cryptography for security

6 Error Correcting Codes In error correction we expand the data (through redundancy, check digits, etc.) to be able to detect and recover from errors.

7 Data Compression Here want to use less space to express (approximately) same info. Reduced size of data (fewer data bits) means reduced costs for both storage and transmission.

8 Data Compression Data compression is the process of encoding data so that the most frequently occurring data are represented by the fewest symbols.

9 Sources of Compressibility Redundancy Recognize repeating patterns Exploit using Dictionary Variable length encoding Human perception Less sensitive to some information Can discard less important data

10 Compression Algorithms Can be lossless – meaning that original data can be reconstructed exactly – or lossy – meaning only get approximate reconstruction of the data. Examples ZIP and GIF are lossless JPEG and MPEG are lossy

11 Types of Compression Lossless Preserves all information Exploits redundancy in data Applied to general data Lossy May lose some information Exploits redundancy & human perception Applied to audio, image, video

12 Lossy Data Compression Frequently used for images where an exact reproduction may not be required. Many different shades of gray could be converted to just a few. Many different colors could be changed to just a few.

13 Run-Length Encoding (RLE) Simple form of data compression (introduced very early, but still in use). Only useful where there are long runs of same data (e.g., black and white images). Repeated symbols (runs) are replaced by a single symbol and a count.

14 Run-Length Encoding (RLE) Repeated symbols (runs) are replaced by a single symbol and a count. For example, we could replace aaaaaaaaaAAaaaaaAAAAAAAA with just 9a2A5a8A. Simplistic, but still commonly used. Only effective when have long runs of symbols (e.g., images with small color palettes).

15 Run-Length Encoding (RLE) Repeated symbols (runs) are replaced by a single symbol and a count. Example Suppose in a B/W image ‘. ’ represents a white pixel ‘*’ represents a black pixel One row of pixels has the form...............*......***..................**.....

16 Run-Length Encoding (RLE) Repeated symbols (runs) are replaced by a single symbol and a count. One row of pixels has the form...............*......***..................**..... These 50 characters could be replaced by the 16 characters 15.1*6.3*18.2*5.

17

18 ABACABADABACABAD In ABACABADABACABAD (16 chars), A appears 8 times, B appears 4 times, C & D each appear twice. Standard ASCII codes this by using 8 ×16 = 128 bits.

19 ABACABADABACABAD In ABACABADABACABAD (16 chars), A 8 times, B 4 times, C & D each twice. Can label 4 chars with 2 bits: A → 00, B → 01, C → 10, D → 11 Now get 16 chars → 2 ×16 = 32 bits to represent the sequence. Here used fixed length labels.

20 ABACABADABACABAD In ABACABADABACABAD (16 chars), A 8 times, B 4 times, C & D each twice. Use variable length labels instead, based on idea that most frequent chars get the shortest labels. So, use codes A → 0, B → 10, C → 110, D → 111 Get message 0100110010011101001100100111 which is just 28 bits.

21 Uncompressing Data Data must be recoverable! Compression algorithms must be reversible (up to some accepted loss in lossy algorithms). In previous ex, how do we decode 0100110010011101001100100111 ? Where do labels end??

22 Uncompressing Data How do we decode 0100110010011101001100100111 ? Where do labels end? Labels given by A → 0, B → 10, C → 110, D → 111 Every 0 terminates a label. Only D does not end with 0; it is the only label with three 1’s.

23 Uncompressing Data How do we decode 0100110010011101001100100111 ? Labels given by A → 0, B → 10, C → 110, D → 111 So rule is: read from left and break when see 0 or three consecutive 1’s.

24 Uncompressing Data Decode 111110000011001100010 Labels given by A → 0, B → 10, C → 110, D → 111 Rule: read from left and break when see 0 or three consecutive 1’s. Get 111,110,0,0,0,0,110,0,110,0,0,10 which translates as DCAAAACACAAB.

25

26 Morse Code Ternary code (uses short & long marks, and spaces) invented in early 1840s. Widely used until mid 20 th century.

27

28 Frequency Table for Letters

29 Huffman Encoding A general lossless compression method. Developed by David Hu ff man in 1951 (while a grad student at MIT). Encodes single chars using variable length labels. Hu ff man codes are the most e ffi cient among compression methods that assign strings of bits to individual source symbols.

30 Huffman Encoding Approach Variable length encoding of chars Exploit statistical frequency of chars Efficient when char freqs vary widely Principle Use fewer bits for frequent symbols Use more bits for infrequent symbols

31 Huffman Encoding Code created using so-called code tree by arranging chars from top to bottom according to increasing probabilities. The code tree is used to both encode and decode. Must know: How to create the code tree. How to use code tree to encode/decode.

32 Using Hu ff man Tree: Assigning Labels The label that gets assigned to a letter is the sequence of binary digits along the path connecting the top to the desired letter.

33 Using Hu ff man Tree: Assigning Labels For pixd tree: A gets label 111 B gets label 1101 C gets label D gets label E gets label F gets label

34 Using Hu ff man Tree: Assigning Labels For pixd tree: A gets label 111 B gets label 1101 C gets label 01 D gets label 00 E gets label 1100 F gets label 10

35 Using Hu ff man Tree: Assigning Labels A gets label 111 B gets label 1101 C gets label 01 D gets label 00 E gets label 1100 F gets label 10 Strings coded by replacing chars with their labels. Code FACED as F A C E D 10 111 01 1100 00 so get 1011101110000.

36 Using Hu ff man Tree: Decoding The digits “walk” you down the tree to the correct letter. Read digits from left side of code string. When see 0, take left branch. When see 1, take right branch. Continue until you terminate at a letter. Replace digits read so far with letter and repeat from the top of the tree with the next digit.

37 Decode: 0011000111110 0011000111110 D 11000111110 DE 0111110 Next? 00 1100 01 111 10 D E C A F

38 Creating a Huffman Code Tree Constructed from a frequency table. Freq table shows number of times (as a fraction of the total) that each char occurs in the document. Freq table specific to the document being compressed, so every doc has its own code tree!

39 Frequency Table Example “PETER PIPER PICKED A PECK OF PICKLED PEPPERS”

40 Creating a Huffman Code Tree Start with freq table: A 0.190 B 0.085 C 0.220 D 0.205 E 0.070 F 0.230 Sort freq table: E 0.070 B 0.085 A 0.190 D 0.205 C 0.220 F 0.230

41 Creating a Huffman Code Tree Merge top (smallest) two values, put char with smaller value on left, and sort again. E 0.070 B 0.085 A 0.190 D 0.205 C 0.220 F 0.230 EB 0.155 A 0.190 D 0.205 C 0.220 F 0.230 Repeat Process

42 Creating a Huffman Code Tree Merge top (smallest) two values, put char with smaller value on left, and sort again. EB 0.155 A 0.190 D 0.205 C 0.220 F 0.230 Repeat Process D 0.205 C 0.220 F 0.230 EBA 0.345

43 Creating a Huffman Code Tree Merge top (smallest) two values, put char with smaller value on left, and sort again. D 0.205 C 0.220 F 0.230 EBA 0.345 Repeat Process F 0.230 EBA 0.345 DC 0.425

44 Creating a Huffman Code Tree Merge top (smallest) two values, put char with smaller value on left, and sort again. F 0.230 EBA 0.345 DC 0.425 DCFEBA 1.000 DC 0.425 FEBA 0.575 Now gotta build Code Tree.

45 Creating a Huffman Code Tree Construct tree from top down by reversing all merges. Insert branch when symbols (re)split. Left branch labeled 0, right branch 1. First split (last merge) is DCFEBA into DC + FEBA

46

47

48

49 Example Find Huffman code tree for following frequency table. A 0.30 B 0.15 C 0.10 D 0.45


Download ppt "Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples."

Similar presentations


Ads by Google