CSCI 3280 Tutorial 6. Outline  Theory part of LZW  Tree representation of LZW  Table representation of LZW.

Presentation on theme: "CSCI 3280 Tutorial 6. Outline  Theory part of LZW  Tree representation of LZW  Table representation of LZW."— Presentation transcript:

CSCI 3280 Tutorial 6

Outline  Theory part of LZW  Tree representation of LZW  Table representation of LZW

Introduction  What is LZW?  Lossless compression method  Lempel-Ziv-Welch  Based on LZ78

Basic idea  Assume repetition of phase usually occurs  Use a code to represent one phase  Build a dictionary of phases that we met  If a phase is found in dictionary, use the code  If not found, add it to dictionary and give it a code  Very high compression ratio if lots of repetition

Algorithm  But how to handle byte stream?  The algorithm is similar to lecture note!  However, each node have maximum number of 256 child nodes. R 01 10 01 255 ….. R AssignmentLecture

R a b a b c d 0 255 ba ENCODER SIDE: R 979899100 c d …. Tree Structure Example

R a b a b c d 0 255 ba ENCODER SIDE: R 979899100 c d output: 97, b 256 …. Tree Structure Example

R a b a b c d 0 255 ba ENCODER SIDE: R 979899100 c d output: 97,98 b 256 …. 257 a Tree Structure Example

R a b a b c d 0 255 ba ENCODER SIDE: R 979899100 c d output: 97,98,256 b 256 …. 257 a 258 c Tree Structure Example

R a b a b c d 0 255 ba ENCODER SIDE: R 979899100 c d output: 97,98,256,99 b 256 …. 257 a 258 c 259 d Tree Structure Example

R a b a b c d 0 255 ba ENCODER SIDE: R 979899100 c d output: 97,98,256,99,100Encode complete! b 256 …. 257 a 258 c 259 d Tree Structure Example

R 0 255 ba DECODE SIDE: R 979899100 c d …. output: input: 97,98,256,99,100 Tree Structure Example

R 0 255 ba DECODE SIDE: R 979899100 c d …. output: a input: 97,98,256,99,100 Tree Structure Example

R 0 255 ba DECODE SIDE: R 979899100 c d …. output: a b input: 97,98,256,99,100 Last string Last string = a b 256 Tree Structure Example

R 0 255 ba DECODE SIDE: R 979899100 c d …. output: a b a b input: 97,98,256,99,100 Last string Last string = b b 256 257 a Tree Structure Example

R 0 255 ba DECODE SIDE: R 979899100 c d …. output: a b a b c input: 97,98,256,99,100 Last string Last string = a b b 256257 a 258 c Tree Structure Example

R 0 255 ba DECODE SIDE: R 979899100 c d …. output: a b a b c d input: 97,98,256,99,100 Last string Last string = c b 256257 a 258 c 259 d Tree Structure Example

Algorithm  Now let’s see a table structure example

IN: a b c c c d c c d CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 :: OUT: Prefix Char. Search Code Saved NULL Table Structure Compression

IN: a b c c c d c c d CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 :: OUT: Prefix Char. Search Code Saved NULL Table Structure Compression ‘a’ “a” 97 “a” ‘b’ “ab” 256 “ ab ” 97

IN: a b c c c d c c d CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 256 “ ab ” OUT: 97, Prefix Char. Search Code Saved NULL ‘b’ “b” 98 ‘b’ ‘c’ “bc” 257 “ bc ” 98 Table Structure Compression

IN: a b c c c d c c d CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 256 “ ab ” 257 “ bc ” OUT: 97, 98, Prefix Char. Search Code Saved NULL ‘c’ “c” 99 ‘c’ “cc” 258 “ cc ” 99 Table Structure Compression

IN: a b c c c d c c d CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 256 “ ab ” 257 “ bc ” 258 “ cc ” OUT: 97, 98, 99, Prefix Char. Search Code Saved NULL ‘c’ “c” 99 ‘c’ “cc” 258 ‘cc’ ‘d’ “ccd” 259 “ ccd ” 258 Table Structure Compression

IN: a b c c c d c c d CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 256 “ ab ” 257 “ bc ” 258 “ cc ” 259 “ ccd ” OUT: 97, 98, 99, 258, Prefix Char. Search Code Saved NULL ‘d’ “d” 100 ‘d’ ‘c’ “dc” 260 “ dc ” 100 Table Structure Compression

IN: a b c c c d c c d CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 256 “ ab ” 257 “ bc ” 258 “ cc ” 259 “ ccd ” 260 “ dc ” OUT: 97, 98, 99, 258, 100, Prefix Char. Search Code Saved NULL ‘c’ “c” 99 ‘c’ “cc” 258 ‘cc’ ‘d’ “ccd” 259 Table Structure Compression

IN: 97, 98, 99, 258, 100, 259 CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 OUT: cW pW C C dict(pW)+C 97 a Table Structure Decompression

IN: 97, 98, 99, 258, 100, 259 CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 OUT: a “ab” 256 b cW pW C C dict(pW)+C 98 97‘b’ “ab” Table Structure Decompression

IN: 97, 98, 99, 258, 100, 259 CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 256 “ ab ” OUT: a b “bc” 257 c cW pW C C dict(pW)+C 99 98‘c’ “bc” Table Structure Decompression

IN: 97, 98, 99, 258, 100, 259 CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 256 “ ab ” 257 “ bc ” OUT: a b c “cc” 258 c cW pW C C dict(pW)+C 258 99‘c’ “cc” exception Table Structure Decompression

IN: 97, 98, 99, 258, 100, 259 CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 256 “ ab ” 257 “ bc ” 258 “ cc ” OUT: a b c c c “ccd” 259 d cW pW C C dict(pW)+C 100 258‘d’ “ccd” Table Structure Decompression

IN: 97, 98, 99, 258, 100, 259 CODEEntry 0NUL : 97a 98b 99c 100d :: 255ASCII-255 256 “ ab ” 257 “ bc ” 258 “ cc ” 259 “ ccd ” OUT: a b c c c d “dc” 260 c c d cW pW C C dict(pW)+C 259 100‘c’ “dc” Table Structure Decompression

 Handling of exception case:  Usually C is the first char of current word(cW)  In exception C is the first char of previous word(pW)

Exercise  Encode and decode with tree structure. (This helps to better understand that in exception case, why C must be the first char of pW.)  My understanding to the exercise question: To encounter exception case, cW must be construct along pW branch, otherwise we will not encounter exception case, so the first char of cW and pW is the same char.

Download ppt "CSCI 3280 Tutorial 6. Outline  Theory part of LZW  Tree representation of LZW  Table representation of LZW."

Similar presentations