Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lempel-Ziv-Welch Compression

Similar presentations


Presentation on theme: "Lempel-Ziv-Welch Compression"— Presentation transcript:

1 Lempel-Ziv-Welch Compression
Assignment 5 Lempel-Ziv-Welch Compression

2 What is LZW compression?
LZW is a form of lossless compression. LZW compression has its roots in the work of Jacob Ziv and Abraham Lempel. In 1977, they published a paper on "sliding-window" compression, and followed it with another paper in 1978 on "dictionary" based compression. These algorithms were named LZ77 and LZ78, respectively. Then in 1984, Terry Welch made a modification to LZ78 which became very popular and was dubbed LZW (guess why). The LZW algorithm is what we are going to talk about here.

3 The Concept of LZW Many files, especially text files, have certain strings that repeat very often, for example " the ". With the spaces, the string takes 5 bytes, or 40 bits to encode. But what if we were to add the whole string to the list of characters after the last one, at 256. Then every time we came across " the ", we could send the code 256 instead of 32,116,104,101,32. This would take 9 bits instead of 40 (since 256 does not fit into 8 bits).

4 The Concept of LZW (cont.)
This is exactly the approach that LZW compression takes. It starts with a "dictionary" of all the single character with indexes It then starts to expand the dictionary as information gets sent through. Pretty soon, redundant strings will be coded as a single bit, and compression has occured.

5 Compression set w = NIL loop read a character k
if wk exists in the dictionary w = wk else output the code for w add wk to the dictionary w = k endloop So what happens here? The program reads one character at a time. If the code is in the dictionary, then it adds the character to the current work string, and waits for the next one. This occurs on the first character as well. If the work string is not in the dictionary, (such as when the second character comes across), it adds the work string to the dictionary and sends over the wire the works string without the new character. It then sets the work string to the new character.

6 Decompression read a character k output k w = k loop entry = dictionary entry for k output entry add w + first char of entry to the dictionary w = entry endloop The nice thing is that the decompressor builds its own dictionary on its side, that matches exactly the compressor's, so that only the codes need to be sent.

7 Compression Example Phrase: Dict. Additions:
sign, sign. everywhere a sign. Dict. Additions: Compressed: 115,

8 Compression Example Phrase: Dict. Additions:
sign, sign. everywhere a sign. Dict. Additions: 256: si Compressed: 115,105,

9 Compression Example Phrase: Dict. Additions:
sign, sign. everywhere a sign. Dict. Additions: 256: si 257: ig Compressed: 115,105,103,

10 Compression Example Phrase: Dict. Additions:
sign, sign. everywhere a sign. Dict. Additions: 256: si 257: ig 258: gn Compressed: 115,105,103,110,

11 Compression Example Phrase: Dict. Additions:
sign, sign. everywhere a sign. Dict. Additions: 256: si 257: ig 258: gn 259: n, Compressed: 115,105,103,110,44,

12 Compression Example Dict. Additions: 256: si 257: ig 258: gn 259: n,
Phrase: sign,█sign. everywhere a sign. Dict. Additions: 256: si 257: ig 258: gn 259: n, 260: ,█ Compressed: 115,105,103,110,44,32,

13 Compression Example Phrase: Dict. Additions:
sign, sign. everywhere a sign. Dict. Additions: 256: si 257: ig 258: gn 259: n, 260: ,█ 261: █s Compressed: 115,105,103,110,44,32,256,

14 Compression Example Phrase: Dict. Additions:
sign, sign. everywhere a sign. Dict. Additions: 256: si 257: ig 258: gn 259: n, 260: ,█ 261: █s 262: sig Compressed: 115,105,103,110,44,32,256, 258,

15 Compression Example Phrase: Dict. Additions:
sign, sign. everywhere a sign. Dict. Additions: 256: si 257: ig 258: gn 259: n, 260: ,█ 261: █s 262: sig 263: gn. Compressed: 115,105,103,110,44,32,256, 258,46…

16 Decompression Example
Phrase: s Dict. Additions: Compressed: 115,105,103,110,44,32,256, 258,46…

17 Decompression Example
Phrase: si Dict. Additions: Compressed: 115,105,103,110,44,32,256, 258,46…

18 Decompression Example
Phrase: sig Dict. Additions: 256: si Compressed: 115,105,103,110,44,32,256, 258,46…

19 Decompression Example
Phrase: sign Dict. Additions: 256: si 257: ig Compressed: 115,105,103,110,44,32,256, 258,46…

20 Decompression Example
Phrase: sign, Dict. Additions: 256: si 257: ig 258: gn Compressed: 115,105,103,110,44,32,256, 258,46…

21 Decompression Example
Phrase: sign,█ Dict. Additions: 256: si 257: ig 258: gn 259: n, Compressed: 115,105,103,110,44,32,256, 258,46…

22 Decompression Example
Phrase: sign, si Dict. Additions: 256: si 257: ig 258: gn 259: n, 260: ,█ Compressed: 115,105,103,110,44,32,256, 258,46…

23 Decompression Example
Phrase: sign, sign Dict. Additions: 256: si 257: ig 258: gn 259: n, 260: ,█ 261: █s Compressed: 115,105,103,110,44,32,256, 258,46…

24 Decompression Example
Phrase: sign, sign. Dict. Additions: 256: si 257: ig 258: gn 259: n, 260: ,█ 261: █s 262: sig Compressed: 115,105,103,110,44,32,256, 258,46…

25 Shortcomings of LZW Traditional LZW compression is done using a set number of bits for each codeword used. Also, there is a chance that some codewords in the dictionary we build will never be used. How can we make the compression more efficient?

26 Modifying LZW For your assignment, you will be modifying basic C code so that variable length codewords can be used.

27 Where to Start Begin by familiarizing yourself with the lzw.c code and how it is working. If you are unfamiliar working on UNIXS, please read: Begin modifying the code to allow for the variable length codewords considering: When will then length of the codewords need to be changed? How can you ensure synchronization between the compression and decompression algorithms?

28 Any Questions? Next week we will address synchronization issues and any other issues that may arise. Please send me an if you have any immediate questions or if you have an area of the project you’d like me to focus on next Friday.


Download ppt "Lempel-Ziv-Welch Compression"

Similar presentations


Ads by Google