1 Analysis of Algorithms Chapter - 08 Data Compression.

Slides:



Advertisements
Similar presentations
Lecture 4 (week 2) Source Coding and Compression
Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Greedy Algorithms Amihood Amir Bar-Ilan University.
Huffman Encoding Dr. Bernard Chen Ph.D. University of Central Arkansas.
Greedy Algorithms (Huffman Coding)
Problem: Huffman Coding Def: binary character code = assignment of binary strings to characters e.g. ASCII code A = B = C =
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Compression & Huffman Codes
Introduction to Data Compression
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Association Clusters Definition The frequency of a stem in a document,, is referred to as. Let be an association matrix with rows and columns, where. Let.
SWE 423: Multimedia Systems
1 Assignment 2: (Due at 10:30 a.m on Friday of Week 10) Question 1 (Given in Tutorial 5) Question 2 (Given in Tutorial 7) If you do Question 1 only, you.
CSCI 3 Chapter 1.8 Data Compression. Chapter 1.8 Data Compression  For the purpose of storing or transferring data, it is often helpful to reduce the.
A Data Compression Algorithm: Huffman Compression
Compression & Huffman Codes Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Data Structures – LECTURE 10 Huffman coding
Chapter 9: Huffman Codes
Greedy Algorithms Huffman Coding
16.Greedy algorithms Hsu, Lih-Hsing. Computer Theory Lab. Chapter 16P An activity-selection problem Suppose we have a set S = {a 1, a 2,..., a.
Data Compression Basics & Huffman Coding
Data Compression Gabriel Laden CS146 – Dr. Sin-Min Lee Spring 2004.
CSE Lectures 22 – Huffman codes
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
Chapter 2 Source Coding (part 2)
Algorithm Design & Analysis – CS632 Group Project Group Members Bijay Nepal James Hansen-Quartey Winter
Advanced Algorithm Design and Analysis (Lecture 5) SW5 fall 2004 Simonas Šaltenis E1-215b
Huffman Codes. Encoding messages  Encode a message composed of a string of characters  Codes used by computer systems  ASCII uses 8 bits per character.
Data Compression1 File Compression Huffman Tries ABRACADABRA
Huffman Encoding Veronica Morales.
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
CS-2852 Data Structures LECTURE 13B Andrew J. Wozniewicz Image copyright © 2010 andyjphoto.com.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 5: Advanced Design Techniques.
Huffman Coding. Huffman codes can be used to compress information –Like WinZip – although WinZip doesn’t use the Huffman algorithm –JPEGs do use Huffman.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:
Introduction to Algorithms Chapter 16: Greedy Algorithms.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Huffman Coding Yancy Vance Paredes. Outline Background Motivation Huffman Algorithm Sample Implementation Running Time Analysis Proof of Correctness Application.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 6 – Basics of Compression (Part 1) Klara Nahrstedt Spring 2011.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
Bahareh Sarrafzadeh 6111 Fall 2009
Main Index Contents 11 Main Index Contents Complete Binary Tree Example Complete Binary Tree Example Maximum and Minimum Heaps Example Maximum and Minimum.
1 Algorithms CSCI 235, Fall 2015 Lecture 30 More Greedy Algorithms.
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
Huffman Codes. Overview  Huffman codes: compressing data (savings of 20% to 90%)  Huffman’s greedy algorithm uses a table of the frequencies of occurrence.
Multi-media Data compression
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.
CS3381 Des & Anal of Alg ( SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 1 Greedy.
Data Compression and Huffman’s Algorithm Fundamental Data Structures and Algorithms Peter Lee February 11, 2003.
Huffman encoding.
Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Lecture 12 Huffman Coding King Fahd University of Petroleum & Minerals College of Computer Science & Engineering Information & Computer Science Department.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 18.
Data Compression: Huffman Coding in Weiss (p.389)
HUFFMAN CODES.
Compression & Huffman Codes
Data Compression.
Chapter 9: Huffman Codes
Huffman Coding.
Advanced Algorithms Analysis and Design
Huffman Coding CSE 373 Data Structures.
Data Structure and Algorithms
Podcast Ch23d Title: Huffman Compression
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Huffman Coding Greedy Algorithm
Huffman codes Binary character code: each character is represented by a unique binary string. A data file can be coded in two ways: a b c d e f frequency(%)
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Analysis of Algorithms CS 477/677
Presentation transcript:

1 Analysis of Algorithms Chapter - 08 Data Compression

2 This Chapter Contains the following Topics: 1. Why Data Compression? 2. Lossless and Lossy Compression 3. Fixed-Length Coding 4. Variable-Length Coding 5. Huffman Coding

3 Why Data Compression?  What is data compression?  Transformation of data into a more compact form.  Transfer rate of compressed data is more than the uncompressed data.  Why compress data?  Saves storage space.  Saves transmission time over a network.  Examples:  Suppose ASCII code of a character is 1 byte.  Suppose we have a text file containing one hundred instances of ‘a’.  So, file size would be about 100 bytes.  Let us store this as “100a” in a new file to convey the same information  New file size would be 4 bytes  4/100  96% saving

4 Lossless and Lossy Data Compression  Last example shows “lossless” compression.  Can retrieve original data by decompression.  Lossless compression used when data integrity is important.  Example software:  winzip, gzip, compress etc.  “Lossy” means original not retrievable.  Reduces size by permanently eliminating certain information.  When uncompressed, only a part of the original information is there (but the user may not notice it)  When can we use lossy compression?  For audio, images, video.  jpeg, mpeg etc. are example softwares.

5 Fixed- Length Coding  Coding:  Way to represent information  Two ways:  Fixed-Length and Variable-Length Coding.  The code for a character is a “codeword”.  We consider binary codes, each character represented by a unique binary codeword.  Fixed-length coding  Length of codeword of each character same  E.g., ASCII, Unicode etc.  Suppose there are n characters  What is the minimum number of bits needed for fixed-length coding?   log 2 n   Example:  {a, b, c, d, e}; 5 characters   log 2 5  =  2.3…  = 3 bits per character  We can have codewords: a=000, b=001, c=010, d=011, e=100.

6 Variable-Length Coding  Length of codewords may differ from character to character.  Frequent characters get short codewords.  Infrequent ones get long codewords.  Example: abcdef Frequency Codeword  Make sure that a codeword does not occur as the prefix of another codeword  What we need is a “prefix-free code”.  Last example is a prefix-free code  Prefix-free codes give unique decoding  E.g., “ ” is decoded as “aabe” based on the table in last example  Huffman coding algorithm shows how to obtain prefix-free codes.

7 Huffman Coding Algorithm  Huffman invented a greedy method to construct an optimal prefix-free variable-length code  Code based on frequency of occurrence  Optimal code given by a full binary tree  Every internal node has 2 children  If |C| is the size of alphabet,, there are |C| leaves and |C|-1 internal nodes  We build the tree bottom-up  Begin with |C| leaves  Perform |C|-1 “merging” operations  Let f [c] denote frequency of character c  We use a priority queue Q in which high priority means low frequency  GetMin(Q) removes element with the lowest frequency and returns it

8 An Algorithm Input: Alphabet C and frequencies f [ ] Result: Optimal coding tree for C Algorithm Huffman(C, f) {n := |C|; Q := C; for i := 1 to n-1 do { z := NewNode( ); x := z.left := GetMin(Q); y := z.right := GetMin(Q); f [z] := f [x] + f [y]; Insert(Q, z); } return GetMin(Q); }  Running time is O(n lg n)

9 Example  Obtain the optimal coding for the following using the Huffman Algorithm Characterabcdef Frequency

10 Example (Contd.)

11 End of Chapter - 07