1 Analysis of Algorithms Chapter - 08 Data Compression.

Slides:

Advertisements

Similar presentations

Lecture 4 (week 2) Source Coding and Compression

Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna

Greedy Algorithms Amihood Amir Bar-Ilan University.

Huffman Encoding Dr. Bernard Chen Ph.D. University of Central Arkansas.

Greedy Algorithms (Huffman Coding)

Problem: Huffman Coding Def: binary character code = assignment of binary strings to characters e.g. ASCII code A = B = C =

Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.

Compression & Huffman Codes

Introduction to Data Compression

1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.

Association Clusters Definition The frequency of a stem in a document,, is referred to as. Let be an association matrix with rows and columns, where. Let.

SWE 423: Multimedia Systems

1 Assignment 2: (Due at 10:30 a.m on Friday of Week 10) Question 1 (Given in Tutorial 5) Question 2 (Given in Tutorial 7) If you do Question 1 only, you.

CSCI 3 Chapter 1.8 Data Compression. Chapter 1.8 Data Compression  For the purpose of storing or transferring data, it is often helpful to reduce the.

A Data Compression Algorithm: Huffman Compression

Compression & Huffman Codes Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.

Data Structures – LECTURE 10 Huffman coding

Chapter 9: Huffman Codes

Greedy Algorithms Huffman Coding

16.Greedy algorithms Hsu, Lih-Hsing. Computer Theory Lab. Chapter 16P An activity-selection problem Suppose we have a set S = {a 1, a 2,..., a.

Data Compression Basics & Huffman Coding

Data Compression Gabriel Laden CS146 – Dr. Sin-Min Lee Spring 2004.

CSE Lectures 22 – Huffman codes

Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.

Chapter 2 Source Coding (part 2)

Algorithm Design & Analysis – CS632 Group Project Group Members Bijay Nepal James Hansen-Quartey Winter

Advanced Algorithm Design and Analysis (Lecture 5) SW5 fall 2004 Simonas Šaltenis E1-215b

Huffman Codes. Encoding messages  Encode a message composed of a string of characters  Codes used by computer systems  ASCII uses 8 bits per character.

Data Compression1 File Compression Huffman Tries ABRACADABRA

Huffman Encoding Veronica Morales.

Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.

CS-2852 Data Structures LECTURE 13B Andrew J. Wozniewicz Image copyright © 2010 andyjphoto.com.

Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 5: Advanced Design Techniques.

Huffman Coding. Huffman codes can be used to compress information –Like WinZip – although WinZip doesn’t use the Huffman algorithm –JPEGs do use Huffman.

Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.

COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:

Introduction to Algorithms Chapter 16: Greedy Algorithms.

Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.

Huffman Code and Data Decomposition Pranav Shah CS157B.

Huffman Coding Yancy Vance Paredes. Outline Background Motivation Huffman Algorithm Sample Implementation Running Time Analysis Proof of Correctness Application.

CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 6 – Basics of Compression (Part 1) Klara Nahrstedt Spring 2011.

Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.

Bahareh Sarrafzadeh 6111 Fall 2009

Main Index Contents 11 Main Index Contents Complete Binary Tree Example Complete Binary Tree Example Maximum and Minimum Heaps Example Maximum and Minimum.

1 Algorithms CSCI 235, Fall 2015 Lecture 30 More Greedy Algorithms.

Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.

Huffman Codes. Overview  Huffman codes: compressing data (savings of 20% to 90%)  Huffman’s greedy algorithm uses a table of the frequencies of occurrence.

Multi-media Data compression

Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.

CS3381 Des & Anal of Alg ( SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 1 Greedy.

Data Compression and Huffman’s Algorithm Fundamental Data Structures and Algorithms Peter Lee February 11, 2003.

Huffman encoding.

Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.

Lecture 12 Huffman Coding King Fahd University of Petroleum & Minerals College of Computer Science & Engineering Information & Computer Science Department.

Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 18.

Data Compression: Huffman Coding in Weiss (p.389)

Compression & Huffman Codes

Data Compression.

Chapter 9: Huffman Codes

Huffman Coding.

Advanced Algorithms Analysis and Design

Huffman Coding CSE 373 Data Structures.

Data Structure and Algorithms

Podcast Ch23d Title: Huffman Compression

Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms

Huffman Coding Greedy Algorithm

Huffman codes Binary character code: each character is represented by a unique binary string. A data file can be coded in two ways: a b c d e f frequency(%)

Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes

Analysis of Algorithms CS 477/677

Presentation transcript:

1 Analysis of Algorithms Chapter - 08 Data Compression

2 This Chapter Contains the following Topics: 1. Why Data Compression? 2. Lossless and Lossy Compression 3. Fixed-Length Coding 4. Variable-Length Coding 5. Huffman Coding

3 Why Data Compression?  What is data compression?  Transformation of data into a more compact form.  Transfer rate of compressed data is more than the uncompressed data.  Why compress data?  Saves storage space.  Saves transmission time over a network.  Examples:  Suppose ASCII code of a character is 1 byte.  Suppose we have a text file containing one hundred instances of ‘a’.  So, file size would be about 100 bytes.  Let us store this as “100a” in a new file to convey the same information  New file size would be 4 bytes  4/100  96% saving

4 Lossless and Lossy Data Compression  Last example shows “lossless” compression.  Can retrieve original data by decompression.  Lossless compression used when data integrity is important.  Example software:  winzip, gzip, compress etc.  “Lossy” means original not retrievable.  Reduces size by permanently eliminating certain information.  When uncompressed, only a part of the original information is there (but the user may not notice it)  When can we use lossy compression?  For audio, images, video.  jpeg, mpeg etc. are example softwares.

5 Fixed- Length Coding  Coding:  Way to represent information  Two ways:  Fixed-Length and Variable-Length Coding.  The code for a character is a “codeword”.  We consider binary codes, each character represented by a unique binary codeword.  Fixed-length coding  Length of codeword of each character same  E.g., ASCII, Unicode etc.  Suppose there are n characters  What is the minimum number of bits needed for fixed-length coding?   log 2 n   Example:  {a, b, c, d, e}; 5 characters   log 2 5  =  2.3…  = 3 bits per character  We can have codewords: a=000, b=001, c=010, d=011, e=100.

6 Variable-Length Coding  Length of codewords may differ from character to character.  Frequent characters get short codewords.  Infrequent ones get long codewords.  Example: abcdef Frequency Codeword  Make sure that a codeword does not occur as the prefix of another codeword  What we need is a “prefix-free code”.  Last example is a prefix-free code  Prefix-free codes give unique decoding  E.g., “ ” is decoded as “aabe” based on the table in last example  Huffman coding algorithm shows how to obtain prefix-free codes.

7 Huffman Coding Algorithm  Huffman invented a greedy method to construct an optimal prefix-free variable-length code  Code based on frequency of occurrence  Optimal code given by a full binary tree  Every internal node has 2 children  If |C| is the size of alphabet,, there are |C| leaves and |C|-1 internal nodes  We build the tree bottom-up  Begin with |C| leaves  Perform |C|-1 “merging” operations  Let f [c] denote frequency of character c  We use a priority queue Q in which high priority means low frequency  GetMin(Q) removes element with the lowest frequency and returns it

8 An Algorithm Input: Alphabet C and frequencies f [ ] Result: Optimal coding tree for C Algorithm Huffman(C, f) {n := |C|; Q := C; for i := 1 to n-1 do { z := NewNode( ); x := z.left := GetMin(Q); y := z.right := GetMin(Q); f [z] := f [x] + f [y]; Insert(Q, z); } return GetMin(Q); }  Running time is O(n lg n)

9 Example  Obtain the optimal coding for the following using the Huffman Algorithm Characterabcdef Frequency

10 Example (Contd.)

11 End of Chapter - 07