Podcast Ch23d Title: Huffman Compression

Slides:



Advertisements
Similar presentations
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Advertisements

Lecture 4 (week 2) Source Coding and Compression
Algorithm Design Techniques: Greedy Algorithms. Introduction Algorithm Design Techniques –Design of algorithms –Algorithms commonly used to solve problems.
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Greedy Algorithms Amihood Amir Bar-Ilan University.
Greedy Algorithms (Huffman Coding)
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Lecture04 Data Compression.
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Huffman Coding: An Application of Binary Trees and Priority Queues
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
A Data Compression Algorithm: Huffman Compression
Data Structures – LECTURE 10 Huffman coding
Chapter 9: Huffman Codes
Greedy Algorithms Huffman Coding
Lossless Data Compression Using run-length and Huffman Compression pages
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
Huffman Codes Message consisting of five characters: a, b, c, d,e
CSE Lectures 22 – Huffman codes
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
Huffman Encoding Veronica Morales.
1 Analysis of Algorithms Chapter - 08 Data Compression.
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
CS-2852 Data Structures LECTURE 13B Andrew J. Wozniewicz Image copyright © 2010 andyjphoto.com.
Huffman Coding. Huffman codes can be used to compress information –Like WinZip – although WinZip doesn’t use the Huffman algorithm –JPEGs do use Huffman.
Introduction to Algorithms Chapter 16: Greedy Algorithms.
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
© Copyright 2012 by Pearson Education, Inc. All Rights Reserved. 1 Chapter 19 Binary Search Trees.
Huffman’s Algorithm 11/02/ Weighted 2-tree A weighted 2-tree T is an extended binary tree with n external nodes and each of the external nodes is.
Bahareh Sarrafzadeh 6111 Fall 2009
Main Index Contents 11 Main Index Contents Complete Binary Tree Example Complete Binary Tree Example Maximum and Minimum Heaps Example Maximum and Minimum.
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
Huffman Codes. Overview  Huffman codes: compressing data (savings of 20% to 90%)  Huffman’s greedy algorithm uses a table of the frequencies of occurrence.
1Computer Sciences Department. 2 Advanced Design and Analysis Techniques TUTORIAL 7.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 18.
Podcast Ch23e Title: Implementing Huffman Compression
Design & Analysis of Algorithm Huffman Coding
HUFFMAN CODES.
CSC317 Greedy algorithms; Two main properties:
Chapter 25 Binary Search Trees
Assignment 6: Huffman Code Generation
Chapter 5 : Trees.
ISNE101 – Introduction to Information Systems and Network Engineering
The Greedy Method and Text Compression
CIS265/506 Files & Indexing CIS265/506: File Indexing.
The Greedy Method and Text Compression
Podcast Ch17a Title: Expression Trees
Data Compression If you’ve ever sent a large file to a friend, you may have compressed it into a zip archive like the one on this slide before doing so.
Chapter 9: Huffman Codes
Huffman Coding.
Merge Sort 11/28/2018 2:21 AM The Greedy Method The Greedy Method.
Advanced Algorithms Analysis and Design
Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.
Huffman Coding CSE 373 Data Structures.
Communication Technology in a Changing World
Chapter 16: Greedy algorithms Ming-Te Chi
Greedy Algorithms TOPICS Greedy Strategy Activity Selection
Greedy: Huffman Codes Yin Tat Lee
Trees Addenda.
Data Structure and Algorithms
Chapter 16: Greedy algorithms Ming-Te Chi
Podcast Ch18c Title: BST delete operation
Podcast Ch18a Title: Overview of Binary Search Trees
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Huffman Coding Greedy Algorithm
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Analysis of Algorithms CS 477/677
Presentation transcript:

Podcast Ch23d Title: Huffman Compression Description: Huffman compression; building a Huffman tree Participants: Barry Kurtz (instructor); John Helfert and Tobie Williams (students) Textbook: Data Structures for Java; William H. Ford and William R. Topp

Huffman Compression Huffman compression relies on counting the number of occurrences of each 8-bit byte in the data and generating a sequence of optimal binary codes called prefix codes. The Huffman algorithm is an example of a greedy algorithm. A greedy algorithm makes an optimal choice at each local step in the hope of creating an optimal solution to the entire problem.

Huffman Compression (continued) The algorithm generates a table that contains the frequency of occurrence of each byte in the file. Using these frequencies, the algorithm assigns each byte a string of bits known as its bit code and writes the bit code to the compressed image in place of the original byte. Compression occurs if each 8-bit char in a file is replaced by a shorter bit sequence.

Huffman Compression (continued) b c d e f Frequency (in thousands) 16 4 8 6 20 3 Fixed-length code word 000 001 010 011 100 101 Compression Ratio = 456000/171000 = 2.67

Huffman Compression (continued) Use a binary tree to represent bit codes. A left edge is a 0 and a right edge is a 1. Each interior node specifies a frequency count, and each leaf node holds a character and its frequency.

Huffman Compression (continued)

Huffman Compression (continued) Each data byte occurs only in a leaf node. Such codes are called prefix codes. A full binary tree is one in where each interior node has two children. By converting the tree to a full tree, we can generate better bit codes for our example.

Huffman Compression (continued) Compression ratio = 456000/148000 = 3.08

Huffman Compression (continued) To compress a file replace each char by its prefix code. To uncompress, follow the bit code bit‑by‑bit from the root of the tree to the corresponding character. Write the character to the uncompressed file. Good compression involves choosing an optimal tree. It can be shown that the optimal bit codes for a file are always represented by a full tree.

Huffman Compression (cont) For each byte b in the original file, let f(b) be the frequency of the byte and d(b) be the depth of the leaf node containing b. The depth of the node is also the number of bits in the bit code for b. The cost of the tree is the number of bits necessary to compress the file. A Huffman tree generates the minimum number of bits in the compressed image. It generates optimal prefix codes.

Circle all statements that are true for a Huffman tree. (a) A Huffman tree is complete. (b) Every interior node has exactly two children. (c) Each byte is in a leaf node. (d) The total number of bits generated by the tree is minimum. (e) Each interior node contains the product of its children's weights. (f) Each interior node contains the sum of its children's weights. (g) Nodes with lower frequency are near the top of the tree. (h) Nodes with lower frequency are near the bottom of the tree.

Building a Huffman Tree For each of the n bytes in a file, assign the byte and its frequency to a tree node, and insert the node into a minimum priority queue ordered by frequency.

Building a Huffman Tree (continued) Remove two elements, x and y, from the priority queue, and attach them as children of a node whose frequency is the sum of the frequencies of its children. Insert the resulting node into the priority queue. In a loop, perform this action n-1 times. Each loop iteration creates one of the n-1 interior nodes of the full tree.

Building a Huffman Tree (continued) With a minimum priority queue the least frequently occurring characters have longer bit codes, and the more frequently occurring chars have shorter bit codes.

Building a Huffman Tree (continued)

Building a Huffman Tree (continued)

Building a Huffman Tree (continued)

Building a Huffman Tree (cont) For the Huffman tree, the compressed file contains (16(2) + 4(4) + 8(2) + 6(3) + 20(2) + 3(4)) x 1000 = 134,000 bits, which corresponds to a compression ratio of 3.4.

The file "data.txt" contains the following ASCII characters: ababbcabaac Construct a Huffman tree for the file. What are the Huffman codes for the characters? Huffman codes: a->0 b-> 11 c->10 (c) Write out the bits in the compressed file. Bits in the compressed file: 01101111100110010

Consider the following file characters: beabdcbacaacbdecdeaaeb Construct a Huffman tree for the file. What are the Huffman codes? a  00 b  10 c  010 d  011 e  11 (c) Write out the bits in the compressed file.