6 1 3 10 10 10 2 1 0 6 x 100+ 1 x 10 + 3 x 1 =613 Base 10 digits {0...9} Base 10 digits {0...9}

Slides:



Advertisements
Similar presentations
Lecture 4 (week 2) Source Coding and Compression
Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Greedy Algorithms Amihood Amir Bar-Ilan University.
22C:19 Discrete Structures Trees Spring 2014 Sukumar Ghosh.
Huffman Encoding Dr. Bernard Chen Ph.D. University of Central Arkansas.
Lecture 10 : Huffman Encoding Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University Lecture notes : courtesy.
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Huffman Coding: An Application of Binary Trees and Priority Queues
Fall 2007CS 2251 Trees Chapter 8. Fall 2007CS 2252 Chapter Objectives To learn how to use a tree to represent a hierarchical organization of information.
Is ASCII the only way? For computers to do anything (besides sit on a desk and collect dust) they need two things: 1. PROGRAMS 2. DATA A program is a.
Data Structures – LECTURE 10 Huffman coding
Chapter 9: Huffman Codes
CSE 143 Lecture 18 Huffman slides created by Ethan Apter
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd.
Greedy Algorithms Huffman Coding
Data Representation in Computers
Data Compression Basics & Huffman Coding
Huffman code uses a different number of bits used to encode characters: it uses fewer bits to represent common characters and more bits to represent rare.
Huffman Codes Message consisting of five characters: a, b, c, d,e
Data Structures and Algorithms Huffman compression: An Application of Binary Trees and Priority Queues.
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
Huffman Codes. Encoding messages  Encode a message composed of a string of characters  Codes used by computer systems  ASCII uses 8 bits per character.
Data Compression1 File Compression Huffman Tries ABRACADABRA
Lecture Objectives  To learn how to use a Huffman tree to encode characters using fewer bytes than ASCII or Unicode, resulting in smaller files and reduced.
Data Structures Week 6: Assignment #2 Problem
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd.
Spring 2010CS 2251 Trees Chapter 6. Spring 2010CS 2252 Chapter Objectives Learn to use a tree to represent a hierarchical organization of information.
Communication Technology in a Changing World Week 2.
Data Structures and Algorithms Lecture (BinaryTrees) Instructor: Quratulain.
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
5.5.2 M inimum spanning trees  Definition 24: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible.
5.5.3 Rooted tree and binary tree  Definition 25: A directed graph is a directed tree if the graph is a tree in the underlying undirected graph.  Definition.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Priority Queues, Trees, and Huffman Encoding CS 244 This presentation requires Audio Enabled Brent M. Dingle, Ph.D. Game Design and Development Program.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
Foundation of Computing Systems
Chapter 10: Trees A tree is a connected simple undirected graph with no simple circuits. Properties: There is a unique simple path between any 2 of its.
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
Trees Ellen Walker CPSC 201 Data Structures Hiram College.
1 Huffman Codes. 2 ASCII use same size encoding for all characters. Variable length codes can produce shorter messages than fixed length codes Huffman.
CSE 143 Lecture 22 Huffman slides created by Ethan Apter
Priority Queues Opening Discussion zWhat did we talk about last class? zDo you have any questions about the assignments? The designs for assignment.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Compression and Huffman Coding. Compression Reducing the memory required to store some information. Lossless compression vs lossy compression Lossless.
5.6 Prefix codes and optimal tree Definition 31: Codes with this property which the bit string for a letter never occurs as the first part of the bit string.
Lossless Compression-Statistical Model Lossless Compression One important to note about entropy is that, unlike the thermodynamic measure of entropy,
Lecture on Data Structures(Trees). Prepared by, Jesmin Akhter, Lecturer, IIT,JU 2 Properties of Heaps ◈ Heaps are binary trees that are ordered.
Design & Analysis of Algorithm Huffman Coding
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
HUFFMAN CODES.
COMP261 Lecture 22 Data Compression 2.
B/B+ Trees 4.7.
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd.
Assignment 6: Huffman Code Generation
ISNE101 – Introduction to Information Systems and Network Engineering
Huffman Coding Based on slides by Ethan Apter & Marty Stepp
The Greedy Method and Text Compression
Data Compression If you’ve ever sent a large file to a friend, you may have compressed it into a zip archive like the one on this slide before doing so.
Chapter 9: Huffman Codes
Math 221 Huffman Codes.
Advanced Algorithms Analysis and Design
Huffman Coding CSE 373 Data Structures.
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd.
Trees Addenda.
Data Structure and Algorithms
Podcast Ch23d Title: Huffman Compression
Huffman Coding Greedy Algorithm
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Presentation transcript:

x x x 1 =613 Base 10 digits {0...9} Base 10 digits {0...9}

x 8+ 1 x 4+ 0 x 2+ 1 x 1 = 13 Base 2 digits {0, 1} Base 2 digits {0, 1}

The binary equivalents of some decimal numbers.

parent Left child right child A binary tree consists of a set of nodes each of which can have at most two children.

root leaf The top node of the tree is called the root. A leaf is a node with no descendants.

The binary digits (bits) in the computer’s memory are initially set to zero. To represent a number, the appropriate bits are set to (dec)

4 (dec) = Multiplying by 2 in machine language is accomplished by shifting left one bit

4 (dec) = 8 (dec) = Multiplying by 2 in machine language is accomplished by shifting left one bit

4 (dec) = 8 (dec) = 16 (dec) = Multiplying by 2 in machine language is accomplished by shifting left one bit

4 (dec) = 8 (dec) = 16 (dec) = (dec) = 9 (dec) = 17 (dec) = We obtain the next integer by adding a 1 to the binary number.

n 2n2n+1 Construct a tree using the following: If the parent ‘s node number is n, the left child’s is 2*n and the right child‘s is 2*n + 1.

1 We assign 1 to the root’s node number

1 2 Then, the left child’s node number is 2

1 2 3 And the right child’s node number is 3.

A graphical way of getting the binary equivalents of decimal numbers. Place a 0 on each left edge.

A graphical way of getting the binary equivalents of decimal numbers. Place a 0 on each left edge and a 1 on each right edge of a binary tree.

To convert 5 to binary, start by writing the lower-most 1 on the path from node 5 to the root. 1 1

To the left of the 1, write the digit for the next edge on the upward path to the root, namely,

Finally, to the left of the 0, place a 1. This represents the node number of the root, 1, which is the same in binary and decimal

The node numbers at the leaves converted to binary numbers.

Placing a 0 on each left edge is equivalent to shifting left, ie., multiplying by 2. Placing a 1 on the right edge means you are adding 1 to the left child’s value.

Express 67 in decimal 1 (bin)

(bin)

(bin)

(bin)

(bin)

(bin)

(dec) = (bin) Place 1 at the left since the root node contains 1.

The ascii code in decimal and binary for some characters. Thus it requires 7 bits to represent each character. American Standard Code for Information Interchange

a c b Symbol a: Symbol b: 00 Symbol c : 1 The code 001 can be decoded as aac or bc. Thus the code is ambiguous.

a Symbol a: 0 0 Symbol b: 01 The code 01 is decoded as b. Before, however, you reach the end of the string 01, you would think that 0 corresponds to a. The code requires you to scan ahead. This is called non-instantaneous code and is inefficient as coding scheme. b 1

a b c Symbol a: 00 Symbol b: 01 Symbol c: 1 If the characters are only in the leaves, the code is unique and instantaneous. Such a code exhibits the prefix property.

a b c Symbol a: 00 Symbol b: 01 Symbol c: 1 Let’s decode 10001

a b c Symbol a: 00 Symbol b: 01 Symbol c:

a b c Symbol a: 00 Symbol b: 01 Symbol c: c c

a b c Symbol a: 00 Symbol b: 01 Symbol c: c c

a b c Symbol a: 00 Symbol b: 01 Symbol c: c c

a b c Symbol a: 00 Symbol b: 01 Symbol c: ca

a b c Symbol a: 00 Symbol b: 01 Symbol c: ca

a b c Symbol a: 00 Symbol b: 01 Symbol c: ca

a b c Symbol a: 00 Symbol b: 01 Symbol c: cab

Letters occurring in a paragraph and their frequency of occurrence. How can we encode these letters so that the resultant code is minimal?

b, 2h, 4g, 9a, 11 A list of nodes containing the letters and frequencies. The list is sorted by frequency.

b, 2h, 4 g, 9a, 11 Remove the first two nodes, add their frequencies, and create a parent node with that frequency. Letters will appear in only the leaves of the final tree. *, 6

b, 2h, 4 g, 9a, 11 Insert the parent node with its children in its sorted position in the list. This type of list and its operations is called a priority queue. *, 6

b, 2h, 4 g, 9 a, 11 *, 6 *, 15 Remove the first two nodes again, add their frequencies, and create a parent node with that frequency.

b, 2h, 4 g, 9 a, 11 *, 6 *, 15 Insert the parent node with its children in its sorted position in the list.

b, 2h, 4 g, 9 a, 11 *, 6 *, 15 By continuing the process, we get the final tree. The leaves are the only nodes containing letters. *, 26

bh g a This tree is called a Huffman tree. Here it is shown with only the leaves labeled.

bh g a Label the edges with 0’s and 1’s as we did for the binary numbers

bh g a The letters with their Huffman codes. The letters with the higher frequencies have smaller Huffman codes

bh g a Let’s decode

bh g a

bh g a

bh g a

bh g a We hit a leaf, print letter, Resultant code: b

bh g a We hit a leaf, print letter & return to root Resultant code: b

bh g a We hit a leaf, print letter Resultant code: ba

bh g a We hit a leaf, print letter & return to root Resultant code: ba

bh g a Resultant code: ba

bh g a We hit a leaf and print letter Resultant code: bag

bag : in Huffman code bag : bag : in Huffman code bag :

bag : in Huffman code bag : bag : in Huffman code bag :

bag : in Huffman code bag : in ascii code bag : in Huffman code bag : in ascii code

12,b13,h 7,g 2,a If you number the nodes as we did when we converted decimal to binary, you can get the Huffman code from the node numbers.

12,b13,h 7,g 2,a The Huffman code is obtained from the binary by removing the leading 1.

"Baseball's Sad Lexicon" These are the saddest of possible words: "Tinker to Evers to Chance." Trio of bear cubs, and fleeter than birds, Tinker and Evers and Chance. Ruthlessly pricking our gonfalon bubble, Making a Giant hit into a double-- Words that are heavy with nothing but trouble: "Tinker to Evers to Chance." Franklin Pierce Adams