Compression Damian Gordon. Rather than have to store every character in a file (e.g. an MP3 file), it would be great if we could find a way of reducing.

Slides:



Advertisements
Similar presentations
1 ABAP Basics III Northern Arizona University College of Business.
Advertisements

IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture3.
 Caesar used to encrypt his messages using a very simple algorithm, which could be easily decrypted if you know the key.  He would take each letter.
Greedy Algorithms (Huffman Coding)
Text Compression 1 Assigning 16 bits to each character in a document uses too much file space We need ways to store and transmit text efficiently Text.
CPSC 231 Organizing Files for Performance (D.H.) 1 LEARNING OBJECTIVES Data compression. Reclaiming space in files. Compaction. Searching. Sorting, Keysorting.
CSCI 3 Chapter 1.8 Data Compression. Chapter 1.8 Data Compression  For the purpose of storing or transferring data, it is often helpful to reduce the.
Compression JPG compression, Source: Original 10:1 Compression 45:1 Compression.
Counting Quarters Click here to begin Click here to begin.
Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples.
1.
Chapter 2 Source Coding (part 2)
BEGINNING MULTIPLICATION BASIC FACTS Multiplication is REPEATED ADDITION. It is a shortcut to skip counting. The first number in the problem tells.
SPELLING GAME PROJECT SAMPLE ASSESSMENT MATERIAL GCSE Computing.
Algorithmic Problem Solving CMSC 201 Adapted from slides by Marie desJardins (Spring 2015 Prof Chang version)
Digital Media Dr. Jim Rowan ITEC Monday, August 27.
: Chapter 12: Image Compression 1 Montri Karnjanadecha ac.th/~montri Image Processing.
Fundamental Structures of Computer Science Feb. 24, 2005 Ananda Guna Lempel-Ziv Compression.
Topic 1: Data, Information and Knowledge Learning Outcomes Learning Outcomes What data, information and knowledge is What data, information and knowledge.
 The amount of data we deal with is getting larger  Not only do larger files require more disk space, they take longer to transmit  Many times files.
Iteration: WHILE Loop Damian Gordon. WHILE Loop Consider the problem of searching for an entry in a phone book with only SELECTION:
What you will learn 1. What an identity matrix is
Higher Grade Computing Studies 4. Standard Algorithms Higher Computing Software Development S. McCrossan 1 Linear Search This algorithm allows the programmer.
Still-image compression Moving-image compression and File types.
Controlling Function Behavior Sequence, Selection and Repetition.
ALGORITHMS FOR ISNE DR. KENNETH COSH WEEK 13.
In this lecture, you will learn: 1 Basic ideas of video compression General types of compression methods.
Huffman Coding. Huffman codes can be used to compress information –Like WinZip – although WinZip doesn’t use the Huffman algorithm –JPEGs do use Huffman.
CRE Programming Club - Class 4 Robert Eckstein and Robert Heard.
5/27/2016 1R. Smith - University of St Thomas - Minnesota CISC 130: Today’s Class RecapRecap Drawing in 2 dimensionsDrawing in 2 dimensions Vertical HistogramVertical.
Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.
Bits and Huffman Encoding Please get a piece of paper and a pen and put your name and netid on it. Make sure you can turn in it after class without losing.
BEGINNING MULTIPLICATION BASIC FACTS Multiplication is REPEATED ADDITION. It is a shortcut to skip counting. The first number in the problem tells.
Data Structures: Arrays Damian Gordon. Arrays Imagine we had to record the age of everyone in the class, we could do it declaring a variable for each.
Fundamental Data Structures and Algorithms Margaret Reid-Miller 24 February 2005 LZW Compression.
More about Strings. String Formatting  So far we have used comma separators to print messages  This is fine until our messages become quite complex:
1 Information Representation in Computer Lecture Nine.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Last Week Modules Save functions to a file, e.g., filename.py The file filename.py is a module We can use the functions in filename.py by importing it.
Chapter 3 Data Representation. 2 Compressing Files.
Introduction to Computer Programming - Project 2 Intro to Digital Technology.
Prime Numbers Damian Gordon. Prime Numbers So let’s say we want to express the following algorithm: – Read in a number and check if it’s a prime number.
CRE Programming Club - Class 4 Robert Eckstein and Robert Heard.
Lecture 12 Huffman Algorithm. In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly.
Computer Sciences Department1. 2 Data Compression and techniques.
Compression and Huffman Coding. Compression Reducing the memory required to store some information. Lossless compression vs lossy compression Lossless.
1 SWE 423 – Multimedia System. 2 SWE Multimedia System Introduction  Compression is the process of coding that will effectively reduce the total.
The ASCII Alphanumeric Code What is it? Why use it? How do we use it?
Selection Using IF THEN ELSE CASE Introducing Loops.
Python: File Management Damian Gordon. File Management We’ve seen a range of variable types: – Integer Variables – Real Variables – Character Variables.
COMP261 Lecture 21 Data Compression.
File Compression-overview
Pengantar Multimedia Data compression.
Simple Data Compression
Look at Me Mod 4 Lesson 3 Graphics Module 4- Build a Game.
W Customize this banner with your own message! Select the letter and add your own text. Use one character per slide.
Structured Programming (Top Down Step Refinement)
B Customize this banner with your own message! Select the letter and add your own text. Use one character per slide.
H Customize this banner with your own message! Select the letter and add your own text. Use one character per slide.
B Customize this banner with your own message! Select the letter and add your own text. Use one character per slide.
W Customize this banner with your own message! Select the letter and add your own text. Use one character per slide.
H Customize this banner with your own message! Select the letter and add your own text. Use one character per slide.
W Customize this banner with your own message! Select the letter and add your own text. Use one character per slide.
File Compression Even though disks have gotten bigger, we are still running short on disk space A common technique is to compress files so that they take.
W Customize this banner with your own message! Select the letter and add your own text. Use one character per slide.
Binary CSCE 101.
GCSE COMPUTER SCIENCE Topic 3 - Data 3.9 Data Compression.
W Customize this banner with your own message! Select the letter and add your own text. Use one character per slide.
W Customize this banner with your own message! Select the letter and add your own text. Use one character per slide.
H Customize this banner with your own message! Select the letter and add your own text. Use one character per slide.
Presentation transcript:

Compression Damian Gordon

Rather than have to store every character in a file (e.g. an MP3 file), it would be great if we could find a way of reducing the length of the file to allow it to be stored in a smaller space. Data Compression

Also Rather than have to send every character in a message, it would be great if we could find a way of reducing the length of the message to allow it to be transmitted quicker. Data Compression

Let’s look at an example. The rain in Spain lies mainly in the plain Data Compression

The a total of 42 characters (including 8 spaces) The rain in Spain lies mainly in the plain

Data Compression The a total of 42 characters (including 8 spaces) The rain in Spain lies mainly in the plain

Data Compression Lets replace the word “the” with the number 1. The rain in Spain lies mainly in the plain

Data Compression Lets replace the word “the” with the number 1. 1 rain in Spain lies mainly in 1 plain the =1

Data Compression Lets replace the word “the” with the number 1. We’ve reduced the of characters to rain in Spain lies mainly in 1 plain the =1

Data Compression Lets replace the letters “ain” with the number 2. 1 rain in Spain lies mainly in 1 plain the =1

Data Compression Lets replace the letters “ain” with the number 2. We’ve reduced the of characters to r2 in Sp2 lies m2ly in 1 pl2 the =1 ain =2

Data Compression Lets replace the letters “in” with the number 3. 1 r2 in Sp2 lies m2ly in 1 pl2 the =1 ain =2

Data Compression Lets replace the letters “in” with the number 3. We’ve reduced the of characters to r2 3 Sp2 lies m2ly 3 1 pl2 the =1 ain =2 in = 3

Data Compression Now lets say 1 means “the ”, so it’s “the” and a space 1 r2 3 Sp2 lies m2ly 3 1 pl2 the =1 ain =2 in = 3

Data Compression Now lets say 1 means “the ”, so it’s “the” and a space We’ve reduced the of characters to 26. 1r2 3 Sp2 lies m2ly 3 1pl2 the =1 ain =2 in = 3

Data Compression Now lets say 3 means “in ”, so it’s “in” and a space 1r2 3 Sp2 lies m2ly 3 1pl2 the =1 ain =2 in = 3

Data Compression Now lets say 3 means “in ”, so it’s “in” and a space We’ve reduced the of characters to 24. 1r2 3Sp2 lies m2ly 31pl2 the =1 ain =2 in = 3

Data Compression So that’s 24 characters for a 42 character message, not bad. The rain in Spain lies mainly in the plain 1r2 3Sp2 lies m2ly 31pl2 the =1 ain =2 in = 3

Data Compression Let’s try a different example.

Data Compression Let’s try a different example. Let’s say we are sending a list of jobs, with each item on the list is 10 characters long. Bookkeeper Teacher--- Porter---- Nurse----- Doctor----

Data Compression Rather than sending the spaces we could just say how long they are: Bookkeeper Teacher--- Porter---- Nurse----- Doctor----

Data Compression Rather than sending the spaces we could just say how long they are: Bookkeeper Teacher--- Porter---- Nurse----- Doctor---- Bookkeeper Teacher3- Porter4- Nurse5- Doctor4-

Data Compression We’ve gone from 50 to 42 characters: Bookkeeper Teacher--- Porter---- Nurse----- Doctor---- Bookkeeper Teacher3- Porter4- Nurse5- Doctor4-

PROGRAM CompressExample: Get Current Character; WHILE (NOT End_of_Line) DO Get Next Character; IF (Current Character != Next Character) THEN Get next char, and set current to next; Write out Current Character; ELSE Keep looping while the characters match; Keep counting; Get next char, and set current to next; When finished write out Counter; Write out Current Character; Reset Counter; ENDIF; ENDWHILE; END.

PROGRAM CompressExample: char Current_Char, Next_char; Current_Char <- Get_char(); WHILE (NOT End_of_Line) DO Next_Char <- Get_char(); IF (Current_Char != Next_char) THEN Current_Char <- Next_Char; Next_Char <- Get_char(); Write out Current_Char; ELSE WHILE (Current_Char = Next_char) DO Counter <- Counter + 1; Current_Char <- Next_Char; Next_Char <- Get_char(); ENDWHILE; Write out Counter, Current_Char; Counter <- 0; ENDIF; ENDWHILE; END.

Data Compression Or let’s imagine we are sending a list of house prices

Data Compression Now let’s use the # to indicate number of zeros:

Data Compression Now let’s use the # to indicate number of zeros: #4 6#5 55#4 21#5 3#6

Data Compression We’ve gone from 32 characters to 18 characters: #4 6#5 55#4 21#5 3#6

Image Compression

Data Compression Let’s think about images. Let’s say we are trying to display the letter ‘A’

Data Compression Let’s think about images. Let’s say we are trying to display the letter ‘A’

Data Compression We could encode this as: WWWBBWWW WWBWWBWW WBWWWWBW WBBBBBBW WBWWWWBW WWWWWWWW

Data Compression We could compress this to: WWWBBWWW WWBWWBWW WBWWWWBW WBBBBBBW WBWWWWBW WWWWWWWW

Data Compression We could compress this to: WWWBBWWW WWBWWBWW WBWWWWBW WBBBBBBW WBWWWWBW WWWWWWWW 3W2B3W 2WB2WB2W WB4WBW W6BW WB4WBW 8W

Data Compression From 64 characters to 44 characters: WWWBBWWW WWBWWBWW WBWWWWBW WBBBBBBW WBWWWWBW WWWWWWWW 3W2B3W 2WB2WB2W WB4WBW W6BW WB4WBW 8W

Data Compression We call this “run-length encoding” or RLE.

Data Compression Now let’s add one more rule.

Data Compression Now let’s add one more rule. Let’s imagine if we send the number ‘0’ it means repeat the previous line.

Data Compression So now we had: WWWBBWWW WWBWWBWW WBWWWWBW WBBBBBBW WBWWWWBW WWWWWWWW 3W2B3W 2WB2WB2W WB4WBW W6BW WB4WBW 8W

Data Compression And we get: WWWBBWWW WWBWWBWW WBWWWWBW WBBBBBBW WBWWWWBW WWWWWWWW 3W2B3W 2WB2WB2W WB4WBW W6BW WB4WBW 8W 3W2B3W 2WB2WB2W WB4WBW 0 W6BW WB4WBW 0 8W

Data Compression Going from 64 to 44 to 34 characters: WWWBBWWW WWBWWBWW WBWWWWBW WBBBBBBW WBWWWWBW WWWWWWWW 3W2B3W 2WB2WB2W WB4WBW W6BW WB4WBW 8W 3W2B3W 2WB2WB2W WB4WBW 0 W6BW WB4WBW 0 8W

Data Compression For most images, the lines are repeated frequently, so you can get massive savings from RLE.

Data Compression

etc.