Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward 1.Is a picture really worth 1000 words? 2.Does.

Slides:



Advertisements
Similar presentations
Data Compression CS 147 Minh Nguyen.
Advertisements

Introduction to Computer Science 2 Lecture 7: Extended binary trees
Lecture 4 (week 2) Source Coding and Compression
Applied Algorithmics - week7
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Boltzmann, Shannon, and (Missing) Information
Entropy in the Quantum World Panagiotis Aleiferis EECS 598, Fall 2001.
SIMS-201 Compressing Information. 2  Overview Chapter 7: Compression Introduction Entropy Huffman coding Universal coding.
From Randomness to Probability
MAT 103 Probability In this chapter, we will study the topic of probability which is used in many different areas including insurance, science, marketing,
Slide 5-2 Copyright © 2008 Pearson Education, Inc. Chapter 5 Probability and Random Variables.
Information Reading: Complexity: A Guided Tour, Chapter 3.
Lecture04 Data Compression.
Biotech 4490 Bioinformatics I Fall 2006 J.C. Salerno 1 Biological Information.
Information Reading: Complexity: A Guided Tour, Chapter 3.
ENGS Lecture 8 ENGS 4 - Lecture 8 Technology of Cyberspace Winter 2004 Thayer School of Engineering Dartmouth College Instructor: George Cybenko,
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
SWE 423: Multimedia Systems
Information Theory Eighteenth Meeting. A Communication Model Messages are produced by a source transmitted over a channel to the destination. encoded.
Machine Learning CMPT 726 Simon Fraser University
Information Theory and Security. Lecture Motivation Up to this point we have seen: –Classical Crypto –Symmetric Crypto –Asymmetric Crypto These systems.
Biology How Does Information/Entropy/ Complexity fit in?
Lecture 2: Basic Information Theory Thinh Nguyen Oregon State University.
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:
Information Theory and Security
1 Lossless Compression Multimedia Systems (Module 2) r Lesson 1: m Minimum Redundancy Coding based on Information Theory: Shannon-Fano Coding Huffman Coding.
Data Compression Gabriel Laden CS146 – Dr. Sin-Min Lee Spring 2004.
Spring 2015 Mathematics in Management Science Binary Linear Codes Two Examples.
Albert Gatt Corpora and Statistical Methods. Probability distributions Part 2.
Huffman Codes Message consisting of five characters: a, b, c, d,e
2. Mathematical Foundations
Thermodynamic principles JAMES WATT Lectures on Medical Biophysics Dept. Biophysics, Medical faculty, Masaryk University in Brno.
1 9/8/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
1 9/23/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
Algorithms and their Applications CS2004 ( ) Dr Stephen Swift 1.2 Introduction to Algorithms.
Physics I Entropy: Reversibility, Disorder, and Information Prof. WAN, Xin
(Important to algorithm analysis )
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
ALGORITHMS FOR ISNE DR. KENNETH COSH WEEK 13.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
Prof. Pushpak Bhattacharyya, IIT Bombay1 Basics Of Entropy CS 621 Artificial Intelligence Lecture /09/05 Prof. Pushpak Bhattacharyya.
Summer 2004CS 4953 The Hidden Art of Steganography A Brief Introduction to Information Theory  Information theory is a branch of science that deals with.
Quantifying Knowledge Fouad Chedid Department of Computer Science Notre Dame University Lebanon.
Information Theory Ying Nian Wu UCLA Department of Statistics July 9, 2007 IPAM Summer School.
Information Theory The Work of Claude Shannon ( ) and others.
THEORY The argumentation was wrong. Halting theorem!
Source Coding Efficient Data Representation A.J. Han Vinck.
Copyright © 2010 Pearson Education, Inc. Unit 4 Chapter 14 From Randomness to Probability.
Basic Principles (continuation) 1. A Quantitative Measure of Information As we already have realized, when a statistical experiment has n eqiuprobable.
The Physics of Information: Demons, Black Holes, and Negative Temperatures Charles Xu Splash 2013.
Measuring chance Probabilities FETP India. Competency to be gained from this lecture Apply probabilities to field epidemiology.
Combinatorics (Important to algorithm analysis ) Problem I: How many N-bit strings contain at least 1 zero? Problem II: How many N-bit strings contain.
Entropy (YAC- Ch. 6)  Introduce the thermodynamic property called Entropy (S)  Entropy is defined using the Clausius inequality  Introduce the Increase.
Lecture 12 Huffman Algorithm. In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly.
SEAC-3 J.Teuhola Information-Theoretic Foundations Founder: Claude Shannon, 1940’s Gives bounds for:  Ultimate data compression  Ultimate transmission.
The Law of Averages. What does the law of average say? We know that, from the definition of probability, in the long run the frequency of some event will.
Chapter 20 Lecture 35: Entropy and the Second Law of Thermodynamics HW13 (problems):19.3, 19.10, 19.44, 19.75, 20.5, 20.18, 20.28,
UNIT I. Entropy and Uncertainty Entropy is the irreducible complexity below which a signal cannot be compressed. Entropy is the irreducible complexity.
(C) 2000, The University of Michigan 1 Language and Information Handout #2 September 21, 2000.
(Important to algorithm analysis )
Chapter 6 6.1/6.2 Probability Probability is the branch of mathematics that describes the pattern of chance outcomes.
Introduction to Information theory
(Important to algorithm analysis )
UNIT IV.
CSE 589 Applied Algorithms Spring 1999
Greedy Algorithms Alexandra Stefan.
Entropy is Your Friend.
Presentation transcript:

Cmprssd Vw f Infrmtn Thry: A Compressed View of Information John Woodward 1.Is a picture really worth 1000 words? 2.Does the Complete Works of Shakespeare contain more information in its original language or a translation? 3.Why is tossing a coin the best way to make a decision? 4.What is your best defence when interrogated? 5.Why is the original scientific paper outlining information theory still relevant?

Information Age probability / coding theory Transmit, share, copy, digest, delete, evaluate, interpret, value, ignore 1.Shannon entropy is concerned with the transmission of a message 2.Algorithmic information theory is concerned with the information content of the message itself. 1948

The diving bell and the butterfly

ABCDEFGHIJKLMNOPQRSTUVWXYZ Move you finger L -> R A is 1 time unit B is 2 C is 3 Z is 26 e.g. “BUT” “SECONDS”

The diving bell and the butterfly ABCDEFGHIJKLMNOPQSTUVWXYZ

Frequency of a Symbol Typewriter QWERTY Computer QWERTY???

Frequency of a Symbol Typewriter QWERTY Computer QWERTY??? Megabee HAWKING movie atch?v=BtMeI3xGtcM atch?v=BtMeI3xGtcM

Morse Code How many symbols are in the Morse code? 1, 2, 3, 4, 5 atch?v=Z5uyK5MrsTs

Morse Code Contains 4 symbols. Morse did basic frequency (probabilistic) analysis. Within 15% of optimum. atch?v=Z5uyK5MrsTs

Morse code tree 3 gaps Most frequent letters THESE SHOULD BE THE KEYS ON YOUR COMPUTER

Morse code tree 1)… --- … 2)

Morse code tree 1)… --- … SOS 2) OMG

Coin Tossing A fair coin A double headed/ tailed coin Gambler’s fallacy – each toss is independent. – Symmetric, – monotonic

Coin Tossing A fair coin A double headed/ tailed coin Gambler’s fallacy – each toss is independent. – Symmetric, – monotonic

Making a Decision If you cannot make a rational decision …toss a fair coin. MAXIMUM ENTROPY This has maximum “surprize” or least predictability. With a friend – chocolate cake or broccoli. wiki/The_Dice_Man – makes decisions by rolling a dice. wiki/The_Dice_Man

Police Interview 1.Police may ask you to repeat your statement. Why. 2.This is a tactic 3.Or they may ask you for details in a different order. 4.“no comment interview” 5.MINIMUM ENTROPY 6. ch?v=q4f_vi7yKuUhttps:// ch?v=q4f_vi7yKuU 7.What is the information content? 8.Neither confirm nor deny

Linguistics. 1.zs, td, pb, fv ??? 2.`` – how much info? 3.shorthand. 4.Lip reading.

Linguistics. 1.zs, td, pb, fv ??? 2.`` – how much info? 3.shorthand. 4.Lip reading.

Genetic Code 1 4 BASES A-U C-G 20 AMINO ACIDS

Genetic Code Genetic Code 2

Genetic Code 3 1.No gaps 2.Use 3 bases (ATCG) not 2 or 4 for 21 code words (20 amino acids + stop) 3.Instantaneous – needed! 4.Even if mistake is made in last base – often okay – grouped (locality/redundancy) 5.Even if wrong amino acid – still has similar chemical properties.

Half Time Transmitting information – Shannon entropy. Algorithmic complexity – the information in the message.

Lossless/Lossy Compression

Compress A File Not all strings are compressible. We want to compress all bit strings of length 3, to be shorter. proof – pigeon hole principle. In fact most strings are not compressible – “RENAMING” “” Bit strings of length <= 2 8 bit strings length = 3

Kolmogorov Complexity “ …” “ …” “ …”

Kolmogorov Complexity “ …” repeat 60 times “0” “ …” repeat 20 times “001” “ …” print “ …”

Kolmogorov Complexity “ …” repeat 60 times “0” “ …” repeat 20 times “001” “ …” print “ …” 1.According to probability theory they are all equally likely? 2.If there is a pattern, we can write a rule and implement it on a computer. 3.Kolmogorov complexity of a bit string is the length of the shortest computer program to print the string and halt. 4.Can be thought of as a measure of compressibility. 5.Amazing fact – it is independent of the computer you run it on. 6.Which string above have high/low Kolmogorov complexity??? 7.Kolmogorov complexity is a generalization of Shannon entropy.

Information and Translating Which contains more “information” – 5,6,4… – five, six, four, … Now consider Shakespeare in English and German. If we translate word for word – the number of pages would increase (10%). If we have a dictionary – this is a “one off cost” in principle – an increase of fixed amount!!!! digitword 1one 2two 3three 4four ……

2 nd Law of Thermodynamics 1.Entropy (disorder) increases (statistically) (closed system) 2.things naturally become untidy (definition of untidy?). 3.Only irreversible law of physics 4.S = K log W 5.W = number microstates corresponding to that macrostate (ratio)

2 nd law e.g. Vibrate 2 Dice on a Tray Microstate Values on each dice E.g. 3,5 Macrostate Sum E.g. 8=3+5 probabilities

Maxwell’s Demon 1860s The demon can separate the atoms. ENERGY FOR FREE

Maxwell’s Demon 2 We can do work (energy) on the gas by compressing either piston. Log v1/v2 We can half push the pistons in order e.g. 010 (left, right, left) We have reduced the entropy of the gas. K log (#microstates) 3 bits of information Divide cylinder into 8

Experimental verification of Landauer’s principle Irreversible transformation K T ln 2 (delete a BIT) Nature 483, 187–189 (08 March 2012) doi: /nature10872Received 11 October 2011 Accepted 17 January 2012 Published online 07 March ature10872.html ature10872.html

Bald Man What does he say to the barber each time. How much “information” is contained in his hair

Which book?

Toss a coin!!!!

- …

- … The end

Next Public Lecture 2nd April (2 weeks) Ant hills, traffic jams and social segregation: modelling the world from the bottom up Dr Savi Maharaj