Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information-Theoretic Listening

Similar presentations


Presentation on theme: "Information-Theoretic Listening"— Presentation transcript:

1 Information-Theoretic Listening
* 07/16/96 Information-Theoretic Listening Paris Smaragdis Machine Listening Group MIT Media Lab 11/11/2018 *

2 Outline Defining a global goal for computational audition
* 07/16/96 Outline Defining a global goal for computational audition Example 1: Developing a representation Example 2: Developing grouping functions Conclusions *

3 Auditory Goals Goals of computational audition are all over the place, should they? Lack of formal rigor in most theories Computational listening is fitting psychoacoustic experiment data

4 Auditory Development What really made audition?
How did our hearing evolve? How did our environment shape our hearing? Can we evolve, rather than instruct, a machine to listen?

5 Goals of our Sensory System
Distinguish independent events Object formation Gestalt grouping Minimize thinking and effort Perceive as few objects as possible Think as little as possible

6 Entropy Minimization as a Sensory Goal
Long history between entropy and perception Barlow, Attneave, Attick, Redlich, etc ... Entropy can measure statistical dependencies Entropy can measure economy in both ‘thought’ (algorithmic entropy) and ‘information’ (Shannon entropy)

7 What is Entropy? Shannon Entropy: A measure of:
Order Predictability Information Correlations Simplicity Stability Redundancy ... High entropy = Little order Low entropy = Lots of order

8 Representation in Audition
Frequency decompositions Cochlear hint Easier to look at data! Sinusoidal bases Signal processing framework

9 Evolving a Representation
Develop a basis decomposition Bases should be statistically independent Satisfaction of minimal entropy idea Decomposition should be data driven Account for different domains

10 Method Use bits of natural sounds to derive bases
Analyze these bits with ICA

11 Results We obtain sinusoidal bases!
Transform is driven by the environment Uniform procedure for different domains

12 Auditory Grouping Heuristics Bootstrapped to individual domains
Hard to implement on computers Require even more heuristics to resolve ambiguity Weak definitions Bootstrapped to individual domains Vision Gestalt  Auditory Gestalt  … Common AM Common FM Good Continuation

13 Method Goal: Find grouping that minimizes scene entropy Parameterized
Auditory Scene s(t,n) Density Estimation Ps(i) Shannon Entropy Calculation

14 Common Modulation - Frequency
Scene Description: Entropy Measurement: Time n = 0.5 Frequency

15 Common Modulation - Amplitude
Scene Description: Entropy Measurement: Sine 2 Amplitude n = 0.5 Sine 1 Amplitude Time

16 Common Modulation - Onset/Offset
Scene Description: Entropy Measurement: Sine 2 Amplitude n = 0.5 Sine 1 Amplitude Time

17 Similarity/Proximity - Harmonicity I
Scene Description: Entropy Measurement: Time Frequency

18 Similarity/Proximity - Harmonicity II
Scene Description: Entropy Measurement: Time Frequency

19 Simple Scene Analysis Example
5 Sinusoids 2 Groups Simulated Annealing Algorithm Input: Raw sinusoids Goal: Entropy minimization Output: Expected grouping

20 Important Notes No definition of time Developed a concept of frequency
No parameter estimation requirement Operations on data not parameters No parameter setting!

21 Conclusions Elegant and consistent formulation
No constraint over data representation Uniform over different domains (Cross-modal!) No parameter estimation No parameter tuning! Biological plausibility Barlow et al ... Insight to perception development

22 Future Work Good Cost Function? Incorporate time
Joint entropy vs entropy of sums Shannon entropy vs Kolmogorov complexity Joint-statistics (cumulants, moments) Incorporate time Sounds have time dependencies I’m ignoring Generalize to include perceptual functions

23 Teasers Dissonance and Entropy Pitch Detection Instrument Recognition


Download ppt "Information-Theoretic Listening"

Similar presentations


Ads by Google