R ESEARCH BY E LAINE C HEW AND C HING -H UA C HUAN U NIVERSITY OF S OUTHERN C ALIFORNIA P RESENTATION BY S EAN S WEENEY D IGI P EN I NSTITUTE OF T ECHNOLOGY.

Slides:



Advertisements
Similar presentations
Decibel values: sum and difference. Sound level summation in dB (1): Incoherent (energetic) sum of two different sounds: Lp 1 = 10 log (p 1 /p rif ) 2.
Advertisements

Frequency analysis.
Acoustic/Prosodic Features
SOUND PRESSURE, POWER AND LOUDNESS MUSICAL ACOUSTICS Science of Sound Chapter 6.
Fourier Transform and its Application in Image Processing
For those who have never played an instrument
Music Software projects New york university Adjunct Instructor Scott Burton.
1 Acoustic Sampling Of Instruments Dan Starr Capstone Design Project Advisors: Prof. Catravas Prof. Postow.
Synthesis. What is synthesis? Broad definition: the combining of separate elements or substances to form a coherent whole. (
Music Software projects New york university Adjunct Instructor Scott Burton.
Pitch Perception.
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Music Perception. Why music perception? 1. Found in all cultures - listening to music is a universal activity. 2. Interesting from a developmental point.
A.Diederich – International University Bremen – USC – MMM – Spring 2005 Scales Roederer, Chapter 5, pp. 171 – 181 Cook, Chapter 14, pp. 177 – 185 Cook,
SUBJECTIVE ATTRIBUTES OF SOUND Acoustics of Concert Halls and Rooms Science of Sound, Chapters 5,6,7 Loudness, Timbre.
DEVON BRYANT CS 525 SEMESTER PROJECT Audio Signal MIDI Transcription.
T.Sharon 1 Internet Resources Discovery (IRD) Music IR.
Fourier Transform and Applications
Advanced Multimedia Music Information Retrieval Tamara Berg.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Audio and Music Representations (Part 2) 1.
Sound Applications Advanced Multimedia Tamara Berg.
Tuning Basics INART 50 Science of Music. Three Fundamental Facts Frequency ≠ Pitch (middle A is often 440 Hz, but not necessarily) Any pitch class can.
Physics 371 March 7, 2002 Consonance /Dissonance Interval = frequency ratio Consonance and Dissonance Dissonance curve The Just Scale major triad construction.
Instrument Recognition in Polyphonic Music Jana Eggink Supervisor: Guy J. Brown University of Sheffield
Beats and Tuning Pitch recognition Physics of Music PHY103.
Tuning and Temperament An overview. Review of Pythagorean tuning Based on string lengths Octave relationship is always 2:1 Fifth relationship is 3:2 “pure”
Wireless and Mobile Computing Transmission Fundamentals Lecture 2.
1 PATTERN COMPARISON TECHNIQUES Test Pattern:Reference Pattern:
Music Software Projects New York University Adjunct Instructor Scott Burton.
Physics 371 March 14, 2002 Scales (end) names of intervals transposition the natural scale the tempered scale meantone tuning.
CS332 Visual Processing Department of Computer Science Wellesley College Binocular Stereo Vision Region-based stereo matching algorithms Properties of.
Complex Auditory Stimuli
Pre-Class Music Paul Lansky Six Fantasies on a Poem by Thomas Campion.
Introduction to Onset Detection Functions HAO-HSUN LI 1/30.
Audio Tempo Extraction Presenter: Simon de Leon Date: February 9, 2006 Course: MUMT611.
SOUND PRESSURE, POWER AND LOUDNESS MUSICAL ACOUSTICS Science of Sound Chapter 6.
Pitch, Rhythm, and Harmony Pg A musical sound has four properties: Pitch Duration Volume Timbre.
Melodic Similarity Presenter: Greg Eustace. Overview Defining melody Introduction to melodic similarity and its applications Choosing the level of representation.
CS 8751 ML & KDDData Clustering1 Clustering Unsupervised learning Generating “classes” Distance/similarity measures Agglomerative methods Divisive methods.
Automatic Transcription System of Kashino et al. MUMT 611 Doug Van Nort.
Pitch What is pitch? Pitch (as well as loudness) is a subjective characteristic of sound Some listeners even assign pitch differently depending upon whether.
The Discrete Fourier Transform
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 4 – Audio and Digital Image Representation Klara Nahrstedt Spring 2010.
G52IIP, School of Computer Science, University of Nottingham 1 Image Transforms Basic idea Input Image, I(x,y) (spatial domain) Mathematical Transformation.
Predictive Analytics derived from HVAC and PMU data at UCSD Chuck Wells Industry Principal OSIsoft, LLC 1.
Combination of tones (Road to discuss harmony) 1.Linear superposition If two driving forces are applied simultaneously, the response will be the sum of.
Tuning and Temperament
MATHS IN MUSIC.
SPATIAL HEARING Ability to locate the direction of a sound. Ability to locate the direction of a sound. Localization: In free field Localization: In free.
And application to estimating the left-hand fingering (automatic tabulature generation) Caroline Traube Center for Computer Research in Music and Acoustics.
Advanced Engineering Mathematics ( ) Topic:- Application of Fourier transform Guided By:- Asst. Prof. Mrs. Pooja Desai B HAGWAN M AHAVIR C OLLEGE.
1 Tempo Induction and Beat Tracking for Audio Signals MUMT 611, February 2005 Assignment 3 Paul Kolesnik.
CS 445/656 Computer & New Media
PATTERN COMPARISON TECHNIQUES
… Sampling … … Filtering … … Reconstruction …
The Physics of Sound.
Introduction to Music scales
ARTIFICIAL NEURAL NETWORKS
(Road to discuss harmony)
(Road to discuss harmony)
Pitch What is pitch? Pitch (as well as loudness) is a subjective characteristic of sound Some listeners even assign pitch differently depending upon whether.
Computer Vision Lecture 4: Color
LECTURE 18: FAST FOURIER TRANSFORM
Audio and Speech Computers & New Media.
Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611
(Road to discuss harmony)
 = N  N matrix multiplication N = 3 matrix N = 3 matrix N = 3 matrix
Lec.6:Discrete Fourier Transform and Signal Spectrum
LECTURE 18: FAST FOURIER TRANSFORM
Presentation transcript:

R ESEARCH BY E LAINE C HEW AND C HING -H UA C HUAN U NIVERSITY OF S OUTHERN C ALIFORNIA P RESENTATION BY S EAN S WEENEY D IGI P EN I NSTITUTE OF T ECHNOLOGY CS 582 / A PRIL 17, 2011 D R. D IMITRI V OLPER Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm

Presentation Flow Musical Pitch and Key Human Perception of Pitch The Spiral Array Model  Pitches  Chords  Keys The CEG Algorithm  Algorithm  Visualization

Musical Pitch and Key Pitch  The perceived value of a tone, “Low” to “High”  Psycho-acoustic (subjective) perception of Frequency Frequency (Hz) is a scientific measurement of period Key (Western music)  Labels the “center” tone in a section of music  Standard smallest interval: Semitone or “half-step”  Standard pattern of semitones around “center” Ascending: 2,2,1,2,2,2,1

Human Perception of Pitch Limited range of perception  Typically 20Hz – 20,000Hz  Range tends to decrease with age Noticable Difference is coarser at low Hz  Less distance (Hz) between lower sounds  Around 1400 perceivable intervals Certain frequency distances sound relatively close  Thirds, Fifths, Octaves

The Spiral Array Model

Helical Structure Toroidal across Octaves Distance in 3D model approximates perceived closeness between pitch Pitch, chord and key can all map to the same space

Chords in the Spiral Array Standard chords are based on three supporting tones Create Triangles in 3D relative to the model Triangles are effectively continuous, as pitch is Major and Minor chords’ centers thus form helixes

Key in the Spiral Array Simple keys are based on three supporting chords Creates triangles in 3D, based on supporting chords’ triangular centers Triangles are effectively continuous, as chords are Major and Minor keys’ centers thus form helixes

Center of Effect Center of Effect (CE)  Relative location of a chord based on its supporting tones Notes of different strength change the CE location  Complex chord CE’s will not line up exactly on the model

Center of Effect Generator (CEG) Key-Finding Center of Effect relates position of multiple pitches in model Spatially closest chord is most likely key  Correlates input music to standard key structure

Helping Visualize the CEG Algorithm Keys exist as a triangle in 3-space Keys’ centers-of-effect make up two helixes in the 3D model In standard intonation, keys are discrete (12 minor, 12 major) 

Helping Visualize the CEG Algorithm From a complex audio signal, weighted values are calculated for bins on each discrete tone The weighted values approximate the current key’s location on the model The spatially-closest key is the most likely match

CEG Key-Finding Algorithm Pitch detection  Extract pitch class and strength from signal Key finding  Nearest Neighbor Search in Spiral Array

Fast Fourier Transform Efficient algorithm to compute Discrete Fourier Transform  O(n log n) vs O(n 2 ) Transforms function into its Frequency Domain representation Widely used across many fields  Solving Partial Differential Equations  Data Compression  Polynomial Multiplication  Spectral Analysis  Frequency bands 

Algorithm for Pitch Class/Strength from FFT For each frequency spectrum in a 0.37 second period: 1. For each frequency band find peak value 2. For each pitch-class, k, and its strength at time j: F jk, is the sum of all peak values for that frequency band (and others related by octaves) 3. Normalize 1. Divide all pitch-strength values by the largest: 2. Divide all pitch-strength values by their sum: (k = 0, 1, …, 11)

CEG Key-Finding Algorithm Pitch detection  Extract pitch class and strength from signal Key finding  Nearest Neighbor Search in Spiral Array

CEG Algorithm For pitch class and strength from each 0.37 seconds: 1. Assign pitch-names to pitch classes: 1. Generate CE for previous 5 seconds; and 2. Assign pitch-names to current pitch-classes by nearest neighbor search in Spiral Array Space 2. Determine Key based on pitch names: 1. Generate the cumulative CE from beginning to current 2. Perform nearest-neighbor search to find closest key

BIBLIOGRAPHY: Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm Chuan, C. and Chew, E. IEEE International Conference on Multimedia & Expo 2005 Towards a Mathematical Model of Tonality Chew, E. Doctoral dissertation, MIT 2000 Questions?