Harmonically Informed Multi-pitch Tracking Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio Lab,

Slides:



Advertisements
Similar presentations
PHONE MODELING AND COMBINING DISCRIMINATIVE TRAINING FOR MANDARIN-ENGLISH BILINGUAL SPEECH RECOGNITION Yanmin Qian, Jia Liu ICASSP2010 Pei-Ning Chen CSIE.
Advertisements

Automatic Photo Pop-up Derek Hoiem Alexei A.Efros Martial Hebert Carnegie Mellon University.
Active Appearance Models
Note-level Music Transcription by Maximum Likelihood Sampling Zhiyao Duan ¹ & David Temperley ² 1.Department of Electrical and Computer Engineering 2.Eastman.
1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.
Source separation and analysis of piano music signals using instrument-specific sinusoidal model Wai Man SZETO and Kin Hong WONG
Multipitch Tracking for Noisy Speech
In this presentation you will: explore the nature of resonance
A Robust Algorithm for Pitch Tracking David Talkin Hsiao-Tsung Hung.
Doorjamb: Unobtrusive Room-level Tracking of People in Homes using Doorway Sensors Timothy W. Hnat, Erin Griffiths, Ray Dawson, Kamin Whitehouse U of Virginia.
Onset Detection in Audio Music J.-S Roger Jang ( 張智星 ) MIR LabMIR Lab, CSIE Dept. National Taiwan University.
Auditory Scene Analysis (ASA). Auditory Demonstrations Albert S. Bregman / Pierre A. Ahad “Demonstration of Auditory Scene Analysis, The perceptual Organisation.
Soundprism An Online System for Score-informed Source Separation of Music Audio Zhiyao Duan and Bryan Pardo EECS Dept., Northwestern Univ. Interactive.
A Probabilistic Framework for Semi-Supervised Clustering
Pitch Recognition with Wavelets Final Presentation by Stephen Geiger.
Locally Constraint Support Vector Clustering
Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.
Hearing & Deafness (5) Timbre, Music & Speech Vocal Tract.
Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Hearing & Deafness (5) Timbre, Music & Speech.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh
/14 Automated Transcription of Polyphonic Piano Music A Brief Literature Review Catherine Lai MUMT-611 MIR February 17,
1 AUTOMATIC TRANSCRIPTION OF PIANO MUSIC - SARA CORFINI LANGUAGE AND INTELLIGENCE U N I V E R S I T Y O F P I S A DEPARTMENT OF COMPUTER SCIENCE Automatic.
Instrument Recognition in Polyphonic Music Jana Eggink Supervisor: Guy J. Brown University of Sheffield
A Time Based Approach to Musical Pattern Discovery in Polyphonic Music Tamar Berman Graduate School of Library and Information Science University of Illinois.
INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant.
Educational Software using Audio to Score Alignment Antoine Gomas supervised by Dr. Tim Collins & Pr. Corinne Mailhes 7 th of September, 2007.
FRIP: A Region-Based Image Retrieval Tool Using Automatic Image Segmentation and Stepwise Boolean AND Matching IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 7,
Automatic detection of microchiroptera echolocation calls from field recordings using machine learning algorithms Mark D. Skowronski and John G. Harris.
ALIGNMENT OF 3D ARTICULATE SHAPES. Articulated registration Input: Two or more 3d point clouds (possibly with connectivity information) of an articulated.
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
Dynamic 3D Scene Analysis from a Moving Vehicle Young Ki Baik (CV Lab.) (Wed)
Polyphonic Music Transcription Using A Dynamic Graphical Model Barry Rafkind E6820 Speech and Audio Signal Processing Wednesday, March 9th, 2005.
Using Blackboard Systems for Polyphonic Transcription A Literature Review by Cory McKay.
Dan Rosenbaum Nir Muchtar Yoav Yosipovich Faculty member : Prof. Daniel LehmannIndustry Representative : Music Genome.
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
What about the rubber bands determines pitch? Musical Instruments - Strings  The pitch or frequency of a string is determined by the string’s velocity.
1 Multiple Classifier Based on Fuzzy C-Means for a Flower Image Retrieval Keita Fukuda, Tetsuya Takiguchi, Yasuo Ariki Graduate School of Engineering,
Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05.
Extracting Melody Lines from Complex Audio Jana Eggink Supervisor: Guy J. Brown University of Sheffield {j.eggink
Polyphonic Transcription Bruno Angeles McGill University - Schulich School of Music MUMT-621 Fall /14.
A Region Based Stereo Matching Algorithm Using Cooperative Optimization Zeng-Fu Wang, Zhi-Gang Zheng University of Science and Technology of China Computer.
Instrument Classification in a Polyphonic Music Environment Yingkit Chow Spring 2005.
CS654: Digital Image Analysis
Song-level Multi-pitch Tracking by Heavily Constrained Clustering Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio.
Stable Multi-Target Tracking in Real-Time Surveillance Video
1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.
QBSH Corpus The QBSH corpus provided by Roger Jang [1] consists of recordings of children’s songs from students taking the course “Audio Signal Processing.
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
Realtime Recognition of Orchestral Instruments Ichiro Fujinaga McGill University.
Automatic Transcription System of Kashino et al. MUMT 611 Doug Van Nort.
1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.
Piano Music Transcription Wes “Crusher” Hatch MUMT-614 Thurs., Feb.13.
Zhiyao Duan, Changshui Zhang Department of Automation Tsinghua University, China Music, Mind and Cognition workshop.
BASS TRACK SELECTION IN MIDI FILES AND MULTIMODAL IMPLICATIONS TO MELODY gPRAI Pattern Recognition and Artificial Intelligence Group Computer Music Laboratory.
And application to estimating the left-hand fingering (automatic tabulature generation) Caroline Traube Center for Computer Research in Music and Acoustics.
Automatic Transcription of Polyphonic Music
Catherine Lai MUMT-611 MIR February 17, 2005
Interactive Offline Tracking for Color Objects
Introduction to Music Information Retrieval (MIR)
Collective Network Linkage across Heterogeneous Social Platforms
Presented by Steven Lewis
Realtime Recognition of Orchestral Instruments
Realtime Recognition of Orchestral Instruments
Measuring the Similarity of Rhythmic Patterns
Harmonically Informed Multi-pitch Tracking
Presentation transcript:

Harmonically Informed Multi-pitch Tracking Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio Lab, For presentation in ISMIR 2009, Kobe, Japan.

Given polyphonic music played by several monophonic harmonic instruments Estimate a pitch trajectory for each instrument The Multi-pitch Tracking Task 2Northwestern University, Interactive Audio Lab.

Potential Applications Automatic music transcription Harmonic source separation Other applications –Melody-based music search –Chord recognition –Music education –…… 3Northwestern University, Interactive Audio Lab.

The 2-stage Standard Approach Stage 1: Multi-pitch Estimation (MPE) in each single frame Stage 2: Connect pitch estimates across frames into pitch trajectories 4Northwestern University, Interactive Audio Lab. … Time Frequency

State of the Art 5Northwestern University, Interactive Audio Lab. How far has existing work gone? –MPE is not very robust –Form short pitch trajectories (within a note) according to local time-frequency proximity of pitch estimates Our contribution –A new MPE algorithm –A constrained clustering approach to estimate pitch trajectories across multiple notes

System Overview 6Northwestern University, Interactive Audio Lab.

Frequency Amplitude Multi-pitch Estimation in Single Frame A maximum likelihood estimation method Spectrum: peaks & the non-peak region Best F0 estimate (a set of F0s) Observed power spectrum F0 hypothesis, (a set of F0s) 7Northwestern University, Interactive Audio Lab.

True F0 Likelihood Definition Likelihood of observing these peaks Likelihood of not having any harmonics in the NP region 8Northwestern University, Interactive Audio Lab. F0 Hyp is large is small is large is small

Likelihood Definition Northwestern University, Interactive Audio Lab. True F0 F0 Hyp Likelihood of observing these peaks Likelihood of not having any harmonics in the NP region is large

Pitch Trajectory Formation 10 Northwestern University, Interactive Audio Lab. How to form pitch trajectories ? –View it as a constrained clustering problem! We use two clustering cues –Global timbre consistency –Local time-frequency locality

Global Timbre Consistency Objective function –Minimize intra-cluster distance Harmonic structure feature –Normalized relative amplitudes of harmonics Northwestern University, Interactive Audio Lab.

Local Time-frequency Locality Constraints –Must-link: similar pitches in adjacent frames –Cannot-link: simultaneous pitches Finding a feasible clustering is NP-hard! Northwestern University, Interactive Audio Lab. … Time Frequency

Our Constrained Clustering Process 1) Find an initial clustering –Labeling pitches according to pitch order in each frame: First, second, third, fourth Northwestern University, Interactive Audio Lab … Time Frequency … Time Frequency

Our Constrained Clustering Process 2) Define constraints –Must-link: similar pitches in adjacent frames and the same initial cluster: Notelet –Cannot-link: simultaneous notelets Northwestern University, Interactive Audio Lab … Time Frequency

Our Constrained Clustering Process 3) Update clusters to minimize objective function –Swap set: A set of notelets in two clusters connected by cannot-links –Swap notelets in a swap set between clusters if it reduces objective function –Iteratively traverse all the swap sets Northwestern University, Interactive Audio Lab … Time Frequency

Data Set Data set –10 J.S. Bach chorales (quartets, played by violin, clarinet, saxophone and bassoon) –Each instrument is recorded individually, then mixed Ground-truth pitch trajectories –Use YIN on monophonic tracks before mixing 16Northwestern University, Interactive Audio Lab.

Experimental Results 17Northwestern University, Interactive Audio Lab. Mean +- StdPrecision (%)Recall (%) How many pitches are correctly estimated? Klapuri, ISMIR Ours How many pitches are correctly estimated and put into the correct trajectory? ChanceApprox 0.0 Ours How many notes are correctly estimated? ChanceApprox 0.0 Ours

Ground Truth Pitch Trajectories 18Northwestern University, Interactive Audio Lab. J.S. Bach, “Ach lieben Christen, seid getrost”

Our System’s Output 19Northwestern University, Interactive Audio Lab. J.S. Bach, “Ach lieben Christen, seid getrost”

Conclusion Our multi-pitch tracking system –Multi-pitch estimation in single frame Estimate F0s by modeling peaks and the non-peak region Estimate polyphony, refine F0s estimates –Pitch trajectory formation Constrained clustering –Objective: timbre (harmonic structure) consistency –Constraints: local time-frequency locality of pitches A clustering algorithm by swapping labels Results on music recordings are promising 20Northwestern University, Interactive Audio Lab.

Thanks you! Q & A 21Northwestern University, Interactive Audio Lab.

Possible Questions How much does our constrained clustering algorithm improve from the initial pitch trajectory (label pitches by pitch order)? 22Northwestern University, Interactive Audio Lab.