1 Robust Temporal and Spectral Modeling for Query By Melody Shai Shalev, Hebrew University Yoram Singer, Hebrew University Nir Friedman, Hebrew University.

Slides:



Advertisements
Similar presentations
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Advertisements

KARAOKE FORMATION Pratik Bhanawat (10bec113) Gunjan Gupta Gunjan Gupta (10bec112)
1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.
An Approach to ECG Delineation using Wavelet Analysis and Hidden Markov Models Maarten Vaessen (FdAW/Master Operations Research) Iwan de Jong (IDEE/MI)
ECE 8443 – Pattern Recognition Objectives: Elements of a Discrete Model Evaluation Decoding Dynamic Programming Resources: D.H.S.: Chapter 3 (Part 3) F.J.:
Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.
Overview of Real-Time Pitch Tracking Approaches Music information retrieval seminar McGill University Francois Thibault.
Foreground cleaning in CMB experiments Carlo Baccigalupi, SISSA, Trieste.
Soundprism An Online System for Score-informed Source Separation of Music Audio Zhiyao Duan and Bryan Pardo EECS Dept., Northwestern Univ. Interactive.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
Hidden Markov Models Theory By Johan Walters (SR 2003)
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
Content-Based Classification, Search & Retrieval of Audio Erling Wold, Thom Blum, Douglas Keislar, James Wheaton Presented By: Adelle C. Knight.
Toward Semantic Indexing and Retrieval Using Hierarchical Audio Models Wei-Ta Chu, Wen-Huang Cheng, Jane Yung-Jen Hsu and Ja-LingWu Multimedia Systems,
What is music? Music is the deliberate organization of sounds by people for other people to hear.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell.
Phoneme Alignment. Slide 1 Phoneme Alignment based on Discriminative Learning Shai Shalev-Shwartz The Hebrew University, Jerusalem Joint work with Joseph.
S. Maarschalkerweerd & A. Tjhang1 Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability Chapter
Jonah Shifrin, Bryan Pardo, Colin Meek, William Birmingham
Learning to Align Polyphonic Music. Slide 1 Learning to Align Polyphonic Music Shai Shalev-Shwartz Hebrew University, Jerusalem Joint work with Yoram.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Based on Slides by D. Gunopulos (UCR)
Slide 1 EE3J2 Data Mining EE3J2 Data Mining - revision Martin Russell.
Scalable Text Mining with Sparse Generative Models
Dynamic Time Warping Applications and Derivation
Structure Learning for Inferring a Biological Pathway Charles Vaske Stuart Lab.
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Sensys 2009 Speaker:Lawrence.  Introduction  Overview & Challenges  Algorithm  Travel Time Estimation  Evaluation  Conclusion.
1 AUTOMATIC TRANSCRIPTION OF PIANO MUSIC - SARA CORFINI LANGUAGE AND INTELLIGENCE U N I V E R S I T Y O F P I S A DEPARTMENT OF COMPUTER SCIENCE Automatic.
A Time Based Approach to Musical Pattern Discovery in Polyphonic Music Tamar Berman Graduate School of Library and Information Science University of Illinois.
1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University.
Alignment and classification of time series gene expression in clinical studies Tien-ho Lin, Naftali Kaminski and Ziv Bar-Joseph.
Graphical models for part of speech tagging
MPI Informatik 1/17 Oberseminar AG5 Result merging in a Peer-to-Peer Web Search Engine Supervisors: Speaker : Sergey Chernov Prof. Gerhard Weikum Christian.
7-Speech Recognition Speech Recognition Concepts
Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.
HMM - Basics.
Polyphonic Music Transcription Using A Dynamic Graphical Model Barry Rafkind E6820 Speech and Audio Signal Processing Wednesday, March 9th, 2005.
Fundamentals of Music Processing
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
Rhythmic Transcription of MIDI Signals Carmine Casciato MUMT 611 Thursday, February 10, 2005.
Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Music Information Retrieval Information Universe Seongmin Lim Dept. of Industrial Engineering Seoul National University.
Polyphonic Transcription Bruno Angeles McGill University - Schulich School of Music MUMT-621 Fall /14.
Lecture Topic 5 Pre-processing AFFY data. Probe Level Analysis The Purpose –Calculate an expression value for each probe set (gene) from the PM.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Evaluation Decoding Dynamic Programming.
MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality.
Bayesian Travel Time Reliability
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Query by Singing and Humming System
1 Hidden Markov Models Hsin-min Wang References: 1.L. R. Rabiner and B. H. Juang, (1993) Fundamentals of Speech Recognition, Chapter.
Statistical Models for Automatic Speech Recognition Lukáš Burget.
1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.
Classification of melody by composer using hidden Markov models Greg Eustace MUMT 614: Music Information Acquisition, Preservation, and Retrieval.
Spectral Algorithms for Learning HMMs and Tree HMMs for Epigenetics Data Kevin C. Chen Rutgers University joint work with Jimin Song (Rutgers/Palentir),
Rhythmic Transcription of MIDI Signals
PATTERN COMPARISON TECHNIQUES
Improving Measurement Precision with Weak Measurements
Aspects of Music Information Retrieval
مدلسازي تجربي – تخمين پارامتر
EE513 Audio Signals and Systems
MUSIC HIGH SCHOOL – MUSIC TECHNOLOGY – Unit 5
Measuring the Similarity of Rhythmic Patterns
Music Signal Processing
Presentation transcript:

1 Robust Temporal and Spectral Modeling for Query By Melody Shai Shalev, Hebrew University Yoram Singer, Hebrew University Nir Friedman, Hebrew University Shlomo Dubnov, Ben-Gurion University

2 Prelude

3 Problem Setting Database of real recordings Query: a melody Find: performances of the queried melody

4 Challenge Find performances of the queried melody independent of: –Tempo –Performing instrument –Dynamics –Expression –Accompaniment

5 Related Work A. Ghias, et al. “Query by humming” A. S. Durey and M. A. Clements. “Melody spotting using hidden markov models” C. Raphael. “Automatic segmentation of acoustic musical signals using HMMs” B. Doval and X. Rodet. “Fundamental frequency estimation using a new harmonic matching method”

6 Overview of Solution Employ a statistical framework Align a melody to a performance using an explicit tempo modeling Employ a maximum likelihood model for the spectrum of a note given the note’s pitch value Find the best alignment of a melody to a performance using dynamic programming

7 Statistical Framework Query Engine For each recording find: A database of real recordings A melody query Ranked list of According to

8 Melody Modeling Hidden Variable Observed Variable Legend: MelodyTempo Aligned Melody Sound

9 Tempo Modeling Sequence of scaling factors (one per note) Model tempo as a first order Markov model Use log-normal distribution to model conditional probability of tempo

10 Spectral Modeling

11 Spectral Modeling

12 Spectral Modeling (cont.)

13 Spectral Modeling (cont.) Estimate the amplitude at each harmony and global variance of the noise using the maximum likelihood principle Resulting signal-to-noise likelihood function:

14 Finding the best melody-performance alignment Recurse over tempo and end-time of the previous note  Dynamic Programming procedure Complexity: #notes Length of Signal #Possible Tempo values

15 Queries: 50 melodies from opera arias (from Midi files) Database: over 800 performances of opera arias performed by over 50 tenors with full orchestral accompaniment Compared our variable-tempo (VT) model vs. fixed-tempo (FT) and locally-fixed-tempo (LFT) models Compared our Harmonic with Scaled Noise (HSN) spectral model vs. Harmonic with Independent Noise (HIN) model Experimental Results

16 Evaluation Measures Oerr = 0 Cov = Likelihood Value Index of Performance in the ranked list 12345

17 Summary of Results One Error of VT+HSN: 8% Average Precision of VT+HSN: 95% Coverage of VT+HSN: 0.21

18 Results FT LFT VT 5 Sec FT LFT VT 15 Sec FT LFT VT 25 Sec. OerrCovAvgPOerrCovAvgP HINHSN Spectral Distribution Model

19 Precision-Recall

20 Illustration of Segmentation

21 Future Work More data Other genre of music Alternative spectral distribution models using supervised learning methods. Use alignment results for separating a soloist from the accompaniment