Melody Recognition with Learned Edit Distances Amaury Habrard Laboratoire d’Informatique Fondamentale CNRS Université Aix-Marseille José Manuel Iñesta,

Slides:



Advertisements
Similar presentations
Machine Learning for Vision-Based Motion Analysis Learning pullback metrics for linear models Oxford Brookes Vision Group Oxford Brookes University 17/10/2008.
Advertisements

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
ECG Signal processing (2)
Aggregating local image descriptors into compact codes
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Data Mining Classification: Alternative Techniques
1 Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge Brief introduction to lectures.
Object Recognition Using Genetic Algorithms CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition.
Tree structured representation of music for polyphonic music information retrieval David Rizo Departament of Software and Computing Systems University.
Jierui Xie, Boleslaw Szymanski, Mohammed J. Zaki Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180, USA {xiej2, szymansk,
Cluster Analysis.  What is Cluster Analysis?  Types of Data in Cluster Analysis  A Categorization of Major Clustering Methods  Partitioning Methods.
T.Sharon 1 Internet Resources Discovery (IRD) Music IR.
Chapter 4 (part 2): Non-Parametric Classification
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Text Classification Using Stochastic Keyword Generation Cong Li, Ji-Rong Wen and Hang Li Microsoft Research Asia August 22nd, 2003.
16 November, 2005 Statistics in HEP, Manchester 1.
Introduction to Machine Learning Approach Lecture 5.
Today Evaluation Measures Accuracy Significance Testing
Knowledge Acquisition from Game Records Takuya Kojima, Atsushi Yoshikawa Dept. of Computer Science and Information Engineering National Dong Hwa University.
Hubert CARDOTJY- RAMELRashid-Jalal QURESHI Université François Rabelais de Tours, Laboratoire d'Informatique 64, Avenue Jean Portalis, TOURS – France.
Polyphonic Queries A Review of Recent Research by Cory Mckay.
A Time Based Approach to Musical Pattern Discovery in Polyphonic Music Tamar Berman Graduate School of Library and Information Science University of Illinois.
Efficient Model Selection for Support Vector Machines
1 Music Classification Using Significant Repeating Patterns Chang-Rong Lin, Ning-Han Liu, Yi-Hung Wu, Arbee L.P. Chen, Proc. of 9th International Conference,
Hierarchical Distributed Genetic Algorithm for Image Segmentation Hanchuan Peng, Fuhui Long*, Zheru Chi, and Wanshi Siu {fhlong, phc,
Zorica Stanimirović Faculty of Mathematics, University of Belgrade
Universit at Dortmund, LS VIII
Hyperparameter Estimation for Speech Recognition Based on Variational Bayesian Approach Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee and Keiichi.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 16 February 2007 William.
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
Margin-Sparsity Trade-off for the Set Covering Machine ECML 2005 François Laviolette (Université Laval) Mario Marchand (Université Laval) Mohak Shah (Université.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
1 Genetic Algorithms and Ant Colony Optimisation.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
AURAL SKILLS ASSESSMENT TASK 2 Question 2 THE CONCEPTS OF MUSIC General Knowledge.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.
Genetic Algorithms CSCI-2300 Introduction to Algorithms
Melodic Similarity Presenter: Greg Eustace. Overview Defining melody Introduction to melodic similarity and its applications Choosing the level of representation.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
A Genetic Algorithm-Based Approach to Content-Based Image Retrieval Bo-Yen Wang( 王博彥 )
1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.
Classification of melody by composer using hidden Markov models Greg Eustace MUMT 614: Music Information Acquisition, Preservation, and Retrieval.
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
A Binary Linear Programming Formulation of the Graph Edit Distance Presented by Shihao Ji Duke University Machine Learning Group July 17, 2006 Authors:
BASS TRACK SELECTION IN MIDI FILES AND MULTIMODAL IMPLICATIONS TO MELODY gPRAI Pattern Recognition and Artificial Intelligence Group Computer Music Laboratory.
Metamidi: a tool for automatic metadata extraction from MIDI files Tomás Pérez-García, Jose M. Iñesta, and David Rizo Computer Music Laboratory University.
Tree structured and combined methods for comparing metered polyphonic music Kjell Lëmstrom David Rizo Valero José Manuel Iñesta CMMR’08 May 21, 2008.
Melody Characterization by a Fuzzy Rule System Pedro J. Ponce de León, David Rizo, José M. Iñesta (DLSI, Univ. Alicante) Rafael Ramírez (MTG, Univ. Pompeu.
Stochastic Text Models for Music Categorization Carlos Pérez-Sancho, José M. Iñesta, David Rizo Pattern Recognition and Artificial Intelligence group Department.
A shallow description framework for musical style recognition Pedro J. Ponce de León, Carlos Pérez-Sancho and José Manuel Iñesta Departamento de Lenguajes.
Genre Classification of Music by Tonal Harmony Carlos Pérez-Sancho, David Rizo Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante,
Chapter 7. Classification and Prediction
Applying Deep Neural Network to Enhance EMPI Searching
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Image Retrieval Longin Jan Latecki.
Rob Fergus Computer Vision
A Unifying View on Instance Selection
Presented by Steven Lewis
Fine Arts section 1 pg.7-20 By david steen.
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Density-Based Image Vector Quantization Using a Genetic Algorithm
Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,
Evaluating Classifiers for Disease Gene Discovery
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Matching of Sets of Points
Presentation transcript:

Melody Recognition with Learned Edit Distances Amaury Habrard Laboratoire d’Informatique Fondamentale CNRS Université Aix-Marseille José Manuel Iñesta, David Rizo Dept. Software and Computing Systems, Universidad de Alicante Marc Sebban Laboratoire Hubert Curien UMR CNRS 5516

Outline Introduction Symbolic music encoding Edit distance and learning Experiments Conclusions

Introduction: task Given an interpretation of a monophonic melody retrieve the name of the song from a set of known prototypes using a similarity measure p1p1 p2p2 pnpn Query q Which song? d(q, p 1 )=9 d(q, p 2 )=5 d(q, p 2 )=7

Approaches to solve the task Geometric methods [Alloupis et al. 2006, Lëmstrom et al. 2003] Language models [Doraisamy et al. 2003] Edit distance-based methods [Mongeau and Sankoff 1990, Rizo and Iñesta 2003, Gratchen et al. 2005] These methods seem to perform the best

Edit distance between strings String coding of a melody: Different possible encodings for pitch and rhythm E.g. intervals from tonic mod 12 7, 7, 9, 7, 0, 11, 7, 7 C Major

Similarity between musical sequences Definition Given an edit script e = e 1... e n as a sequence of edit operations e i = (b i | a i ) to transform a input data X into an output Y e i = (a i, b i ) ∈ (Σ ∪ {λ}) x (Σ ∪ {λ}) Cost of edit script Distance between X and Y d(X, Y ) = min e ∈ S(X,Y ) π(e)

 Based on the logarithmic nature of music notation  Each tree level is a subdivision of the upper level whole4 beats half2+2 quarter 4×1 8×½8×½eighth  Leaf labels can be any pitch magnitude  Rests are coded the same way as notes  Duration is implicitly coded in the tree structure  Finally labels are propagated bottom-up selecting most important note Tree representation

Similarity of melodies Use tree-edit distance Currently using Selkow distance

Musical meaning of edit costs Editing costs depend on the note and its context in terms of music knowledge E.g. deleting a non diatonic note has lower cost than deleting a note of the scale Find the best editing cost matrix Current approaches in music applications: Unit costs: it does not reflect musical concept Costs fixed manually based on musical expertise Brute force: learning time Genetic systems: not understandable costs Our proposal: Probabilistic framework: learns matrix of edit probabilities, keeps understandability

Stochastic Edit Similarity (1/3) Learning the parameters of an edit distance requires the use of an inductive principle. In the context of probabilistic machines, the maximization of the likelihood is often used. Solution: to learn the edit parameters in a probabilistic framework.

Stochastic Edit Similarity (2/3) Definition Given an edit script e = e 1... e n as a sequence of edit operations e i = (b i | a i ) to transform a input data X into an output Y a probabilistic edit script has a probability π s (e) such that:

Stochastic Edit Similarity (3/3) Definition p(Y |X ) is the sum of the probabilities of all edit scripts transforming X in Y. Let S (Y | X ) be the set of all scripts that enable the emission of Y given X. How to learn p(e i ) ? Stochastic conditional transducer Strings: Oncina and Sebban (2006) Trees: Bernard, Habrard and Sebban (2006)

Experiments Corpus 8-12 bar fragments of 20 worldwide well known tunes For each song 20 different interpretations by 5 different players Total: 420 prototypes, 20 classes

Experiments Two experiments Compare classification performance of the probabilistic approach to: Fixed-costs systems Edit costs learned with genetic algorithms Compare understandability of learned matrix to: Edit costs learned with genetic algorithms

Genetic system Conventional genetic system Objective of the algorithm: find the best editing costs to be used in the classical edit distance Costs represented by a 13x13 matrix 13 = 12 (pitches = | Σ | ) + 1 ( λ ) No e (λ, λ) operation 168 editing costs Individuals are represented by chromosomes with 168 genes Genes \in [0,2] C |R

Genetic system Fitness function Average precision-at-n for all prototypes of learning set, using leave-one-out precision-at-n: number of relevant hits among the n closest melodies, n being the number of prototypes in the learning set with the same class as the query Classification using 1-NN JGAP library used, with default setup Stop criterion: fitness function stabilization (around 100 generations)

Experiment 1: Melody classification accuracy Goal: to identify a melody given an interpretation query Strings and trees 3-fold cross validation Train with 2 folds Test with the remaining fold using leave-one-out Classified using 1-NN

Experiment 1: Melody classification accuracy Both learned schemes improves success rates over fixed costs systems, more evident on trees

Experiment 2: Analysis of the Learned Edit Matrices Genetic StringsTrees Stochastic By changing the contrast of the image the costs represent the piano keyboard

Tree costs learned

Conclusions Algorithms for edit distance learning applied Improve results over fixed-cost approaches Probabilistic and genetic approaches compared Probabilistic more adequate to explain by means of cost matrices the musical variations and/or noise Maybe, with a bigger corpus the genetic approach can generate more understandable cost matrices But also the probabilistic approach will improve

Future works Consider more pitch representations such as contour, high definition contour, pitch classes, and intervals to evaluate their adequacy to the melody matching problem Work on polyphonic music By means of trees [Rizo et al. 2008]