Presentation is loading. Please wait.

Presentation is loading. Please wait.

Melody Recognition with Learned Edit Distances Amaury Habrard Laboratoire d’Informatique Fondamentale CNRS Université Aix-Marseille José Manuel Iñesta,

Similar presentations


Presentation on theme: "Melody Recognition with Learned Edit Distances Amaury Habrard Laboratoire d’Informatique Fondamentale CNRS Université Aix-Marseille José Manuel Iñesta,"— Presentation transcript:

1 Melody Recognition with Learned Edit Distances Amaury Habrard Laboratoire d’Informatique Fondamentale CNRS Université Aix-Marseille José Manuel Iñesta, David Rizo Dept. Software and Computing Systems, Universidad de Alicante Marc Sebban Laboratoire Hubert Curien UMR CNRS 5516

2 Outline Introduction Symbolic music encoding Edit distance and learning Experiments Conclusions

3 Introduction: task Given an interpretation of a monophonic melody retrieve the name of the song from a set of known prototypes using a similarity measure p1p1 p2p2 pnpn Query q Which song? d(q, p 1 )=9 d(q, p 2 )=5 d(q, p 2 )=7

4 Approaches to solve the task Geometric methods [Alloupis et al. 2006, Lëmstrom et al. 2003] Language models [Doraisamy et al. 2003] Edit distance-based methods [Mongeau and Sankoff 1990, Rizo and Iñesta 2003, Gratchen et al. 2005] These methods seem to perform the best

5 Edit distance between strings String coding of a melody: Different possible encodings for pitch and rhythm E.g. intervals from tonic mod 12 7, 7, 9, 7, 0, 11, 7, 7 C Major

6 Similarity between musical sequences Definition Given an edit script e = e 1... e n as a sequence of edit operations e i = (b i | a i ) to transform a input data X into an output Y e i = (a i, b i ) ∈ (Σ ∪ {λ}) x (Σ ∪ {λ}) Cost of edit script Distance between X and Y d(X, Y ) = min e ∈ S(X,Y ) π(e)

7  Based on the logarithmic nature of music notation  Each tree level is a subdivision of the upper level whole4 beats half2+2 quarter 4×1 8×½8×½eighth  Leaf labels can be any pitch magnitude  Rests are coded the same way as notes  Duration is implicitly coded in the tree structure  Finally labels are propagated bottom-up selecting most important note......... Tree representation

8 Similarity of melodies Use tree-edit distance Currently using Selkow distance

9 Musical meaning of edit costs Editing costs depend on the note and its context in terms of music knowledge E.g. deleting a non diatonic note has lower cost than deleting a note of the scale Find the best editing cost matrix Current approaches in music applications: Unit costs: it does not reflect musical concept Costs fixed manually based on musical expertise Brute force: learning time Genetic systems: not understandable costs Our proposal: Probabilistic framework: learns matrix of edit probabilities, keeps understandability

10 Stochastic Edit Similarity (1/3) Learning the parameters of an edit distance requires the use of an inductive principle. In the context of probabilistic machines, the maximization of the likelihood is often used. Solution: to learn the edit parameters in a probabilistic framework.

11 Stochastic Edit Similarity (2/3) Definition Given an edit script e = e 1... e n as a sequence of edit operations e i = (b i | a i ) to transform a input data X into an output Y a probabilistic edit script has a probability π s (e) such that:

12 Stochastic Edit Similarity (3/3) Definition p(Y |X ) is the sum of the probabilities of all edit scripts transforming X in Y. Let S (Y | X ) be the set of all scripts that enable the emission of Y given X. How to learn p(e i ) ? Stochastic conditional transducer Strings: Oncina and Sebban (2006) Trees: Bernard, Habrard and Sebban (2006)

13 Experiments Corpus 8-12 bar fragments of 20 worldwide well known tunes For each song 20 different interpretations by 5 different players Total: 420 prototypes, 20 classes

14 Experiments Two experiments Compare classification performance of the probabilistic approach to: Fixed-costs systems Edit costs learned with genetic algorithms Compare understandability of learned matrix to: Edit costs learned with genetic algorithms

15 Genetic system Conventional genetic system Objective of the algorithm: find the best editing costs to be used in the classical edit distance Costs represented by a 13x13 matrix 13 = 12 (pitches = | Σ | ) + 1 ( λ ) No e (λ, λ) operation 168 editing costs Individuals are represented by chromosomes with 168 genes Genes \in [0,2] C |R

16 Genetic system Fitness function Average precision-at-n for all prototypes of learning set, using leave-one-out precision-at-n: number of relevant hits among the n closest melodies, n being the number of prototypes in the learning set with the same class as the query Classification using 1-NN JGAP library used, with default setup Stop criterion: fitness function stabilization (around 100 generations)

17 Experiment 1: Melody classification accuracy Goal: to identify a melody given an interpretation query Strings and trees 3-fold cross validation Train with 2 folds Test with the remaining fold using leave-one-out Classified using 1-NN

18 Experiment 1: Melody classification accuracy Both learned schemes improves success rates over fixed costs systems, more evident on trees

19 Experiment 2: Analysis of the Learned Edit Matrices Genetic StringsTrees Stochastic By changing the contrast of the image the costs represent the piano keyboard

20 Tree costs learned

21 Conclusions Algorithms for edit distance learning applied Improve results over fixed-cost approaches Probabilistic and genetic approaches compared Probabilistic more adequate to explain by means of cost matrices the musical variations and/or noise Maybe, with a bigger corpus the genetic approach can generate more understandable cost matrices But also the probabilistic approach will improve

22 Future works Consider more pitch representations such as contour, high definition contour, pitch classes, and intervals to evaluate their adequacy to the melody matching problem Work on polyphonic music By means of trees [Rizo et al. 2008]


Download ppt "Melody Recognition with Learned Edit Distances Amaury Habrard Laboratoire d’Informatique Fondamentale CNRS Université Aix-Marseille José Manuel Iñesta,"

Similar presentations


Ads by Google