Performing expressive music using Case-Based Reasoning Ramon López de Mántaras IIIA - CSIC

Slides:



Advertisements
Similar presentations
Design Project (Last updated: Nov. 22/2010) Change since August 31: added the notes to the presentation in the next slide.
Advertisements

Active Appearance Models

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Reduced Support Vector Machine
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
An Exploration of timbre: its perception, analysis and representation Dr. Deirdre Bolger CNRS-LMS,Paris Invited lecture, Institut für Musikwissenschaft,
T.Sharon 1 Internet Resources Discovery (IRD) Music IR.
Case-based Reasoning System (CBR)
Fuzzy Medical Image Segmentation
Learning from Experience: Case Injected Genetic Algorithm Design of Combinational Logic Circuits Sushil J. Louis Genetic Algorithm Systems Lab(gaslab)
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Case-Based Reasoning Ramon López de Mántaras Badia IIIA - CSIC
Online Learning for Web Query Generation: Finding Documents Matching a Minority Concept on the Web Rayid Ghani Accenture Technology Labs, USA Rosie Jones.
Building Knowledge-Driven DSS and Mining Data
31 st October, 2012 CSE-435 Tashwin Kaur Khurana.
1 AUTOMATIC TRANSCRIPTION OF PIANO MUSIC - SARA CORFINI LANGUAGE AND INTELLIGENCE U N I V E R S I T Y O F P I S A DEPARTMENT OF COMPUTER SCIENCE Automatic.
JSymbolic and ELVIS Cory McKay Marianopolis College Montreal, Canada.
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
A Time Based Approach to Musical Pattern Discovery in Polyphonic Music Tamar Berman Graduate School of Library and Information Science University of Illinois.
Introduction to algorithmic models of music cognition David Meredith Aalborg University.
Machine Discoveries: A few Simple, Robust Local Expression Principles Written by Gerhard Widmer presented by Siao Jer, ISE 575b, Spring 2006.
HANA HARRISON CSE 435 NOVEMBER 19, 2012 Music Composition.
CBR for Design Upmanyu Misra CSE 495. Design Research Develop tools to aid human designers Automate design tasks Better understanding of design Increase.
Exploring Design Innovation: The AI Method and Some Results Ashok Goel Georgia Tech May 18, 2006.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
11 Applications of Machine Learning to Music Research: Empirical Investigations into the Phenomenon of Musical Expression 이 인 복.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
NEW MODEL, OLD PROBLEM: AN EMPIRICAL INVESTIGATION INTO GROUPING AND METRICAL CONSTRAINTS IN MUSIC PERCEPTION NEW MODEL, OLD PROBLEM: AN EMPIRICAL INVESTIGATION.
Modeling Expressive Performances of the Singing Voice Maria-Cristina Marinescu (Universidad Carlos III de Madrid) Rafael Ramirez (Universitat Pompeu Fabra)
Configuration Systems - CSE Sudhan Kanitkar.
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.
Audio Tempo Extraction Presenter: Simon de Leon Date: February 9, 2006 Course: MUMT611.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Article Summary of The Structural Complexity of Software: An Experimental Test By Darcy, Kemerer, Slaughter and Tomayko In IEEE Transactions of Software.
Strategies for Distributed CBR Santi Ontañón IIIA-CSIC.
1 Knowledge Acquisition and Learning by Experience – The Role of Case-Specific Knowledge Knowledge modeling and acquisition Learning by experience Framework.
Melodic Similarity Presenter: Greg Eustace. Overview Defining melody Introduction to melodic similarity and its applications Choosing the level of representation.
2004 謝俊瑋 NTU, CSIE, CMLab 1 A Rule-Based Video Annotation System Andres Dorado, Janko Calic, and Ebroul Izquierdo, Senior Member, IEEE.
20. september 2006TDT55 - Case-based reasoning1 Retrieval, reuse, revision, and retention in case-based reasoning.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Using Transportation Distances for Measuring Melodic Similarity Pichaya Tappayuthpijarn Qiang Wang.
Date: 2011/1/11 Advisor: Dr. Koh. Jia-Ling Speaker: Lin, Yi-Jhen Mr. KNN: Soft Relevance for Multi-label Classification (CIKM’10) 1.
Generating Query Substitutions Alicia Wood. What is the problem to be solved?
Content-Based MP3 Information Retrieval Chueh-Chih Liu Department of Accounting Information Systems Chihlee Institute of Technology 2005/06/16.
Guerino Mazzola (Spring 2016 © ): Performance Theory III EXPRESSIVE THEORY III.2 (Fr Feb 19) Emotional Expression II.
Research Word has a broad spectrum of meanings –“Research this topic on ….” –“Years of research has produced a new ….”
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Multiple Sequence Alignment Vasileios Hatzivassiloglou University of Texas at Dallas.
Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Learning to analyse tonal music Pl á cido Rom á n Illescas David Rizo Jos é Manuel I ñ esta Pattern recognition and Artificial Intelligence group University.
Melody Recognition with Learned Edit Distances Amaury Habrard Laboratoire d’Informatique Fondamentale CNRS Université Aix-Marseille José Manuel Iñesta,
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Tutoring & Help Systems Deepthi Bollu for CSE495 10/31/2003.
1 Tempo Induction and Beat Tracking for Audio Signals MUMT 611, February 2005 Assignment 3 Paul Kolesnik.
Musical Similarity: More perspectives and compound techniques
Transfer Learning in Astronomy: A New Machine Learning Paradigm
Memory and Melodic Density : A Model for Melody Segmentation
Integrating Segmentation and Similarity in Melodic Analysis
Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611
Lecture 6: Knowledge Application Systems
Chord Recognition with Application in Melodic Similarity
Presentation transcript:

Performing expressive music using Case-Based Reasoning Ramon López de Mántaras IIIA - CSIC

Outline Reminding CBR & Introducing Saxex main components Case representation –The musical knowledge Retrieval using perspectives Reuse –Fuzzy combination SaxEx Results TempoExpress Conclusions and future work

Case-based reasoning (CBR) Solving problems by means of examples of already solved similar problems (reasoning from precedents) The task of our system is to infer, via CBR and musical knowledge, a set of expressive transformations to be applied to the notes of inexpressive musical phrases given as input The precedents are examples of expressive human interpretations

Saxex Components

SMS Snapshot

Saxex-CBR

Outline Reminding CBR & Introducing Saxex main components Case representation –The musical knowledge Retrieval using perspectives Reuse –Fuzzy combination SaxEx Results TempoExpress Conclusions and future work

Case representation Score Musical knowledge –implication-realization, metrical structure, time-span reduction & prolongational reduction Performance representation (solution description) sound transformation operations: –eg: high dynamics, medium rubato, very legato, etc. SOLUTIONSOLUTION

Transformations Transformations (for each note) –Dynamics (5 possible values) –Rubato (5 possible values) –Vibrato (5 possible values) ----->1250 possibilities –Articulation (5 possible values) –Attack (2 possible values) Vibr. Din. Rub Art.

Score

Musical knowledge Implication/Realization model (Narmour) –Basic structures: –Melodic direction, durational cumulation GTTM theory (Lerdahl & Jackendoff) –Metrical structure (metrical strength of notes) –Time-span reduction (relative importance of notes within phrases or sub-phrases) –Prolongational reduction (tensions, relaxations) Jazz Theory –Harmonic Progressions (duration, harmonic stability)

Implication/Realization Model

GTTM Theory

Performance

Outline Reminding CBR & Introducing Saxex main components Case representation –The musical knowledge Retrieval using perspectives Reuse –Fuzzy combination SaxEx Results TempoExpress Conclusions and future work

A Retrieval Perspective

Case Memory Problem IdentifySearchSelect Retrieval Example

Outline Reminding CBR & Introducing Saxex main components Case representation –The musical knowledge Retrieval using perspectives Reuse –Fuzzy combination SaxEx Results TempoExpress Conclusions and future work

Saxex-Reuse Transformations –Dynamics –Rubato –Vibrato –Articulaction –Attack Criteria –Most similar –Majority –Minority –Continuity –Random –Fuzzy combination (DEFAULT) Vibr. Din. Rub Art.

Problem Din. Rub Art. Single case retrieved Din. Rub Art. Saxex-Reuse Example

Saxex-Reuse (Fuzzy Combination) 20320Tempo 0 1 Very Low LowMedium High High The notes in the human-performed musical phrases are qualified by means of five ordered linguistic values. Those for rubato are: Assume that SaxEx has retrieved and selected two notes whose rubato values are 72 and 190 respectively. The fuzzy combination followed by a defuzzification gives the rubato value to be applied to the input note: COA LowMedium

Outline Reminding CBR & Introducing Saxex main components Case representation –The musical knowledge Retrieval using perspectives Reuse –Fuzzy combination SaxEx Results TempoExpress Conclusions and future work

Saxex Results Autumn Leaves Inexpressive Input phrase Expressive Output phrase SaxEx

Affective Labels Three orthogonal dimensions –Tender-Aggressive –Sad-Joyful –Calm-Restless Relating to notions such as –activity –tension vs. relaxation –Brightness...

Inexpressive Input phrase SaxEx Results SaxEx Aff. values Joyful Sad All of me

Reminding CBR & Introducing Saxex main components Case representation –The musical knowledge Retrieval using perspectives Reuse –Fuzzy combination SaxEx Results TempoExpress Conclusions and future work

Goal: –Changing the original performing tempo of a melody, preserving expressiveness, in the context of jazz standards. Application: Audio editing software Video / Audio post-production (video constrains audio) Why not applying uniform time stretching to the audio? Timing of notes w.r.t. beat may have to change Other expressive phenomena (e.g. ornamentations, consolidations, fragmentations) may have to change as a function of the tempo TempoExpress

Musical explanation: Expressivity is a result of the conception of the music by the performer, and this conception changes with tempo [Desain & Honing, 1994] Original tempo (180 ) Transformed tempo (90) Uniform time stretching Melody: “Up Jumped Spring” Recording TempoExpress

Some basic music performance concepts and their relations Expressive Transformations

Onset deviations at different tempos (Body and Soul A1)

“Hand crafted” –Let a music expert formulate rules for music performance (Friberg, CMJ 1991, Friberg et al. CMJ 2000) Machine learned –Derive expressivity rules automatically from examples (Widmer, ICMC 2000, JNMR 2002) Eager approach: Builds a model based on many training examples and uses the learned model to solve new problems –Imitate expressivity using examples of concrete human performances by means of CBR (Arcos & Lopez de Mantaras, JNMR 1998, Lopez de Mantaras & Arcos, AI Mag 2002) Lazy approach: Take the solution of the training example that resembles most to the new problem, and adapt it to solve it “That an expressive effect is applied only once does not mean it is insignificant” (Sundberg, MP 2001) Approches to expressive music generation

TempoExpress Architecture Desired Tempo

Performance Annotation Expressivity in jazz is more than timing / dynamics deviations. It is also spontaneous note ornamentations, fragmentations, etc. To model this, we define a set of Performance Events: And we use them as edit operations to obtain an edit-distance-based alignment between the score and the performance

Goal of the annotation process –Automatic case base acquisition Comparing Score vs recordings

Body and Soul Once I Loved Examples F C C I I

Goal: Assessing the distance between two sequences –Calculated as the minimal cost of transforming S 1 into S 2 –Requires: Edit operations Cost functions Edit (Levenshtein) distance

R R RI

TTF Case Annotation examples (I)

TT T CTC T Case Annotation examples (II)

T TT T I Case Annotation examples (III)

Rationale: the expressivity of a performed note is not just determined by the note itself. Ergo: Some representation of the melodic context of the note is needed We use the Implication / Realization model of melodic structure (Narmour, 1990) –It captures the pattern of fulfillment / violation of expectations created by the melodic surface –Groups notes based on gestalt principles Representing melodic context

Repeated for each tempo Case Representation

1. Filter cases by tempo: keep cases containing performances at relevant tempos (one of the tempos is similar to the original tempo of the target melody and there is another performed tempo similar to the desired tempo to which the target melody has to be transformed) 2. Rank the cases that passed the previous filter by I/R similarity to the score of the target melody (using edit-distance) 3. Partition the phrases of the most similar cases into segments using the I/R parser or any other melodic segmentation algorithm (for instance Temperley, 2001) 4. Form a “new” case base containing the obtained segments (space of partial solutions) as cases Retrieval

Solutions for the target melody are generated segment-wise via a best first search through the space of partial solutions (segments) Procedure: 1. Retrieve best matching segment (using edit-distance) 2. Align target melody and retrieved segment 3. Transfer performance events For aligned notes T and R, let T i (R) -----> T o (R) represent the tempo transformation of note R; use the annotations differences between T i (R) and T o (R) to generate the solution T o (T) from T i (T) 4.For non-aligned target notes use UTS to transform T i (T) into T o (T) Reuse

TempoExpress overall view

Uniform time stretching CBR Human bpm Example of TempoExpress Result

Four jazz standards recordings by a professional musician (12 tempos for each: 48 recordings) 14 different phrases containing a total of 64 different melodic segments More than 8000 tempo-transformation problems in the case base Experimental comparison to UTS

TempoExpress vs. UTS as a function of the ratio of original tempo to transformed tempo. The lower plot shows the probability of incorrectly rejecting the hypothesis (that there is no difference between TempoExpress and UTS) for the Wilcoxon signed-rank test.

Conclusions & Future CBR is a powerful technique to imitate human solutions (performances): Human-like output SaxEx successfully retrieves relevant cases Fuzzy combination increases output variation SaxEx as a pedagogical tool: –Users can experiment with the system –Helps understanding how to use the different expressive resources TempoExpres: an application to audio post-production that clearly outperforms UTS Further TempoExpress experimentation with fast tempos (more example cases at fast tempos are needed) Add within-note descriptions: –Energy envelpe features: attack, sustain, decay, tremolo –Pitch envelope features: vibrato, glissando Add between-notes descriptions: –Articulation (legato,, staccato)