Educational Software using Audio to Score Alignment Antoine Gomas supervised by Dr. Tim Collins & Pr. Corinne Mailhes 7 th of September, 2007.

Slides:



Advertisements
Similar presentations
1 Gesture recognition Using HMMs and size functions.
Advertisements

Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
Yasuhiro Fujiwara (NTT Cyber Space Labs)
Toward Automatic Music Audio Summary Generation from Signal Analysis Seminar „Communications Engineering“ 11. December 2007 Patricia Signé.
Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.
LAM: Musical Audio Similarity Michael Casey Centre for Cognition, Computation and Culture Department of Computing Goldsmiths College, University of London.
Lyric alignment in popular songs Luong Minh Thang.
Energy Characterization and Optimization of Embedded Data Mining Algorithms: A Case Study of the DTW-kNN Framework Huazhong University of Science & Technology,
Rhythmic Similarity Carmine Casciato MUMT 611 Thursday, March 13, 2005.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
Introduction to Hidden Markov Models
Page 1 Hidden Markov Models for Automatic Speech Recognition Dr. Mike Johnson Marquette University, EECE Dept.
Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.
Hidden Markov Models Theory By Johan Walters (SR 2003)
1 Hidden Markov Models (HMMs) Probabilistic Automata Ubiquitous in Speech/Speaker Recognition/Verification Suitable for modelling phenomena which are dynamic.
Lecture 15 Hidden Markov Models Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer.
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
“Random Projections on Smooth Manifolds” -A short summary
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell.
Face Recognition Using Embedded Hidden Markov Model.
Hidden Markov Models Usman Roshan BNFO 601. Hidden Markov Models Alphabet of symbols: Set of states that emit symbols from the alphabet: Set of probabilities.
Hidden Markov Models Usman Roshan BNFO 601. Hidden Markov Models Alphabet of symbols: Set of states that emit symbols from the alphabet: Set of probabilities.
Hidden Markov Models 戴玉書
Learning Hidden Markov Model Structure for Information Extraction Kristie Seymour, Andrew McCullum, & Ronald Rosenfeld.
Ensuring Home-based Rehabilitation Exercise by Using Kinect and Fuzzified Dynamic Time Warping Algorithm Qiao Zhang.
Sensys 2009 Speaker:Lawrence.  Introduction  Overview & Challenges  Algorithm  Travel Time Estimation  Evaluation  Conclusion.
Introduction to Automatic Speech Recognition
1 AUTOMATIC TRANSCRIPTION OF PIANO MUSIC - SARA CORFINI LANGUAGE AND INTELLIGENCE U N I V E R S I T Y O F P I S A DEPARTMENT OF COMPUTER SCIENCE Automatic.
Polyphonic Queries A Review of Recent Research by Cory Mckay.
WHAT IS TRANSCRIBE! ? Transcribe! is computer software that was designed to help people transcribe music from recordings. Transcribe- Learning to play.
Fundamentals of Hidden Markov Model Mehmet Yunus Dönmez.
Dan Rosenbaum Nir Muchtar Yoav Yosipovich Faculty member : Prof. Daniel LehmannIndustry Representative : Music Genome.
Incorporating Dynamic Time Warping (DTW) in the SeqRec.m File Presented by: Clay McCreary, MSEE.
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Images Similarity by Relative Dynamic Programming M. Sc. thesis by Ady Ecker Supervisor: prof. Shimon Ullman.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
PGM 2003/04 Tirgul 2 Hidden Markov Models. Introduction Hidden Markov Models (HMM) are one of the most common form of probabilistic graphical models,
Audio Tempo Extraction Presenter: Simon de Leon Date: February 9, 2006 Course: MUMT611.
PhD Candidate: Tao Ma Advised by: Dr. Joseph Picone Institute for Signal and Information Processing (ISIP) Mississippi State University Linear Dynamic.
Automatic Speech Recognition A summary of contributions from multiple disciplines Mark D. Skowronski Computational Neuro-Engineering Lab Electrical and.
BY KALP SHAH Sentence Recognizer. Sphinx4 Sphinx4 is the best and versatile recognition system. Sphinx4 is a speech recognition system which is written.
Piano Music Transcription Wes “Crusher” Hatch MUMT-614 Thurs., Feb.13.
Tracking Groups of People for Video Surveillance Xinzhen(Elaine) Wang Advisor: Dr.Longin Latecki.
Classification of melody by composer using hidden Markov models Greg Eustace MUMT 614: Music Information Acquisition, Preservation, and Retrieval.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Multi-view Synchronization of Human Actions and Dynamic Scenes Emilie Dexter, Patrick Pérez, Ivan Laptev INRIA Rennes - Bretagne Atlantique
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Flexible Speaker Adaptation using Maximum Likelihood Linear Regression Authors: C. J. Leggetter P. C. Woodland Presenter: 陳亮宇 Proc. ARPA Spoken Language.
2014 Development of a Text-to-Speech Synthesis System for Yorùbá Language Olúòkun Adédayọ̀ Tolulope Department of Computer Science.
Genre Classification of Music by Tonal Harmony Carlos Pérez-Sancho, David Rizo Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante,
A NONPARAMETRIC BAYESIAN APPROACH FOR
Audio to Score Alignment for Educational Software
Carmine Casciato MUMT 611 Thursday, March 13, 2005
Machine Learning overview Chapter 18, 21
Machine Learning overview Chapter 18, 21
Artificial Intelligence for Speech Recognition
Supervised Time Series Pattern Discovery through Local Importance
Implementing Boosting and Convolutional Neural Networks For Particle Identification (PID) Khalid Teli .
Computational NeuroEngineering Lab
Carmine Casciato MUMT 611 Thursday, March 13, 2005
Isolated word, speaker independent speech recognition
Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611
Handwritten Characters Recognition Based on an HMM Model
Visual Recognition of American Sign Language Using Hidden Markov Models 문현구 문현구.
Measuring the Similarity of Rhythmic Patterns
Music Signal Processing
Presentation transcript:

Educational Software using Audio to Score Alignment Antoine Gomas supervised by Dr. Tim Collins & Pr. Corinne Mailhes 7 th of September, 2007

1 Agenda Introduction Objectives Review & Innovation Work  Dynamic Time Warping  Hidden Markov Models  Interface Conclusion

2 Audio to score alignment? Associate  Notes in a score  Timing points in a recording Example

3 Project objectives Implement a monophonic audio to score alignment algorithm Evaluate characteristics of the performance Design a learning interface to help music students improve their performance

4 Review (1) Previous work  Algorithms already exist  Similar to Spoken Language Processing  Application: musicology  Professional recordings

5 Review (2) Previous work (continued)  Dynamic Time Warping Few parameters Heavy Low flexibility  Hidden Markov Models Very flexible Large number of parameters (training)

6 Review (3) Innovation  Apply to educational software  Requires modifications & new functionalities Cope with errors Detect errors

7 Work Dynamic Time Warping Hidden Markov Models ITS & Interface design

8 DTW (1) Overview Get a first version to work Attack, Sustain, Silence Uses Dynamic Time Warping

9 DTW (2) Structure Feature extraction Distance matrix Find optimal path

10 DTW (3) Instrument model Silence  Energy Attack   Energy Sustain Guitar Vibes

11 DTW (4) Results ~95% notes aligned on “good” performances Rhythm errors  Very high tolerance  Provided pitches are correct Pitch errors  Tuning errors: no problem  Note errors: OK Good results, but limitations

12 DTW (5) Limitations Impossible to recover from severe student mistakes Self-correction not perfect

13 HMM (1) Why? Expected  Lower computing requirements  Flexibility to recover from student’s errors And also  Use state-of-the-art techniques  Find connections with SLP

14 HMM (2) Application to ASA HMM  Observed symbols  State trellis  Emission matrix  Decoded sequence ASA  Recording frames  Score representation  Instrument model  Performance image

15 HMM (3) Flexibility Note 6 D 6, P 6 1-p 12 1 Note 1 D 1, P 1 Note 2 D 2, P 2 Note 3 D 3, P 3 Note 4 D 4, P 4 Note 5 D 5, P 5 p 12 p Note 1 D 1, P 1 Note 2 D 2, P 2 Note 3 D 3, P 3 Note 4 D 4, P 4 Note 5 D 5, P 5 p 23 11p 12 Note 7 D’ 3, P 3 Note 8 D’ 4, P 4 1-p Note 6 D 2, P’ 2 1-p 12 1-p 63 p 63 1-p 23

16 HMM (4) Results 100% on rhythmic recordings Good on melodic recordings Rhythm errors  Good tolerance, though inferior to DTW Pitch errors  No data Severe mistakes  Fine when anticipated Self correction  More robust than DTW  Tempo estimation not critical

17 HMM (5) Extensions Pitch Other note topologies Improve speed  Local algorithm  Language Waiting state

18 ITS & Interface (1) Intelligent Tutoring Systems Knowledge models  Domain model  Learner model Open Learner Model DM LM Teaching strategies DM LM Teaching strategies OverlayPerturbation

19 ITS & Interface (2)

20 Conclusion DTW not suitable for education Promising HMM results  Works without pitch  Additional paths for anticipated errors Still room for improvements  Pitch  Computation efficiency Coherent ground together with IF design

21 Thank you for listening Any questions?