QBSH Corpus The QBSH corpus provided by Roger Jang [1] consists of recordings of children’s songs from students taking the course “Audio Signal Processing.

Slides:



Advertisements
Similar presentations
CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
Advertisements

Analysis of recorded sounds using Labview
Speaker Associate Professor Ning-Han Liu. What’s MIR  Music information retrieval (MIR) is the interdisciplinary science of retrieving information from.
Multimedia Database Systems
Franz de Leon, Kirk Martinez Web and Internet Science Group  School of Electronics and Computer Science  University of Southampton {fadl1d09,
Chapter 5-Sound.
Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
Rhythmic Similarity Carmine Casciato MUMT 611 Thursday, March 13, 2005.
Retrieval Methods for QBSH (Query By Singing/Humming) J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
1 Integrating User Feedback Log into Relevance Feedback by Coupled SVM for Content-Based Image Retrieval 9-April, 2005 Steven C. H. Hoi *, Michael R. Lyu.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Disambiguation Algorithm for People Search on the Web Dmitri V. Kalashnikov, Sharad Mehrotra, Zhaoqi Chen, Rabia Nuray-Turan, Naveen Ashish For questions.
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
1 The Representation, Indexing and Retrieval of Music Data at NTHU Arbee L.P. Chen National Tsing Hua University Taiwan, R.O.C.
Using Relevance Feedback in Multimedia Databases
The Effectiveness Study of Music Information Retrieval Arbee L.P. Chen National Tsing Hua University 2002 ACM International CIKM Conference.
The Chinese University of Hong Kong Department of Computer Science and Engineering Lyu0202 Advanced Audio Information Retrieval System.
A Wavelet-Based Approach to the Discovery of Themes and Motives in Melodies Gissel Velarde and David Meredith Aalborg University Department of Architecture,
FYP0202 Advanced Audio Information Retrieval System By Alex Fok, Shirley Ng.
DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.
Presenting by, Prashanth B R 1AR08CS035 Dept.Of CSE. AIeMS-Bidadi. Sketch4Match – Content-based Image Retrieval System Using Sketches Under the Guidance.
Harmonically Informed Multi-pitch Tracking Zhiyao Duan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio Lab,
NM7613: Music Signal Analysis and Retrieval 音樂訊號分析與檢索 Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
A Time Based Approach to Musical Pattern Discovery in Polyphonic Music Tamar Berman Graduate School of Library and Information Science University of Illinois.
Multimedia Databases (MMDB)
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Aspects of Music Information Retrieval Will Meurer School of Information University of Texas.
2015/10/221 Progressive Filtering and Its Application for Query-by-Singing/Humming J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval Lab CS Dept.,
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
Rhythmic Transcription of MIDI Signals Carmine Casciato MUMT 611 Thursday, February 10, 2005.
Demos for QBSH J.-S. Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05.
Presenter: Shanshan Lu 03/04/2010
Content-based Music Retrieval from Acoustic Input (CBMR)
Online Kinect Handwritten Digit Recognition Based on Dynamic Time Warping and Support Vector Machine Journal of Information & Computational Science, 2015.
Music Information Retrieval Information Universe Seongmin Lim Dept. of Industrial Engineering Seoul National University.
2016/6/41 Recent Improvement Over QBSH and AFP J.-S. Roger Jang (張智星) Multimedia Information Retrieval (MIR) Lab CSIE Dept, National Taiwan Univ.
Personalized Course Navigation Based on Grey Relational Analysis Han-Ming Lee, Chi-Chun Huang, Tzu- Ting Kao (Dept. of Computer Science and Information.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
Effective Information Access Over Public Archives Progress Report William Lee, Hui Fang, Yifan Li For CS598CXZ Spring 2005.
MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality.
Audio Fingerprinting as a New Task for MIREX-2014 Chung-Che Wang Jyh-Shing Roger Jang.
For use with WJEC Performing Arts GCSE Unit 1 and Unit 3 Task 1 Music Technology Technical issues.
Progress Report - Year 2 Extensions of the PhD Symposium Presentation Daniel McEnnis.
Content-Based MP3 Information Retrieval Chueh-Chih Liu Department of Accounting Information Systems Chihlee Institute of Technology 2005/06/16.
Query by Singing and Humming System
1 CS 430: Information Discovery Lecture 5 Ranking.
1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Distance/Similarity Functions for Pattern Recognition J.-S. Roger Jang ( 張智星 ) CS Dept., Tsing Hua Univ., Taiwan
Music Emotion Classification: A Fuzzy Approach
Discussions on Audio Melody Extraction (AME) J.-S. Roger Jang ( 張智星 ) MIR Lab, CSIE Dept. National Taiwan University.
BASS TRACK SELECTION IN MIDI FILES AND MULTIMODAL IMPLICATIONS TO MELODY gPRAI Pattern Recognition and Artificial Intelligence Group Computer Music Laboratory.
Introduction to Music Information Retrieval (MIR)
Introduction to ISMIR/MIREX
MIR Lab: R&D Foci and Demos ( MIR實驗室:研發重點及展示)
Query by Singing/Humming via Dynamic Programming
MATCH A Music Alignment Tool Chest
自我介紹 學歷: 研究方向: 經歷: 1984:學士,台大電機系 1992:博士,加州大學柏克萊分校、電機電腦系
Closing Remarks on MSAR-2017
Introduction to Music Information Retrieval (MIR)
Introduction to Music Information Retrieval (MIR)
Query by Singing/Humming via Dynamic Programming
1Micheal T. Adenibuyan, 2Oluwatoyin A. Enikuomehin and 2Benjamin S
M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University
Harmonically Informed Multi-pitch Tracking
Pre and Post-Processing for Pitch Tracking
Presentation transcript:

QBSH Corpus The QBSH corpus provided by Roger Jang [1] consists of recordings of children’s songs from students taking the course “Audio Signal Processing and Recognition” over the past 4 years at CS Dept of Tsing Hua Univ., Taiwan. The corpus consists of two parts: MIDI files: 48 monophonic MIDI files of ground truth. WAV files: 2797 singing/humming clips from 118 subjects, with sampling rate of 8 KHz and bit resolution of 8 bits. For each of the WAV file, we have provided another two formats distinguished by their file extensions of “.pv” and “.mid”. PV files are pitch vectors labeled manually by the student who recorded the clip. Most of the PV files are correct, however, there is no guarantee of the absolute correctness. MIDI files were generated from the PV files through a simple note segmentation algorithm. The participants may choose any one of the formats as the input to their systems. The recording count of each MIDI file is shown next: Department of Computer Science, Tsing Hua University, Taiwan Simple But Effective Distance Measures for Query by Singing/Humming Simple But Effective Distance Measures for Query by Singing/Humming J.-S. Roger Jang, N.-J. Lee, and C.-L. Hsu Overview of QBSH Task The goal of QBSH (Query by Singing/Humming) task at MIREX 2006 is to evaluate MIR systems that take sung or hummed query input from real-world users. QBSH task consists of two subtasks: Subtask1: Known-Item Retrieval Input: 2797 sung/hummed queries of 8 seconds. Test database: 48 ground-truth MIDIs Essen Collection MIDI noise files. Evaluation: Mean reciprocal rank (MRR) of the ground truth computed over the top-20 returns. Subtask2: Queries as Variations Input: 2797 sung/hummed queries + 48 ground- truth files of 8 seconds Test database: 48 ground-truth MIDIs Essen MIDI noise files sung/hummed queries. Evaluation: The precision based on the number of songs within the same ground-truth class of the query calculated from the top-20 returns for each of the 2845 queries.. Our Approaches Since all the query data are available, we have to choose a simple but effective distance measure, which do not run into the potential problem of over fitting/training. Under this guideline, our primary candidates are DTW: Dynamic Time Warping [2, 3] LS: Linear Scaling [4] LS+DTW: LS plus DTW [5] Then we need to decide which files to use as the input to our system. Apparently,.wav and.pv should be better ones since.mid is derived from.pv. In order to carry a systematic check, we perform an evaluation (similar to subtask1, where 2000 MIDIs are selected from the Internet as a replacement for Essen Collection) on.pv and.wav. When.wav files are used, we employed a robust pitch tracking algorithm based on dynamic programming to extract the pitch vectors. The result is shown next, which demonstrates.pv is a better choice then.wav + pitch tracking. Since computation time is not really an issue in this task, we used only DTW and LS in our evaluation. Based on the evaluation criteria for both subtasks, we found that LS is the best method for subtask 1 and DTW is the best method for subtask 2. Comments on QBSH Task For a comprehensive evaluation of QBSH in the coming year, we have several comments/suggestions: Preparation of a test set Ideally, the test set should not be accessible to any participant beforehand. One way to achieve this is to require every participant to submit a set of recordings to IMIRSEL team to be used as the test set. The test set should be released after the evaluation results are published. By following this convention, we should be able to increase our QBSH corpus year by year and new effective methods can be identified accordingly. More participation For this year, we have only 13 participants for subtask 1, 10 participants for subtask 2. More participants should be encouraged such that more effective methods can be identified. Variations of QBSH Use WAV exclusively as the query input: This is closer to the real-world situation where a QBSH system has to deal with wave to pitch vector conversion using a pitch tracking algorithm.. Use MP3 as the test database: This is far more practical then using monophonic MIDIs as the test database. Of course, this is also far more challenging since audio melody extraction is known as a tough task in MIREX. Results Out simple distance measures do prove to be effective in both subtasks. In subtask 1, an MRR (mean reciprocal rank) of is achieved, ranked 3 among 13 participants. For subtask 2, an MP (mean precision) of is achieved, ranked 1 among 10 participants. The following plots demonstrate the evaluation results for both subtasks. References [1] J.-S. Roger Jang, "QBSH: A Corpus for Designing QBSH (Query by Singing/Humming) Systems", available at the "QBSH corpus for query by singing/humming" link at the organizer's homepage at [2] J.-S. Roger Jang and Ming-Yang Gao, "A Query-by- Singing System based on Dynamic Programming", International Workshop on Intelligent Systems Resolutions (the 8th Bellman Continuum), PP , Hsinchu, Taiwan, Dec [3] J.-S. Roger Jang, Hong-Ru Lee, "Hierarchical Filtering Method for Content-based Music Retrieval via Acoustic Input", The 9th ACM Multimedia Conference, PP , Ottawa, Ontario, Canada, September [4] J.-S. Roger Jang, Hong-Ru Lee, Ming-Yang Kao, "Content-based Music Retrieval Using Linear Scaling and Branch-and-bound Tree Search", IEEE International Conference on Multimedia and Expo, Waseda University, Tokyo, Japan, August [5] J.-S. Roger Jang, Hong-Ru Lee, Jiang-Chuen Chen, and Cheng-Yuan Lin, "Research and Development of an MIR Engine with Multi-modal Interface for Real-world Applications", Journal of the American Society for Information Science and Technology, 2004.