Content-based Music Retrieval from Acoustic Input (CBMR)

Slides:

Advertisements

Similar presentations

CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.

Advertisements

Sound can make multimedia presentations dynamic and interesting.

I Power Higher Computing Multimedia technology Audio.

Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.

Multimedia Retrieval. Outline Audio Retrieval Spoken information Music Document Image Analysis and Retrieval Video Retrieval.

A System for Hybridizing Vocal Performance By Kim Hang Lau.

Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.

Retrieval Methods for QBSH (Query By Singing/Humming) J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval.

Extracting Noise-Robust Features from Audio Data Chris Burges, John Platt, Erin Renshaw, Soumya Jana* Microsoft Research *U. Illinois, Urbana/Champaign.

FINGER PRINTING BASED AUDIO RETRIEVAL Query by example Content retrieval Srinija Vallabhaneni.

Content-Based Classification, Search & Retrieval of Audio Erling Wold, Thom Blum, Douglas Keislar, James Wheaton Presented By: Adelle C. Knight.

Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.

LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.

DEVON BRYANT CS 525 SEMESTER PROJECT Audio Signal MIDI Transcription.

Content-Based Image Retrieval (CBIR) Student: Mihaela David Professor: Michael Eckmann Most of the database images in this presentation are from the Annotated.

Digital Voice Communication Link EE 413 – TEAM 2 April 21 st, 2005.

T.Sharon 1 Internet Resources Discovery (IRD) Music IR.

LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.

The Chinese University of Hong Kong Department of Computer Science and Engineering Lyu0202 Advanced Audio Information Retrieval System.

FYP0202 Advanced Audio Information Retrieval System By Alex Fok, Shirley Ng.

A PRESENTATION BY SHAMALEE DESHPANDE

Representing Sound in a computer Analogue  Analogue sound is produced by being picked up by a transducer (microphone) and converted in an electrical current.

Representing Acoustic Information

Content-Based Video Retrieval System Presented by: Edmund Liang CSE 8337: Information Retrieval.

LE 460 L Acoustics and Experimental Phonetics L-13

Digital Sound and Video Chapter 10, Exploring the Digital Domain.

GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.

Polyphonic Queries A Review of Recent Research by Cory Mckay.

Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval.

S DTW: COMPUTING DTW DISTANCES USING LOCALLY RELEVANT CONSTRAINTS BASED ON SALIENT FEATURE ALIGNMENTS K. Selçuk Candan Arizona State University Maria Luisa.

Signal Digitization Analog vs Digital Signals An Analog Signal A Digital Signal What type of signal do we encounter in nature?

Implementing a Speech Recognition System on a GPU using CUDA

Overview of Multimedia A multimedia presentation might contain: –Text –Animation –Digital Sound Effects –Voices –Video Clips –Photographic Stills –Music.

National Taiwan University

K. Selçuk Candan, Maria Luisa Sapino Xiaolan Wang, Rosaria Rossini

Dan Rosenbaum Nir Muchtar Yoav Yosipovich Faculty member : Prof. Daniel LehmannIndustry Representative : Music Genome.

Incorporating Dynamic Time Warping (DTW) in the SeqRec.m File Presented by: Clay McCreary, MSEE.

Audio Thumbnailing of Popular Music Using Chroma-Based Representations Matt Williamson Chris Scharf Implementation based on: IEEE Transactions on Multimedia,

2015/10/221 Progressive Filtering and Its Application for Query-by-Singing/Humming J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval Lab CS Dept.,

2015/10/241 Query by Tapping 敲擊選歌 J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval Lab CS Dept., Tsing Hua Univ., Taiwan

Demos for QBSH J.-S. Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.

2015/10/251 Two Paradigms for Music IR: Query by Singing/Humming and Audio Fingerprinting J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval Lab.

Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )

Revision CUS30109 Certificate III in music. Microphones - Condenser w phantom power - Dynamic - What each is used for - Polar patterns/ frequency response.

CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.

Music Information Retrieval: Overview and Challenges

MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality.

QBSH Corpus The QBSH corpus provided by Roger Jang [1] consists of recordings of children’s songs from students taking the course “Audio Signal Processing.

Content-Based MP3 Information Retrieval Chueh-Chih Liu Department of Accounting Information Systems Chihlee Institute of Technology 2005/06/16.

Query by Singing and Humming System

DYNAMIC TIME WARPING IN KEY WORD SPOTTING. OUTLINE KWS and role of DTW in it. Brief outline of DTW What is training and why is it needed? DTW training.

Query by Image and Video Content: The QBIC System M. Flickner et al. IEEE Computer Special Issue on Content-Based Retrieval Vol. 28, No. 9, September 1995.

Toshiba IR Test Apparatus Project Ahmad Nazri Fadzal Zamir Izam Nurfazlina Kamaruddin Wan Othman.

CART H E A T CART H1 E A T CART H1 E2 A3 T4.

Audio Processing Mitch Parry. Resource! Sound Waves and Harmonic Motion.

A Music Search Engine for Plagiarism Detection

Query by Singing/Humming via Dynamic Programming

MATCH A Music Alignment Tool Chest

A review of audio fingerprinting (Cano et al. 2005)

Introduction to Music Information Retrieval (MIR)

Distance Functions for Sequence Data and Time Series

Sharat.S.Chikkerur S.Anand Mantravadi Rajeev.K.Srinivasan

Homework 1 (Due: 11th Oct.) (1) Which of the following applications are the proper applications of the short -time Fourier transform? Also illustrate.

Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611

Ms Jennifer - Senior 4 - Data Representation Introduction

Query by Singing/Humming via Dynamic Programming

Measuring the Similarity of Rhythmic Patterns

Pre and Post-Processing for Pitch Tracking

Presentation transcript:

Content-based Music Retrieval from Acoustic Input (CBMR)

Outline zWhat is CBMR? zMethods ySignal processing ySimilarity comparison zExperiment results zDemo zFuture work

What is CBMR? zCBMR : yContent-based Music Retrieval zTraditional database query : yText-based or SQL-based zOur goal : yMusic retrieval by singing/humming

Related Work zQuery by humming by Ghias,Loga and Chamberlin in 1995 yAutocorrelation pitch detection y183 songs in database zMELDEX system by New Zealand Digital Library Project in 1996 yGold/Rabiner Algorithm (800 songs) ySing ‘la’ or ‘ta’ when transposition zKaraoke song recognizer by J.F. Wang in 1997 yNovel pitch detection y50 songs in database

Flowchart Post Signal Processing Pitch Tracking Microphone Signal Input Filtering Query Results (Ranked Song List) Similarity Comparison Off-line processing Midi message Extraction Songs Database Sampling 11KHz Mid-level Representation On-line processing

Original Wave Input 小雨中的回憶 Hz 8 Bits Mono

Single Frame 512 points/frame 340 points overlap Zoom in Overlap Frame

Pitch Tracking zRange yE2 - C6 y82 Hz Hz ( - ) zMethod yAuto-correlation y

Auto-correlation without Clipping

-10- Center Clipping (a)(b)(c) 000 Clipping limits are set to  % of the absolute maximum of the auto-correlation data

-11- Auto-correlation with Clipping

-12- Pitch Contour

-13- Signal Process zRemove violent point & short notes zDown sampling & smoothing zFrequency to semitone ySemitone : A music scale based on A440 y

-14- Pitch Contour (After Smoothing)

-15- Mid-level Representation

-16- Mid-level Representation without Rest

-17- Similarity Comparison zGoal yFind the most similar Midi file zChallenge yTempo variance xDynamic time warping (DTW) yTune variance xKey transposition

-18- Compare by DTW Wave File Mid File DTW

-19- Dynamic Time Warping (DTW) i j t(i-1) t(i) r(j) r(j-1) window

-20- DTW (cont.) i j dist(i,j) = |t(i)-r(j)| if ( t(i) = Rest && r(j) = Rest ) dist(i,j) = 0; elseif ( t(i) = Rest || r(j) = Rest) dist(i,j) = restWeight;

-21- Example of DTW

-22- Key Transposition zMean sift zBinary search in the searching area yO( N) --> O (log N) Mean Searching Area

-23- Example of Key Transposition

-24- Score Function z ym : length of match string yn : length of input string ye : DTW distance yA = 0.8 yB = 0.6

-25- Experiment Environment z290 wave files yWave length : sec yWave format : PCM, 11025Hz, 8bits, Mono zEnvironment yCeleron 450 with 128Mb RAM under Matlab 5.3 zDatabase y493 midi files

-26- Experiment Result (Histogram)

-27- Experiment Result (Pie) Total time : 4589 sec (15.8 sec/per-wave)

-28- Experiment Result (Pie) - With Rest Total time : 7893 sec (27.2 sec/per-wave)

-29- How to Accelerate? zBranch and bound yO(N) -> O(lnN) yTriangle inequality xd(a,b) + d(b,c) ≧ d(a,c) zHierarchical y2 phase x3/32 sec x2/32 sec

-30- Experiment Result (Pie) - 3/32 sec Total time : 2358 sec (8.9 sec/per-wave)

-31- Experiment Result (Pie) - 2 Phase Total time : 3006 sec (11.2 sec/per-wave)

-32- Error Analysis zMidi error zSinging error zLow pitch zBroken vocalism zNoise

-33- Future Work zTime consuming yBetter similarity comparison yDifferent comparison unit yHardware acceleration yBetter searching algorithm zSteadier pitch tracking algorithm zNoise handle