LAM: Musical Audio Similarity Michael Casey Centre for Cognition, Computation and Culture Department of Computing Goldsmiths College, University of London.

Slides:



Advertisements
Similar presentations
Song Intersection by Approximate Nearest Neighbours Michael Casey, Goldsmiths Malcolm Slaney, Yahoo! Inc.
Advertisements

Indexing Time Series Based on original slides by Prof. Dimitrios Gunopulos and Prof. Christos Faloutsos with some slides from tutorials by Prof. Eamonn.
Building an ASR using HTK CS4706
Spatial Database Systems. Spatial Database Applications GIS applications (maps): Urban planning, route optimization, fire or pollution monitoring, utility.
CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
Presented by Xinyu Chang
Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.
Rhythmic Similarity Carmine Casciato MUMT 611 Thursday, March 13, 2005.
FINGER PRINTING BASED AUDIO RETRIEVAL Query by example Content retrieval Srinija Vallabhaneni.
A Novel Scheme for Video Similarity Detection Chu-Hong Hoi, Steven March 5, 2003.
A presentation by Modupe Omueti For CMPT 820:Multimedia Systems
Content-Based Classification, Search & Retrieval of Audio Erling Wold, Thom Blum, Douglas Keislar, James Wheaton Presented By: Adelle C. Knight.
Toward Semantic Indexing and Retrieval Using Hierarchical Audio Models Wei-Ta Chu, Wen-Huang Cheng, Jane Yung-Jen Hsu and Ja-LingWu Multimedia Systems,
Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
Indexing Time Series. Time Series Databases A time series is a sequence of real numbers, representing the measurements of a real variable at equal time.
Indexing Time Series Based on Slides by C. Faloutsos (CMU) and D. Gunopulos (UCR)
Multimedia and Text Indexing. Multimedia Data Management The need to query and analyze vast amounts of multimedia data (i.e., images, sound tracks, video.
Thursday, November 13, 2008 ASA 156: Statistical Approaches for Analysis of Music and Speech Audio Signals AudioDB: Scalable approximate nearest-neighbor.
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
ISP 433/633 Week 5 Multimedia IR. Goals –Increase access to media content –Decrease effort in media handling and reuse –Improve usefulness of media content.
Distance Functions for Sequence Data and Time Series
Based on Slides by D. Gunopulos (UCR)
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Dynamic Time Warping Applications and Derivation
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
FYP0202 Advanced Audio Information Retrieval System By Alex Fok, Shirley Ng.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Content-Based Video Retrieval System Presented by: Edmund Liang CSE 8337: Information Retrieval.
1 AUTOMATIC TRANSCRIPTION OF PIANO MUSIC - SARA CORFINI LANGUAGE AND INTELLIGENCE U N I V E R S I T Y O F P I S A DEPARTMENT OF COMPUTER SCIENCE Automatic.
TEMPORAL VIDEO BOUNDARIES -PART ONE- SNUEE KIM KYUNGMIN.
Multimedia and Time-series Data
Educational Software using Audio to Score Alignment Antoine Gomas supervised by Dr. Tim Collins & Pr. Corinne Mailhes 7 th of September, 2007.
Centre for Computational Creativity Semantic Audio Studio Tools and Techniques using MPEG-7 Dr. Michael Casey Centre for Computational Creativity Department.
Jacob Zurasky ECE5526 – Spring 2011
PMLAB Finding Similar Image Quickly Using Object Shapes Heng Tao Shen Dept. of Computer Science National University of Singapore Presented by Chin-Yi Tsai.
IST DIVAS Presentation 1 Advanced search technologies for digital audio-visual content.
Fundamentals of Music Processing
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
IBM QBIC: Query by Image and Video Content Jianping Fan Department of Computer Science University of North Carolina at Charlotte Charlotte, NC 28223
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
Cross-Modal (Visual-Auditory) Denoising Dana Segev Yoav Y. Schechner Michael Elad Technion – Israel Institute of Technology 1.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
Semantic Extraction and Semantics-Based Annotation and Retrieval for Video Databases Authors: Yan Liu & Fei Li Department of Computer Science Columbia.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Text Clustering Hongning Wang
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Time Series databases Text databases.
Query by Singing and Humming System
DYNAMIC TIME WARPING IN KEY WORD SPOTTING. OUTLINE KWS and role of DTW in it. Brief outline of DTW What is training and why is it needed? DTW training.
1 Hidden Markov Model: Overview and Applications in MIR MUMT 611, March 2005 Paul Kolesnik MUMT 611, March 2005 Paul Kolesnik.
1 Electrical and Computer Engineering Binghamton University, State University of New York Electrical and Computer Engineering Binghamton University, State.
VISUAL INFORMATION RETRIEVAL Presented by Dipti Vaidya.
Introduction to Data Processing Michael J. Watts
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
A NONPARAMETRIC BAYESIAN APPROACH FOR
Visual Information Retrieval
Carmine Casciato MUMT 611 Thursday, March 13, 2005
Automatic Video Shot Detection from MPEG Bit Stream
Introduction Multimedia initial focus
Distance Functions for Sequence Data and Time Series
Carmine Casciato MUMT 611 Thursday, March 13, 2005
Presentation on Timbre Similarity
Measuring the Similarity of Rhythmic Patterns
Presentation transcript:

LAM: Musical Audio Similarity Michael Casey Centre for Cognition, Computation and Culture Department of Computing Goldsmiths College, University of London

Overview Machine Music Understanding Features / Classes / Clusters Real-Time Audio Matching Feature Extraction Feature Similarity (Indexing / Retrieval) PD/MSP Tools Music Similarity Applications Sound object matching Texture matching

Sound Understanding Signal ProcessingSound Understanding

Feature Extraction

p( | ) * P( ) Statistical Learning for Decision Making Decision boundary Partitioning of feature space P( | )= p( ) Music Speech

MPEG-7 Audio Tools Audio

MPEG-7 Audio Tools Log Frequency Spectrogram Audio AudioSpectrumEnvelopeD

MPEG-7 Audio Tools Log Frequency Spectrogram Audio Log Amplitude Decorrelating Transform / Dimension Reduction AudioSpectrumEnvelopeD AudioSpectrumProjectionD

SoundModelStatePathD State Path Use estimated state sequence as a feature

MPEG-7 Audio Tools Log Frequency Spectrogram Audio Log Amplitude Decorrelating Transform / Dimension Reduction AudioSpectrumEnvelopeD AudioSpectrumProjectionD Hidden Markov Model SoundModelDS

MPEG-7 Audio Strings Acoustic Lexicons Log Frequency Spectrogram Audio Log Amplitude Decorrelating Transform / Dimension Reduction AudioSpectrumEnvelopeD AudioSpectrumProjectionD Hidden Markov Model SoundModelDS State Path ? 7 1 V SoundModelStatePathD SYMBOL STRING

State Symbol Sequence (40 State Model) ?71V

State Symbol Sequence (40 State Model) ?71V

State Symbol Sequence (40 State Model) ?71V

State Symbol Sequence (40 State Model) ?71V

SoundModelStateHistogramD seconds state index 0.01s Frames

Self-Similarity Matrix

a

a b

a b

S-Matrix

Efficient Storage / Retrieval Real-Time Access Large Databases Distributed Databases

PostgreSQL Database Representation of State Path “Strings” and Histograms

Similarity Compute distance between feature pairs Features == SoundModelStateHistogramD Similarity Metric dist(a,b) >= 0 dist(a,b)== 0 iff a==b dist(a,b) + dist(b,c) >= dist(a,c) Vector Dot Product

Similarity of Feature Trajectories

Dynamic Time Warping

Acousticon Strings Distance Metric –String Edit Distance (Levenschtein) Scalable to Large Databases –PostgreSQL Implementation –Can use built-in Index Structures Scalable to Real-Time Implementation –matching and audio streaming (< 20ms )

Information Retrieval for Creativity Utilize sound extant database for new material Take the structure of a music clip but replace the content. New interfaces for music creativity.

Audio Information Retrieval MPEG-7 Database A pre-indexed Collection of Sounds

Audio Query Extract MPEG-7 Database SegmentMatch Result List A Sound or Scene or List of Sounds Audio Information Retrieval

Audio Query Extract MPEG-7 Database SegmentMatch Result List Feature extraction from audio. Audio Information Retrieval

Audio Query Extract MPEG-7 Database SegmentMatch Result List Partitioning of audio into chunks. Audio Information Retrieval

Audio Query Extract MPEG-7 Database SegmentMatch Result List Find similar chunks of Audio Audio Information Retrieval

Real-Time Matching

Musaics Real-Time Matching

Musaics Real-Time Matching

Musaics Real-Time Matching

Musaics Real-Time Matching

Musaics Real-Time Matching

Musaics Real-Time Matching

Musaics Real-Time Matching

Musaics Real-Time Matching

Musaics Real-Time Matching

Musaics Real-Time Matching