Combining Audio Content and Social Context for Semantic Music Discovery José Carlos Delgado Ramos Universidad Católica San Pablo.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
ECG Signal processing (2)
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Classification / Regression Support Vector Machines
Linear Classifiers (perceptrons)
Supervised Learning Techniques over Twitter Data Kleisarchaki Sofia.
Franz de Leon, Kirk Martinez Web and Internet Science Group  School of Electronics and Computer Science  University of Southampton {fadl1d09,
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Pattern Recognition and Machine Learning
Support Vector Machines
Support vector machine
Machine learning continued Image source:
Computer vision: models, learning and inference Chapter 8 Regression.
Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.
Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model.
Discriminative and generative methods for bags of features
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Support Vector Machines
Support Vector Machines and Kernel Methods
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Results Audio Information Retrieval using Semantic Similarity Luke Barrington, Antoni Chan, Douglas Turnbull & Gert Lanckriet Electrical & Computer Engineering.
Learning Techniques for Information Retrieval Perceptron algorithm Least mean.
Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.
Support Vector Machines and Kernel Methods
MusicSense: Contextual Music Recommendation using Emotional Allocation Modeling Rui Cai, Chao Zhang, Chong Wang, Lei Zhang, and Wei-Ying Ma Proceedings.
Identifying Words that are Musically Meaningful David Torres, Douglas Turnbull, Luke Barrington, Gert Lanckriet Computer Audition Lab UC San Diego ISMIR.
Semantic Similarity for Music Retrieval Luke Barrington, Doug Turnbull, David Torres & Gert Lanckriet Electrical & Computer Engineering University of California,
Radial Basis Function Networks
An Introduction to Support Vector Machines Martin Law.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
SVM by Sequential Minimal Optimization (SMO)
Dan Rosenbaum Nir Muchtar Yoav Yosipovich Faculty member : Prof. Daniel LehmannIndustry Representative : Music Genome.
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
An Introduction to Support Vector Machines (M. Law)
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Music Information Retrieval Information Universe Seongmin Lim Dept. of Industrial Engineering Seoul National University.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
Kernels Usman Roshan CS 675 Machine Learning. Feature space representation Consider two classes shown below Data cannot be separated by a hyperplane.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Mining Tag Semantics for Social Tag Recommendation Hsin-Chang Yang Department of Information Management National University of Kaohsiung.
Semantic Alignment Spring 2009 Ben-Gurion University of the Negev.
Support Vector Machine (SVM) Presented by Robert Chen.
Approximation Algorithms based on linear programming.
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Support vector machines
Large-Scale Content-Based Audio Retrieval from Text Queries
An Introduction to Support Vector Machines
Kernels Usman Roshan.
Support Vector Machines Introduction to Data Mining, 2nd Edition by
CS 2750: Machine Learning Support Vector Machines
Support vector machines
Usman Roshan CS 675 Machine Learning
Support vector machines
Support vector machines
Linear Discrimination
SVMs for Document Ranking
A Neural Passage Model for Ad-hoc Document Retrieval
Introduction to Machine Learning
Presentation transcript:

Combining Audio Content and Social Context for Semantic Music Discovery José Carlos Delgado Ramos Universidad Católica San Pablo

I.Introduction II.Sources of Music Information III.Combining multiple sources of music information IV.Experiments

Introduction Most music IR system focus on either content-based analysis of audio signals

Introduction Or content-based analysis of webpages…

Introduction …user preference information…

Introduction … and social tagging data.

Tags Short text-based tokens Helpful when describing songs

Tags Not always accurate, the strength of the semantic association betwen each song and each tag may vary.

Sources of semantic information Surveys Social tagging websites Annotation games

Relevance of tags to songs May be determined by using content- based audio analysis or by text-mining associated web documents.

Main sources for information retrieval Audio content, Social tags and Web documents Also used audio signal analysis by using two acoustic feature representations related to timbre and harmony.

Sources of Music Information A relevance score function r(s;t) is derived; evaluates the relevance of a song s to a tag t. Song-tag representations are dense if based on audio content, sparse if based on social representations.

Representing Audio Content: Supervised Multiclass Labeling (SML) Audio track s represented as a bag of feature vectors X = {x 1,x 2,…,x T } 1: Expectation maximization algorithm 2: Identify set of example songs with a given tag. 3: Mixture-hiearchies expectation maximization algorithm.

Representing Audio Content: Supervised Multiclass Labeling (SML) Given a song s, X is extracted and likehood is evaluated using each of the tag GMMs. Result: vector or probabilites. Relevance of song s to a tag t may be written as:

Representing Audio Content: Audio feature representations Mel Frequency Cepstral Coefficients (MFCC): associated with musical notion of timbre. Chroma: represents the armonic content (keys, chords) by computing spectral energy at frequences corresponding to chromatic scale.

Representing Social Context: Summarize each song with annotation vector over a vocabulary of tags. Methods for retrieval tags: social & web-mined. Missing song-tag pair: Tag not relevant or relevant but not annotated.

Representing Social Context: Social Tags Last.FM: Music discovery website. 20 million users a month annotate 3.8 million items over 50 million times using a 1.2 million tags universe. Last.FM db: 150 million songs/16 million artists.

Representing Social Context: Social Tags

Two lists of social Last.FM tags for each song: relating song to tags, and relating artist to tags. Relevance T social (s,t) = artist list tag scores + songs lists tag scores + tag score for synonyms or wildcard matches of t on either list.

Representing Social Context: Web-Mined Tags Relevance Scoring (RS) algorithm. Relevance function is a function of tag- frequency, document frequency, number of total words in documents, etc Site-specific queries in HQ web-sites. Steps: Collect Document Corpus and Tag songs

Combining multiple sources of music information Given a query tag t, goal: fin a simple rank ordering of songs based on relevance to t. Tag-score, web-relevance score and convex optimization used. Three algorithms: supervised, use labeled traning data for learning.

Calibrated Score Averaging (CSA) Using training data, we can learn a function g() that calibrates scores such that To learn g(), we start with a rank-ordered training set of N songs where If data is is perfectly ordered, then g is isotonic. Otherwise:

Calibrated Score Averaging (CSA) E.g. 7 songs with relevant scores (1,2,4,5,6,7,9) and ground truth levels = (0,1,0,1,1,0,1) Then g(r) = 0 for r < 2, g(r) = ½ for 3<=r<6, g(r) = 2/3 for 6<=r<9 and g(r) = 1 for 9<=r. Missing song tags scores suggests tag isn’t relevant. Instead:

Rankboost algorithm For a given song, weak ranking function is n indicator functions that outputs 1 if the scoe for the associated representation is greater than the threshold or if the score is missing and the default value is set to 1. Otherwise 0.

Kernel Combination SVM (KC-SVM) Linear combination of M different kernels that each encode different data features: Since each kernel matrix, K m is positive semi- definite, their positive-weighted sum, K is also a valid positive semi-definite kernel.

Kernel Combination SVM (KC-SVM) K m represents similarities between all songs in the data set, after vectors X = {x 1,x 2,…,x T } obtained from MFCC and Chroma. Compute the entries of a probability product kernel (PPK)

Kernel Combination SVM (KC-SVM) For each of the social context features, a radial basis function (RBF) function is computed, with entries: Where K (i,j) represents the similaritybetween xi and xj, the annotation vectors for songs i and j.

Kernel Combination SVM (KC-SVM) For each tag t and corresponding class-label vector, y, the primal problem for single-kernel SVM is to find the decision boundary with maximum margin separating the two clases.. Optimum K can be learned by minimizing the function that optimizes the dual (thereby maximizing hte margin) with respect to the kernel weights.

Kernel Combination SVM (KC-SVM) Where and e is an n-vector of ones such that constrains the weights tu sum to one. C is a hyper parameter that limits violations of the margin.

Kernel Combination SVM (KC-SVM) The solution returns a linear decision function that defines the distance of a new song s z, from the hyperplane boundary between the positive and negative classes (i.e. elevance of s z to tag t) b: offset of the decision boundary from the region.

Semantic Music Retrieval Experiments 500 songs by 500 unique artists, each annotated by a minimum of 3 individual from a 174-tag vocabulary. Song annotated: 80% agree with tag relevance. Experiment: 72 tags associated with at least 20 songs each.

Thanks!