Online Learning for Collaborative Filtering

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Google News Personalization: Scalable Online Collaborative Filtering
Active Appearance Models
Item Based Collaborative Filtering Recommendation Algorithms
Collaborative QoS Prediction in Cloud Computing Department of Computer Science & Engineering The Chinese University of Hong Kong Hong Kong, China Rocky.
ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Active Learning and Collaborative Filtering
Learning to Recommend Hao Ma Supervisors: Prof. Irwin King and Prof. Michael R. Lyu Dept. of Computer Science & Engineering The Chinese University of Hong.
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.
Clustering with Bregman Divergences Arindam Banerjee, Srujana Merugu, Inderjit S. Dhillon, Joydeep Ghosh Presented by Rohit Gupta CSci 8980: Machine Learning.
Collaborative Filtering Matrix Factorization Approach
Item-based Collaborative Filtering Recommendation Algorithms
Cao et al. ICML 2010 Presented by Danushka Bollegala.
Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.
WEMAREC: Accurate and Scalable Recommendation through Weighted and Ensemble Matrix Approximation Chao Chen ⨳ , Dongsheng Li
Yan Yan, Mingkui Tan, Ivor W. Tsang, Yi Yang,
Natural Gradient Works Efficiently in Learning S Amari (Fri) Computational Modeling of Intelligence Summarized by Joon Shik Kim.
1 Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing Seung-Taek Park and David M. Pennock (ACM SIGKDD 2007)
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
GAP FM : O PTIMAL T OP -N R ECOMMENDATIONS FOR G RADED R ELEVANCE D OMAINS Yue Shi, Alexandros Karatzoglou, Linas Baltrunas, Martha Larson, Alan Hanjalic.
GAUSSIAN PROCESS FACTORIZATION MACHINES FOR CONTEXT-AWARE RECOMMENDATIONS Trung V. Nguyen, Alexandros Karatzoglou, Linas Baltrunas SIGIR 2014 Presentation:
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
User Interests Imbalance Exploration in Social Recommendation: A Fitness Adaptation Authors : Tianchun Wang, Xiaoming Jin, Xuetao Ding, and Xiaojun Ye.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
RecBench: Benchmarks for Evaluating Performance of Recommender System Architectures Justin Levandoski Michael D. Ekstrand Michael J. Ludwig Ahmed Eldawy.
EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.
1 A fast algorithm for learning large scale preference relations Vikas C. Raykar and Ramani Duraiswami University of Maryland College Park Balaji Krishnapuram.
SOFIANE ABBAR, HABIBUR RAHMAN, SARAVANA N THIRUMURUGANATHAN, CARLOS CASTILLO, G AUTAM DAS QATAR COMPUTING RESEARCH INSTITUTE UNIVERSITY OF TEXAS AT ARLINGTON.
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.
Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
ICDCS 2014 Madrid, Spain 30 June-3 July 2014
Recommender Systems with Social Regularization Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu The Chinese University of Hong Kong Irwin.
Logistic Regression William Cohen.
Large-Scale Matrix Factorization with Missing Data under Additional Constraints Kaushik Mitra University of Maryland, College Park, MD Sameer Sheoreyy.
Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media J. Bian, Y. Liu, E. Agichtein, and H. Zha ACM WWW, 2008.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s Function Dingyan Wang and Irwin King Dept. of Computer.
Online Evolutionary Collaborative Filtering RECSYS 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University.
Poster Spotlights Conference on Uncertainty in Artificial Intelligence Catalina Island, United States August 15-17, 2012 Session: Wed. 15 August 2012,
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Collaborative Deep Learning for Recommender Systems
Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu
StressSense: Detecting Stress in Unconstrained Acoustic Environments using Smartphones Hong Lu, Mashfiqui Rabbi, Gokul T. Chittaranjan, Denise Frauendorfer,
A Collaborative Quality Ranking Framework for Cloud Components
Matrix Factorization and Collaborative Filtering
Collaborative Filtering for Streaming data
Efficient Multi-User Indexing for Secure Keyword Search
Learning Recommender Systems with Adaptive Regularization
WSRec: A Collaborative Filtering Based Web Service Recommender System
Accelerated Sampling for the Indian Buffet Process
Asymmetric Correlation Regularized Matrix Factorization for Web Service Recommendation Qi Xie1, Shenglin Zhao2, Zibin Zheng3, Jieming Zhu2 and Michael.
Tingdan Luo 05/02/2016 Interactively Optimizing Information Retrieval Systems as a Dueling Bandits Problem Tingdan Luo
Probabilistic Models for Linear Regression
Statistical Learning Dong Liu Dept. EEIS, USTC.
Advanced Artificial Intelligence
Collaborative Filtering Matrix Factorization Approach
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Movie Recommendation System
Learning Theory Reza Shadmehr
Probabilistic Latent Preference Analysis
Mathematical Foundations of BME
Learning to Rank with Ties
Recommender Systems Problem formulation Machine Learning.
Stochastic Methods.
Presentation transcript:

Online Learning for Collaborative Filtering Guang Ling, Haiqin Yang, Irwin King, Michael Lyu Presented by Guang LING

Outline Introduction PMF and RMF Online PMF and Online RMF Experiments and Results Conclusion and Future Work

Introduction We face unprecedentedly large amount of choice! Search Vs. Recommend

Introduction Recommender system emerged Content based filtering Analyze item content Collaborative filtering Rating based

Introduction Collaborative filtering Allow user to rate items Infer user’s taste and item’s feature based on ratings Match user’s preferences with item’s features

Introduction Various methods have been developed Memory based User based Item based Model based PMF, RMF PLSA, PLPA So, what is the problem? I1 I2 I3 I4 U1 1 5 4 ? U2 2 U3

Introduction Unrealistic assumptions Reality All ratings are available There will be no new rating Data set are small enough to be handled in main memory Reality Ratings are collected over time New ratings are received constantly Huge data set cannot be easily handled

Introduction We propose online CF algorithm that Obviate the need to hold all data Make incremental changes based solely on new rating Scale linearly with the number of ratings Extra features Command explicit regularization effect

PMF and RMF Matrix factorization models Factor R into U and V Minimize Square loss: PMF Cross entropy: RMF No. users No. items

PMF Conditional distribution over observed ratings: Spherical Gaussian priors on user and movie feature vectors: Maximize posterior:

PMF Maximize Equivalent to minimize the following loss: Using gradient descent to minimize loss: Squared loss Regularization

RMF Top one probability Minimize cross entropy The probability that an item i being ranked on top Minimize cross entropy Cross entropy measures the divergence between two distributions Un-normalized KL-divergence

RMF Model loss is defined as: Using gradient descent to minimize: Cross entropy Regularization

Online PMF We propose two online algorithms for PMF Stochastic gradient descent Adjust model stochastically for each observation Regularized dual averaging Maintain an approximated average gradient Solve an easy optimization problem at each iteration

Stochastic Gradient Descent PMF Recall the loss function for PMF Squared loss can be dissected and associated with each observation triplet Update model using gradient of this loss:

Regularized Dual Averaging PMF Maintain the approximated average gradient Previous gradient Gradient due to new observation Number of items rated by u

Regularized Dual Averaging PMF Solve the following optimization problem to obtain New user feature vector New item feature vector

Online RMF Similar to online PMF, we propose two online algorithms for RMF Stochastic Gradient Descent Regularized Dual Averaging However, the challenge is Loss function cannot be easily dissected

Online RMF Recall the loss function for RMF When a new observation is revealed Loss due to new item Decay of previous items

Online RMF We approximate the gradient by Decay previous gradient Gradient with respect to new item Decay previous gradient Gradient with respect to new item

Online RMF Stochastic Gradient Descent RMF Dual Averaging RMF

Experiments and Results Online Vs. Batch algorithms Performance under different settings Sensitivity analysis of parameters Scalability to large dataset

Evaluation Metric Root Mean Square Error(RMSE) The lower the better Normalized Discounted Cumulative Gain(NDCG) The higher the better

Online Vs. Batch algorithms We conduct experiments on real life data set MovieLens: movie rating data set 6,040 users 3,900 movies 1,000,209 ratings 4.25% of user-item rating matrix is known Simulate three settings T1: 10% training, 90% testing T5: 50% training, 50% testing T9: 90% training, 10% testing

Online Vs. Batch algorithms Shown below is the PMF result T1 T5 T9

Online Vs. Batch algorithms Shown below is the RMF result T1 T5 T9

Impact of in PMF denote the regularization parameter Observation Fewer training data needs more regularization Results are quite sensitive to regularization SGD-PMF DA-PMF

Impact of in RMF denote the regularization parameter Observation Fewer training data needs more regularization SGD-RMF RDA-RMF

Impact of learning rate We use to denote the learning rate It is used in stochastic gradient descent algorithms only SGD-RMF SGD-PMF

Scalability to large dataset Yahoo! Music dataset Largest CF dataset publicly available 252,800,275 ratings 1,000,990 users 624,961 items Rating value range [0, 100]

Scalability to large dataset Experiment environment Linux workstation (Xeon Dual Core 2.4 GHz, 32 GB RAM) Batch PMF: 8 hours for 120 iteration Online PMF: 10 minutes T1 T5

Conclusion and Future Work We proposed online CF algorithms Perform comparable or even better than corresponding batch algorithms Scales linearly with number of ratings Adjust model incrementally given new observation Future Work Theoretical bound for convergence rate Find better approximation for average gradient of RMF

Thanks! Questions?