Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.

Slides:



Advertisements
Similar presentations
Recommender Systems & Collaborative Filtering
Advertisements

Item Based Collaborative Filtering Recommendation Algorithms
Differentially Private Recommendation Systems Jeremiah Blocki Fall A: Foundations of Security and Privacy.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Learning to Recommend Hao Ma Supervisors: Prof. Irwin King and Prof. Michael R. Lyu Dept. of Computer Science & Engineering The Chinese University of Hong.
Item-based Collaborative Filtering Idea: a user is likely to have the same opinion for similar items [if I like Canon cameras, I might also like Canon.
Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!
Memory-Based Recommender Systems : A Comparative Study Aaron John Mani Srinivasan Ramani CSCI 572 PROJECT RECOMPARATOR.
1 Collaborative Filtering and Pagerank in a Network Qiang Yang HKUST Thanks: Sonny Chee.
Top-N Recommendation Algorithm Based on Item-Graph
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Sparsity, Scalability and Distribution in Recommender Systems
Collaborative Filtering CMSC498K Survey Paper Presented by Hyoungtae Cho.
Analysis of Recommendation Algorithms for E-Commerce Badrul M. Sarwar, George Karypis*, Joseph A. Konstan, and John T. Riedl GroupLens Research/*Army HPCRC.
Recommender systems Ram Akella November 26 th 2008.
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
Item-based Collaborative Filtering Recommendation Algorithms
References: Linden, G.; Smith, B.; York, J.; , "Amazon.com recommendations: item-to-item collaborative filtering,". Internet Computing, IEEE , vol.7,
Performance of Recommender Algorithms on Top-N Recommendation Tasks
Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
Item Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karpis, Joseph KonStan, John Riedl (UMN) p.s.: slides adapted from:
Collaborative Filtering Recommendation Reporter : Ximeng Liu Supervisor: Rongxing Lu School of EEE, NTU
Clustering-based Collaborative filtering for web page recommendation CSCE 561 project Proposal Mohammad Amir Sharif
Classical Music for Rock Fans?: Novel Recommendations for Expanding User Interests Makoto Nakatsuji, Yasuhiro Fujiwara, Akimichi Tanaka, Toshio Uchiyama,
Toward the Next generation of Recommender systems
1 Social Networks and Collaborative Filtering Qiang Yang HKUST Thanks: Sonny Chee.
Collaborative Filtering Presented by; Ghulam Mujtaba MS CS, IBA, Karachi.
EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.
Collaborative Filtering  Introduction  Search or Content based Method  User-Based Collaborative Filtering  Item-to-Item Collaborative Filtering  Using.
Badrul M. Sarwar, George Karypis, Joseph A. Konstan, and John T. Riedl
The Effect of Dimensionality Reduction in Recommendation Systems
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.
1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,
Cosine Similarity Item Based Predictions 77B Recommender Systems.
Collaborative Filtering Zaffar Ahmed
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
Recommendation Algorithms for E-Commerce. Introduction Millions of products are sold over the web. Choosing among so many options is proving challenging.
Page 1 A Random Walk Method for Alleviating the Sparsity Problem in Collaborative Filtering Hilmi Yıldırım and Mukkai S. Krishnamoorthy Rensselaer Polytechnic.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s Function Dingyan Wang and Irwin King Dept. of Computer.
Online Evolutionary Collaborative Filtering RECSYS 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University.
User Modeling and Recommender Systems: recommendation algorithms
Experimental Study on Item-based P-Tree Collaborative Filtering for Netflix Prize.
Company LOGO MovieMiner A collaborative filtering system for predicting Netflix user’s movie ratings [ECS289G Data Mining] Team Spelunker: Justin Becker,
10/03/59 1 Recommendation Systems Sunantha Sodsee Information Technology King Mongkut’s University of Technology North Bangkok.
Reputation-aware QoS Value Prediction of Web Services Weiwei Qiu, Zhejiang University Zibin Zheng, The Chinese University of HongKong Xinyu Wang, Zhejiang.
Collaborative Filtering: Searching and Retrieving Web Information Together Huimin Lu December 2, 2004 INF 385D Fall 2004 Instructor: Don Turnbull.
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
ItemBased Collaborative Filtering Recommendation Algorithms 1.
Slope One Predictors for Online Rating-Based Collaborative Filtering Daniel Lemire, Anna Maclachlan In SIAM Data Mining (SDM’05), Newport Beach, California,
Item-Based Collaborative Filtering Recommendation Algorithms
Chapter 14 – Association Rules and Collaborative Filtering © Galit Shmueli and Peter Bruce 2016 Data Mining for Business Analytics (3rd ed.) Shmueli, Bruce.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
Data Mining: Concepts and Techniques
Recommender Systems & Collaborative Filtering
CS728 The Collaboration Graph
Collaborative Filtering
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Collaborative Filtering Nearest Neighbor Approach
M.Sc. Project Doron Harlev Supervisor: Dr. Dana Ron
Movie Recommendation System
ITEM BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHEMS
Presentation transcript:

Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army HPC Research Center Department of Computer Science and Engineering University of Minnesota, Minneapolis, Nov. 05 Presented by Eun-gyeong Kim, IDS Lab.

Copyright  2008 by CEBT Contents  Introduction  Collaborative Filtering Based Recommender Systems Overview of the Collaborative Filtering Process Challenges of User-based Collaborative Filtering Algorithms  Item-based Collaborative Filtering Algorithm Item Similarity Computation Prediction Computation Performance Implications  Experimental Evaluation  Contributions  Discussion & Conclusion IDS Lab. Seminar - 2Center for E-Business Technology

Copyright  2008 by CEBT Introduction (What is Collaborative filtering?)  Now it is time to create the technologies that can help us sift through all the available information to find that which is most valuable to us. One of the most promising such technologies is collaborative filtering  Collaborative filtering (by Wikipedia) The process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc. The underlying assumption of CF approach is that those who agreed in the past tend to agree again in the future  CF systems usually take two steps Look for users who share the same rating patterns with the active user Use the ratings from those like-minded users found in step 1 to calculate a prediction for the active user IDS Lab. Seminar - 3Center for E-Business Technology

Copyright  2008 by CEBT Two main Categories of CF algorithms  Memory-based CF Algorithms Utilize the entire user-item database to generate a prediction Employ statistical techniques to find the neighbors  Model-based CF Algorithms First developing a model of user ratings. Computing the expected value of a user prediction, given his/her ratings on other items. To build the model – Bayesian network (probabilistic) – clustering (classification) – rule-based approaches (association rules between co-purchased items) IDS Lab. Seminar - 4Center for E-Business Technology

Copyright  2008 by CEBT Recommendation Algorithms  User-based collaborative filtering Traditional Collaborative Filtering Cluster Models  Item-based collaborative filtering Search-based Methods Item-to-item collaborative filtering Amazon.com Recommendations: Item-to-Item Collaborative Filtering IDS Lab. Seminar - 5Center for E-Business Technology

Copyright  2008 by CEBT CF Based Recommender Systems  provide item recommendations or predictions based on the opinions of other like-minded users IDS Lab. Seminar Center for E-Business Technology

Copyright  2008 by CEBT Traditional Collaborative Filtering (1)  Represents a customer as an N-dimensional vector of items, where N is the number of distinct catalog items For almost all customers, this vector is extremely sparse  Generates recommendations based on a few customers(neighbors) who are most similar to the user  Measure the similarity of two customers, A and B IDS Lab. Seminar - 7Center for E-Business Technology

Copyright  2008 by CEBT Traditional Collaborative Filtering (2)  Generate recommendations A common technique is to rank each item according to how many similar customers purchased it O(MN) in the worst case Performance tends to be closer to O(M+N) because the average customer vector is extremely sparse  Scaling issues Reduce the data size – Reduce M by randomly sampling the customers or discarding customers with few purchases – Reduce N by discarding very popular or unpopular items Reduce recommendation quality  We need better algorithms to scale to large data sets and at the same time produce high-quality recommendations IDS Lab. Seminar - 8Center for E-Business Technology

Copyright  2008 by CEBT Challenges of User-based CF Algorithms  Challenges Sparsity – A person may have purchased well under 1% of the items – (1% of 2 million books is 20,000 books) – The accuracy of recommendations may be poor Scalability – Computation grows with both the number of users and the number of items – Traditional CF does little or no offline computation, and its online computation scales with the number of customers and catalog items. => The key to item-to-item CF’s scalability and performance is that it creates the expensive similar- items table offline IDS Lab. Seminar - 9Center for E-Business Technology

Copyright  2008 by CEBT Item-based CF Algorithm  Similarity computation between two item i and j First isolate the users who have rated both of these items Then apply a similarity computation technique to determine the similarity  Prediction generation Take a weighted average of the target user’s ratings on these similar items IDS Lab. Seminar - 10Center for E-Business Technology

Copyright  2008 by CEBT Item Similarity Computation IDS Lab. Seminar - 11Center for E-Business Technology

Copyright  2008 by CEBT Item Similarity Computation IDS Lab. Seminar - 12 i1i2i3i4Ave u1(out of 5) u2(out of 5)121.5 u3(out of 5)42423 u4(out of 5) average (1,2)(1,3)(1,4)(2,3)(2,4)(3,4) Center for E-Business Technology

Copyright  2008 by CEBT Prediction Computation IDS Lab. Seminar - 13Center for E-Business Technology

Copyright  2008 by CEBT Prediction Computation  Weighted Sum Compute the sum of the ratings given by the user on the items similar to I Each ratings is weighted by the corresponding similarity  Regression Similarities computed using cosine or correlation measures may be misleading Approximated values based on a linear regression model are used (Instead of using the similar item N’s “raw” ratings values ) IDS Lab. Seminar - 14Center for E-Business Technology

Copyright  2008 by CEBT Weighted Sum Example  Let’s predict the value of item i1 for u4 IDS Lab. Seminar - 15 i1i2i3i4Ave u1(out of 5) u2(out of 5)121.5 u3(out of 5)42423 u4(out of 5) average Pu4,i1Pu2,i2Pu1,i4Pu2,i (1,2)(1,3)(1,4)(2,3)(2,4)(3,4) Center for E-Business Technology

Copyright  2008 by CEBT Item-to-item CF in Amazon.com  We could build a product-to-product matrix by iterating through all item pairs and computing a similarity metric for each pair. However, many product pairs have no common customers, thus the approach is inefficient in terms of processing time and memory usage  Better approach by calculating the similarity between a single product and all related products  in the worst case  in practical IDS Lab. Seminar - 16Center for E-Business Technology

Copyright  2008 by CEBT Performance Implications  Precompute item-item similarity scores In a typical E-Commerce scenario, we usually have a set of item that is static compared to the number of users that changes most often Compute all-to-all similarity and then performing a quick table look- up to retrieve the required similarity values  Generating predictions for a user u on item i Retrieves the precomputed k most similar items corresponding to the target item i Then intersect between those k items and items purchased by the user u The prediction is computed using basic item-based CF algorithm IDS Lab. Seminar - 17Center for E-Business Technology

Copyright  2008 by CEBT Experimental Evaluation: Data set  Movie data Data from MovieLens – 943 users (among 43,000 users ) – 1682 movies (among over 3,500 different movies) – 100,000 ratings (only considered users that had rated 20 or more movies) Divided the DB into a training set and a test set. – X=0.8 (80% of the data is used as training set) Sparsity level: IDS Lab. Seminar - 18Center for E-Business Technology

Copyright  2008 by CEBT Experimental Evaluation: Evaluation Metrics  Statistical accuracy metrics Mean Absolute Error (MAE) is a measure of the deviation of recommendations from their true user-specified values. The lower the MAE, the more accurately the recommendation engine predicts user ratings.  Decision support accuracy metrics IDS Lab. Seminar - 19Center for E-Business Technology

Copyright  2008 by CEBT Experimental Results (1)  Effect of Similarity Algorithms IDS Lab. Seminar - 20Center for E-Business Technology

Copyright  2008 by CEBT Experimental Results (2)  Sensitivity of Training/Test Ratio  Experiments with neighborhood size IDS Lab. Seminar - 21Center for E-Business Technology

Copyright  2008 by CEBT Experimental Results (3)  Quality Experiments IDS Lab. Seminar - 22Center for E-Business Technology

Copyright  2008 by CEBT Sensitivity of the Model Size  The High accuracy that can be achieved using only a fraction of items  It is useful to precompute the item similarities using only a fraction of items and yet possible to obtain good prediction quality IDS Lab. Seminar - 23Center for E-Business Technology 100% 98.3% 96%

Copyright  2008 by CEBT Impact of the model size on run-time and throughput IDS Lab. Seminar - 24Center for E-Business Technology

Copyright  2008 by CEBT Contributions  Analysis of the item-based prediction algorithms and identification of different ways to implement its subtasks  Formulation of a precomputed model of item similarity to increase the online scalability of item-based recommendations  An experimental comparison of the quality of several different item-based algorithms to the classic user- based (nearest neighbor) algorithms IDS Lab. Seminar - 25Center for E-Business Technology

Copyright  2008 by CEBT Discussion & Conclusion  Discussion Item-item scheme provides better quality of predictions than the user-user scheme Item neighborhood is fairly static, which can be pre- computed, which results in very high online performance Possible to retain only a small subset of items and produce reasonably good prediction quality  Conclusion Item-based techniques allow CF-based algorithms to scale to large data sets and at the same time produce high-quality recommendations IDS Lab. Seminar - 26Center for E-Business Technology

Copyright  2008 by CEBT My comments  Lack of explanations about recommendation process  Does the calculated similarity really represent the similarity of items? Lack of explanations about the range of similarity value  Can’t we precompute the similarity of users? IDS Lab. Seminar - 27Center for E-Business Technology

Copyright  2008 by CEBT References  Amazon.com Recommendations: Item-to-Item Collaborative Filtering Recommendations.pdf Recommendations.pdf  Item-based Collaborative Filtering Recommendation Algorithms IDS Lab. Seminar - 28Center for E-Business Technology