Jeff Howbert Introduction to Machine Learning Winter 2012 1 Collaborative Filtering Nearest Neighbor Approach.

Slides:



Advertisements
Similar presentations
Recommender Systems & Collaborative Filtering
Advertisements

Prediction Modeling for Personalization & Recommender Systems Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Differentially Private Recommendation Systems Jeremiah Blocki Fall A: Foundations of Security and Privacy.
Data Mining Classification: Alternative Techniques
1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.
Oct 14, 2014 Lirong Xia Recommender systems acknowledgment: Li Zhang, UCSC.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Lazy vs. Eager Learning Lazy vs. eager learning
Active Learning and Collaborative Filtering
Navneet Goyal. Instance Based Learning  Rote Classifier  K- nearest neighbors (K-NN)  Case Based Resoning (CBR)
Filtering and Recommender Systems Content-based and Collaborative Some of the slides based On Mooney’s Slides.
Collaborative Filtering in iCAMP Max Welling Professor of Computer Science & Statistics.
Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!
Memory-Based Recommender Systems : A Comparative Study Aaron John Mani Srinivasan Ramani CSCI 572 PROJECT RECOMPARATOR.
Recommendations via Collaborative Filtering. Recommendations Relevant for movies, restaurants, hotels…. Recommendation Systems is a very hot topic in.
Agent Technology for e-Commerce
Data Mining Classification: Alternative Techniques
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
CES 514 – Data Mining Lec 9 April 14 Mid-term k nearest neighbor.
1 Introduction to Recommendation System Presented by HongBo Deng Nov 14, 2006 Refer to the PPT from Stanford: Anand Rajaraman, Jeffrey D. Ullman.
Recommender systems Ram Akella November 26 th 2008.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Jeff Howbert Introduction to Machine Learning Winter Machine Learning Feature Creation and Selection.
Collaborative Filtering Matrix Factorization Approach
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
Item-based Collaborative Filtering Recommendation Algorithms
Performance of Recommender Algorithms on Top-N Recommendation Tasks
K Nearest Neighborhood (KNNs)
Recommendation system MOPSI project KAROL WAGA
Marketing and CS Philip Chan. Enticing you to buy a product 1. What is the content of the ad? 2. Where to advertise? TV, radio, newspaper, magazine, internet,
Google News Personalization: Scalable Online Collaborative Filtering
Collaborative Filtering  Introduction  Search or Content based Method  User-Based Collaborative Filtering  Item-to-Item Collaborative Filtering  Using.
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Cosine Similarity Item Based Predictions 77B Recommender Systems.
Pearson Correlation Coefficient 77B Recommender Systems.
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
KNN CF: A Temporal Social Network kNN CF: A Temporal Social Network Neal Lathia, Stephen Hailes, Licia Capra University College London RecSys ’ 08 Advisor:
CS378 Final Project The Netflix Data Set Class Project Ideas and Guidelines.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
User Modeling and Recommender Systems: recommendation algorithms
CS Machine Learning Instance Based Learning (Adapted from various sources)
Experimental Study on Item-based P-Tree Collaborative Filtering for Netflix Prize.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Company LOGO MovieMiner A collaborative filtering system for predicting Netflix user’s movie ratings [ECS289G Data Mining] Team Spelunker: Justin Becker,
Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.
Recommendation Systems By: Bryan Powell, Neil Kumar, Manjap Singh.
Presented By: Madiha Saleem Sunniya Rizvi.  Collaborative filtering is a technique used by recommender systems to combine different users' opinions and.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
Recommendation Systems ARGEDOR. Introduction Sample Data Tools Cases.
Chapter 14 – Association Rules and Collaborative Filtering © Galit Shmueli and Peter Bruce 2016 Data Mining for Business Analytics (3rd ed.) Shmueli, Bruce.
Collaborative Filtering With Decoupled Models for Preferences and Ratings Rong Jin 1, Luo Si 1, ChengXiang Zhai 2 and Jamie Callan 1 Language Technology.
Recommender Systems 11/04/2017
Recommender Systems & Collaborative Filtering
Distance and Similarity Measures
CS728 The Collaboration Graph
WSRec: A Collaborative Filtering Based Web Service Recommender System
Classification Nearest Neighbor
Machine Learning With Python Sreejith.S Jaganadh.G.
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Instance Based Learning (Adapted from various sources)
Collaborative Filtering Nearest Neighbor Approach
M.Sc. Project Doron Harlev Supervisor: Dr. Dana Ron
Classification Nearest Neighbor
Nearest-Neighbor Classifiers
Q4 : How does Netflix recommend movies?
Movie Recommendation System
Nearest Neighbors CSC 576: Data Mining.
Multivariate Methods Berlin Chen
Presentation transcript:

Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach

Jeff Howbert Introduction to Machine Learning Winter Netflix Prize data no longer available to public. l Just after contest ended in July 2009: –Plans for Netflix Prize 2 contest were announced –Contest data was made available for further public research at UC Irvine repository l But a few months later: –Netflix was being sued for supposed privacy breaches connected with contest data –FTC was investigating privacy concerns l By March 2010: –Netflix had settled the lawsuit privately –Withdrawn the contest data from public use –Cancelled Netflix Prize 2 Bad news

Jeff Howbert Introduction to Machine Learning Winter An older movie rating dataset from GroupLens is still available, and perfectly suitable for the CSS 490 / 590 project. l Consists of data collected through the MovieLens movie rating website. l Comes in 3 sizes: –MovieLens 100k –MovieLens 1M –MovieLens 10M Good news

Jeff Howbert Introduction to Machine Learning Winter l 943 users l 1682 movies l 100,000 ratings l rating scale l Rating matrix is 6.3% occupied l Ratings per user min = 20mean = 106max = 737 l Ratings per movie min = 1mean = 59max = 583 MovieLens 100k dataset properties

Jeff Howbert Introduction to Machine Learning Winter Recommender system definition DOMAIN: some field of activity where users buy, view, consume, or otherwise experience items PROCESS: 1. users provide ratings on items they have experienced 2. Take all data and build a predictive model 3. For a user who hasn’t experienced a particular item, use model to predict how well they will like it (i.e. predict rating)

Jeff Howbert Introduction to Machine Learning Winter Types of recommender systems Predictions can be based on either: l content-based approach –explicit characteristics of users and items l collaborative filtering approach –implicit characteristics based on similarity of users’ preferences to those of other users

Jeff Howbert Introduction to Machine Learning Winter Collaborative filtering algorithms l Common types: –Global effects –Nearest neighbor –Matrix factorization –Restricted Boltzmann machine –Clustering –Etc.

Jeff Howbert Introduction to Machine Learning Winter Nearest neighbor in action Identical preferences – strong weight Similar preferences – moderate weight

Jeff Howbert Introduction to Machine Learning Winter Nearest neighbor classifiers Requires three inputs: 1.The set of stored samples 2.Distance metric to compute distance between samples 3.The value of k, the number of nearest neighbors to retrieve

Jeff Howbert Introduction to Machine Learning Winter Nearest neighbor classifiers To classify unknown record: 1.Compute distance to other training records 2.Identify k nearest neighbors 3.Use class labels of nearest neighbors to determine the class label of unknown record (e.g., by taking majority vote)

Jeff Howbert Introduction to Machine Learning Winter l Compute distance between two points –Example: Euclidean distance l Options for determining the class from nearest neighbor list –Take majority vote of class labels among the k-nearest neighbors –Weight the votes according to distance  example: weight factor w = 1 / d 2 Nearest neighbor classification

Jeff Howbert Introduction to Machine Learning Winter l For our implementation in Project 2: –Actually a regression, not a classification.  Prediction is a weighted combination of neighbor’s ratings (real number). –We consider all neighbors, not the k-nearest subset of neighbors.  Since we’re not ranking neighbors by distance, distance no longer relevant. –Instead of distance, we calculate similarities that determine weightings of each neighbor’s rating. Nearest neighbor in collaborative filtering

Jeff Howbert Introduction to Machine Learning Winter Nearest neighbor in action Identical preferences – strong weight Similar preferences – moderate weight l For this example: –Find every user that has rated movie 10 –Compute similarity between user 2 and each of those users –Weight those users’ ratings according to their similarities –Predicted rating for user 2 is sum of other users’ weighted ratings on movie 10

Jeff Howbert Introduction to Machine Learning Winter l For Project 2 we will use Pearson’s correlation coefficient (PCC) as a measure of similarity between users. l Pearson’s correlation coefficient is covariance normalized by the standard deviations of the two variables: –Always lies in range -1 to 1 Measuring similarity of users

Jeff Howbert Introduction to Machine Learning Winter l PCC similarity for two users a and b: – Both sums are taken over only those movies rated by both a and b (indexed by j) – r a,j = rating by user a on movie j –  r a = average rating on all movies rated by user a – n = number of movies rated by both a and b Measuring similarity of users

Jeff Howbert Introduction to Machine Learning Winter Mesauring similarity of users Identical preferences – strong weight Similar preferences – moderate weight l Calculating PCC on sparse matrix –Calculate user average rating using only those cells where a rating exists. –Subtract user average rating only from those cells where rating exists. –Calculate and sum user-user cross-products and user deviations from average only for those movies where a rating exists for both users.