Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!

Slides:



Advertisements
Similar presentations
Item Based Collaborative Filtering Recommendation Algorithms
Advertisements

The Netflix Challenge Parallel Collaborative Filtering James Jolly Ben Murrell CS 387 Parallel Programming with MPI Dr. Fikret Ercal.
By Srishti Gahlot (sg2856) 1. 2 What do you mean by online behavior? Why do we need to analyze online behavior and personalize it? How do we analyze this.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
Oct 14, 2014 Lirong Xia Recommender systems acknowledgment: Li Zhang, UCSC.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
LYRIC-BASED ARTIST NETWORK METHODOLOGY Derek Gossi CS 765 Fall 2014.
G54DMT – Data Mining Techniques and Applications Dr. Jaume Bacardit
Learning to Recommend Hao Ma Supervisors: Prof. Irwin King and Prof. Michael R. Lyu Dept. of Computer Science & Engineering The Chinese University of Hong.
The Wisdom of the Few A Collaborative Filtering Approach Based on Expert Opinions from the Web Xavier Amatriain Telefonica Research Nuria Oliver Telefonica.
Intro to RecSys and CCF Brian Ackerman 1. Roadmap Introduction to Recommender Systems & Collaborative Filtering Collaborative Competitive Filtering 2.
Item-based Collaborative Filtering Idea: a user is likely to have the same opinion for similar items [if I like Canon cameras, I might also like Canon.
CS345 Data Mining Recommendation Systems Netflix Challenge Anand Rajaraman, Jeffrey D. Ullman.
1 Collaborative Filtering Rong Jin Department of Computer Science and Engineering Michigan State University.
Learning Bit by Bit Collaborative Filtering/Recommendation Systems.
Recommendations via Collaborative Filtering. Recommendations Relevant for movies, restaurants, hotels…. Recommendation Systems is a very hot topic in.
Customizable Bayesian Collaborative Filtering Denver Dash Big Data Reading Group 11/19/2007.
1 Introduction to Recommendation System Presented by HongBo Deng Nov 14, 2006 Refer to the PPT from Stanford: Anand Rajaraman, Jeffrey D. Ullman.
Collaborative Filtering CMSC498K Survey Paper Presented by Hyoungtae Cho.
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Note to other teachers and users of these.
Recommendation Systems
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
Item-based Collaborative Filtering Recommendation Algorithms
Performance of Recommender Algorithms on Top-N Recommendation Tasks
A NON-IID FRAMEWORK FOR COLLABORATIVE FILTERING WITH RESTRICTED BOLTZMANN MACHINES Kostadin Georgiev, VMware Bulgaria Preslav Nakov, Qatar Computing Research.
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
Item Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karpis, Joseph KonStan, John Riedl (UMN) p.s.: slides adapted from:
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
Collaborative Filtering Recommendation Reporter : Ximeng Liu Supervisor: Rongxing Lu School of EEE, NTU
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.
Collaborative Filtering  Introduction  Search or Content based Method  User-Based Collaborative Filtering  Item-to-Item Collaborative Filtering  Using.
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.
1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.
EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Collaborative Filtering with Temporal Dynamics Yehuda Koren Yahoo Research Israel KDD’09.
Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,
Cosine Similarity Item Based Predictions 77B Recommender Systems.
Pearson Correlation Coefficient 77B Recommender Systems.
Singular Value Decomposition and Item-Based Collaborative Filtering for Netflix Prize Presentation by Tingda Lu at the Saturday Research meeting 10_23_10.
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
Netflix Challenge: Combined Collaborative Filtering Greg Nelson Alan Sheinberg.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
User Modeling and Recommender Systems: recommendation algorithms
Experimental Study on Item-based P-Tree Collaborative Filtering for Netflix Prize.
Company LOGO MovieMiner A collaborative filtering system for predicting Netflix user’s movie ratings [ECS289G Data Mining] Team Spelunker: Justin Becker,
Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.
Recommender Systems Based Rajaraman and Ullman: Mining Massive Data Sets & Francesco Ricci et al. Recommender Systems Handbook.
The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
Collaborative Deep Learning for Recommender Systems
Analysis of massive data sets Prof. dr. sc. Siniša Srbljić Doc. dr. sc. Dejan Škvorc Doc. dr. sc. Ante Đerek Faculty of Electrical Engineering and Computing.
ItemBased Collaborative Filtering Recommendation Algorithms 1.
Slope One Predictors for Online Rating-Based Collaborative Filtering Daniel Lemire, Anna Maclachlan In SIAM Data Mining (SDM’05), Newport Beach, California,
Item-Based Collaborative Filtering Recommendation Algorithms
Chapter 14 – Association Rules and Collaborative Filtering © Galit Shmueli and Peter Bruce 2016 Data Mining for Business Analytics (3rd ed.) Shmueli, Bruce.
Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu
Matrix Factorization and Collaborative Filtering
Statistics 202: Statistical Aspects of Data Mining
CF Recommenders.
CS728 The Collaboration Graph
WSRec: A Collaborative Filtering Based Web Service Recommender System
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Collaborative Filtering Nearest Neighbor Approach
M.Sc. Project Doron Harlev Supervisor: Dr. Dana Ron
Q4 : How does Netflix recommend movies?
Collaborative Filtering Non-negative Matrix Factorization
Recommendation Systems
Presentation transcript:

Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!

Recommender Systems Basic implementations:  Most popular / cheap / etc.  New items  Can they go shopping together?

Live Demonstrations  Amazon  Netflix XBOX360 usage:

Netflix Example

Netflix Prize

Recommender Systems  Personalized Recommendations!!!  Predicts user rating  Provide Recommendations  Attempt to profile user preferences  Model interaction between users and product

Recommender Systems Requirements:  Provide good recommendations (daaaa)  Justify the recommendation  Feasible in Run-Time

Strategies  Content-Based  Collaborative Filtering (CF)

Content-Based  Actors: Will Smith, Martin…  Genre: Action / Comedy  Director: Michael Bay

Content-Based - VSM  Domain of Features  Describing Vector Will Smith Michael Bay Action Comedy Pamela Anderson

Comparing Two Vectors  Calculate the angle between the vectors  Easier to calculate the cosine

VSM – “near” vectors - Michael Bay - Action - Will Smith - Comedy

Content-Based - Disadvantages  Static  Can’t find “special” correlations  Requires gathering external information

Collaborative Filtering  Relies just on users behavior  No profiles are required  Analyzes the relationships between users and items

CF - Levels  Neighborhood Based (local area)  Factorization Based (regional area)

CF – Neighborhood Based

CF Algorithms

Little more formally  Missing value estimation  User-Item matrix of scores  Predict unknown scores within the matrix

Scores?? According to:  Purchases  Rating  Browsing history ……

Formally..  M(|M|=m)users  N (|N|=n)items  RmXn matrix  r u,i the rating of user u of item i

More Problems  Massive amount of Data  99% of the matrix R is unknown (sparse matrix)  Data is NOT uniform across users & items

Netflix Real-Life Data  17,700 Movies  480,000 Users  (rating in a scale of 1-5)  Over 100,000,000 Ratings!!

Netflix – How to Win??  Quality is measured by RMSE (more emphasis on large errors)  Predict unknown 1,400,000 rating and compare them to real rating  Improve Netflix’s system (Cinematch) by 10%

Netflix – How to Win??  RMSE

Netflix – Leaderboard

Netflix – Statistics  51,051 contestants, 41,305 teams  186 countries  44,014 valid submissions from 5169 different teams

OK, so what's the plan?  Find a “good” neighborhood (p.s. what about YouTube's related videos?)  Take a weighted average on the neighbors rate

More Specifically User-Based:  N(u;i) – set of users who rate similarly to u and actually rated i

S u,v Key role! Used for:  Selecting N(u;i)  Weighting Most popular implementations:  Pearson correlation coefficient  Cosine similarity

Pearson correlation coefficient  I(u,v) – Set of all items rated by both u and v

N(u;i) Most popular / easiest ways:  Correlation Threshold  Best – n – neighbors  What about external data?

Social Networks!

Social Networks, Hot Topics  Facebook  MySpace  Delicious  Flicker

Quick Summary Two main parameters:  How to choose the neighbors  How to choose the weights

What about performance? Netflix Data:  N = 17,700  M = 480,000  Calculating N(u;i) is expensive  M >> N

Item-Based  Instead of “users” neighbors,  “items” neighbors  Estimate using known rating made by the user on similar items

More Specifically Item-Based:  N(i;u) – set of items who other users rate similar to i. Similarly, all items needs to be rated by u as well

Reminder.. User-Based:  N(u;i) – set of users who rate similarly to u and actually rated i

Why is it better?  Similarities is between Items (not Users)  Pre-compute all S i,j  Provide better recommendations?  Easier Justification  Most industry systems use it (Amazon)

Checkpoint  We know the basics  Can we “Tweak” the basic algorithm?

“Tweaks” - Normalized Data  Some rate 3 and some 5 for movies they liked  Old solution: normalize the dataset  New solution: predict the change from the average rating instead of the rating

“Tweaks” - Remove Global Effects  A user rates 5 all the times  A user rated 10,000 movies  Remove old rating?  Using the Time variable is not “Tweak”..

TAU’s Current Research  Distributed CF!!!  “Server” level

Distributed CF

? ?

Shared Users

Shared Items

How To Do It???? Copy all data to one server?  CF algorithm do not scale linear  Privacy  Bandwidth

TAU’s Solution  Join TAU’s DB group for more info