Tom Rampley Recommendation Engines: an Introduction.

Slides:



Advertisements
Similar presentations
Recommender System A Brief Survey.
Advertisements

Recommender Systems & Collaborative Filtering
Google News Personalization: Scalable Online Collaborative Filtering
Item Based Collaborative Filtering Recommendation Algorithms
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
By Srishti Gahlot (sg2856) 1. 2 What do you mean by online behavior? Why do we need to analyze online behavior and personalize it? How do we analyze this.
Dimensionality Reduction PCA -- SVD
Oct 14, 2014 Lirong Xia Recommender systems acknowledgment: Li Zhang, UCSC.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Distance and Similarity Measures
Recommendation Engines & Accumulo - Sqrrl Data Science Group May 21, 2013.
Sean Blong Presents: 1. What are they…?  “[…] specific type of information filtering (IF) technique that attempts to present information items (movies,
What is Statistical Modeling
Planning under Uncertainty
2. Introduction Multiple Multiplicative Factor Model For Collaborative Filtering Benjamin Marlin University of Toronto. Department of Computer Science.
Recommender Systems Aalap Kohojkar Yang Liu Zhan Shi March 31, 2008.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
1 Understanding more about Consumers. 2 Recall the law of demand was a statement that the price of a product and the quantity demanded of the product.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Recommender systems Ram Akella November 26 th 2008.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
Item-based Collaborative Filtering Recommendation Algorithms
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
Privacy risks of collaborative filtering Yuval Madar, June 2012 Based on a paper by J.A. Calandrino, A. Kilzer, A. Narayanan, E. W. Felten & V. Shmatikov.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Filtering and Recommendation INST 734 Module 9 Doug Oard.
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
Google News Personalization: Scalable Online Collaborative Filtering
Toward the Next generation of Recommender systems
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
1 Social Networks and Collaborative Filtering Qiang Yang HKUST Thanks: Sonny Chee.
EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.
Collaborative Filtering  Introduction  Search or Content based Method  User-Based Collaborative Filtering  Item-to-Item Collaborative Filtering  Using.
Badrul M. Sarwar, George Karypis, Joseph A. Konstan, and John T. Riedl
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,
Collaborative Filtering Zaffar Ahmed
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
Google News Personalization Big Data reading group November 12, 2007 Presented by Babu Pillai.
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.
Recommendation Systems By: Bryan Powell, Neil Kumar, Manjap Singh.
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
KNN & Naïve Bayes Hongning Wang
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
Item-Based Collaborative Filtering Recommendation Algorithms
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Announcements Paper presentation Project meet with me ASAP
Matrix Factorization and Collaborative Filtering
Data Mining: Concepts and Techniques
Recommender Systems & Collaborative Filtering
Item-to-Item Recommender Network Optimization
Recommender Systems 01 , Content Based , Collaborative Filtering
Machine Learning Basics
Applications of IScore (using R)
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Recommender Systems.
Collaborative Filtering Non-negative Matrix Factorization
Recommender Systems: Collaborative & Content-based Filtering Features
Recommendation Systems
Presentation transcript:

Tom Rampley Recommendation Engines: an Introduction

A Brief History of Recommendation Engines Today: Recommenders become core products In addition to Amazon, companies like Pandora, Stitchfix, and Google (because what is a search engine other than a document recommender?) make recommendations a core value add of their services 2000: Amazon joins the party The introduction and vast success of the Amazon recommendation engine in the early 2000s led to wide acceptance of the technology as a way of increasing sales 1992: Recommenders are older than you might think GroupLens becomes the first widely used recommendation engine

What Does a Recommender Do? Recommendation engines use algorithms of varying complexity to suggest items based upon historical information Item ratings or content Past user behavior/purchase history Recommenders typically use some form of collaborative filtering

Collaborative Filtering The name: ‘Collaborative’ because the algorithm takes the choices of many users into account to make a recommendation Rely on user taste similarity ‘Filtering’ because you use the preferences of other users to filter out the items most likely to be of interest to the current user Collaborative Filtering algorithms include: K nearest neighbors Cosine similarity Pearson correlation Bayesian belief nets Markov decision processes Latent semantic indexing methods Association Rules Learning

Cosine Similarity Example Lets walk through an example of a simple collaborative filtering algorithm, namely cosine similarity Cosine similarity can be used to find similar items, or similar individuals. In this case, we’ll be trying to identify individuals with similar taste Imagine individual ratings on a set of items to be a [user,item] matrix. You can then treat the ratings of each individual as an N- dimensional vector of ratings on items: {r 1, r 2 …r N } The similarity of vectors (individuals’ ratings) can be computed by computing the cosine of the angle between them: The closer the cosine is to 1, the more alike the two individuals’ ratings are

Cosine Similarity Example Continued Lets say we have the following matrix of users and ratings of TV shows: And we encounter a new user, James, who has only seen and rated 5 of these 7 shows: Of the two remaining shows, which one should we recommend to James? True BloodCSIJAGStar TrekCastleThe Wire Twin Peaks Bob Mary Jim George Jennifer Natalie Robin True BloodCSIJAGStar TrekCastle James55310

Cosine Similarity Example Continued To find out, we’ll see who James is most similar to among the folks who have rated all the shows by calculating the cosine similarity between the vectors of the 5 shows that each individual have in common: It seems that Mary is the closest to James in terms of show ratings among the group. Of the two remaining shows, The Wire and Twin Peaks, Mary slightly preferred Twin Peaks so that is what we recommend to James Cosine SimilarityJames Bob0.73 Mary0.89 Jim0.47 George0.69 Jennifer0.78 Natalie0.50 Robin0.79

Collaborative Filtering Continued This simple cosine similarity example could be extended to extremely large datasets with hundreds or thousands of dimensions You can also compute item to item similarity by treating the item as the vectors for which you’re computing similarity, and the users as the dimensions Allows for recommending similar items to a user after they’ve made a purchase Amazon uses a variant of this algorithm This is an example of item-to-item collaborative filtering

Adding ROI to the Equation: an Example with Naïve Bayes When recommending products, some may generate more margin for the firm than others Some algorithms can take cost into account when making recommendations Naïve Bayes is a commonly used classifier that allows for the inclusion of marginal value of a product sale in the recommendation decision

Naïve Bayes Bayes theorem tells us the probability of our beliefs being true given prior beliefs and evidence Naïve Bayes is a classifier that utilizes Bayes’ theorem (with simplifying assumptions) to generate a probability of an instance belonging to a class Class likelihood can be combined with expected payoff to generate the optimal payoff from a recommendation

Naïve Bayes Continued How does the NB algorithm generate class probabilities, and how can we use the algorithmic output to maximize expected payoff? Let’s say we want to figure out which of two products to recommend to a customer Each product generates a different amount of profit for our firm per unit sold We know the target customer’s past purchasing behavior, and we know the past purchasing behavior of twelve other customers who have bought one of the two potential recommendation products Let’s represent our knowledge as a series of matrices and vectors

Naïve Bayes Continued

NB uses (independent) probabilities of events to generate class probabilities Using Bayes’ theorem (and ignoring the scaling constant) the probability of a customer with past purchase history α (a vector of past purchases) buying item θ is: P ( α 1, …, α i | θ j ) P ( θ j ) Where P ( θ j ) is the frequency with which the item appears in the training data, and P ( α 1, …, α i | θ j ) is Π P ( α i | θ j ) for all i items in the training data That P ( α 1, …, α i | θ j ) P ( θ j ) = Π P ( α i | θ j ) P ( θ j ) is dependent up on the assumption of conditional independence between past purchases

Naïve Bayes Continued In our example, we can calculate the following probabilities:

Now that we can calculate P ( α 1, …, α i | θ j ) P ( θ j ) for all instances, let’s figure out the most likely boat purchase for Eric: These probabilities may seem very low, but recall that we left out the scaling constant in Bayes theorem since we’re only interested in the relative probabilities of the two outcomes Naïve Bayes Continued P(θ)ToysGamesCandyBooksBoat EricSquirt GunLifeSnickersHarry Potter? Sailboat6/123/122/12 3/ Speedboat6/121/122/121/

So it seems like the sailboat is a slam dunk to recommend. It’s much more likely (18 times!) for Eric to buy than the speedboat. But let’s consider a scenario: let’s say our hypothetical firm generates $20 of profit whenever a customer buys a speedboat, but only $1 when they buy a sailboat (outboard motors are apparently very high margin) In that case, it would make more sense to recommend the speedboat, because our expected payoff from the speedboat recommendation would be 11% greater ($20/$1 * /.00087) than our expected payout from the sailboat recommendation This logic can be applied to any number of products, by multiplying the set of purchase probabilities by the set of purchase payoffs, taking the maximum value as the recommended item Naïve Bayes Continued

Challenges While recommendation algorithms in many cases are relatively simple as machine learning goes, there are a couple of difficult problems that all recommenders must deal with: Cold start problem How do you make recommendations to someone for whom you have very little or no data? Data sparsity With millions of items for sale, most customers have bought very few individual items Grey and Black sheep problem Some people have very idiosyncratic taste, and making recommendations to them is extremely difficult because they don’t behave like other customers

Dealing With Cold Start Typically only a problem in the very early stages of a user-system interaction Requiring creation of a profile for new users can mitigate the problem to a certain extent, by making early recommendations contingent upon supplied personal data A recommender system can also start out using item-item recommendations based upon the first items a user buys, and gradually change over to a person-person system as the system learns the user’s taste

Dealing With Data Sparsity Data sparsity can be dealt with primarily by two methods: Data imputation Latent factor methods Data imputation typically uses an algorithm like cosine similarity to impute the rating of an item based upon the ratings of similar users Latent factor methods typically use some sort of matrix decomposition to reduce the rank of the large, sparse matrix while simultaneously adding ratings for unrated items based upon latent factors

Dealing With Data Sparsity Techniques like principal components analysis/singular value decomposition allow for the creation of low rank approximations to sparse matrices with relatively little loss of information

Dealing With Sheep of Varying Darkness To a large extent, these cases are unavoidable Feedback on recommended items post purchase, as well as the purchase rate of recommended items, can be used to learn even very idiosyncratic preferences, but take longer than for a normal user Grey and black sheep are doubly troublesome because their odd tendencies can also weaken the strength of your engine to make recommendations to the broad population of white sheep

References A good survey of recommendation techniques Matrix factorization for use in recommenders Article on the BellKor solution to the Netflix challenge Article on Amazon's recommendation engine