A Bayesian approach to recommender systems

A Bayesian approach to recommender systems
Arnoldo Frigessi

3 5 1 2 4

PREFERENCE LEARNING items 2 4 1 3 ? Assessors Consensus?

PREFERENCE LEARNING items 2 4 1 3 Assessors Consensus!

Consensus with uncertainty!
2 4 1 3 Consensus with uncertainty!

Predicting preference with uncertainty
New customer 1 2 Predicting preference with uncertainty

Different sub-cultures
1 2 New customer

The data-generating mechanism
stochastic models that can be thought to have produced the data.

Bayesian Mallows model

Bayesian Mallows model
Highly incomplete data: Ratings Pair comparisons Inconsistencies Time varying items Click-through data Bayesian Mallows model

The Kendall distance measures the minimum number of pairwise adjacent switches which convert R into ρ. The computation of the normalizing constant in the Mallows model when using other distance measures than Kendall's is NP-complete.

Distance Measures and Normalizing Constants
Definition: A right invariant distance is unaffected by a relabelling of the items.

NP-hard; more complicated for α

Posterior distribution
Sampling from the posterior by Markov Chain Monte Carlo - Iterative algorithm - given update - update: propose and accept/reject.

Grid of α values The partition function can be approximated off-line
Small n: exact formulas Medium n: importance sampling Large n: asymptotic approximation.

Only a subset of the items are ranked.
Ranks can be missing at random, or the assessors may only have ranked, say, the top-5 items. Can be handled in the Bayesian framework, by applying data augmentation techniques: estimating the lacking ranks consistently with the partial observations.

be the augmented rankings

Clustering users Assessors not one homogeneous group, but C groups
We use a mixture of Mallows models to cluster of N users according to their preferences We estimate a latent ranking of the items for each cluster of assessors. The variables assign each assessor to one of the C clusters.

N = 5000 people (assessors) were interviewed, each giving his/her complete ranking of n = 10 sushi variants (items): ebi (shrimp), anago (sea eel), maguro (tuna), ika (squid), uni (sea urchin), sake (salmon roe), tamago (egg), toro (fatty tuna), tekka-maki (tuna roll), kappa-maki (cucumber roll).

within cluster distance
of each rank to the cluster centroid Elbow rule

MCMC: we need to propose augmented ranks which obey the partial ordering constraints given by the assessor. Assume coherent pair comparisons

25 random pairs of images "Which of the two beaches would you prefer to go to in your next vacation?". N = 60 users

Which of these two movies do you prefer?

n = 200 most rated movies, N = 6004 users who rated at least 3 movies Each user compared an average of 30.2 movies. Converted the ratings from a 1-5 scale to pairwise preferences Asymptotic approximation of partition function as in Mukherjee (2013) Run for C=1,…, 20

Prediction: Leave-one-out test data We discarded for each assessor one of the rated movies at random. Then, we randomly selected one of the movies ranked by the assessor to create a pairwise comparison involving the discarded movie. This comparison was not used for inference.

Compute, for all assessors, the posterior predictive probabilities
Estimate the posterior probabilities for correctly predicting the discarded comparison, for all assessors. median = 88% of the probabilities were > 0.5 100% 0%

LOOKING for master students! frigessi@medisin.uio.no
Valeria Vitelli Øystein Sørensen Marta Crispino Elja Arjas LOOKING for master students! arXiv: Sylvia Qinghua Liu

A Bayesian approach to recommender systems

Similar presentations

Presentation on theme: "A Bayesian approach to recommender systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Bayesian approach to recommender systems

Similar presentations

Presentation on theme: "A Bayesian approach to recommender systems"— Presentation transcript:

Similar presentations

About project

Feedback