Presentation on theme: "Online Recommendations"— Presentation transcript:
1Online Recommendations The UV Decomposition AlgoritHm
2MotivationIt is now common to get “personal recommendations” when we visit a website.News articlesProduct recommendationsAdvertisementsWhy ?Unlike paper newspapers or brick and mortar stores, there is no limit [in terms of space/inventory] what can be shown or sold on a web-site..Long-Tail effectLarge part of the income comes from the tail [example in search revenue]
3The Netflix challengeNetflix is a (online) US company from where people can rent moviesNetflix would like to recommend movies to users.Netflix challenge (2006) – one million dollars prize who could beat their movie recommendation by 10%After three years the prize was awarded..We will discuss one of the “main ideas” behind the winning entry[We will follow the discussion from the “Mining of Massive Data Sets”]
4The Data Data can be arranged in the form… M1 M2 M3 M4 M5 M6 M7 U1 3 2 Users have given rating (between 1 and 5) from a database of 7 movie..What should be recommended to U2 ?
5First ideas Do missing entries mean a rating of “0” ? How about simple dot product.Other ideas…?ClusteringAssociation Rules ?
6Another basic idea Let the user “u” rate movie “m” as follow: Take the average of the following two numbers.Average of user’s “u” ratings.Average of all ratings given to movie “m” by all users’ who rated “m”This was only 3% worse than than the Netflix algorithm (called CineMatch)
7Latent ModelingA big breakthrough is the idea of “latent variable modeling”The data we observe is a result of another variable or set of variables which are “latent” (not observable)…The latent variables generate the observed data…In our case…the latent variables control the generation of the ratings..So the challenge is to “infer” the latent variables from the observed data…clusteringBayes Theorem
8Cognitive Science/AIScientists working in AI/Cognitive Science have drawn the following analogy..Mind ComputerMental Representation Programs/TheoriesThinking Computational Process/AlgorithmsPractical Outcome: Infer Latent Structures
9The Key Idea Decompose the User x Rating matrix into: User x Rating = ( User x Genre ) x (Genre x Movies)Number of Genres is typically smallOrR =~ UVFind U and V such that ||R – UV|| is minimized…Almost like k-means clustering…why ?
10Example of UV Decomposition The criterion used to select U and V is the Root Mean Square Error (RMSE)