Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ranking: Compare, Don’t Score Ammar Ammar, Devavrat Shah (LIDS – MIT) Poster ( No preprint), WIDS 2011.

Similar presentations


Presentation on theme: "Ranking: Compare, Don’t Score Ammar Ammar, Devavrat Shah (LIDS – MIT) Poster ( No preprint), WIDS 2011."— Presentation transcript:

1 Ranking: Compare, Don’t Score Ammar Ammar, Devavrat Shah (LIDS – MIT) Poster ( No preprint), WIDS 2011

2 Introduction  The need to rank items based on user input – election, betting, recommendation systems. E.g. Netflix, the movie streaming service company (accounts for 30% of U.S. web traffic) : the problem of recommending movies to users based on partial historical information about their preferences.  Two main approaches  Scores : ask users to provide a score/rating for each product, and use the scores to rank the products. (A popular approach)  Comparisons : ask users to compare two, or more, products at a time. Use comparisons to rank products. (A natural alternative) 2

3 Introduction  Scores  Advantage : Easy aggregation.  Disadvantage : Scores are arbitrary/relative (e.g. scale).  Comparisons  Advantage : Absolute information.  Disadvantage : Hard aggregation. 1 2 3 1 2 3

4 Mathematical Model  n products, N={1,...,n}.  Each customer is associated with a permutation σ of the elements of N.  σ(i) < σ(j) means customer prefers product i to j. E.g. N={1,2,3,4,5} a customer have a permutation σ = ( )  his preference ranking 3 > 1 > 4 > 2 > 5  Their model of customer choice is a distribution, μ: S n  [0,1], over the set of possible permutations, S n.  Observed data is limited to pairwise comparison marginal of μ : w ij = P [ σ(i) < σ(j) ] (fraction of users who prefer item i to item j)  Goal: find an estimate μ, that is consistent with the data. 12345 24135 likedislike ˆ 4

5 Maximum Entropy  Multiple distributions are consistent with the data constraints  The principle of maximum entropy helps a distribution choice.  Subject to known constraints (called “testable information”)  The probability distribution which best represents the current state of knowledge is the one with largest entropy.  The solution has the parametric form max μ σ log μ σ 5

6 Contributions  Developed a consistent algorithm for estimating the parameter of the Maximum Entropy distribution.  Algorithm is distributed and iterative.  Provided a randomized 2-approximation scheme for the mean of the distribution.  Developed two ranking schemes that utilize the Maximum Entropy distribution to obtain a ranking that puts emphasis at the top elements:  Top-k ranking: uses likelihood of the item appearing in top k.  θ-ranking: uses a tilted average of the item's possible positions. 6

7 Algorithm Sketch  The Maximum Entropy distribution is fully characterized by the parameter λ ij.  To estimate these parameters using the data w ij 's.  Initialize the parameter to λ ij =1.  for t=1,2,... T1: Set λ ij t+1 = λ ij t + 1/t(w ij – E λ [I { σ(i) < σ(j) } ] )  Exact computation of E λ [I { σ(i) < σ(j) } ] is hard.  Use MCMC or BP to obtain an approximation  Parameters can be estimated "separately" in a distributed manner. 7 t t

8 Mode  Mode : σ* = argmax ( )  Exact computation of the mode is hard.  A randomized 2-approximation: 1. Generate k permutations, σ 1,..., σ k, u.a.r. 2. Select the permutation σ with the largest weight. 8 σ ˆ

9 Top-k Ranking  A robust ranking of the top k items using one of the following two schemes: 1. top-k ranking  Compute: S k (i)=P λ [σ(i) ≤ k].  Rank products using S k and choose top k. 2. θ-ranking  Compute: S θ (i)=∑ j e -θj ∙ P λ [σ(i) = j]  Rank products using S θ and choose top k. 9


Download ppt "Ranking: Compare, Don’t Score Ammar Ammar, Devavrat Shah (LIDS – MIT) Poster ( No preprint), WIDS 2011."

Similar presentations


Ads by Google