Presentation is loading. Please wait.

Presentation is loading. Please wait.

Q4 : How does Netflix recommend movies?

Similar presentations


Presentation on theme: "Q4 : How does Netflix recommend movies?"β€” Presentation transcript:

1 Q4 : How does Netflix recommend movies?
Networked Life: 20 Questions and Answers (M. Chiang, Princeton University) Q4 : How does Netflix recommend movies? Prof. Hongseok Kim

2 Netflix Business 1 : DVD Business 2 : Online streaming
Rental business in 1997 Just wait for DVDs to arrive by mail Cannot receive a new DVD without returning the old one Sliding window Business 2 : Online streaming Streaming movies and TV programs Up to 23 million subscribers by April 2011

3 Examples Amazon: Content-based filtering YouTube: Co-visitation counts
Pandora: Experts + Thumbs up or down Netflix: Collaborative-filtering

4 Input User ID Movie ID Rating Timing 𝒖 π’Š {1, 2, 3, 4, 5} , 𝒓 π’–π’Š
{1, 2, 3, 4, 5} , 𝒓 π’–π’Š Timing date of rating , 𝒕 π’–π’Š

5 Output Predicted rating Example) Predicted rating : 4.2
User will rate 4 stars with 80% probability & 5 stars with 20% probability

6 Metric Customer satisfaction Prediction effectiveness Prediction error
RMSE Hamming distance Hard to gather data C : (u,i) pairs The smaller the RMSE, the better the recommendation system

7 The Netflix Prize Objective October 2006 10% over Cinematch?
Could recommendation accuracy be improved by 10% RMSE over what Netflix was using? October 2006 Open, online, international competition 10% over Cinematch? $1M and 100 Million data points 1999 ~ 2006(7 years) 480,000 users 17,770 movies Skewed, Sparse data

8

9

10 Data Sets Similar statistical properties
Can be used by each competing team as often as they want At most once a day Final decision is based on comparison of RMSE on the test set

11 Timeline 5,000 teams 44,000 submissions

12

13

14 The problem Unknown ratings to be predicted (Only Netflix knows)

15 Challenges and solutions
Large and sparse data Two main types of techniques for recommendation Content-based filter : Amazon Only looks at each row in isolation and attaches labels to the columns If you like a comedy with X, you will probably like another comedy with X Collaborative filter : Netflix Exploits all the data in the entire table Neighborhood method Compute a similarity score, Similar movies & users Latent factor method Hidden, low-dimensional structures

16 A few detours Least squares Convex optimization Implicit feedback
Linear regressions Convex optimization Generalizes linear programs Implicit feedback Which movies she browsed, which ones she watched, and which ones she bothered to rate at all are all helpful hints Temporal dynamics Time-dependent parameters Allows the model to capture changes in a person’s taste and in trends of the movie market, as well as the mood of the day

17

18

19 Parameterized models

20 Baseline predictor Average predictor Baseline predictor RMSE
= (𝑒,𝑖) π‘Ÿ 𝑒,𝑖 𝐢 C= (u,i) pairs

21 RMSE minimization Condition : user1, two movie(A,B)

22 Least squares B에 λŒ€ν•΄ λ―ΈλΆ„

23 Solution

24 Regularization Overfitting Regularization
Least squares solutions often suffer from the overfitting problem Fits the known data in the training set so well that it loses the flexibility to adjust to a new data set Regularization A standard technique to avoid overfitting Minimize weight of parameters Original least square Trade-off parameter Penalty

25 After baseline predictor
Error matrix Prediction matrix Actual rating matrix

26 Convex optimization Minimize convex objective function
Least squares is a special case of convex optimization Subject to convex constraint set Easy in theory and in practice

27 Convex set (c) (d) (e) Which is a convex set?

28 Convex set Definition Most important property Separate by a line

29 Convex function Which is a convex function?

30 Convex function Second derivative test Hessian matrix of a function
All eigenvalues of hessian matrix are non-negative Positive Semi Definite(PSD) 𝑓( π‘₯ 1 , π‘₯ 2 , … π‘₯ 𝑛 ) ( 𝛻 2 𝑓) 𝑖𝑗 = πœ• 2 𝑓 πœ• π‘₯ 𝑖 πœ• 𝑦 𝑗

31 Neighborhood method From local to global structure
Pairwise statistical correlation User-user Two similar people Movie-movie Two similar movie

32

33

34 Similarity metric Cosine coefficient

35 Neighborhood

36 Neighborhood predictor
Baseline predictor + weighted sum of ratings from neighbor movies weight Similar movie Baseline predictor Normalize

37 Summary

38

39

40 Example test data training data

41 Baseline predictor min 30 training data 15 variables b 30x15

42 Prediction User Movie

43 Rating matrix(Estimated by the baseline predictor)

44 Prediction

45 Similarity Use the cosine coefficient to measure the similarity between movies represented in The entire similarity matrix

46 Neighborhood predictor

47 Prediction

48 Summary Netflix Prize is a special case of recommendation system
Collaborative filter leverages similarities among users or among movies to make prediction Minimizing RMSE may lead to least squares A special case of convex optimization

49


Download ppt "Q4 : How does Netflix recommend movies?"

Similar presentations


Ads by Google