Download presentation
Presentation is loading. Please wait.
1
Lecture: Dudu Yanay
2
Input: Each instance is associated with a rank or a rating, i.e. an integer from ‘1’ to ‘K’. Goal: To find a rank-prediction rule which assigns to each instance a rank which is as close as possible to the instance true rank. Similar problems: ◦ Classifications. ◦ Regression.
3
Information Retrieval. Collaborative filtering: Predict a user’s rating on new items (books, movies etc) given the user’s past rating of similar items.
4
To cast a rating problem as a regression problem. To reduce a total order into a set of preferences over pairs. ◦ Time consuming since it might require to increase the sample size from to.
5
Online Algorithm (Littlestone 1988): ◦ Each can be computed in polynomial time. ◦ If the problem is separable, after polynomial failures (no) the learner doesn’t make a mistake. Meaning: לומדמורה Animation from Nader Bshouty’s Course.
6
Animation from Nader Bshouty’s Course.
7
A slide from Nader Bshouty’s Course.
9
Input: A sequence ◦. Output: A ranking rule where: ◦. Ranking loss after T rounds is: where is the TRUE rank of the instance in round ‘t’ and.
10
Given an input instance-rank pair, if: ◦. Lets represent the above inequalities by where The TRUE rank vector
11
Given an input instance-rank pair, if. So, let’s “move” the values of and towards each other: ◦. ◦, where the sum is only over the indices ‘r’ for which there was a prediction error, i.e.,.
12
12345 Predicted Rank Correct interval
13
Building the TRUE rank vector Checking which threshold prediction is wrong Updating the hypothesis
14
First, we need to show that the output hypothesis of Prank is acceptable. Meaning, if and is the final ranking rule then. Proof – By induction: Since the initialization of the thresholds is such that, then it suffices to show that the claim hold inductively. Lemma 1 (Order Preservation): Let and be the current ranking rule, where and let be an instance-rank pair fed to Prank on round ‘t’. Denote by and the resulting ranking after the update of Prank, then
15
23456 Predicted Rank Correct interval Option 1 12345 Correct interval Predicted Rank Option 2 1
16
Theorem 2: Let be an input sequence for PRank where. and. Denote by. Assume that there is a ranking rule with of a unit norm that classifies the entire sequence correctly with margin.. Then, the rank loss of the algorithm, is at the most.
17
Comparison between: ◦ Prank. ◦ MultiClass Perceptron – MCP. ◦ Widrow-Hoff (online regression) – WH. Datasets: ◦ Synthetic. ◦ EachMovie.
18
Randomly generated points - uniformly at random. Each point was assign a rank according to: ◦ - noise. Generated 100 sequences of instance-rank pairs, each of length 7000.
19
Collaborative filtering dataset. Contains ratings of movies provided by 61,265 people. 6 possible rating: 0, 0.2, 0.4, 0.6, 0.8, 1. Only people with at least 100 rating where considered. Chose at random one person to be the TRUE rank and other ratings where used as features (-0.5,-0.3,-0.1,0.1, 0.3, 0.5).
20
Batch setting Ran Prank over the training data as an online algorithm and used its last hypothesis to rank the unseen data.
22
משפט PERCEPTRON הוכחה
23
משפט PERCEPTRON הוכחה
24
משפט PERCEPTRON הוכחה
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.