Presentation on theme: "Memory vs. Model-based Approaches SVD & MF Based on the Rajaraman and Ullman book and the RS Handbook. See the Adomavicius and Tuzhilin, TKDE 2005 paper."— Presentation transcript:
Memory vs. Model-based Approaches SVD & MF Based on the Rajaraman and Ullman book and the RS Handbook. See the Adomavicius and Tuzhilin, TKDE 2005 paper for a quick & great overview of RS methodologies. 1
Basics So far we discussed user-based and item-based CF. In both, we predict an unknown rating by taking some kind of aggregate of: ratings on the distinguished item by the distinguished user’s most similar neighbors (user-based) or ratings of the distinguished user on the distinguished item’s most similar neighbors (item- based). Both based on a literal memory of past ratings: thus, memory-based. Both look at closest neighbors: thus, neighborhood- based. 2
Model-based approaches Build a model of each user’s behavior: what s/he looks for in an item. Build a model of each item: what does it have to offer. Problem: these “features” are not explicitly present. Turn to latent features. Matrix factorization: Approximate ratings matrix as a product of two low-rank matrices. Dimensionality reduction. Components of each user/item vector – latent factors/features. 3
Geared towards females Geared towards males serious escapist The Princess Diaries The Lion King Braveheart Lethal Weapon Independence Day Amadeus The Color Purple Dumb and Dumber Ocean’s 11 Sense and Sensibility G us Da ve Stolen from: Bell, Koren, and Volinsky’s Netflix Prize talk. 4
Extensions Location-aware: recommend items depending on where you are. Time-aware: e.g., recommend fastfood restaurants for lunch on weekdays and formal/classy ones for dinner; recommend shows that are currently in town; recommend games during hockey season; … Context-aware: e.g., recommend a movie depending on your current company. 15
What if feedback was implicit More common than explicit feedback: – Not every customer will bother to rate/review. – Purchase history of products. – Browsing history or search queries. – Playcount in last.fm is a kind of implicit feedback. – Simple thumbs up/down for TV shows. 16 Based on: Y.F. Hu, Y. Koren, and C. Volinsky, Collaborative Filtering for Implicit Feedback Datasets, Proc. IEEE Intl Conf. Data Mining (ICDM 08), IEEE CS Press, 2008, pp. 263-272.
More Differences Even the so-called positive cases are noisy. E.g., – Tune to a certain TV channel and talk to a friend the whole time. – Perhaps my experience after buying a smartphone was negative. – Perhaps I bought that watch as a gift. – No way of knowing! Explicit f/b (rating) preference; implicit f/b confidence. E.g., – How often do I watch a series and for how long? – Playcount of songs on last.fm. Evaluation metric – unclear; needs to account for availability and competition. 18