Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 8 Collaborative Filtering Stand 20.12.00.

Similar presentations


Presentation on theme: "Chapter 8 Collaborative Filtering Stand 20.12.00."— Presentation transcript:

1 Chapter 8 Collaborative Filtering Stand 20.12.00

2 - 2 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Recommended References Shardanand, U., and Mayes, P. (1995) Social Information Filtering: Algorithms for Automating ‘Word of Mouth’, in Proceedings of CHI95, 210-217. http://www.acm.org/sigchi/chi95/Electronic/documnts/papers/us_bdy.htm Billsus, D., & Pazzani, M.J. (1998) Learning Collaborative Information Filters. In: The 15th International Conference on Machine Learning, ICML-98. http://www.ics.uci.edu/~dbillsus/papers/icml98.pdf Smyth B., Cotter P., ‘Surfing the Digital Wave, Generating Personalised TV Listings using Collaborative, Case-Based Recommendation’, In: Proceedings of the Third International Conference on Case-Based Reasoning ICCBR99’, Springer. Berkeley School of Information Systems, Link Collection on Collaborative Filtering. http://www.sims.berkeley.edu/resources/collab/

3 - 3 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Content-based vs. Collaborative Filtering/Selection Filtering and Selection means basically the same: –Filtering: removing certain objects from a universe –Selection: picking certain objects from a universe Previously discussed approaches for selecting products are content-based. Representation of products is required and a notion of similarity between demands and products (see chapters 4-7) Alternative approach discussed in this chapter: collaborative selection

4 - 4 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Collaborative Filtering Approach (1) Basic Idea Select items based on aggregated user ratings of those items You buy an item only because many of your friends (which share the same interest with you) bought it an like it, although you don’t really know anything about the product. Consider ratings of similar users (customers) only Requires stored user profiles of the kind: –Customer C1 likes (buys) product p1,p4,p8 –Customer C2 likes (buys) product p1,p2,p8 –...

5 - 5 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Collaborative Filtering Approach (2) Users 1, 2 and 3 are similar since they all bought products A,B, and C D & E can be recommended to User 1 based on this shared interest Recommendation based on observations –no detailed representation of D or E –users must be identified, i.e., a user profile must be available A B C E D F User 1 User 3 User 2 Products A,...,F

6 - 6 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern First Realization (1) Customer U gives ratings U x for certain products x  P U A rating U x is a value from an ordered set, e.g., an Integer value 1..7, 1: don’t like at all... 4: neutral... 7: great stuff Note: Not every Customer rates every Product Determine similarity of customers U and V based on the similarity of ratings of those products both have rated, i.e., P U  V.

7 - 7 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern First Realization (2) Distance/ Similarity Measures for Customers Given: two customers U and V Mean Squared Difference (Distance Measure) Pearson correlation coefficient may be better: r Pearson (U,V) –r uv > 0: positively related –r uv = 0: not related –r uv < 0: negatively related

8 - 8 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern First Realization (3) Determining Recommendations Profile of a new customer W is compare to the profile of all known users U and the similarity/distance r WU is determined Users whose profile similarity exceeds a certain threshold are selected Rating for an item is a weighted average of rating of similar users for that item Products with the highest rating W x are recommended to W

9 - 9 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Shortcomings of the First Realization Correlation only based on items which two customers have in common –When thousands of items available only little overlap! –Then: Recommendations based on only a few observations Correlation Coefficient is not transitive, however customer similarity is at least to some degree transitive –If A and B correlated and B and C are correlated then A and C should also be correlated

10 - 10 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Second Realization (1) We view collaborative filtering as a classification task For each customer U i, determine a classifier f i that classifies a product into classes, e.g. –{ like, dislike } or –ratings from 1...7 A product is represented by the rating vector of the other customers The classifier is a function f i : OthersRatingVector  MyRating, i.e., the predicted rating for product x is determined by f i (U 1 x,...,U n x ). This classifier can be learned from examples using machine learning approaches (see also chapter 13). Training examples for f i are the ratings of those products that are also rated U i

11 - 11 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Second Realization (2) Construction of Training Examples Current ratings Training Examples for U 4 P1P2P3P4P5 U 1 ++ U 2 -- U 3 ++-+ U 4 +-- U i : Customers Pi: Products +: like -: dislike no information E1E2E3 U 1 + 1 0 1 U 1 - 0 0 0 U 2 + 0 0 0 U 2 - 0 1 0 U 3 + 1 1 0 U 3 - 0 0 1 Class 1 0 0

12 - 12 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Second Realization (3) Various Machine Learning Approaches can be applied Feed-forward nets with one hidden layer with two units show good results; Training with backpropagation Problems: –High dimensionality of training data –Sparse data (i.e. only few ‘1’ entries, many ‘0’s) Methods for reducing the dimensions (compression) must be applied during a pre-processing step –Choose not all users, but characteristic (reference) users only –LSI approach (see Billsus & Pazzani, 1998)

13 - 13 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Drawbacks of Collaborative Filtering No anonymity: User Profiles are required and must be stored The pump priming problem: (1) When a new store is launched, no ratings are available  poor recommendations (2) When a new product emerges, no ratings for this product available  new product is never recommended Large training effort involved

14 - 14 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Customers who bought this book also bought:  Reinforcement Learning: An Introduction; R. S. Sutton, A. G. Barto  Advances in Knowledge Discovery and Data Mining; U. M. Fayyad  Probabilistic Reasoning in Intelligent Systems; J. Pearl Application 1: Amazon Book Store

15 - 15 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Application2: Personalized TV Program www.ptv.ie Generates personalized TV guides Uses collaborative & case- based recommendations based on descriptions of programs based on likes of users with similar tastes.

16 - 16 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern PTV: Recommendations & Feedback

17 - 17 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern Summary Collaborative vs. Content Based Content Based (CBR) –can be anonymous –requires representation Collaborative Filtering –requires identification –“representationless” –pump priming problem –scalability –sparse matrix Current Trend: Combination of both approaches


Download ppt "Chapter 8 Collaborative Filtering Stand 20.12.00."

Similar presentations


Ads by Google