Content-Based Recommendation Systems Michael J. Pazzani and Daniel Billsus Rutgers University and FX Palo Alto Laboratory By Vishal Paliwal.

Slides:



Advertisements
Similar presentations
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Advertisements

Content-based Recommendation Systems
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Text Categorization.
Prediction Modeling for Personalization & Recommender Systems Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Data Mining Classification: Alternative Techniques
Collaborative Filtering Sue Yeon Syn September 21, 2005.
Distance and Similarity Measures
Learning for Text Categorization
K nearest neighbor and Rocchio algorithm
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Relevance Feedback based on Parameter Estimation of Target Distribution K. C. Sia and Irwin King Department of Computer Science & Engineering The Chinese.
SVM Active Learning with Application to Image Retrieval
Chapter 8 Collaborative Filtering Stand
University of Athens, Greece Pervasive Computing Research Group Predicting the Location of Mobile Users: A Machine Learning Approach 1 University of Athens,
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Presented by Zeehasham Rasheed
Recommender systems Ram Akella November 26 th 2008.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.
Online Learning Algorithms
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
A k-Nearest Neighbor Based Algorithm for Multi-Label Classification Min-Ling Zhang
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization Thorsten Joachims Carnegie Mellon University Presented by Ning Kang.
APPLICATIONS OF DATA MINING IN INFORMATION RETRIEVAL.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
Data mining and machine learning A brief introduction.
Bayesian Networks. Male brain wiring Female brain wiring.
Adaptive News Access Daniel Billsus Presented by Chirayu Wongchokprasitti.
Processing of large document collections Part 2 (Text categorization, term selection) Helena Ahonen-Myka Spring 2005.
Representation of Electronic Mail Filtering Profiles: A User Study Michael J. Pazzani Information and Computer Science University of California, Irvine.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.
Data Management and Database Technologies 1 DATA MINING Extracting Knowledge From Data Petr Olmer CERN
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 9 Instance-Based.
Classification Techniques: Bayesian Classification
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
Copyright R. Weber Machine Learning, Data Mining INFO 629 Dr. R. Weber.
USE RECIPE INGREDIENTS TO PREDICT THE CATEGORY OF CUISINE Group 7 – MEI, Yan & HUANG, Chenyu.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Iowa State University Department of Computer Science Center for Computational Intelligence, Learning, and Discovery Harris T. Lin, Sanghack Lee, Ngot Bui.
Text Categorization With Support Vector Machines: Learning With Many Relevant Features By Thornsten Joachims Presented By Meghneel Gore.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
A Supervised Machine Learning Algorithm for Research Articles Leonidas Akritidis, Panayiotis Bozanis Dept. of Computer & Communication Engineering, University.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
User Modeling and Recommender Systems: recommendation algorithms
1 Learning Bias & Clustering Louis Oliphant CS based on slides by Burr H. Settles.
Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
KNN & Naïve Bayes Hongning Wang
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Classification Techniques: Bayesian Classification
Text Categorization Assigning documents to a fixed set of categories
Instance Based Learning
Authors: Wai Lam and Kon Fan Low Announcer: Kyu-Baek Hwang
SVMs for Document Ranking
Presentation transcript:

Content-Based Recommendation Systems Michael J. Pazzani and Daniel Billsus Rutgers University and FX Palo Alto Laboratory By Vishal Paliwal

Additional Papers 1. Yoda: An Accurate and Scalable Web-based Recommendation System Cyrus Shahabi, Farnoush Banaei-Kashani, Yi-Shin Chen, and Dennis McLeod Department of Computer Science, Integrated Media Systems Center, University of Southern California, Los Angeles In the Proceedings of the Sixth International Conference on Cooperative Information Systems, Trento, Italy, September 2001 A recommendation system, called Yoda is introduced. Combines CF and Content based querying. Designed to support large-scale Web-based applications requiring highly accurate recommendations in real-time. Yoda’s accuracy surpasses that of the basic nearest-neighbor method by a wide margin (in most cases more than 100%).

2. A Maximum Entropy Web Recommendation System: Combining Collaborative and Content Features Xin Jin, Yanzan Zhou, Bamshad Mobasher School of Computer Science, Telecommunication, and Information Systems, DePaul University Copyright 2005 ACM X/05/0008 A new Web recommendation system is introduced in which collaborative features such as navigation or rating data as well as the content features accessed by the users are seamlessly integrated under the maximum entropy principle. System is able to achieve better recommendation accuracy with small number of recommendations. The proposed framework provides reasonable accuracy in predictions in the face of high data sparsity.

Outline Introduction Item Representation User Profiles Learning a user model Conclusion

Introduction CBR systems recommends an item to user. Maintains a profile of user’s interests. Being used in variety of domains. Web-pages, news articles, restaurants, TV programs. Main utility of CBR systems is in Selecting a subset of items to be displayed Determining an order to display the items

Item Representation Items are stored in db table. Each item has a unique identifier key. Data can be structured or unstructured. Structured – a list of restaurants. Unstructured – news articles Unstructured data is more complex Data can also be semi-structured Unrestricted text can be converted to structured representation.

User Profiles Most recommendation systems use profile of user’s interests. Generally there are two types of information. A model of user’s preferences i.e., a description of the types of items that interest the user. A history of the user’s interactions with the recommendation system, it can also include queries typed by the user. History in CBR also serves as training data for a machine learning algorithm that creates a user model.

User model can also be made by user customization. System provides an interface to construct a representation of user’s interests. Often check boxes and forms are used. There are many limitations of user customization, it can’t capture changing preferences. They do not provide a way to order the items. They do not provide detailed personalized recommendation.

Learning a user model Creating a model of the user’s preference from the user history. The training data is divided into categories. The binary categories ‘items the user likes’. ‘Items the user doesn’t like’. This is accomplished either by explicit feedback or by observing the users interactions with items. Importance of classification algorithms is in providing an estimate of the probability that a user will like an unseen item.

Classification algorithms in the paper Decision Trees and Rule Induction Nearest Neighbor Methods Relevance Feedback & Rocchio’s algorithm Linear Classifiers Probabilistic Methods & Naïve Bayes

Decision Trees & Rule Induction Ex. are ID3 (Quinlan, 1986), RIPPER (Cohen, 1995) Decision tree learners like ID3, build tree by recursively partitioning training data. They are excellent with structured data. They are not ideal for unstructured text classification tasks (Pazzani and Billsus, 1997)

Rule induction algorithm (RIPPER) are closely related to decision trees. RIPPER performs competitively with other state-of-the-art text classification algorithm. Ripper supports multi-valued attributes. This makes it a good for text classification tasks.

Nearest Neighbor Methods Nearest neighbor algorithm stores all of its training data, in memory. A new unlabeled item is compared to all stored items using a similarity function and determines the ‘nearest neighbor’ or the k nearest neighbors. Numeric score or class label is derived for the unseen item. Similarity function depends on the type of data. For structured data, a Euclidean distance metric is used. For vector space model, the cosine similarity is used.

Relevance Feedback & Rocchio’s Algorithm Principle is to allow users to rate documents returned by the retrieval system. There are explicit and implicit means of collecting relevance feedback data. Rocchio’s algorithm is widely used algorithm that operates in the vector space model. Based on the modification of initial query through differently weighted prototypes of relevant and non-relevant documents.

Linear Classifiers These algorithms learn linear decision boundaries. Hyper planes separating instances in a multi-dimensional space. Suitable for text classification tasks (Lewis et.al. 1996) Outcome of learning process is an n-dimensional weight vector. Threshold is used to convert continuous predictions to discrete class labels.

Widrow-Hoff rule, delta rule or gradient descent rule The weight vector can be derived incrementally. Learning rate controls the degree to which additional instance affects the previous weight vector. An alternative ‘exponential gradient’ (EG) algorithm. An advantage of above linear algorithms is that they can be performed on-line.

Probabilistic Methods and Naïve Bayes Naïve Bayes is an exceptionally well performing text classification algorithm. Two frequently used formulations of naïve Bayes Multi-variate Bernoulli Multinomial model Both model assumes that text doc. Are generated by a parameterized mixture model

Multivariate Bernoulli formulation was derived with structured data in mind. Multinomial formulation captures word frequency information. Multinomial naïve Bayes formulation outperforms multivariate Bernoulli model. This is noticeable particularly for large vocabularies (McCallum and Nigam, 1998)

Limitation & Extensions of CBR System CBR can’t give good recommendation if the content dos not contain enough information. Mainly to distinguish items the user likes from the items the user doesn’t like. A better approach is to use collaborative and content-based features. The paper presented variety of learning algorithms for adapting user profiles, however the choice of algorithm depends upon the representation of content. Thank You !