Presentation on theme: "A Graph-based Recommender System Zan Huang, Wingyan Chung, Thian-Huat Ong, Hsinchun Chen Artificial Intelligence Lab The University of Arizona 07/15/2002."— Presentation transcript:
A Graph-based Recommender System Zan Huang, Wingyan Chung, Thian-Huat Ong, Hsinchun Chen Artificial Intelligence Lab The University of Arizona 07/15/2002 Acknowledgement: NSF DLI-2 (IIS-9817473)
Agenda Introduction Literature Review A Graph-based Recommender System Research Questions Research Testbed and Experiments Conclusion and Discussion Questions and Comments
Introduction Recommender System: From Business Application to Digital Libraries
Product information User information Interaction information between users and products Challenges for both buyers and sellers Information Overload
Recommender Systems Automatic recommendation generation Substantial research interests –PHOAKS (1997), Syskills & Webert (1997), Fab (1997), GroupLens (1998) Commercial applications –Amazon.com, CDNOW, Drugstore, MovieFinder –Business success (Schafer et al. 2001) Browser to buyer Cross-selling Customer loyalty
Digital Libraries Information overload –Library content information –User information –Library usage information Recommender system for Digital Libraries –Efficient knowledge dissemination –User satisfaction
Literature Review Recommender System: System Inputs and Recommendation Approaches
Recommender System Recommending items to users by predicting user’s interest in an item based on various sorts of information including item, user information and interactions between users and items. Items - documents, web pages, books, movies, restaurants, etc.
System Inputs User factual data –Demographic information Item factual data –Structural attribute information –Textual description/content information Transactional data –Explicit feedback – rating, comments –Implicit feedback – purchase, browsing
Recommendation Approaches Content-based approach –Based on item factual data –Item neighborhood formation –Machine learning methods Collaborative filtering approach –Based on user factual data and transactional data –User neighborhood formation –Similarity functions, correlation, clustering –Collaborative filtering association rules (Fu et al. 2000)
Recommendation Approaches (cont.) Hybrid approach –Combining content-based approach and collaborative filtering approach Combining recommendation results –(Claypool et al. 1999) Collaborative filtering augmented by content analysis –(GroupLens, Sarwar et al. 1998, Fab, Balabanovic and Shoham 1997) Comprehensive models –(Basu et al. 1998)
A Graph-based Recommender System Model and Recommendation Methods
A Two-layer Graph Model Goal –Comprehensive representation –Support flexible recommendation approaches A two-layered graph model –User layer – users as nodes, user similarity as links –Item layer – items as nodes, item similarity as links –Inter-layer links – interaction between user and items
Model Characteristics Comprehensiveness –All three types of system inputs –Transformation of feature data into similarity data Flexibility –Flexible similarity calculation –Multiple types of transactional data Recommendation as a graph search task –Finding item nodes highly associated with the user nodes –Support different recommendation approaches –Different association calculation methods
Recommendation Approaches Content-based approach –Starting from item nodes associated with the target user, exploring the item-layer links Collaborative filtering approach –Starting from the target user node, exploring the user-layer links and inter-layer links Hybrid approach –Starting with the target user node, exploring all three types of links
Recommendation Methods Low-degree association –Exploring direct associations High-degree association –Exploring transitive associations A simple example –1-degree association = 0 –2-degree association = 0.5*0.6=0.3 (C1-B2-B1) –3-degree association = 0.3+0.21+0.12+0.28=0.91 (C1-B2-B1, C1-C2-B2-B1, C1-B2-B3-B1, C1-C2-B3-B1)
Recommendation Methods (cont.) High-degree association recommendation algorithm –High-degree association retrieval in associative retrieval literature –Hopfield Net Spreading Activation (Chen and Ng 1995, Houston et al. 2000) Item and user nodes as neurons and links as synapses in the Hopfield Net Parallel relaxation search Stop until activation values in the network converge Item nodes with highest activation values as recommendations
Recommender System Problems Content-based recommendation –Over-specification Collaborative filtering recommendation –Early rater problem –Sparsity problem Possible solutions –Hybrid recommendation approach –High-degree association recommendation
Research Questions Whether hybrid recommendation approach achieves higher recommendation quality over content-based or collaborative filtering approaches? Whether high-degree association recommendation improves the recommendation quality?
Research Testbed A online bookstore –Books.com.tw –One of the biggest online bookstores in Taiwan –Data Set 2000 Customers 9695 Books 18771 Transaction Records Similarity with a typical digital library environment –Books with description and attributes – Electronic documents in DL –Customer demographic information – DL user demographic information –Customers with purchase history – DL users with browsing or borrowing histories
Implementation Details Book representation –Book attributes (price, publisher,layout, etc.) –Book content (title, keyword, introduction, etc.) Chinese key phrase extraction –Mutual Information algorithm (Ong and Chen 1999) Similarity calculation –Attribute based similarity –Book content similarity An asymmetric algorithm based on key phrase vector model (Houston et al. 2000)
Book Sales Transactions 2000.12000.22000.32000.4
Experiment Procedure Holdout testing –Use half of the purchases (past purchases) to make recommendations. See if they match the other half (future purchases). (Sarwar et al. 1998) –Used 100 randomly selected customers as sample data. Measurement of recommendation quality
Hypotheses Hybrid recommendation approach achieves better performance than content-based recommendation approach Hybrid recommendation approach achieves better performance than collaborative filtering recommendation approach Exploring high-degree associations achieves better performance than only exploring low- degree associations.
Experiment Results Statistical results –Hybrid approach achieved significantly higher precision and recall than content-based (t-test p-value: precision: 0.0058, recall: 0.0000) and collaborative approaches (t-test p-value: precision: 0.0016, recall: 0.0002) –No significant difference between high-degree association and low- degree association methods
A generic graph model for recommender systems –Comprehensive data representation –Flexible recommendation approaches –Applicable in Digital Libraries A hybrid approach improved recommendation quality No significant improvement was observed for high-degree association methods
Conclusion and Discussion (cont.) About low precision and recall –The gap between interest and purchase behavior –Online bookstore data might not fully represent users’ interests High-degree association method –Poor performance might be related to the density of the graph
Recent Development The relationship between high-degree association recommendation performance and graph density Implementation of association rule mining under the graph model for different recommendation approaches Implementation of other associative retrieval algorithms for high-degree association recommendation –Associative Linear Retrieval Model –Leaky Capacity Model Spreading Activation –Branch-and-Bound Spreading Activation
For Project Information http://ai.bpa.arizona.edu email@example.com Acknowledgement NSF DLI-2 #9817473
Your consent to our cookies if you continue to use this website.