Presentation is loading. Please wait.

Presentation is loading. Please wait.

Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

Similar presentations

Presentation on theme: "Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131."— Presentation transcript:

1 Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131

2 John is interested in an item: “iPhone 5 64gb white”, should we recommends – “iPhone 5 case” (or) – “iPhone 5s gold” Meet John Doe LSRS 20132

3 Recommendation on e-marketplace Recommendation “before” purchase – iPhone 5S gold Recommendation “after” purchase – iPhone 5 case Similar Item Recommendation (SIR) Related Item Recommendation (RIR) LSRS 20133

4 SIR- Example 1 LSRS 20134

5 SIR Example 2 LSRS 20135

6 Related Item Recommendation 6 Recommendations for Xbox 360 4GB on Checkout page LSRS 2013

7 Main Idea Similar Item Clustering (SIC) – Titles – Attributes (Price, etc.) – Images Recommendation – SIR: (same cluster) – RIR: (neighbor clusters) LSRS 20137

8 Models Item clusters Cluster represented by meaningful keywords – “clarks women shoe pumps classics” – “authentic handmade amish quilt” Cluster-Cluster Relations – “samsung galaxy s4” – “samsung galaxy s4 screen protector” – “wolfgang puck electric pressure cooker” – “kitchenaid food processor” LSRS 20138

9 System Architecture - Overview LSRS 20139 Inventory Cluster-Cluster Relations Transactions Clusters Conceptual Knowledgebase Offline Model GenerationThe Data StoreReal-time Performance System Similar Items Recommender (SIR) Related Items Recommender (RIR) Clusters Model Generation Related Clusters Model Generation Clickstream Lost Item Similar Items ?similarTo(item) Bought Item Related Items ?relatedTo(item)

10 Cluster Generation (offline) LSRS 201310

11 Data on eBay Item-item co-occurrences on transaction logs Large Data – Much bigger data set in both users and inventory than other ecommerce sites. Scale – More than 300M listings. – More than 10M new items every day LSRS 201311

12 Challenges Global clustering not feasible Size bias on different categories Performance LSRS 201312

13 Model Generation - Clusters 1.Select a few keyword to represents “big notions”, e.g. iPhone, Handbags, etc. – How to select? 2.Clustering by K-means – How to set K? LSRS 201313

14 Model Generation - Clusters new clusters items user queries concepts, categories query-to-items Query-Recall Generation Cluster Generation Clusters Model Generation Data Store Clusters Inventory Clickstream Conceptual Knowledgebase Problem: Global clustering not feasible Solution: Partition input data by user queries Parallel distributed K-Means in Hadoop MapReduce Dedupe and merge overlapping clusters (100X reduction in size over inventory with over 90% coverage) LSRS 201314

15 Base Cluster Generation Base Cluster ≡ Query Find merge candidates based on query term overlap – Eg: “nike airmax tennis shoes” -> “nike airmax” Score candidates using cosine similarity – Term weight : TF-IDF in the query space(document=query) TF : Query Demand IDF : Number of Queries LSRS 201315

16 Step 1: base cluster candidates Method for choosing the ``base clusters’’ (initial states): – Minimum frequency – Supply threshold (Enough Inventory) – Min and max token constraint (Length of queries) – Heuristic constraints Queries that have only numbers are not allowed: “10 5” … – Merge similar clusters into one LSRS 201316

17 candidates merge 4.34M base clusters merged into 1.95M Example phrase(hand,made) phrase(king,s) queen quilt phrase(hand,made) phrase(pink,s) quilt phrase(hand,made) phrase(prae,owned) queen quilt phrase(hand,made) queen quilt phrase(hand,made) phrase(prae,owned) quilt phrase(hand,made) quilt size twin phrase(hand,made) quilt silk phrase(hand,made) quilt twin phrase(hand,made) phrase(patch,work) quilt phrase(hand,made) quilt white phrase(hand,made) phrase(king,size) quilt phrase(hand,made) phrase(yo,yo,s) quilt phrase(hand,made) quilt sale phrase(hand,made) quilt red phrase(hand,made) quilt LSRS 201317

18 Step 2: K-Means Clustering Split Clusters Query to Items Data Base Cluster Generation K-Means Clustering of Base Clusters Generate Item Features Transaction Logs Inventory Logs Scoring Models LSRS 201318

19 Clusters on Item Signature apple ipod touch 4g clear film protector screen Cluster clarks women shoe pumps classics LSRS 201319

20 Recommendation (online) LSRS 201320

21 Performance System ClustersInventory Conceptual Knowledgebase ?similarTo(item) SIR query formation Item Selection Cluster Assignment SIR Ranking items Data Store Lost Item Similar Items recommendations Item Search query Clusters Inventory Conceptual Knowledgebase ?relatedTo(item) Item Selection Cluster Assignment RIR Ranking items Data Store Bought Item Related Items recommendations Item Search queries RIR Query Formation Cluster-Cluster Relations clusters related clusters LSRS 201321

22 Items in the same cluster LSRS 201322

23 Similar Item Recommendations LSRS 201323

24 Experimental Results A/B Tests comparing against legacy systems – SIR legacy system Completely online Naïve approach of using seed item title as a search query – RIR legacy system Chen, Y. and J.F. Canny, Recommending ephemeral items at web scale, ACM SIGIR 2011 Collaborative Filtering on stable representations of items – Significant improvements at 90% confidence interval SIR resulted in 38.18% higher user engagement (CTR) RIR resulted in 10.5% higher CTR Statistically significant improvement in site-wide business metrics from both SIR & RIR LSRS 201324

25 Conclusion Balance between similarity and quality crucial in driving user engagement and conversion Clusters of similar items in the inventory – Local clustering in the coverage set of user queries Offline models built using Map-Reduce – Huge input datasets including inventory, clickstream and transactional data Efficient real-time performance system Currently deployed on LSRS 201325

26 Acknowledgments Current & Past team members – Kranthi Chalasani – Santanu Kolay – Riyaaz Shaik – Venkat Sundaranatha LSRS 201326

27 WE’RE HIRING Chu-Cheng Hsieh LSRS 201327

Download ppt "Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131."

Similar presentations

Ads by Google