Presentation on theme: "Recommender Systems and Product Semantics Rayid Ghani & Andy Fano Accenture Technology Labs Workshop on Recommendation & Personalization in E-Commerce."— Presentation transcript:
Recommender Systems and Product Semantics Rayid Ghani & Andy Fano Accenture Technology Labs Workshop on Recommendation & Personalization in E-Commerce May 28, 2002
Who we are? Accenture Technology Labs R&D Group for Accenture ~ 40 researchers in Chicago, Palo Alto (California) and Sophia Antipolis (France) Research in Data Mining, Machine Learning, Ubiquitous Computing, Wearable Computing, Language Technologies, Virtual & Augmented Reality, Collaborative Workspaces…
What Does a Transaction Mean? Terabytes of transaction data. But what does any one transaction mean? What does it tell us about the customer?
Example: Apparel Transactional information captured by retailers: Date of Purchase SKU Price Size Brand But what does this tell me about the customer who bought it?
Product Semantics: What does a product mean? What does this shirt say about her? Is it conservative or flashy? Trendy or classic? Formal or casual? Where would we get this information?
Where do people get this information? Marketing Product Companies and Retailers spend fortunes telling customers what their products mean. Our idea: Build a system that analyzes marketing texts to infer these attributes.
Example From the Macy’s web site: DKNY Jeans Ruched Side-Tie Tee Get back to basics with a fresh new look this season. The Ruched Side-Tie Tee has a drawstring tie at left hip with shirred detail down the side. Stretch provides a flattering, shapely fit. V-neck.
Product Descriptions Domain Experts Product descriptions marked up with attribute values Supervised Learning Algorithm Learned Statistical Models Training the System
Inferring Attributes via Text Classification Build one classifier per attribute type Simple statistical classifier – Naïve Bayes Multinomial model (McCallum & Nigam 1998) For all words (description) and attribute values: calculate P(word | attribute value) using the manually rated items Given a new item description: Calculate P(attribute value | item description) for all attribute values Use Maximum Likelihood
Semi-supervised Learning Lot of product descriptions available for minimal cost Labeling them is expensive Apply magical algorithms that combine labeled and unlabeled data for classification EM (Nigam et al. 1999), Co-Training (Blum & Mitchell 1999), Co-EM (Nigam & Ghani), ECo-Train (Ghani, 2002)
The EM Algorithm Naïve Bayes Learn from labeled data Estimate labels Probabilistically add to labeled data
Extremely Conservative lauren ralph breasted seasonless trouser jones sport classic blazer A Peek at the Learned Models Not Conservative (Flashy) rose special leopard chemise straps flirty spray silk platform Bias Slip Dress The perfect black dress gets flirty and feminine in the bias-cut slip dress with sheer ruffled cap sleeves. A low, scoop neck and back is ultra-flattering while a draped, romantic fit reveals total elegance. Lauren Single-Breasted Blazer Sporty elegance and classic Gatsby-esque styling are captured in this impeccably designed single-breasted, three-button blazer from Lauren by Ralph Lauren. With traditional notch collar, signature button hardware, front flap pockets, and signature crest on left breast pocket.
Informal jean tommy denim sweater pocket neck tee hilfiger formal jacket fully button skirt lines seam crepe leather A Peek at the Learned Models Polo Jeans Co. Muscle Logo Tee Strut your stuff in the Muscle Logo Tee. Flattering on the arms with a close-to-the-body fit, classic crewneck and shimmery logo print with stars. A sporty new basic for your tee collection. BLACK TRIACETATE JACKET A fresh alternative to classic suiting. Wear open for cardigan effect, buttoned for a clean look. Hidden placket with four tonal buttons and a hook-and-eye closure at the collar. Falls to hip. Lined.
Loungewear chemise silk kimono calvin klein august lounge hilfiger robe gown Partywear rock dress sateen length: skirt shirtdress open platform plaid flower A Peek at the Learned Models ABS by Allen Schwartz Asymmetrical Dress Just for the party girl with a big feminine streak. A ruffled one-shoulder cuts diagonally across the front and back. Accented with a rhinestone detail on the shoulder.
Extremely Sporty sneaker camp base rubber sole white miraclesuit athletic nylon Mesh Juniors jrs dkny jeans tee collegiate logo tommy polo short sneaker A Peek at the Learned Models DKNY Jeans Jrs. Mesh Jersey Sweater An innovative take on the football jersey, the see-through mesh sweater is a fashion favorite among the sporty set. Denim appliqué
Populating the Knowledge Base New Product Descriptions Product descriptions automatically marked up with attribute values Learned Statistical Models Product Semantics Knowledge Base
Retailer’s Web Site Extracted Descriptions of Products Browsed Product Semantics Knowledge Base Learned Statistical Models Evolving User Profile Query the Knowledge Base for Matching Products Recommend Matching Products to User Recommender System
Advantages over Traditional Recommendation Systems This approach provides us some of the underlying attributes that characterize a customer’s preference. We can therefore begin to explain the preference rather than simply rely on the co-occurrence of purchases (e.g. people who bought x also bought y). This helps with: Handling new products/rapidly changing products Low Frequency Products Cross Category Recommendations
Cross-Category Recommendations Difficult for collaborative filtering and content- based systems Build a model of the user - personality, stylistic attributes Taste in clothing might also be suggestive of taste in other products, say furniture and home decoration Create models for different product classes and create mappings among these models
Summary “Understand” a product and hence the customer Use Text Learning (supervised and semi- supervised) to abstract from product (description) to subjective, domain-specific features Effective for new (and low frequency) products and for cross-category recommendations