Presentation is loading. Please wait.

Presentation is loading. Please wait.

Consumer Behavior Prediction using Parametric and Nonparametric Methods Elena Eneva Carnegie Mellon University 25 November 2002

Similar presentations


Presentation on theme: "Consumer Behavior Prediction using Parametric and Nonparametric Methods Elena Eneva Carnegie Mellon University 25 November 2002"— Presentation transcript:

1 Consumer Behavior Prediction using Parametric and Nonparametric Methods Elena Eneva Carnegie Mellon University 25 November 2002 eneva@cs.cmu.edu

2 Recent Research Projects Dimensionality Reduction Methods and Fractal Dimension (with Christos Faloutsos) Learning to Change Taxonomies (with Valery Petrushin, Accenture Technology Labs) Text Re-Classification Using Existing Schemas (with Yiming Yang) Learning Within-Sentence Semantic Coherence (with Roni Rosenfeld) Automatic Document Summarization (with John Lafferty) Consumer Behavior Prediction (with Alan Montgomery [Business school] and Rich Caruana [SCS])

3 Outline Introduction & Motivation Dataset Baseline Models New Hybrid Models Results Summary & Work in Progress

4 How to increase profits? Without raising the overall price level? Without more advertising? Without attracting new customers?

5 A: Better Pricing Strategies Encourage the demand for products which are most profitable for the store Recent trend to consolidate independent stores into chains Pricing doesn’t take into account the variability of demand due to neighborhood differences.

6 A: Micro-Marketing Pricing strategies should adapt to the neighborhood demand The basis: the difference in interbrand competition in different stores Stores can increase operating profit margins by 33% to 83% [Montgomery 1997]

7 Understanding Demand Need to understand the relationship between the prices of products in a category and the demand for these products Price Elasticity of Demand

8 Price Elasticity consumer’s response to price change inelasticelastic Q is quantity purchased P is price of product

9 Prices and Quantities Q demanded of a specific product is a function of the prices of all the products in that category This function is different for every store, for every category

10 The Function Category Price of Product 1 Price of Product 2 Price of Product 3 Price of Product N... “I know your customers” Predictor Quantity bought of Product 1... Quantity bought of Product 2 Quantity bought of Product 3 Quantity bought of Product N Need to multiply this across many stores, many categories.

11 How to find this function? Traditionally – using parametric models (linear regression)

12 Data Example

13 Data Example – Log Space

14 The Function Category Price of Product 1 Price of Product 2 Price of Product 3 Price of Product N... “I know your customers” Predictor Quantity bought of Product 1... Quantity bought of Product 2 Quantity bought of Product 3 Quantity bought of Product N Need to multiply this across many stores, many categories. convert to ln spaceconvert to original space

15 How to find this function? Traditionally – using parametric models (linear regression) Recently – using non-parametric models (neural networks)

16 Our Goal Advantage of LR: known functional form (linear in log space), extrapolation ability Advantage of NN: flexibility, accuracy robustness accuracy NN new LR Take Advantage: use the known functional form to bias the NN Build hybrid models from the baseline models

17 Evaluation Measure Root Mean Squared Error (RMS) the average deviation between the true quantity and the predicted quantity

18 Error Measure – Unbiased Model which is an unbiased estimator for q. is a biased estimator for q, and we correct the bias by using by computing the integral over the distribution but

19 Dataset Store-level cash register data at the product level for 100 stores Store prices updated every week Two Years of transactions Chilled Orange Juice category (12 Products)

20 Models Hybrids –Smart Prior –MultiTask Learning –Jumping Connections –Frozen Jumping Connections Baselines –Linear Regression –Neural Networks

21 Baselines Linear Regression Neural Networks

22 q is the quantity demanded p i is the price for the i th product K products overall The coefficients a and b i are determined by the condition that the sum of the square residuals is as small as possible. Linear Regression

23

24 Results - RMS Error RMS

25 Neural Networks Generic nonlinear function approximators Collection of basic units (neurons), computing a (non)linear function of their input Random initialization Backpropagation Early stopping to prevent overfitting

26 Neural Networks 1 hidden layer, 100 units, sigmoid activation function

27 Results RMS RMS

28 Hybrid Models Smart Prior MultiTask Learning Jumping Connections Frozen Jumping Connections

29 Smart Prior Idea: Initialize the NN with a “good” set of weights; help it start from a “smart” prior. Start the search in a state which already gives a linear approximation NN training in 2 stages –First, on synthetic data (generated by the LR model) –Second, on the real data

30 Smart Prior LR

31 Results RMS RMS

32 Multitask Learning Idea: learning an additional related task in parallel, using a shared representation Adding the output of the LR model (built over the same inputs) as an extra output to the NN Make the NN share its hidden nodes between both tasks [Caruana 1997]

33 MultiTask Learning Custom halting function Custom RMS function

34 Results RMS RMS

35 Jumping Connections Idea: fusing LR and NN Modify architecture of the NN Add connections which “jump” over the hidden layer Gives the effect of simulating a LR and NN together

36 Jumping Connections

37 Results RMS RMS

38 Frozen Jumping Connections Idea: show the model what the “jump” is for Same architecture as Jumping Connections, but two training stages Freeze the weights of the jumping layer, so the network can’t “forget” about the linearity

39 Frozen Jumping Connections

40

41

42 Results RMS RMS

43 Models Hybrids –Smart Prior –MultiTask Learning –Jumping Connections –Frozen Jumping Connections Baselines: –Linear Regression –Neural Networks Combinations –Voting –Weighted Average

44 Combining Models Idea: Ensemble Learning Use all models and then combine their predictions Committee Voting Weighted Average 2 baseline and 3 hybrid models (Smart Prior, MultiTask Learning, Frozen Jumping Conections)

45 Committee Voting Average the predictions of the models

46 Results RMS RMS

47 Weighted Average – Model Regression Optimal weights determined by a linear regression model over the predictions

48 Results RMS RMS

49 Normalized RMS Error Compare model performance across stores with different: –Sizes –Ages –Locations Need to normalize Compare to baselines Take the error of the LR benchmark as unit error

50 Normalized RMS Error

51 Summary Built new models for better pricing strategies for individual stores, categories Hybrid models clearly superior to baselines for customer choice prediction Incorporated domain knowledge (linearity) in Neural Networks New models allow stores to –price the products more strategically and optimize profits –maintain better inventories –understand product interaction www.cs.cmu.edu/~eneva Category P of Prod1 P of Prod2 P of Prod3 P of ProdN... “I know your customers ” Predictor Q bought of Prod1... Q bought of Prod2 Q bought of Prod3 Q bought of ProdN

52 References Montgomery, A. (1997). Creating Micro- Marketing Pricing Strategies Using Supermarket Scanner Data West, P., Brockett, P. and Golden, L (1997) A Comparative Analysis of Neural Networks and Statistical Methods for Predicting Consumer Choice Guadagni, P. and Little, J. (1983) A Logit Model of Brand Choice Calibrated on Scanner data Rossi, P. and Allenby, G. (1993) A Bayesian Approach to Estimating Household Parameters

53 Work In Progress analyze Weighted Average model compare extrapolation ability of new models Other MTL tasks: –shrinkage model – a “super” store model with data pooled across all stores –store zones

54 On one hand… In log space, Price-Quantity relationship is fairly linear

55 On the other hand… the derivation of consumers' demand responses to price changes without the need to write down and rely upon particular mathematical models for demand

56 “The” Model Category Price of Product 1 Price of Product 2 Price of Product 3 Price of Product N... “I know your customers” Predictor Quantity bought of Product 1... Quantity bought of Product 2 Quantity bought of Product 3 Quantity bought of Product N Need to multiply this across many stores, many categories. convert to ln spaceconvert to original space

57 Problem Definition For a set of products –Given the price distribution –Predict the consumption distribution Change in price of one product affects the consumption of all other products

58 Assumptions Independence –Substitutes: fresh fruit, other juices –Other Stores Stationarity –Change over time –Holidays

59 The Most Important Slide for this presentation and the paper: www.cs.cmu.edu/~eneva/ eneva@cs.cmu.edu

60 Converting Predictions to Original Space


Download ppt "Consumer Behavior Prediction using Parametric and Nonparametric Methods Elena Eneva Carnegie Mellon University 25 November 2002"

Similar presentations


Ads by Google