Experimental Design and Choice modelling. Motivating example Suppose we have three products which can be set at three price points Priced at $1, $2 and.

Slides:



Advertisements
Similar presentations
Week 2 – PART III POST-HOC TESTS. POST HOC TESTS When we get a significant F test result in an ANOVA test for a main effect of a factor with more than.
Advertisements

Things to do in Lecture 1 Outline basic concepts of causality
Choice modelling - an introduction. Class experiment uEverybody loves Chocolate- fact: Of the following choices: WhiteChewyNoNuts DarkChewyNoNuts WhiteSoftNoNuts.
Brief introduction on Logistic Regression
Chris Skedgel Research Health Economist Atlantic Clinical Cancer Research Unit, Capital Health.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Data Modeling and Parameter Estimation Nov 9, 2005 PSCI 702.
Evaluating Theoretical Models R-squared represents the proportion of the variance in Y that is accounted for by the model. When the model doesn’t do any.
Instrumental Variables Estimation and Two Stage Least Square
Chapter 19 Confidence Intervals for Proportions.
Mixture Designs Simplex Lattice Simplex Centroid
Section 4.2 Fitting Curves and Surfaces by Least Squares.
Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 4. Further Issues.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 4. Further Issues.
1 Branch and Bound Searching Strategies 2 Branch-and-bound strategy 2 mechanisms: A mechanism to generate branches A mechanism to generate a bound so.
Data mining and statistical learning, lecture 3 Outline  Ordinary least squares regression  Ridge regression.
2.3. Measures of Dispersion (Variation):
Lecture 10 Comparison and Evaluation of Alternative System Designs.
Incomplete Block Designs
Data measurement, probability and statistical tests
Introduction to Linear and Logistic Regression. Basic Ideas Linear Transformation Finding the Regression Line Minimize sum of the quadratic residuals.
Lorelei Howard and Nick Wright MfD 2008
Research Methods Steps in Psychological Research Experimental Design
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 3 Correlation and Prediction.
Objectives of Multiple Regression
Classification (Supervised Clustering) Naomi Altman Nov '06.
Discriminant Function Analysis Basics Psy524 Andrew Ainsworth.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
So far... We have been estimating differences caused by application of various treatments, and determining the probability that an observed difference.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Between groups designs (2) – outline 1.Block randomization 2.Natural groups designs 3.Subject loss 4.Some unsatisfactory alternatives to true experiments.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Curvilinear 2 Modeling Departures from the Straight Line (Curves and Interactions)
GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005.
February 2012 Sampling Distribution Models. Drawing Normal Models For cars on I-10 between Kerrville and Junction, it is estimated that 80% are speeding.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
Scientific Method & Descriptive Research Methods Module 5.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Assumption checking in “normal” multiple regression with Stata.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.
Robust Estimation With Sampling and Approximate Pre-Aggregation Author: Christopher Jermaine Presented by: Bill Eberle.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
Outline of Today’s Discussion 1.Introduction to Correlation 2.An Alternative Formula for the Correlation Coefficient 3.Coefficient of Determination.
The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
Advanced Statistics Factor Analysis, I. Introduction Factor analysis is a statistical technique about the relation between: (a)observed variables (X i.
Warsaw Summer School 2015, OSU Study Abroad Program Normal Distribution.
3 “Products” of Principle Component Analysis
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Multiple Regression Reference: Chapter 18 of Statistics for Management and Economics, 7 th Edition, Gerald Keller. 1.
AP PHYSICS 1 SUMMER PACKET Table of Contents 1.What is Physics? 2.Scientific Method 3.Mathematics and Physics 4.Standards of Measurement 5.Metric System.
Stats Methods at IC Lecture 3: Regression.
Different Types of Data
Choice modelling - an introduction
Sampling Design and Analysis MTH 494 Lecture-9
QM222 A1 On tests and projects
Analysis of Covariance (ANCOVA)
Introduction to Instrumentation Engineering
Chapter 12 Power Analysis.
Section 6.2 Prediction.
Chapter 3 Correlation and Prediction
Correlation and Causality
Lesson Overview 1.1 What Is Science?.
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Experimental Design and Choice modelling

Motivating example Suppose we have three products which can be set at three price points Priced at $1, $2 and $3 (note equally spaced). These can be recoded as -1,0 1 respectively (-$2 i.e. –mean centred) We have is a 3x3x3 design. We can measure: – the main effects for price for each model, called P1, P2 and P3 (also P1^2, P2^2, P3^2 for quadratic effects) –The 2 nd order interaction terms P1*P2, P1*P3 and P2*P3, –And 3 rd order interaction term P1*P2*P3

Motivating example What we wish to do is measure particular quantities of interest with the smallest number of scenarios (a.k.a. sets or runs) We want to have: – balance (equal sample sizes per combination) – and orthogonality (correlations between effects is zero)

How may scenarios do we need? If we have a straight linear main effects we the following tells us how many runs we may need (in SAS): %mktruns(3 3 3); Some Reasonable Design Sizes Cannot Be (Saturated=7) Violations Divided By So we may decide to go with n=18 scenarios

Let’s fit a main effects only model %mktdes(factors=x1-x3=3,n=18) proc print; run; Prediction Design Standard Number D-Efficiency A-Efficiency G-Efficiency Error ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Obs x1 x2 x

How does this work out? We have 100% efficiency for the effects we wish to measure (main effects) But if we look at the correlation matrix of effects we have the following:

Is this good enough? We see that the main effects are all orthogonal, but we have some correlation between these and the higher order interaction terms. (eg: P3^2 and P1*P3.) Is this a problem? –Well yes and no. No, if these effects are not of interest (e.g. P1*P3) – i.e. we suspect they don’t exist in real life. Yes, if we suspect they might and/or we or not sure if they do or not.

Is this good enough?… Well-known fact in almost cases involving real data (Louviere, Hensher, Swait, 2000) Main effects explain the largest amount of variance in respondent data, often 80% or more (70-90%); Two-way interactions account for the next largest proportion of variance, although this rarely exceeds 3%~6%; Three-way interactions account for even smaller proportions of variance, rarely more than 2%~3% (usually 0.5%~1%); Higher-order interactions account for minuscule proportions of variance.

Let’s fit a model with main effects with 2 nd order interactions %mktdes(factors=x1-x3=3, interact = x1*x2 x1*x3 x2*x3 x1*x2*x3,n=18) proc print; run; Design Standard Number D-Efficiency A-Efficiency G-Efficiency Error ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

So how do we do now? This is an unmitigated disaster when we only have 18 scenarios. So let’s change the number of scenarios we investigate. We can increase this to 27 – as this is divisible by 3x3 = 9 –i.e. every possible combination for two 3 level factors

Let’s fit a model with main effects with 2 nd order interactions (27 scenarios) %mktdes(factors=x1-x3=3, interact = x1*x2 x1*x3 x2*x3 x1*x2*x3,n=27) proc print; run; Design Standard Number D-Efficiency A-Efficiency G-Efficiency Error ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Conclusions Try to keep the number of scenarios (runs, sets) to less than 40 max. – otherwise you get respondent fatigue Only measure effects up to 2 nd order (3 rd order and above are difficult to explain and don’t account for much explanation If you have prior knowledge of which effects are more likely than others, then use this to establish which effects you want to measure.