Presentation on theme: "CLASH OF CAUSAL INFERENCE TECHNIQUES TERRENCE WILLETT AND NATHAN PELLEGRIN RP GROUP CONFERENCE KELLOGG WEST APRIL 2014."— Presentation transcript:
CLASH OF CAUSAL INFERENCE TECHNIQUES TERRENCE WILLETT AND NATHAN PELLEGRIN RP GROUP CONFERENCE KELLOGG WEST APRIL 2014
OUTCOMES Describe purpose of regression and propensity score matching (PSM) Explain data requirements and basic procedure of regression and PSM Compare and contrast regression and PSM Identify additional resources for further exploration
WHY CAUSAL INFERENCE If you need to use statistics, then you should design a better experiment. –attributed to Rutherford Most education research is observational/correlational, not experimental.
COMMON SCENARIO Did participation in an activity, class or support result in higher outcomes for students than would have happened had they not participated? Students self-selected to participate and/or were recruited to participate. When participants are compared to non-participants, differences in outcomes can be attributed to differences in background variables or motivation. Can we determine if the participation caused a change in outcomes? No, but…
Classic correlational technique Covariates used in model to attempt to control for differences in background variables or motivation Background variables can include measures of or proxies for skill level, social capital, or socio-economic status Measures of self-motivation often unavailable Models are imperfect and generally must be combined with other evidence to more completely describe the possible influence of an intervention, program or strategy
PROPENSITY SCORE MATCHING One of several ways to create a matched comparison group of non-participants intended to be similar to participant group for a valid comparison Likelihood or other techniques used to create a score indicating the likelihood that a particular non-participant would have been a participant based on similarity to one or more participants
THE COUNTERFACTUAL (POTENTIAL OUTCOMES) FRAMEWORK
THE POTENTIAL OUTCOMES MATRIX Potential Outcome YtYc Actual Treatment Status (T) 1 0 = observable = not observable
AVERAGE TREATMENT EFFECT (ATE) Participants and non-participants differ systematically (w.r.t. demographics, trajectories, risk profiles, self-selection, etc.) Different people respond differently to treatment (differential response) These facts must be taken into account when modeling/computing treatment effects. This means all four cells of the matrix must be estimated in order to obtain an average treatment effect ! How ?
SYMBOLIC DERIVATION OF AVERAGE EFFECTS ATE ATT ATU
CONDITIONAL INDEPENDENCE (PERFECT STRATIFICATION) (SELECTION ON OBSERVABLES)
ASSUMING CI… Potential Outcome YtYc Actual Treatment Status (T) 1 0 = observable = not observable Y(t) Y(c)
AVERAGE TREATMENT EFFECT (ATE)
PROS AND CONS Regressions can be easier to run but harder to explain to a general audience PSM can be more time consuming to conduct but easier to explain to a general audience Regressions tend to perform better with large data sets while PSM tends to perform better with few observations provided the non-participant group has sufficient numbers of individuals with the key confounding variables Regressions have been used for many years and are well described mathematically with broad consensus on proper error terms PSM is newer and there is not consensus on optimal matching procedures or proper error terms Regression will use all cases with non-missing data while PSM may only a subset of cases from the pool of non-participants All analytic methods suffer if key variables are not available Conclusions can often be the same with either method
HOW TO RUN PSM Create data file (95% of effort) Match participants and non-participants on a set of control variables to create a comparison group with similar proportions on all characteristics (i.e. comparison group would have a similar percent female, Hispanic, low income, etc. as compared to the participant group) This step is referred to as balancing and generally must be repeated several times to obtain balance on all variables of interest either by adjusting matching criteria or removing variables Run comparative analyses, which can include simple t-tests, post-PSM regressions, or other techniques Major packages that conduct PSM include STATA, R, and SAS STATA version 12 and older have psmatch2, v13 has teffects psmatch Note SPSS/PASW does not do PSM directly but there is an R plugin for SPSS
AN ALTERNATIVE STRATEGY TO DECIDING THIS BATTLE (WARNING: EMPIRICISTS APPROACHING)
ALTERNATIVE PERSPECTIVES Estimating program effects based on observational data can also be understood as… Delimiting the error built-into inductive reasoning about causes (the problem of induction) An inverse problem in the study of social dynamics. A missing data problem An optimization problem From these alternate perspectives there is a large menu of methods and extensions, including Euclidean, Mahalanobis and Gowers distance, nearest neighbor, cosine similarity, kernal functions, genetic algorithms, imputation, dimensionality reduction, and other supervised/unsupervised learning algorithms A criterion that can be applied to regression, PSM and other methods is this: how do they perform at predicting new (future) observations? (false positives, false negatives, correlated errors) In empirical investigations we do not answer this question just once using new replications or sampling under varying conditions. It is applicable in any situation where predictions are made concerning new (possibly future) observations. We update our models based on new evidence. And in this respect, regression and PSM methods can both be used as tools of discovery; as ways to extend our understanding of the processes producing patterns we find in sets of observations (and in streams of information, generally). SO: CHOOSE THE METHOD/MODEL WHICH YIELDS THE SMALLEST PREDICTION ERROR. That may decide a battle in a particular setting (or occasion), but the war between methods will go on ….
The inability to predict outliers implies the inability to predict the course of history Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly ImprobableNassim Nicholas TalebThe Black Swan: The Impact of the Highly Improbable If you insist on strict proof (or strict disproof) in the empirical sciences, you will never benefit from experience, and never learn from it how wrong you are. Karl Raimund Popper, The Logic of Scientific Discovery: Logik Der Forschung (2002), 28.
FURTHER READING Angrist, J. D., & Pischke, J. (2008). Mostly Harmless Econometrics: An Empiricists Companion Morgan, S., Harding, D. (2006) Matching Estimators of Causal Effects: From Stratification and Weighting to Practical Data Analysis Routines Caliendo and Kopeinig Practical Guide for PSM Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, Padgett, R.; Salisbury, M.; An, B.; & Pascarella, E. (2010). Required, Practical, or Unnecessary? An Examination and Demonstration of Propensity Score Matching Using Longitudinal Secondary Data. New Directions for Institutional Research – Assessment Supplement (pp ). San Francisco, CA: Jossey-Bass. Soledad Cepeda, M.; Boston, R.; Farrar, J., & Strom, B. (2003). Comparison of Logistic Regression versus Propensity Score When the Number of Events Is Low and There Are Multiple Confounders. American Journal of Epidemiology, 158,
THANK YOU Terrence Willett Director of Planning, Research, and Knowledge Systems Cabrillo College Nathan Pellegrin Data Processing Specialist Peralta District