Using the Bayesian Information Criterion to Judge Models and Statistical Significance Paul Millar University of Calgary.

Presentation on theme: "Using the Bayesian Information Criterion to Judge Models and Statistical Significance Paul Millar University of Calgary."— Presentation transcript:

Using the Bayesian Information Criterion to Judge Models and Statistical Significance Paul Millar University of Calgary

Problems Choosing the best model Aside from OLS, few recognized standards Few ways to judge if adding an explanatory variable is justified by the additional explained variance Conventional p-values are problematic Large, small N Potential unrecognized relationships between explanatory variables Random associations not always detected

Judging Models Explanatory Framework Need to find the best or most likely model, given the data Two aspects Which variables should comprise the model? Which form should the model take? Predictive Framework Of the potential variables and model forms, which best predicts the outcome?

Bayesian Approach Origins (Bayes 1763) Bayes Factors (Jeffreys 1935) BIC (Swartz 1978) Variable Significance (Raftery 1995) Judging Variables and Models Stata Commands

A B Joint Distribution: (A,B) or (A B) A= Low Education B= High Income Bayes Law

Assume: Two Models Assume: Equal Priors Bayes Law and Model Probability

Jeffreys (1935) Allows comparison of any two models Nesting not required Explanatory framework Problem Complexity Challenging to solve Problem Bayes Law and Model Probability

An Approximation: BIC Bayesian Information Criterion (BIC) Function of N, df, deviance or 2 from the LRT Readily obtainable from most model output Allows approximation of the Bayes Factor Two versions relative to saturated model (BIC) or null model (BIC) Assumptions large N Nested Models Prior expectation of model parameters is multivariate normal Attributed to Schwartz (1978)

An Alternative to the t-test Produces over-confident results for large datasets Random relationships sometimes pass the test Widely varying results possible when combined with stepwise regression Only other significance testing method (re-sampling) provides no guidance on form or content of model

BIC-based Significance Raftery (1995) Examines all possible models with the given variables (2 k models) For each model calculates a BIC-based probability Computationally intensive

A Further Approximation Compare the model with all variables to the model without a specific variable Only requires a model for each IV (k) Experiment: k=10, n=100,000 VariableCoef.P>tbicdrop1 Pbic P Riv10.00250.436*0.9960.960 Riv20.00110.731*0.9970.968 Riv3-0.00440.167*0.9920.924 Riv40.00170.597*0.9960.965 Riv50.00210.507*0.9960.962 Riv60.00700.026*0.9630.651 Riv7-0.00250.428*0.9960.959 Riv8-0.00060.843*0.9970.970 Riv9-0.00130.684*0.9970.968 Riv100.00710.024*0.9610.631

-pre- Prediction only The reduction in errors for categorical variables logistic, probit, mlogit, cloglog Allows calculation of best cutoff The reduction in squared errors for continuous variables regress, etc. Allows comparison of prediction capability across model forms Ex. mlogit vs. ologit vs. nbreg vs. poisson

bicdrop1 Used when –bic– takes too long or when comparisons to the AIC are desired

-bic- Reports probability for each variable using Rafterys procedure Also reports pseudo-R 2, pre, bicdrop1 results Reports most likely models, given the theory and data (hence a form of stepwise)

Further Development -pre- –wise regression Find the combination of IVs and model specification that best predict the outcome variable Variable significance ignored Bayesian cross-model comparisons Safer than stepwise Bayes Factors Requires development of reasonable empirical solutions to integrals

Download ppt "Using the Bayesian Information Criterion to Judge Models and Statistical Significance Paul Millar University of Calgary."

Similar presentations