Presentation is loading. Please wait.

Presentation is loading. Please wait.

Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 2: Using the Bootstrap: Throw Out Those Messy.

Similar presentations


Presentation on theme: "Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 2: Using the Bootstrap: Throw Out Those Messy."— Presentation transcript:

1 Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician http://gcrc.humc.edu/Biostat Session 2: Using the Bootstrap: Throw Out Those Messy Statistics Formulas?

2 Outline Case study: Example of bootstrap Basics: Precision of estimates Practical difficulties Bootstrap concept Particular use in our case study Other uses Bootstrap References: 1.Tutorial pdf* : www.insightful.com/hesterberg/bootstrap 2.Haukoos and Lewis. Acad Emerg Med 2005; 12(4):360-365. *Source for most figures here.

3 Case Study Outcome: Labor progression was estimated by the duration of labor for each cm of cervical dilation using serial vaginal exams. Predictor: Classification as overweight, obese or normal weight. Many possible confounding factors, e.g. fetal size.

4 Paper: Major Results p-value is based on the precision of the estimated durations of 6.20 and 7.52 hours. p<0.01 ↔ the 99% CI is centered at 7.52-6.20 and is above 0 (so it is unlikely that there is no difference).

5 Case Study: From statistical methods Survival analysis methods were needed to estimate the median duration from, say 4 to 5 cm dilation, since the exact times were not known. Nevertheless, they did not just use the p-values from the software output, but used “bootstrapping”. Why?

6 Basics: Precision vs. Normal Range Suppose that a random sample of 100 women shows a 4-10 cm dilation duration mean±SD of 7.0±1.25. Normal (95%) range is ~ 7±2(1.25) = 4.5 to 9.5 hours. With no other info about a patient, we predict she will have between 4.5 and 9.5 hours, based on SD=variation among individuals. But, how well (precise) did the study estimate the mean duration of all women? We only have one mean, but want 7.0±2(SD of means).

7 Estimating Precision from Theory “All” pregnant women To get SD of mean, conceptually take many samples: Of course, we don’t have the luxury of more than 1 sample. From math theory, SD of a mean of N is SD/√N = SEM: “All” pregnant women Mean of X-bars SD of X-bars

8 Extensions of the Theory The theory has been extended beyond means (SEM) to SEs for more complex measures, such as predictions from regression: Blue bands are “normal ranges” and red bands are CIs, showing precision. But, the relation is not just a factor of √N, as with means.

9 Difficulties For most situations, standard errors (SE) have been developed based on theory. They may not have been, or may be inaccurate in some circumstances: The sample may not be a simple random one that is required for standard SE formulas. There may be non-sampling sources of variation, e.g., using estimated results from one analysis in further analyses. There may be approximations or assumptions required for the formulas based on theory that are known to not hold.

10 The Bootstrap Standard Error Obtain a single sample of size N. Then: 1.Create thousands of samples with replacement of size N, called “bootstrap samples” or “resamples” from the original sample. 2.Calculate for the quantity of interest, M, the bootstrap estimate, M*, for each sample. 3.Find the bootstrap distribution of these quantities, and in particular their SD, which is the bootstrap SE: M M* M*s are the bootstrap estimates of M

11 The Bootstrap SE: Concept Consider a sample of N=6 with 3 bootstrap samples: Mean±SD of original sample = 4.46±7.54. SEM = 7.54/√6 = 3.08 Bootstrap SEM is SD(4.13,4.64,1.74) = 1.55 Here, bootstrap SE is awful since only 3 samples were drawn. Typically, thousands are used.

12 The Bootstrap SE: Software A programmer could set up any software that has a sampling component to obtain bootstrap SEs for any estimates it can produce. It just needs a do-loop and the standard SD formula. But statistics software is starting to make this less tedious: SAS: Version 9 has bootstrap estimates as an option for many modules. There was a macro for earlier versions, described in the Haukoos&Lewis paper. Stata and R or S-Plus: user can request bootstraps for all procedures. To obtain the SE for the median of ‘time’ using 1000 bootstrap samples in Stata, the following preface to the median command could be used [centile(50) and r(c_1) obscurely refer to median]: bootstrap r(c_1), reps(1000): centile time, centile(50)

13 Back to Labor Progression Case Study Why was bootstrapping used in the paper? The design used a stratified random sample, not a simple random sample: There are SE formulas for some quantities from studies that over-sample some groups, as was done here, but perhaps not for these adjusted medians.

14 Other Bootstrap Advantages Typically fewer assumptions. Very general and reliable: can use the same software code for many estimation problems that have different formulas. Some studies show that greater accuracy is obtained than with classical methods. Can model the entire estimation process, not just sampling error. See next 2 slides.

15 Labor Progression Paper: Covariate Adjustment Method that was Used Recall that durations were adjusted for several covariates, including oxytocin use and fetal size. The oversampling of heavier women was accounted for with bootstrapping so that these adjustments were unbiased. A single set of covariates was used for all bootstrap samples.

16 Labor Progression Paper: Alternative Covariate Adjustment Recall that durations were adjusted for several covariates, including oxytocin use and fetal size. The entire process of selecting covariates could have been performed separately with each bootstrap sample. → Possibly different covariates in each bootstrap sample. This could better incorporate the uncertainty of choosing which factors need to be used for adjustment.

17 Conclusions Bootstrapping can avoid the requirement of unnecessary assumptions. It is not needed in most applications. It is needed for studies w/o simple random sampling, unless software for other sampling designs is used. For our paper here, it probably had a small impact, but could have been used to gain further advantages, e.g. adjustment. In general, it is of relatively minor importance for most studies. Just excluding a confounder would typically have a greater impact.


Download ppt "Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 2: Using the Bootstrap: Throw Out Those Messy."

Similar presentations


Ads by Google