Download presentation
Presentation is loading. Please wait.
Published byEdith Walters Modified over 6 years ago
1
Application of the Bootstrap Estimating a Population Mean
Movie Average Shot Lengths Sources: Barry Sands’ Average Shot Length Movie Database L. Chihara and T. Hesterberg (2011). Mathematical Statistics with Resampling and R. Wiley, Hoboken, NJ.
2
Data Description Average Shot Length (seconds) for a population of films (Barry Sands’ movie database) Very highly right-skewed population. Min= LQ= Median= UQ= Max=1000 m = s = Coefficient of Variation: CV=100(12.765/7.739)=164.94% Goal: Small sample estimation of m with unknown small-sample sampling distribution of sample mean (in terms of shape)
5
Introduction to the Bootstrap
Makes use of a sample from a population to estimate the sampling distribution of a statistic/estimator. Treats the sample as an “estimate” of the population of measurements (sample empirical cumulative distribution function as estimate of population cdf)
6
Population and Sample Empirical CDF’s (sample size: n=25)
7
Applying the Bootstrap
Obtain a random sample of size n from the population Determine the estimator(s) of interest Compute the estimate(s) based on the sample: Determine B, the number of bootstrap samples to be taken Obtain B random samples of size n from the original sample with replacement Compute the estimate for each bootstrap sample: The bootstrap distribution is the collection of estimates The bootstrap standard error is the standard deviation of the estimates
8
Properties of the Bootstrap Sampling Distribution
Center: The center of the bootstrap sampling distribution is the estimate based on the full sample, not the population parameter it is estimating Spread: The spread is representative of the spread of the estimator’s sampling distribution Bias: Represents the difference between the center of the bootstrap sampling distribution and the true parameter the estimator is used for. The bootstrap bias estimate is accurate for the true bias. Skewness: Skewness in bootstrap sampling distribution is representative of the skewness of the estimator’s sampling distribution
9
Example – Movie Average Shot Lengths (ASL)
Interested in approximating the sampling distributions of the sample mean. Population value: m = 7.739 (Pseudo) Random sample of n=25 films’ ASLs:
10
Bootstrap Samples Taking B=10000 bootstrap samples from the original samples. Summaries for original sample, mean, sd, CV: > summary(ASL.sample1) Min. 1st Qu. Median Mean 3rd Qu. Max. > summary(ASL.mean) > summary(ASL.sd) > summary(ASL.CV)
12
Bootstrap Standard Error and Sampling Distribution
In terms of the sampling distribution of the sample mean: Mean of bootstrap sample means: (Close to original sample mean (8.1876), not so close to population mean (7.7394). Bootstrap estimate of bias: = Bootstrap standard error: Standard deviation of the bootstrap sample means: Bias/BSE=.0023/.7620=.0030 (0.30%) Bootstrap 95-percentile interval: (.025,.975) quantiles of the bootstrap mean sampling distribution: (6.7444,9.7113) which does include the population mean (7.739) Note: Interval is of the following form (reflecting an asymmetric bootstrap sampling distribution:
13
Bootstrap t Confidence Interval for m
14
ASL Example
15
Comparison of 3 Methods – 95% CI for m
Repeat methods described previously, based on each of M=1000 random samples from the original population. Obtain empirical coverage rates for each method based on the M=1000 random samples, with B=1000 bootstrap samples per random sample of n=25. Method 1: (t-interval based on normality assumption): Coverage Probability: Average width: 5.05 seconds Method 2: Bootstrap Percentile Interval: Coverage Probability: Average width: 4.40 seconds Method 3: Bootstrap t Confidence Interval: Coverage Probability: Average width: seconds
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.