What Can We Do When Conditions Aren’t Met? Robin H. Lock, Burry Professor of Statistics St. Lawrence University BAPS at 2011 JSM Miami Beach, August 2011.

Slides:



Advertisements
Similar presentations
Panel at 2013 Joint Mathematics Meetings
Advertisements

StatKey Online Tools for Teaching a Modern Introductory Statistics Course Robin Lock St. Lawrence University USCOTS Breakout – May 2013 Patti Frazer Lock.
Review bootstrap and permutation
What Can We Do When Conditions Arent Met? Robin H. Lock, Burry Professor of Statistics St. Lawrence University BAPS at 2012 JSM San Diego, August 2012.
Confidence Intervals: Bootstrap Distribution
What Can We Do When Conditions Aren’t Met? Robin H. Lock, Burry Professor of Statistics St. Lawrence University BAPS at 2014 JSM Boston, August 2014.
Bootstraps and Scrambles: Letting Data Speak for Themselves Robin H. Lock Burry Professor of Statistics St. Lawrence University Science.
Using Randomization Methods to Build Conceptual Understanding in Statistical Inference: Day 1 Lock, Lock, Lock Morgan, Lock, and Lock MAA Minicourse –
Using Randomization Methods to Build Conceptual Understanding in Statistical Inference: Day 1 Lock, Lock, Lock, Lock, and Lock MAA Minicourse – Joint Mathematics.
Hypothesis Testing: Intervals and Tests
Bootstrap Distributions Or: How do we get a sense of a sampling distribution when we only have ONE sample?
Early Inference: Using Bootstraps to Introduce Confidence Intervals Robin H. Lock, Burry Professor of Statistics Patti Frazer Lock, Cummings Professor.
Hypothesis Testing I 2/8/12 More on bootstrapping Random chance
Intuitive Introduction to the Important Ideas of Inference Robin Lock – St. Lawrence University Patti Frazer Lock – St. Lawrence University Kari Lock Morgan.
Models and Modeling in Introductory Statistics Robin H. Lock Burry Professor of Statistics St. Lawrence University 2012 Joint Statistics Meetings San Diego,
Section 3.4 Bootstrap Confidence Intervals using Percentiles.
A Fiddler on the Roof: Tradition vs. Modern Methods in Teaching Inference Patti Frazer Lock Robin H. Lock St. Lawrence University Joint Mathematics Meetings.
Sample Size Determination In the Context of Hypothesis Testing
Connecting Simulation- Based Inference with Traditional Methods Kari Lock Morgan, Penn State Robin Lock, St. Lawrence University Patti Frazer Lock, St.
Let sample from N(μ, σ), μ unknown, σ known.
Starting Inference with Bootstraps and Randomizations Robin H. Lock, Burry Professor of Statistics St. Lawrence University Stat Chat Macalester College,
Using Simulation Methods to Introduce Statistical Inference Patti Frazer Lock Kari Lock Morgan Cummings Professor of Mathematics Assistant Professor of.
Building Conceptual Understanding of Statistical Inference with Lock 5 Dr. Kari Lock Morgan Department of Statistical Science Duke University Wake Forest.
Bootstrapping: Let Your Data Be Your Guide Robin H. Lock Burry Professor of Statistics St. Lawrence University MAA Seaway Section Meeting Hamilton College,
Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.
Introducing Inference with Simulation Methods; Implementation at Duke University Kari Lock Morgan Department of Statistical Science, Duke University
Using Bootstrap Intervals and Randomization Tests to Enhance Conceptual Understanding in Introductory Statistics Kari Lock Morgan Department of Statistical.
Statistical Inference Dr. Mona Hassan Ahmed Prof. of Biostatistics HIPH, Alexandria University.
Ch 10 Comparing Two Proportions Target Goal: I can determine the significance of a two sample proportion. 10.1b h.w: pg 623: 15, 17, 21, 23.
Confidence Intervals: Bootstrap Distribution
Section 5.2 Confidence Intervals and P-values using Normal Distributions.
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 250 Dr. Kari Lock Morgan Chapter 5 Normal distribution Central limit theorem Normal.
Normal Distribution Chapter 5 Normal distribution
Statistics: Unlocking the Power of Data Lock 5 Synthesis STAT 250 Dr. Kari Lock Morgan SECTIONS 4.4, 4.5 Connecting bootstrapping and randomization (4.4)
Using Lock5 Statistics: Unlocking the Power of Data
How to Handle Intervals in a Simulation-Based Curriculum? Robin Lock Burry Professor of Statistics St. Lawrence University 2015 Joint Statistics Meetings.
Building Conceptual Understanding of Statistical Inference Patti Frazer Lock Cummings Professor of Mathematics St. Lawrence University
Statistics: Unlocking the Power of Data Patti Frazer Lock Cummings Professor of Mathematics St. Lawrence University University of Kentucky.
Bootstrapping (And other statistical trickery). Reminder Of What We Do In Statistics Null Hypothesis Statistical Test Logic – Assume that the “no effect”
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 9/18/12 Confidence Intervals: Bootstrap Distribution SECTIONS 3.3, 3.4 Bootstrap.
Introducing Inference with Simulation Methods; Implementation at Duke University Kari Lock Morgan Department of Statistical Science, Duke University
For 95 out of 100 (large) samples, the interval will contain the true population mean. But we don’t know  ?!
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
A Broad Overview of Key Statistical Concepts. An Overview of Our Review Populations and samples Parameters and statistics Confidence intervals Hypothesis.
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 101 Dr. Kari Lock Morgan 10/18/12 Chapter 5 Normal distribution Central limit theorem.
Using Randomization Methods to Build Conceptual Understanding of Statistical Inference: Day 2 Lock, Lock, Lock Morgan, Lock, and Lock MAA Minicourse- Joint.
Confidence Intervals: Bootstrap Distribution
Introducing Inference with Bootstrapping and Randomization Kari Lock Morgan Department of Statistical Science, Duke University with.
Implementing a Randomization-Based Curriculum for Introductory Statistics Robin H. Lock, Burry Professor of Statistics St. Lawrence University Breakout.
Statistics: Unlocking the Power of Data Lock 5 Bootstrap Intervals Dr. Kari Lock Morgan PSU /12/14.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Building Conceptual Understanding of Statistical Inference Patti Frazer Lock Cummings Professor of Mathematics St. Lawrence University Canton, New York.
Using Bootstrapping and Randomization to Introduce Statistical Inference Robin H. Lock, Burry Professor of Statistics Patti Frazer Lock, Cummings Professor.
Give your data the boot: What is bootstrapping? and Why does it matter? Patti Frazer Lock and Robin H. Lock St. Lawrence University MAA Seaway Section.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Constructing Bootstrap Confidence Intervals
Statistics: Unlocking the Power of Data Lock 5 Section 6.4 Distribution of a Sample Mean.
1 Probability and Statistics Confidence Intervals.
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
StatKey Online Tools for Teaching a Modern Introductory Statistics Course Robin Lock Burry Professor of Statistics St. Lawrence University
Bootstraps and Scrambles: Letting a Dataset Speak for Itself Robin H. Lock Patti Frazer Lock ‘75 Burry Professor of Statistics Cummings Professor of MathematicsSt.
AP STATISTICS LESSON 11 – 1 (DAY 2) The t Confidence Intervals and Tests.
Notes on Bootstrapping Jeff Witmer 10 February 2016.
Using Randomization Methods to Build Conceptual Understanding in Statistical Inference: Day 1 Lock, Lock, Lock, Lock, and Lock Minicourse – Joint Mathematics.
Application of the Bootstrap Estimating a Population Mean
Assignment: Solve practice problems WITHOUT looking at the answers.
Connecting Intuitive Simulation-Based Inference to Traditional Methods
Presentation transcript:

What Can We Do When Conditions Aren’t Met? Robin H. Lock, Burry Professor of Statistics St. Lawrence University BAPS at 2011 JSM Miami Beach, August 2011

Example #1: CI for a Mean To use t* the sample should be from a normal distribution. But what if the sample is clearly skewed, has outliers, …?

Example #2: CI for a Standard Deviation Example #3: CI for a Correlation What is the distribution?

Alternate Approach: Bootstrapping “Let your data be your guide.” Brad Efron – Stanford University

What is a bootstrap? and How does it give an interval?

Example #1: Atlanta Commutes Data: The American Housing Survey (AHS) collected data from Atlanta in What’s the mean commute time for workers in metropolitan Atlanta?

Sample of n=500 Atlanta Commutes Where might the “true” μ be?

“Bootstrap” Samples Key idea: Sample with replacement from the original sample using the same n. Assumes the “population” is many, many copies of the original sample.

Atlanta Commutes – Original Sample

Atlanta Commutes: Simulated Population

Creating a Bootstrap Distribution 1. Compute a statistic of interest (original sample). 2. Create a new sample with replacement (same n). 3. Compute the same statistic for the new sample. 4. Repeat 2 & 3 many times, storing the results. Important point: The basic process is the same for ANY parameter/statistic. Bootstrap sample Bootstrap statistic Bootstrap distribution

Bootstrap Distribution of 1000 Atlanta Commute Means

Using the Bootstrap Distribution to Get a Confidence Interval – Version #1 The standard deviation of the bootstrap statistics estimates the standard error of the sample statistic. Quick interval estimate : For the mean Atlanta commute time:

Example #2 : Find a confidence interval for the standard deviation, σ, of prices (in $1,000’s) for Mustang(cars) for sale on an internet site. Original sample: n=25, s=11.11 Bootstrap distribution of sample std. dev’s SE=1.61

Using the Bootstrap Distribution to Get a Confidence Interval – Method # Keep 95% in middle Chop 2.5% in each tail For a 95% CI, find the 2.5%-tile and 97.5%-tile in the bootstrap distribution 95% CI=(27.34,31.96)

90% CI for Mean Atlanta Commute For a 90% CI, find the 5%-tile and 95%-tile in the bootstrap distribution Keep 90% in middle Chop 5% in each tail 90% CI=(27.52,30.66)

99% CI for Mean Atlanta Commute For a 99% CI, find the 0.5%-tile and 99.5%-tile in the bootstrap distribution Keep 99% in middle Chop 0.5% in each tail 99% CI=(26.74,31.48)

What About Technology? Possible options? Fathom R Minitab (macro) JMP Web apps Others? xbar=function(x,i) mean(x[i]) x=boot(Margin,xbar,1000) x=do(1000)*sd(sample(Price,25,replace=TRUE))

(coming soon)

Example #3: Find a 95% confidence interval for the correlation between size of bill and tips at a restaurant. Data: n=157 bills at First Crush Bistro (Potsdam, NY) r=0.915

Bootstrap correlations 95% (percentile) interval for correlation is (0.860, 0.956) BUT, this is not symmetric…

Method #3: Reverse Percentiles Golden rule of bootstraps: Bootstrap statistics are to the original statistic as the original statistic is to the population parameter

What About Hypothesis Tests?

“Randomization” Samples Key idea: Generate samples that are (a)based on the original sample AND (a)consistent with some null hypothesis.

Example: Mean Body Temperature Data: A sample of n=50 body temperatures. Is the average body temperature really 98.6 o F? H 0 :μ=98.6 H a :μ≠98.6 Data from Allen Shoemaker, 1996 JSE data set article

Randomization Samples How to simulate samples of body temperatures to be consistent with H 0 : μ=98.6? Fathom Demo

Randomization Distribution Looks pretty unusual… p-value ≈ 1/1000 x 2 = 0.002

Choosing a Randomization Method A=Caffeine mean=248.3 B=No Caffeine mean=244.7 Example: Finger tap rates (Handbook of Small Datasets) Method #1: Randomly scramble the A and B labels and assign to the 20 tap rates. H 0 : μ A =μ B vs. H a : μ A >μ B Method #3: Pool the 20 values and select two samples of size 10 (with replacement) Method #2: Add 1.8 to each B rate and subtract 1.8 from each A rate (to make both means equal to 246.5). Sample 10 values (with replacement) within each group.

Connecting CI’s and Tests Randomization body temp means when μ=98.6 Bootstrap body temp means from the original sample Fathom Demo

Fathom Demo: Test & CI

Materials for Teaching Bootstrap/Randomization Methods?