Connecting Intuitive Simulation-Based Inference to Traditional Methods

Slides:



Advertisements
Similar presentations
Panel at 2013 Joint Mathematics Meetings
Advertisements

Early Inference: Using Bootstraps to Introduce Confidence Intervals Robin H. Lock, Burry Professor of Statistics Patti Frazer Lock, Cummings Professor.
Hypothesis Testing I 2/8/12 More on bootstrapping Random chance
Intuitive Introduction to the Important Ideas of Inference Robin Lock – St. Lawrence University Patti Frazer Lock – St. Lawrence University Kari Lock Morgan.
Statistics: Unlocking the Power of Data Lock 5 Inference Using Formulas STAT 101 Dr. Kari Lock Morgan Chapter 6 t-distribution Formulas for standard errors.
Models and Modeling in Introductory Statistics Robin H. Lock Burry Professor of Statistics St. Lawrence University 2012 Joint Statistics Meetings San Diego,
A Fiddler on the Roof: Tradition vs. Modern Methods in Teaching Inference Patti Frazer Lock Robin H. Lock St. Lawrence University Joint Mathematics Meetings.
Connecting Simulation- Based Inference with Traditional Methods Kari Lock Morgan, Penn State Robin Lock, St. Lawrence University Patti Frazer Lock, St.
StatKey: Online Tools for Bootstrap Intervals and Randomization Tests Kari Lock Morgan Department of Statistical Science Duke University Joint work with.
Dr. Kari Lock Morgan Department of Statistics Penn State University Teaching the Common Core: Making Inferences and Justifying Conclusions ASA Webinar.
Starting Inference with Bootstraps and Randomizations Robin H. Lock, Burry Professor of Statistics St. Lawrence University Stat Chat Macalester College,
Using Simulation Methods to Introduce Statistical Inference Patti Frazer Lock Kari Lock Morgan Cummings Professor of Mathematics Assistant Professor of.
Bootstrapping: Let Your Data Be Your Guide Robin H. Lock Burry Professor of Statistics St. Lawrence University MAA Seaway Section Meeting Hamilton College,
Introducing Inference with Simulation Methods; Implementation at Duke University Kari Lock Morgan Department of Statistical Science, Duke University
Statistics: Unlocking the Power of Data Lock 5 Inference for Proportions STAT 250 Dr. Kari Lock Morgan Chapter 6.1, 6.2, 6.3, 6.7, 6.8, 6.9 Formulas for.
Understanding the P-value… Really! Kari Lock Morgan Department of Statistical Science, Duke University with Robin Lock, Patti Frazer.
Confidence Intervals: Bootstrap Distribution
Section 5.2 Confidence Intervals and P-values using Normal Distributions.
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 250 Dr. Kari Lock Morgan Chapter 5 Normal distribution Central limit theorem Normal.
Normal Distribution Chapter 5 Normal distribution
Statistics: Unlocking the Power of Data Lock 5 Synthesis STAT 250 Dr. Kari Lock Morgan SECTIONS 4.4, 4.5 Connecting bootstrapping and randomization (4.4)
Using Lock5 Statistics: Unlocking the Power of Data
Many times in statistical analysis, we do not know the TRUE mean of a population of interest. This is why we use sampling to be able to generalize the.
What Can We Do When Conditions Aren’t Met? Robin H. Lock, Burry Professor of Statistics St. Lawrence University BAPS at 2011 JSM Miami Beach, August 2011.
How to Handle Intervals in a Simulation-Based Curriculum? Robin Lock Burry Professor of Statistics St. Lawrence University 2015 Joint Statistics Meetings.
Statistics: Unlocking the Power of Data Lock 5 Afternoon Session Using Lock5 Statistics: Unlocking the Power of Data Patti Frazer Lock University of Kentucky.
Building Conceptual Understanding of Statistical Inference Patti Frazer Lock Cummings Professor of Mathematics St. Lawrence University
Statistics: Unlocking the Power of Data Patti Frazer Lock Cummings Professor of Mathematics St. Lawrence University University of Kentucky.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 9/18/12 Confidence Intervals: Bootstrap Distribution SECTIONS 3.3, 3.4 Bootstrap.
Introducing Inference with Simulation Methods; Implementation at Duke University Kari Lock Morgan Department of Statistical Science, Duke University
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 101 Dr. Kari Lock Morgan 10/18/12 Chapter 5 Normal distribution Central limit theorem.
Using Randomization Methods to Build Conceptual Understanding of Statistical Inference: Day 2 Lock, Lock, Lock Morgan, Lock, and Lock MAA Minicourse- Joint.
Confidence Intervals: Bootstrap Distribution
Introducing Inference with Bootstrapping and Randomization Kari Lock Morgan Department of Statistical Science, Duke University with.
Implementing a Randomization-Based Curriculum for Introductory Statistics Robin H. Lock, Burry Professor of Statistics St. Lawrence University Breakout.
Statistics: Unlocking the Power of Data Lock 5 Bootstrap Intervals Dr. Kari Lock Morgan PSU /12/14.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Using Bootstrapping and Randomization to Introduce Statistical Inference Robin H. Lock, Burry Professor of Statistics Patti Frazer Lock, Cummings Professor.
Give your data the boot: What is bootstrapping? and Why does it matter? Patti Frazer Lock and Robin H. Lock St. Lawrence University MAA Seaway Section.
Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution.
Constructing Bootstrap Confidence Intervals
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 250 Dr. Kari Lock Morgan Chapter 5 Normal distribution (5.1) Central limit theorem.
Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution.
Statistics: Unlocking the Power of Data Lock 5 Inference for Proportions STAT 250 Dr. Kari Lock Morgan Chapter 6.1, 6.2, 6.3, 6.7, 6.8, 6.9 Formulas for.
StatKey Online Tools for Teaching a Modern Introductory Statistics Course Robin Lock Burry Professor of Statistics St. Lawrence University
Review Confidence Intervals Sample Size. Estimator and Point Estimate An estimator is a “sample statistic” (such as the sample mean, or sample standard.
Bootstraps and Scrambles: Letting a Dataset Speak for Itself Robin H. Lock Patti Frazer Lock ‘75 Burry Professor of Statistics Cummings Professor of MathematicsSt.
Many times in statistical analysis, we do not know the TRUE mean of a population on interest. This is why we use sampling to be able to generalize the.
Comparing Two Proportions Chapter 21. In a two-sample problem, we want to compare two populations or the responses to two treatments based on two independent.
Making computing skills part of learning introductory stats
Active Learning Lecture Slides For use with Classroom Response Systems
Statistics 200 Objectives:
Confidence Intervals Topics: Essentials Inferential Statistics
Inference for Proportions
Synthesis and Review for Exam 1
Normal Distribution Chapter 5 Normal distribution
Chapter 8: Inference for Proportions
Inferences About Means from Two Groups
Confidence Intervals Topics: Essentials Inferential Statistics
Using Simulation Methods to Introduce Inference
Improving Conceptual Understanding in Intro Stats
Sampling Distributions
Using Simulation Methods to Introduce Inference
Improving Conceptual Understanding in Intro Stats
Comparing Two Proportions
Comparing Two Proportions
Teaching with Simulation-Based Inference, for Beginners
Chapter 10: Comparing Two Populations or Groups
Randomization Tests - Beyond One/Two Sample Means & Proportions
Presentation transcript:

Connecting Intuitive Simulation-Based Inference to Traditional Methods Robin Lock, St. Lawrence University Patti Frazer Lock, St. Lawrence University Kari Lock Morgan, Pennsylvania State University ICOTS 10 – Kyoto , Japan July 9, 2018

Assumptions/Conditions We start with simulation-based inference (SBI): bootstrap intervals, randomization tests. We cover lots of parameter situations (mean, proportion, differences, correlation, slope, …). We want students (eventually) to see traditional methods. We need good software to make SBI methods accessible to students.

Software? StatKey Statistics packages: Freely available web apps http://lock5stat.com/statkey http://www.rossmanchance.com/applets/ http://www.rossmanchance.com/ISIapplets.html Statistics packages: R, JMP, Minitab Express, …

Example #1: Online Dating Apps What proportion of 18-24 year olds (young adults) in the U.S. have used an online dating app? Data: Pew Research survey 53 yes in a sample of 194, 𝑝 = 53 194 =0.273. Task: Find a 95% CI for the proportion Method: Create a bootstrap distribution of sample proportions by sampling with replacement from the original sample http://www.pewinternet.org/2016/02/11/15-percent-of-american-adults-have-used-online-dating-sites-or-mobile-dating-apps/

lock5stat.com/statkey

95% Confidence Interval from a Bootstrap Distribution Percentile method: Find the endpoints of the middle 95% of the bootstrap statistics. Standard Error method: 𝑂𝑟𝑖𝑔𝑖𝑛𝑎𝑙 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐±2⋅𝑆𝐸 Standard deviation of the bootstrap statistics

0.273±2∙0.032 =(0.209, 0.337)

Example #2: : Does Mind-Set Matter? Female hotel maids were randomly divided into two groups. Group #1 was informed that their duties count a exercise Group #2 was not given this information Weight loss was measured. n mean std. dev Group #1 (Informed) 41 1.79 2.88 Group #2 (Uninformed) 34 0.20 2.32 𝐻 0 : 𝜇 1 = 𝜇 2 𝐻 𝑎 : 𝜇 1 > 𝜇 2 𝑥 1 − 𝑥 2 =1.59 Task: Does this provide enough evidence to conclude that the mean weight loss is higher when informed? Method: Create a randomization distribution of differences in means when being informed has no effect (H0 is true) Crum, A. and Langer, E. (2007) “Mind-Set Matters: Exercise and the Placebo Effect” Psychological Science, 18:165-171

lock5stat.com/statkey

Distribution of statistic if no difference (H0 true) p-value Distribution of statistic if no difference (H0 true) observed statistic

Transition to Traditional Step #1: Smooth Curve: Simulation distribution to general curve Step #2: Standardized Statistic: Original statistic to standardized value Step #3: Standard Error Formula: Simulation SE to formula SE

Step #1: Mind-Set Matters Compare the original statistic to this Normal distribution to find the p-value.

p-value from N(null, SE) Same idea as randomization test, just using a smooth curve! p-value observed statistic

Seeing the Connection! Randomization Distribution Normal Distribution

Step #1 Online Dating N(0.273, 0.032) 𝑝 =0.273

CI from N(statistic, SE) Same idea as the bootstrap, just using a smooth curve!

Transition to Traditional Step #1: Smooth Curve: Simulation distribution to general curve Step #2: Standardize Statistic: Original statistic to standardized value Step #3: Standard Error Formula: Simulation SE to by formula SE

Step #2: Standardize Statistic Convert to “number of SE’s” and use N(0,1) 𝑧= 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐−𝑁𝑢𝑙𝑙 𝑆𝐸 For tests: (standardize) For intervals: 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐± 𝑧 ∗ ⋅𝑆𝐸 (unstandardize) (For now) SE comes from the randomization or bootstrap distribution

Step #2: Mind-Set Matters 𝑧= 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐−𝑁𝑢𝑙𝑙 𝑆𝐸 𝐻 0 : 𝜇 1 = 𝜇 2 ⇒ 𝜇 1 − 𝜇 2 =0 Data: 𝑥 1 − 𝑥 2 =1.59 𝑧= 1.59−0 0.632 =2.52 N(0,1) p-value

Step #2: Online Dating 𝑝 ± 𝑧 ∗ ⋅𝑆𝐸 N(0,1) 𝑝 = 53 194 =0.273 0.273±1.96⋅0.032=0.273±0.063=(0.210 to 0.336)

Step #2: Standardize Statistic Convert to “number of SE’s” and use N(0,1) 𝑧= 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐−𝑁𝑢𝑙𝑙 𝑆𝐸 For tests: (standardize) For intervals: 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐± 𝑧 ∗ ⋅𝑆𝐸 (unstandardize) Wouldn’t it be nice to find the SE’s without needing any simulations?

Transition to Traditional Step #1: Smooth Curve: Simulation distribution to general curve Step #2: Standardize Statistic: Original statistic to standardized value Step #3: Standard Error Formula: Simulation SE to by formula SE

Standard Error Formulas Parameter Standard Error Proportion 𝑝 1− 𝑝 𝑛 Mean (use t) 𝑠 𝑛 Diff. in Proportions 𝑝 1 1− 𝑝 1 𝑛 1 + 𝑝 2 1− 𝑝 2 𝑛 2 Diff. in Means (use t) 𝑠 1 2 𝑛 1 + 𝑠 2 2 𝑛 2

Step #3: Mind-Set Matters 𝑧= 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐−𝑁𝑢𝑙𝑙 𝑆𝐸 𝐻 0 : 𝜇 1 = 𝜇 2 ⇒ 𝜇 1 − 𝜇 2 =0 Data: 𝑥 1 − 𝑥 2 =1.59 𝑧= 1.59−0 0.601 =2.65 𝑆𝐸= 2.88 2 41 + 2.32 2 34 𝑆𝐸=0.601 t33 p-value

Step #3: Online Dating 𝑝 ± 𝑧 ∗ ⋅𝑆𝐸 N(0,1) 𝑆𝐸= 0.273(1−0.273) 194 𝑆𝐸= 0.273(1−0.273) 194 𝑆𝐸=0.032 𝑝 = 53 194 =0.273 𝑝 ± 𝑧 ∗ ⋅𝑆𝐸 0.273±1.96⋅0.032=0.273±0.063=(0.210 to 0.336)

Transition to Traditional Step #1: Smooth Curve: Simulation distribution to general curve Step #2: Standardize Statistic: Original statistic to standardized value Step #3: Standard Error Formula: Simulation SE to by formula SE Note: These steps are designed for making the transition, not for routinely calculating p-values or intervals.

Simulation to Traditional Bootstrap Normal( 𝑝 , 𝑆𝐸) A 𝑝 ± 𝑧 ∗ 𝑝 (1− 𝑝 ) 𝑛 B 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐± 𝑧 ∗ ⋅𝑆𝐸 B Even if you only want your students to be able to go from A to B, it helps understanding to build connections along the way!

Observation Important point: The fundamental concepts of inference have already been established (via simulation) Once the transition has been made, traditional methods can go VERY quickly! Two questions: What’s a formula for SE? What are conditions for a theoretical distribution to apply?

Observation Why do we use 𝑆𝐸= 𝑝 (1− 𝑝 ) 𝑛 for intervals, but 𝑆𝐸= 𝑝 0 (1− 𝑝 0 𝑛 for tests? Bootstrap distribution is centered at 𝑝 , randomization distribution is centered at (null) 𝑝 0 .

Thank you! QUESTIONS? Robin Lock: rlock@stlawu.edu Patti Frazer Lock: plock@stlawu.edu Kari Lock Morgan: klm47@psu.edu Slides posted at www.lock5stat.com