Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.

Slides:



Advertisements
Similar presentations
Chapter 11 Comparing Two Means. Homework 19 Read: pages , , LDI: 11.1, 11.2, EX: 11.40, 11.41, 11.46,
Advertisements

Review bootstrap and permutation
Estimation of Means and Proportions
“Students” t-test.
Hypothesis testing and confidence intervals by resampling by J. Kárász.
Hypothesis Testing I 2/8/12 More on bootstrapping Random chance
3 pivot quantities on which to base bootstrap confidence intervals Note that the first has a t(n-1) distribution when sampling from a normal population.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Sample size computations Petter Mostad
2008 Chingchun 1 Bootstrap Chingchun Huang ( 黃敬群 ) Vision Lab, NCTU.
Bootstrapping LING 572 Fei Xia 1/31/06.
AP Statistics Section 12.1 A. Now that we have looked at the principles of testing claims, we proceed to practice. We begin by dropping the unrealistic.
Chapter 11: Inference for Distributions
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview Central Limit Theorem The Normal Distribution The Standardised Normal.
15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.
Getting Started with Hypothesis Testing The Single Sample.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 14: Non-parametric tests Marshall University Genomics.
Chapter 7 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 7: The t Test for Two Independent Sample Means To conduct a t test for.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Chapter 20: Testing Hypotheses about Proportions
STA291 Statistical Methods Lecture 27. Inference for Regression.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Statistical Computing
Empirical Research Methods in Computer Science Lecture 2, Part 1 October 19, 2005 Noah Smith.
Applications of bootstrap method to finance Chin-Ping King.
Estimation of Statistical Parameters
1 Introduction to Estimation Chapter Concepts of Estimation The objective of estimation is to determine the value of a population parameter on the.
1 SAMPLE MEAN and its distribution. 2 CENTRAL LIMIT THEOREM: If sufficiently large sample is taken from population with any distribution with mean  and.
Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9.
Bootstrapping (And other statistical trickery). Reminder Of What We Do In Statistics Null Hypothesis Statistical Test Logic – Assume that the “no effect”
PARAMETRIC STATISTICAL INFERENCE
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Inferential Statistics 2 Maarten Buis January 11, 2006.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Confidence intervals and hypothesis testing Petter Mostad
Sampling Error.  When we take a sample, our results will not exactly equal the correct results for the whole population. That is, our results will be.
Resampling techniques
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Nonparametric Methods II 1 Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4.
1 Probability and Statistics Confidence Intervals.
10.1 – Estimating with Confidence. Recall: The Law of Large Numbers says the sample mean from a large SRS will be close to the unknown population mean.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
AP STATISTICS LESSON 11 – 1 (DAY 2) The t Confidence Intervals and Tests.
Bootstrapping James G. Anderson, Ph.D. Purdue University.
Notes on Bootstrapping Jeff Witmer 10 February 2016.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
Estimating standard error using bootstrap
Bootstrap – The Statistician’s Magic Wand
Based on “An Introduction to the Bootstrap” (Efron and Tibshirani)
Inference for the Mean of a Population
ESTIMATION.
Sampling distribution
Statistics in Applied Science and Technology
When we free ourselves of desire,
Bootstrap - Example Suppose we have an estimator of a parameter and we want to express its accuracy by its standard error but its sampling distribution.
Chapter 23 Comparing Means.
BOOTSTRAPPING: LEARNING FROM THE SAMPLE
Ch13 Empirical Methods.
How Confident Are You?.
Presentation transcript:

Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1

Bootstrap Why is that important? – Bootstrap can be used to test almost any index that you think is interesting – No underlying assumptions – Intuitive 2

Bootstrap How does it build on what we did before? – For each analysis, we saw that they had conditions of applications and were often at a loss about how to deal with what to do if the conditions were not met. – We saw one test statistic that we didn’t know how to test: sobel mediation test (c-c’) 3

Bootstrap How is it conceptually different from other measures? – It is non parametric and so does not refer to a hypothetical underlying distribution of the test statistic or of the data 4

Today’s subject Principles of the bootstrap: how to do it? How to estimate bias, standard error and confidence interval of parameters with bootstrap? 5

The process of frequentist statistics Test statistic Theoretical hypothesis Operational hypothesis Null hypothesis P-value OR CI Underlying distribution 6 If one of those is unusual or unknown, bootstrap or permutation is useful

Plan for the day Present basic concepts of bootstrap – bias, – Standard deviation over bootstrap samples 7

Historically 1979: seminal paper of Efron. But there were some predecessors Popularized in the 80’s due to availability of computer. Strong mathematical background (even before 1979) 8

In practice Almost always based on simulations It is often used to: – Estimate standard errors, – Estimate bias – Construct confidence intervals 9

Advantages Minimal assumption: the sample is a good representation of the unknown population Works for almost any test statistic you can think of 10

Bootstrap algorithm Draw a sample x* with replacement from your sample x. Both samples have the same size n. Compute TS* for this bootstrap sample Repeat steps 1 and 2, B times. We obtain TS* = (TS* 1, TS* 2,…,TS* B ) TS* is a sample from the unknown distribution of TS. 11

Bootstrap Standard Errors The standard deviation of TS* over the Bootstrap samples is an estimation of the standard error for a single sample 12

Bootstrap estimate of bias Bias(TS)=mean(TS* B )-TS Because it is an estimate, it will not be exactly zero. Thus, the distance from zero must be estimated relative to the bootstrap standard deviation. 13

Types of bootstrap Parametric: we assume that the TS follows a specific distribution. Bootstrap is used to obtain the estimates of the parameters Non-parametric: we do not know the distribution of the TS and obtain the empirical distribution 14

Bootstrap estimate of covariance Let TS 1 be a test statistic and TS 2 be another test statistic (e.g., TS 1 is mean and TS 2 is variance): Because you estimate each TS in each replication, you can then obtain the covariance(TS 1,TS 2 ) 15

Confidence interval: parametric Estimate mean and standard error of the supposed distribution by using bootstrap Use these estimates to compute a parametric (because you assume a distribution) confidence interval – Example: assume normality of TS, bootstrap TS to obtain mean and standard deviation over bootstrap (remember this is similar to standard error in a single sample) 16

Confidence interval: percentile Obtain a few thousands replications Compute the TS Order the TS from the smallest to the largest The 95% percentile confidence interval lower (resp. higher value) is the value of TS such that 2.5% of the replications have a lower (resp. higher) TS than this value. This interval is not symmetric 17

Bias Corrected CI (Bca) The TS may be biased. The percentile confidence interval then is centered on the central tendency bootstrap value and not on the real sample estimate. 18

Which CI to choose? Percentile CI are easily applicable and intuitive. In order to estimate consistently extreme percentiles, we need a very large number of replications Bca is equal to percentile CI if TS is not biased. 19

Take away message 3 main reasons to use bootstrap. Each research question generates one or several TS. – TS can be new and thus not implemented in any software. – H0 can be complex (not mean1=mean2) – TS can follow an unknown distribution Procedure of bootstrap: – Bias, SE, confidence intervals 20