Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bootstrapping 2014/4/13 R basic 3 Ryusuke Murakami.

Similar presentations


Presentation on theme: "Bootstrapping 2014/4/13 R basic 3 Ryusuke Murakami."— Presentation transcript:

1 Bootstrapping 2014/4/13 R basic 3 Ryusuke Murakami

2 Bootstrapping in R Bootstrapping uses resampling to assign measures of accuracy (defined in terms of bias, variance, confidence intervals, prediction error or some other such measure) to sample estimates. boot package includes functions from the book Bootstrap Methods and Their Applications by A. C. Davison and D. V. Hinkley (1997, CUP)

3 ブートストラップ法 統計学におけるブートストラップ法(英 : bootstrap method )とは、 様々な目的に用いられる統計的推論の手法であり、再標本化法に分類 されるもののひとつである。モンテカルロ法の一つ。 統計学英再標本化モンテカルロ法 ブートストラップ法は母集団の推定量の性質(分散など)を、近似分 布にしたがって標本化したときの性質を計算することで推定する手法 である。近似分布としては、測定値から求められる経験分布を用いる のが標準的である。また仮説検定に使う場合もある。仮定される分布 が疑わしい場合や、パラメトリックな仮定が不可能ないし非常に複雑 な計算を必要とするような場合に、パラメトリックな仮定に基づく推 計の代わりに用いられる。母集団推定量 ブートストラップ法の利点は解析的な手法と比べて非常に単純なこと である。母集団分布の複雑なパラメータ(パーセンタイル点、割合、 オッズ比、相関係数など)の複雑な推定関数に対して標準誤差や信頼 区間を求めるために、単にブートストラップ標本を適用するだけで済 む。パラメータ オッズ比相関係数標準誤差信頼 区間 一方ブートストラップ法の欠点として、漸近的に一致する場合には有 限標本が保証されず、楽観的になる傾向がある。 http://ja.wikipedia.org/wiki/%E3%83%96%E3%83%BC%E3%83%88%E3%82 %B9%E3%83%88%E3%83%A9%E3%83%83%E3%83%97%E6%B3%95

4 Bootstrapping (statistics) In statistics, bootstrapping is a method for assigning measures of accuracy (defined in terms of bias, variance, confidence intervals, prediction error or some other such measure) to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using only very simple methods. Generally, it falls in the broader class of resampling methods.statisticsresampling Bootstrapping is the practice of estimating properties of an estimator (such as its variance) by measuring those properties when sampling from an approximating distribution. One standard choice for an approximating distribution is the empirical distribution of the observed data. In the case where a set of observations can be assumed to be from an independent and identically distributed population, this can be implemented by constructing a number of resamples of the observed dataset (and of equal size to the observed dataset), each of which is obtained by random sampling with replacement from the original dataset.estimator varianceindependent and identically distributed resamplesrandom sampling with replacement It may also be used for constructing hypothesis tests. It is often used as an alternative to inference based on parametric assumptions when those assumptions are in doubt, or where parametric inference is impossible or requires very complicated formulas for the calculation of standard errors.hypothesis tests http://en.wikipedia.org/wiki/Bootstrapping_%28statistics%29

5 R Programming/Bootstrap

6 parameterdescription dataA vector, matrix, or data frame statistic A function that produces the k statistics to be bootstrapped (k=1 if bootstrapping a single statistic). The function should include an indices parameter that the boot() function can use to select cases for each replication (see examples below). RNumber of bootstrap replicates... Additional parameters to be passed to the function that produces the statistic of interest bootobject <- boot(data=, statistic=, R=,...) where

7 parameterdescription bootobjectThe object returned by the boot function conf The desired confidence interval (default: conf=0.95) type The type of confidence interval returned. Possible values are "norm", "basic", "stud", "perc", "bca" and "all" (default: type="all boot.ci(bootobject, conf=, type= ) where

8 Bootstrapping a Single Statistic (k=1) # Bootstrap 95% CI for R-Squared library(boot) # function to obtain R-Squared from the data rsq <- function(formula, data, indices) { d <- data[indices,] # allows boot to select sample fit <- lm(formula, data=d) return(summary(fit)$r.square) } # bootstrapping with 1000 replications results <- boot(data=mtcars, statistic=rsq, R=1000, formula=mpg~wt+disp) # view results results plot(results) # get 95% confidence interval boot.ci(results, type="bca")

9 Bootstrapping several Statistics (k>1) # Bootstrap 95% CI for regression coefficients library(boot) # function to obtain regression weights bs <- function(formula, data, indices) { d <- data[indices,] # allows boot to select sample fit <- lm(formula, data=d) return(coef(fit)) } # bootstrapping with 1000 replications results <- boot(data=mtcars, statistic=bs, R=1000, formula=mpg~wt+disp) # view results results plot(results, index=1) # intercept plot(results, index=2) # wt plot(results, index=3) # disp # get 95% confidence intervals boot.ci(results, type="bca", index=1) # intercept boot.ci(results, type="bca", index=2) # wt boot.ci(results, type="bca", index=3) # disp

10 help(boot) Exercise Parameters Parallel operation


Download ppt "Bootstrapping 2014/4/13 R basic 3 Ryusuke Murakami."

Similar presentations


Ads by Google