Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bootstrapping in regular graphs

Similar presentations


Presentation on theme: "Bootstrapping in regular graphs"— Presentation transcript:

1 Bootstrapping in regular graphs
Gesine Reinert, Oxford With Susan Holmes, Stanford

2 What is the bootstrap? Efron (1979), Bickel and Freedman (1981), Singh (1981) Resampling procedure, used to construct confidence intervals and calculate standard errors for statistics

3 The bootstrap procedure
Have random sample of size n, say Draw M observations out of the n, with replacement Calculate the statistic of interest for this sample of size M Repeat many times Use the standard deviation in these samples to estimate standard deviation in the population

4 Example: median Suppose we would like to estimate the median of a population from a sample of size n Sample M=n observations with replacement from the observed data, Take the median of this simulated data set Repeat these steps B times: B simulated medians These medians are approximately draws from the sampling distribution of the median of n observations: Calculate their standard deviation to estimate the standard error of the median

5 When does the bootstrap work?
The underlying idea is that of Russian dolls – the bootstrap samples should relate to the original sample just as the original sample relates to the unknown population (Count the freckles on the faces of Russian dolls)

6 Empirical measures Each observation can be represented by a point mass in space The average of these point masses is called empirical measure: a random quantity taking values in the set of measures

7 Limits of empirical measures
This empirical measure will converge to a limit if the conditions are right; just like the law of large numbers Just like for real-valued random quantities, for independent identically distributed observations an approximation by a Gaussian measure holds We say that the bootstrap works when the bootstrap empirical measure can be approximated by a Gaussian measure centred around the true measure

8 Conditions for validity?
The theoretical arguments proving that the bootstrap works rely on large independent samples But in dependent observations the standard deviation would be estimated wrongly In time series: blockwise bootstrap: Kuensch (1989), Carlstein et al. (1998): sample a whole block of observations in the time series, use the block to approximate the standard deviation

9 Dependency graphs For random variables we can construct a graph with the random variables as the vertices Two vertices are linked by an edge if and only if the corresponding random variables are dependent The set of all neighbours of a vertex is then the set of all random variables which are dependent on the vertex random variable

10 Bootstrapping in such graphs
To capture the dependence structure, we bootstrap not isolated vertices but whole neighbourhoods of dependence together with the vertex Have to weight and re-scale observations

11 Regular graph If all dependency neighbourhoods have the same size, i.e. every vertex has the same degree, then we have a regular graph If the dependency neighbourhoods are small, then the bootstrap works (have numerical bound)

12 Re-weighting If the graph is not only regular, but also all pairwise intersections of dependency neighbourhoods have the same size, g, say, then adjust the variance estimate by multiplying with M and then divide by (n-g), where M is the size of the bootstrap sample, and n is the original sample size Same weights as above also if intersections are all empty

13 Weights in K-nearest neighbour graphs
Place vertices on a circle, connect each vertex to its k nearest neighbours to the left and to the right; so each vertex has degree 2k Have to multiply variance estimator by M and divide by n-2k But also have to weight covariance part differently, depending on the size of dependency neighbourhood overlaps

14 Example: Bucky ball

15 Weighted network For each edge: simulate i.i.d. standard normals
Fix a random orientation of the edge For each vertex: add the normals for edges going into the vertex, and subtract the normals going out of the vertex Sampling distribution for the variance?

16 Realisation

17 Dependency bucky graph

18 Variances

19 Numerical values

20 Summary Dependency graph bootstrapping from graphs, when edges indicate dependence, works when the graph is (reasonably) regular, provided that the variance estimates are multiplied by the correction factor Independent bootstrapping may lead to wrong standard error estimates

21 Reference S. Holmes and G. Reinert: Stein’s method for the bootstrap. In: Stein’s Method: Expository Lectures and Applications. P. Diaconis and S. Holmes, eds, IMS, Hayward, 2004.


Download ppt "Bootstrapping in regular graphs"

Similar presentations


Ads by Google