Sampling distribution

Sampling distribution
Do not ‘read’ this. It is meant to be watched only.

Any and usually undefinable N
POPULATION Any and usually undefinable N μ, σ Sample Size = N Start with just a single random sample from the population.

POPULATION Any and usually undefinable N μ, σ Sample Size = N Sample Size = N Sample Size = N Sample Size = N Sample Size = N All these hypothetical samples have the same N, but different parameter estimates (in this case, mean and standard deviation) for each sample.

POPULATION Any and usually undefinable N μ, σ Sample Size = N Sample Size = N Sample Size = N Sample Size = N Sample Size = N Note the sample means.

POPULATION Any and usually undefinable N μ, σ Sample Size = N Sample Size = N Sample Size = N Sample Size = N Sample Size = N Concerning means, we can think about a distribution of them, and how it would take shape.

POPULATION Any and usually undefinable N μ, σ If we were to create a sampling distribution, the distribution of means would have its own mean equal to μ, and standard deviation of σ/sqrt(N), and with a large N, be approximately normal. This is the Central Limit Theorem.

POPULATION Any and usually undefinable N μ, σ We don’t actually do this though (we only have our one sample and mean), so such a distribution is theoretical. Its mean, i.e. the population mean μ, is posited by the null hypothesis. The standard deviation would usually be estimated by our sample s.

POPULATION Any and usually undefinable N μ, σ μ The question now is, given this distribution, what is the probability of obtaining my sample estimate? It is the conditional probability, p(D|H0). Given the null hypothesis, what is the probability of obtaining this estimate or more extreme. Knowing the properties of the sampling distribution (e.g. normal) allows us to obtain these probabilities.

POPULATION Any and usually undefinable N μ, σ Lower Limit Upper Limit Alternatively, we could assume a sampling distribution centered on our estimate, obtain lower and upper limits with some confidence, and see if this confidence interval contains the population value proposed by the null hypothesis.

POPULATION Any and usually undefinable N μ, σ Sample Size = N Now what is bootstrapping again?

POPULATION Any and usually undefinable N μ, σ Sample N, Sample Size = N Now what is bootstrapping again?

Sample N, Boot Sample Size = N Boot Sample Size = N Boot Sample Size = N Boot Sample Size = N Boot Sample Size = N We get sample means from sampling with replacement from our original sample.

Sample N, Example in R. Most of the time bootstrapping is used this is done for you, but it’s good to demonstrate for yourself. After obtaining the bootstrap distribution (e.g. as we have done in lab previously): quantile(data, c(.025,.975)) #From lab example, data would be bootmean$thetastar. Lower Limit Upper Limit Now we have an empirical sampling distribution, not assumed, but based on our sample data. By finding the appropriate quantiles1, we can obtain a confidence interval as we did before (e.g. at .025 and .975 for a 95% CI), but not have to assume a theoretically normal distribution.

Summary Hopefully this will help keep some of this straight, but don’t worry if it takes awhile to get sampling distributions etc. down It’s a different way to think about things and will take some getting used to

Sampling distribution

Similar presentations

Presentation on theme: "Sampling distribution"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sampling distribution

Similar presentations

Presentation on theme: "Sampling distribution"— Presentation transcript:

Similar presentations

About project

Feedback