# Basic Statistics Probability and Sampling Distributions.

## Presentation on theme: "Basic Statistics Probability and Sampling Distributions."— Presentation transcript:

Basic Statistics Probability and Sampling Distributions

STRUCTURE OF STATISTICS
TABULAR DESCRIPTIVE GRAPHICAL NUMERICAL STATISTICS ESTIMATION INFERENTIAL TESTS OF HYPOTHESIS

? Inferential Statistics EXAMPLE: Population parameters statistics
Inferential statistics describes a population of data using the information contained in a sample ? EXAMPLE: Population parameters Estimating Predicting (inferring) Sampling Sample statistics A sample is a portion, or part, of the population of interest

Everyone makes inferences, why Statistical Inference?
I predict rain today!!! The difference between a fortune teller and a statistician is a “Statement of Goodness”

A statement of goodness is a statement indicating the chance or probability that an inference is wrong. Probability is a statement of one’s belief that an event will happen. * What are the chances of a head appearing when you toss a coin? * What are the chances of selecting an Ace of Diamonds from a deck of cards?

Probability Has a Basis in Mathematics
Consider a coin-toss experiment: where k = number of heads and N = number of tosses 1/2

Relative Frequency Probability
P(A) = Number of Events of Interest (A) divided by the Total Number of Events. Probability of a Head on the flip of an honest coin = 1/2 (as we saw on the last slide). Probability of drawing the Ace of Diamonds = 1/52 (since there are 52 cards in the deck and only 1 ace of diamonds). There are other elements of probability discussed in the text but we will not be using those portions for this class.

We can put probability in the context of a relative frequency distribution. Recall the example of 10 students who took a 5-point quiz with the following results. It turns out that the area under the curve also represents the probability of an event. For example, what is the probability that a student picked at random scored 4 on the quiz? f X r .4 -- .3 -- .2 -- .1 -- .0 -- .1 The probability would be .2, the same as the area under the curve! .1 .1 .1 .1 .1 .1 .1 .1 .1

This idea transfers directly to a normal distribution
This idea transfers directly to a normal distribution. What is the probability that a randomly selected person scores at least 650 on the Verbal portion of the GRE? First, we must compute the person’s z-score. From Table A, Column C we find that of the area is in the tail. Thus, the probability is about .07. z 1.5

Some Notation Characteristic Sample Population Mean Standard Deviation
(Necessary to Distinguish Between Sample and Population) Characteristic Sample Population Mean Standard Deviation Sample Size In descriptive statistics, the differentiation is not important as the sample and population numerical measures are the same.

Sampling A key to statistical inference is the assumption that the sample is representative of the population from which it was drawn. Random Sampling ensures that each possible sample has an equal chance of being selected and all members of the population have an equal chance of being selected into the sample. We will assume that all samples are selected in a random fashion. Please note that bigger is not better, unless it is representative.

Introduction to Inferential Statistics
How do we get from a sample to a prediction about a population? In statistics, we use a sampling distribution to infer the characteristics of the population.

Some Definitions Sampling Distributions: A distribution of statistics obtained by selecting all possible samples of a specific size from a population. Sampling Error: The discrepancy between the statistic obtained from the sample and the parameter for the population. Standard Error: Provides an estimate of exactly how much error, on average, should exist between the statistic and the parameter. It is a measure of chance and is the standard deviation of the sampling distribution.

What is the shape of the sampling distribution and can we describe it in terms of mean and standard deviation (standard error)? YES! The answer is the Central Limit Theorem (CLT).

Central Limit Theorem Applied to Means
For any population with mean m and standard deviation s, the distribution of sample means for sample size n (n >30) will have a mean of m and a standard deviation (standard error) of , and will approach a normal distribution as n approaches infinity.

The Central Limit Theorem Recap
Regardless of the mean of the population, the mean of the distribution of sample means (sampling distribution) will be the same. Regardless of the SD of the population, the SD of the sampling distribution will be the same divided by the square root of the sample size. Regardless of the shape of the population, the shape of the sampling distribution will be approximately normal.

An Example of a Sampling Distribution of the Means
First, assume that we have a Population consisting of only four numbers N =4, (2, 4, 6, 8). Next, we will take all possible samples from this population of size n = 2. We will calculate the mean of each sample that we obtain. Finally, we will plot the means in a frequency histogram.

Our Population Parameters
= 20 / 4 = 5 = = 2.236

All Possible Samples (n = 2) from our Population
First Pick Second Pick Mean 2 6 4 3 5 8 7 These are all possible samples (16) of size n = 2 and the means of those samples that can be taken from our population of N = 4 objects.

Frequency Histogram of Means
All 16 means plotted

Calculate the Mean of the Means and Standard Deviation of the Means

The Central Limit Theorem
Recall that the Central Limit Theorem states that the mean of the sampling distribution of the means would be equal to m. From the previous slide we calculated the mean of the sampling distribution of our 16 means to be 5, which is the population mean we calculated earlier. Also, recall that the Central Limit Theorem states that the standard deviation (standard error) would be equal to If we divide the standard deviation we calculated on our population (s = 2.236) and divide it by the square root of our sample size ( n= 2) we would obtain / = 1.58; which is exactly what we calculated our standard deviation to be using our sample data.

Summary of Central Limit Theorem
= 20 / 4 = 5 = 2.236 = / 2 = 1.58 Note the symbols used to denote the mean of the means and the standard error of the mean and

Inferential Statistics
We will use our knowledge of the sampling distribution of the means {(m), ( )} given by the Central Limit Theorem as the basis for inferential statistics. We also will use our ability to locate a single score in a distribution using z scores in hypothesis testing.

Similar presentations