 # Introduction to Hypothesis Testing

## Presentation on theme: "Introduction to Hypothesis Testing"— Presentation transcript:

Introduction to Hypothesis Testing

What is a Hypothesis Test?
A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis about a population

Falsifiability A good hypothesis is one that is falsifiable
You cannot prove something that cannot be disproved Better yet, you cannot support a hypothesis if you cannot disconfirm it What are some examples of hypotheses that cannot be falsified? What are examples of ones that can?

What Are The Steps For Hypothesis Testing?
First we state the null hypothesis H0. What is the null hypothesis? This states that in the general population there is no change, no difference, or no relationship. Basically it says the opposite of what we are hoping to show.

Hypothesis Testing Continued
Then we state the alternative hypothesis What is the alternative hypothesis? This states that there is a change, a difference, or a relationship for the general population This is where we state what we believe (hypothesize) to be true

Why Do We Do This? There is no way to PROVE a hypothesis. You can only support a hypothesis, or reject it. If you support it 100,000 times, and then on the 100,001st time you reject it, the hypothesis is not true. So, we seek to reject the null, and thus, conversely we support the alternative.

The Next Step (Hypothesis Testing)
Set the evaluation criteria By this, we are looking to assess an acceptable level of error by chance What do we think is an acceptable probability that the data we are looking at is “different”

Alpha Levels Usually we use α = .05. This corresponds to p = .05.
The alpha level or the level of significance is a probability value that is used to define the very unlikely sample outcomes if the null hypothesis is true. In this case we would expect to obtain this “outlier” sample in only 5% of the samples simply by chance. This corresponds to p = .05. In other words, the probability of obtaining this difference by chance is 5%.

Critical Region The critical region is composed of extreme sample values that are very unlikely to be obtained if the null hypothesis is true. The boundaries for the critical region are determined by the alpha level. If sample data fall in the critical region, the null hypothesis is rejected.

The Next Step (Hypothesis Testing)
Collect Data Compute sample statistics Compute test statistics

What Are the Relevant Sample Statistics?
The mean (M) The sample size (n)

What Are the Relevant Population Parameters?
μ σ Where do we get these parameters? From our hypotheses

Now We Calculate the Test Statistic
The formula for a z-statistic is z = (M – μ) / σm First we calculate σm = σ/√n The we use the values to get z Finally we make a decision based on the z statistic and the alpha level we have chosen

Decision? Given the calculations we have performed, and the alpha levels chosen, are we going to accept or reject the null hypothesis? We fail to reject the null

Error α ß

Type I and Type II Type I Type II
Occurs when a researcher rejects a null hypothesis that is actually true Occurs at the rate of the alpha level we set Type II Occurs when a researcher fails to reject a null hypothesis that is really false No easy calculation. How do we know if we have made this type of error? It is NOT the converse of Type I We must estimate beta

Practice Page 241 Mu = 18 Sigma = 4 n = 16 M = 15 Alpha = .05

What Is Meant by Significance?
A result is said to be significant or statistically significant if it is very unlikely to occur when the null hypothesis is true. That is, the result is sufficient to reject the null hypothesis. What factors influence significance? The size of the difference. The variability of the scores. The number of scores in the sample. Is there a difference between significance and meaningfulness?

Assumptions For Hypothesis Tests With z-Scores
All statistical tests are based on a certain set of assumptions that, when violated, may bias the statistic, and give us misleading results Assumptions for hypothesis tests with z-scores Random Sampling Independent Observations The Value of sigma is unchanged by the treatment Normal sampling distribution

Random Sampling It is assumed that the subjects used to obtain the sample data were selected randomly

Independent Observations
The values in the sample must consist of independent observations. Two events are independent if the occurrence of the first event has no effect on the probability of the second event.

Sigma Unchanged Because sigma is unknown we must make an assumption
We assume that the standard deviation for the unknown population (after treatment) is the same as it was for the population before the treatment In other words, the treatment affects the mean, not the standard deviation

Normal Sampling Distribution
The distribution of sample means must be normal since we have been using the unit normal table to identify probabilities

Directional Hypothesis Tests
In a directional hypothesis test, or a one-tailed test, the statistical hypothesis (h0 and H1) specify either an increase or a decrease in the population mean score. That is, they make a statement about the direction of the effect. This halves the critical region since it is only taking into account the one tail.

Effect Size Demonstrating a significant treatment effect does not necessarily indicate a substantial treatment effect. This is because we are looking at the relative magnitude of the difference in the sample and the population mean with respect to the S.E. What if n is very large, or sigma is very small? Then a small difference in the means may in fact be significant.

Cohen’s d One of the simplest and most direct methods for measuring effect size is Cohen’s d Cohen’s d = (mean difference) / (standard deviation)

Power The power of a statistical test is the probability that the test will correctly reject a false null hypothesis. That is, power is the probability that the test will identify a treatment effect if one really exists What is the relation between power and error? 1 - ß

What Affects Power? Effect size Sample size Alpha level
Number of tails in the test