Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.

Similar presentations


Presentation on theme: "Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017."β€” Presentation transcript:

1 Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal
Lecture 8 Justin Kern October 10 and 12, 2017

2 Inferential Statistics
Hypothesis Testing One sample mean / proportion Two sample means / proportions ANOVA (more than two sample means) Chi-Square (Goodness of Fit, Independence) If we have time. Confidence Intervals Correlation and Regression Relationship between two variables

3 Inferential Statistics
In real life, we usually do not know the true characteristics of the population of interest What is the mean weight of teenagers in the US? How many hours do UIUC students spend studying per week on average? What proportion of people in Europe suffer from depression? In order to find out something about the population, we conduct a research study by collecting data from a sample from the population We collect data from a sample because it is almost always impossible or highly impractical to collect data from the entire population This is actually the whole point of statistics: we want to infer something about the population by analyzing data from a sample from that population Samples should be representative of the population of interest Proper representation is generally achieved by taking a random sample

4 Hypothesis testing Suppose you are a researcher, making some claim (i.e., a hypothesis). To evaluate this claim, it is necessary to collect data, and then test the claim (using the data) against some sort of benchmark. To make this rigorous, it is necessary to define the hypothesis in a quantitatively. Example: Suppose a researcher claims that a new early intervention technique increases the learning capabilities of autistic children. This can be evaluated using the mean of test scores for autistic children. Test score mean for kids getting new intervention technique is πœ‡ 1 . Test score mean for kids getting the older intervention technique πœ‡ 2 . If πœ‡ 1 is greater than πœ‡ 2 , then the claim can be supported. Unfortunately, we do not know πœ‡ 1 or πœ‡ 2 , so they must be estimated. As we now know, estimation involves uncertainty in the estimate, so that must be taken into account.

5 Hypothesis testing The standard way to evaluate claims is by using the hypothesis testing (or significance testing) framework. Definition: Hypothesis testing is a method for testing a claim or hypothesis about a parameter in a population, using data measured in a sample. In this method, we test some hypothesis by determining the likelihood that a sample statistic could have been selected, if the hypothesis regarding the population parameter were true. To make this precise, we need to have a standard framework for making decisions about whether the data support a hypothesis or not.

6 Mean-Centered Variable
Suppose we have n observations on a variable, π‘₯ 1 ,…, π‘₯ 𝑛 . Take one observation, say the π‘₯ 𝑖 th value. How can we compare it to the rest of the dataset? One way to compare data is to compare the distance of observations from their mean. Thus, if we mean-center our variable, we now have a new variable. For instance: 𝑦 𝑖 = π‘₯ 𝑖 βˆ’ π‘₯ . The mean of this variable is 0. The variance of this variable is 𝑠 π‘₯ 2 . Note, that now all observations are in terms of distances from the mean, which allows for improved comparison of observations. A positive 𝑦 𝑖 means that π‘₯ 𝑖 is above the mean. A negative 𝑦 𝑖 means that π‘₯ 𝑖 is below the mean. The magnitude of the distance is unclear, though, as it depends on variability in the dataset.

7 Mean-Centered Variable
Example: Take values of x as 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 π‘₯ =5 Low variability ( 𝑠 π‘₯ =3.317) Mean-centered values: -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5 Now take values of x as -20, -15, -10, -5, 0, 5, 10, 15, 20, 25, 30 High variability ( 𝑠 π‘₯ =16.583) Mean-centered values: -25, -20, -15, -10, -5, 0, 5, 10, 15, 20, 25 In the first set, being 5 below the mean means you are very far from the mean. In the second set, being 5 below the mean means you are not far from the mean. Thus, to compare values and have a sense of placement within a dataset, variability must be handled properly.

8 Standardized Variable
To handle this issue, we can simply divide our observations by the standard deviation of the data. 𝑧 𝑖 = π‘₯ 𝑖 βˆ’ π‘₯ 𝑠 π‘₯ Note: The mean of z is 0. The sd of z is 1. We call z here a standardized variable, or a z-score. Each value of z describes the number of standard deviations above or below the mean that a given observation is.

9 Standardized Variable (Example)
The mean and standard deviation of an IQ test is 100 and 15, respectively. What is the z-score associated with an IQ score of 140? 𝑧= 140βˆ’ = =2.67 An IQ of 140 is 2.67 standard deviations above the mean! What is the z-score associated with an IQ score of 90? 𝑧= 90βˆ’ = βˆ’10 15 =βˆ’.67 An IQ of 90 is 0.67 standard deviations below the mean!

10 The Central Limit Theorem (CLT)
Suppose that we draw a simple random sample of size n from any population distribution (with finite mean and variance). When n is β€œlarge enough,” the Central Limit Theorem states that the sampling distribution of the sample mean 𝑋 is approximately normal. That is, 𝑋 𝑛 ~𝑁 πœ‡ π‘₯ , 𝜎 π‘₯ 2 𝑛 , as π‘›β†’βˆž. If we standardize this result, then we find that as π‘›β†’βˆž 𝑋 𝑛 βˆ’πœ‡ 𝜎 π‘₯ 𝑛 = 𝑍 𝑛 ~𝑁 0,1 . β€œLarge enough” is, in general, 𝑛β‰₯30. We will use this result in hypothesis testing. Show CLT with applet.

11 Hypothesis Testing Hypothesis testing is a process that involves testing tentative guesses (hypotheses) about relationships in a population. It can be viewed as a process of gathering evidence for (or against) a specific claim, typically regarding a research question being studied by a researcher. The researcher is concerned with testing whether or not the hypothesis can be supported empirically. The null hypothesis, denoted by 𝐻 0 , is the hypothesis in question. The researcher tests whether the data support or fail to support the null hypothesis. The opposing hypothesis, called the alternative hypothesis, and denoted by 𝐻 1 or 𝐻 𝐴 , is the hypothesis that is accepted if the data fail to support the null.

12 Hypothesis Testing Procedure
Form a null hypothesis ( 𝐻 0 ) and an alternative hypothesis ( 𝐻 1 ). Determine the rules for making a decision (i.e., rules for accepting or rejecting the null hypothesis). Gather data! This is your evidence to support or reject 𝐻 0 . Use laws of probability/statistical sampling to test 𝐻 0 vs. 𝐻 1 . Use the appropriate test statistic for testing 𝐻 0 . Accept or reject 𝐻 0 based on decision rules.

13 Hypothesis Testing: Hypotheses
The null hypothesis is usually specified so that the null parameter is equal to a specific value, or that two parameters are equivalent. Example: 𝐻 0 : πœ‡= πœ‡ 0 =5 , 𝐻 0 : πœ‡ 1 = πœ‡ 2 We will often call the null hypothesis, the hypothesis of no effect. The alternative hypothesis may be either One-tailed (or, one-sided): The alternative specifies a region either to the left (<) or right (>) of the null parameter. E.g., 𝐻 1 : πœ‡> πœ‡ 0 Two-tailed (or, two-sided, or non-directional): The alternative specifies a region to both the left or right of the null parameter. E.g., 𝐻 1 : πœ‡β‰  πœ‡ 0

14 Decision Rules and Errors
Decision rules are a set of guidelines that we use to determine (based on our gathered sample data) whether we should accept or reject 𝐻 0 . Type I Error: 𝑃 π‘Ÿπ‘’π‘—π‘’π‘π‘‘ 𝐻 0 𝐻 0 π‘‘π‘Ÿπ‘’π‘’ =𝛼 (also called a false positive) Type II Error: 𝑃 π‘Žπ‘π‘π‘’π‘π‘‘ 𝐻 0 𝐻 1 π‘‘π‘Ÿπ‘’π‘’ =𝛽 (also called a false negative) Power: 𝑃 π‘Ÿπ‘’π‘—π‘’π‘π‘‘ 𝐻 0 𝐻 1 π‘‘π‘Ÿπ‘’π‘’ =1βˆ’π›½ (also called a true positive) Correct negative: 𝑃 π‘Žπ‘π‘π‘’π‘π‘‘ 𝐻 0 𝐻 0 π‘‘π‘Ÿπ‘’π‘’ =1βˆ’π›Ό For hypothesis testing, only Type I error can be set manually. By convention, it is common to set it as 𝛼=0.05 or 0.01. One should really try to find an appropriate balance between Type I and Type II errors, but this is complicated. Type II error is a function of effect size, sample size, and Type I error. 𝑯 𝟎 True 𝑯 𝟏 True Reject 𝑯 𝟎 Type I error (𝛼) Power (1βˆ’π›½) Accept 𝑯 𝟎 Correct (1βˆ’π›Ό) Type II error (𝛽)

15 Critical Values A critical value is the value 𝑧 𝛼 such that if our observed z statistic is β€œmore extreme” than 𝑧 𝛼 we reject 𝐻 0 . The region greater than (if 𝑧 𝛼 > 𝑧 π‘œπ‘π‘  ) or less than (if 𝑧 𝛼 < 𝑧 π‘œπ‘π‘  ) the critical value is called the critical region. This region is determined by the value of 𝛼, as well as the null hypothesis. As mentioned, the convention is to set 𝛼=0.05 or 0.01. We find the critical value 𝑧 𝛼 that corresponds to this 𝛼-level.

16 Critical Values Suppose the hypotheses are 𝐻 0 :πœ‡= πœ‡ 0 vs. 𝐻 1 :πœ‡β‰  πœ‡ 0 . πœ‡ is the true mean. πœ‡ 0 is the hypothesized value for the true mean. Since this is two-sided, there are two critical values: Β± 𝑧 𝛼 2 . Suppose the hypotheses are 𝐻 0 :πœ‡= πœ‡ 0 vs. 𝐻 1 :πœ‡< πœ‡ 0 . There is only one critical value: βˆ’ 𝑧 𝛼 . Suppose the hypotheses are 𝐻 0 :πœ‡= πœ‡ 0 vs. 𝐻 1 :πœ‡> πœ‡ 0 . There is only one critical value: 𝑧 𝛼 .

17 p-values A p-value is the probability that a test statistic takes on a value as extreme or more extreme (in the direction of the alternative hypothesis) than the test statistic’s observed value, assuming 𝐻 0 is true. Say, the value of the test statistic is z. (The random variable of the test statistic is Z). If the hypotheses are 𝐻 0 :πœ‡β‰₯ πœ‡ 0 vs. 𝐻 1 :πœ‡< πœ‡ 0 , then 𝑝=𝑝 𝑍≀𝑧 If the hypotheses are 𝐻 0 :πœ‡β‰€ πœ‡ 0 vs. 𝐻 1 :πœ‡> πœ‡ 0 , then 𝑝=𝑝 𝑍β‰₯𝑧 If the hypotheses are 𝐻 0 :πœ‡= πœ‡ 0 vs. 𝐻 1 :πœ‡β‰  πœ‡ 0 , then 𝑝=𝑝 𝑍β‰₯|𝑧| =𝑝 π‘β‰€βˆ’π‘§ +𝑝(𝑍β‰₯𝑧) If it is further supposed that the distribution of Z is symmetric, then 𝑝=𝑝 𝑍β‰₯|𝑧| =𝑝 π‘β‰€βˆ’π‘§ +𝑝 𝑍β‰₯𝑧 =2𝑝 𝑍β‰₯𝑧 A small p-value means that the probability of obtaining your observed test statistic (assuming 𝐻 0 is true) is small. Thus, the smaller the p-value, the more evidence have against 𝐻 0 . Show examples of the three hypotheses on the board. Put the distribution, show alpha, z_alpha (or z_{alpha/2}).

18 Conducting Hypothesis Tests
For conducting hypothesis tests for the population mean, we will use the sample mean statistic. Either normality will be assumed, or we can make use of the CLT (assuming the sample size is large enough). For now, we will also assume the population variance is known. When the population variance is not known, then it must be estimated, which will change the distribution of the test statistic. This will be further explored later.

19 Example A developmental psychologist claims that a training program he developed according to a theory should improve problem- solving ability. For a population of 7-year-olds, the mean score πœ‡ on a standard problem-solving test is known to be 80 with a standard deviation of 10. To test the training program, 26 7-year-olds are selected at random, and their mean score is found to be 82. Let’s assume the population of scores is normally distributed. Can we conclude, at an 𝛼=.05 level of significance, that the program works? Assume the sd of scores after the training program is also 10. We hypothesize that the test improves problem-solving. What are the null and alternative hypotheses? 𝐻 0 :πœ‡β‰€80 vs. 𝐻 1 :πœ‡>80 The data are normally distributed, so 𝑋 ~𝑁 πœ‡, 𝜎 2 𝑛 A z-statistic can then be formed: 𝑧= π‘₯ βˆ’ πœ‡ 0 𝜎 𝑛 = 82βˆ’ β‰ˆ1.0198 p-value method: 𝑝=𝑝 𝑍β‰₯ =.15>.05 Since 𝑝>𝛼, then we cannot reject 𝐻 0 . Critical value method: 𝛼=.05β†’ 𝑧 𝛼 = Since 𝑧=1.0198< = 𝑧 𝛼 , then we cannot reject 𝐻 0 . Substantively, this means that a mean score of 82 is likely to occur by chance, so we cannot say that the training program improved problem-solving skills in 7-year-olds. Show on the board the duality of using p-values and using the critical value method.


Download ppt "Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017."

Similar presentations


Ads by Google