Introduction to Hypothesis Testing AP Statistics Chap 11-1.

1 Introduction to Hypothesis Testing AP Statistics Chap 11-1

2 Statistical Dilemma AP Statistics Chap 11-2 AT&T believes the average telephone bill in Columbus, Georgia is $42.05 per month. They take a sample of 100 bills and find that the average value of the sample is $ What does it mean?

3 AP StatisticsChap 11-3 Hypothesis Testing Population Conclusion: Mean age is lower than thought. How strong is the evidence? Sample Now select a random sample Compare the sample results to current accepted facts/thoughts. If currently accepted that mean age is 50 and sample mean is 20.

4 What is a Hypothesis? A hypothesis is a theory proposed to explain a observation. – population mean – population proportion AP StatisticsChap 11-4 Example: The mean monthly cell phone bill of this city is  = $42 Example : The proportion of adults in this city with cell phones is p =.68

5 The Null Hypothesis, H 0 States the currently accepted fact Example: The average number of TV sets in U.S. Homes is at least three ( ) AP StatisticsChap 11-5 Is always about a population parameter, not about a sample statistic

6 The Null Hypothesis, H 0 Assume that the null hypothesis is true until there is sufficient evidence to reject it. – Similar to the notion of innocent until proven guilty Always contains “=”, “≤” or “  ” sign May or may not be rejected – Never proven true or false AP StatisticsChap 11-6

7 The Alternative Hypothesis, H A Is generally the hypothesis that is believed by the researcher based on the sample. Challenges the H o Is the opposite of the null hypothesis – e.g.: The average number of TV sets in U.S. homes is less than 3 ( H A :  < 3 ) Never contains the “=”, “≤” or “  ” sign Stated as “≠”, “>” or “<“ AP StatisticsChap 11-7

8 If it is unlikely that we would get a sample mean of this value... Reason for Rejecting H 0 AP StatisticsChap 11-8 Sampling Distribution of the Statistic  = 50 If H 0 is true... then we reject the null hypothesis that  = if in fact this were the population mean… x

9 Level of Significance,  Defines unlikely values of sample statistic if null hypothesis is true – Defines rejection region of the sampling distribution Is designated by , (level of significance) – Typical values are.01,.05, or.10 Is selected by the researcher at the beginning AP StatisticsChap 11-9

10 Level of Significance and the Rejection Region AP StatisticsChap H 0 : μ =50 H A : μ < 50 0  Lower tail test Level of significance =  0 H 0 : μ = 50 H A : μ > 50  0 Upper tail test H 0 : μ = 50 H A : μ ≠ 50 /2 /2  Two tailed test Rejection region is shaded /2 /2 

11 p-Value Approach to Testing p-value: Probability of obtaining a test statistic more extreme ( ≤ or  ) than the observed sample value given H 0 is true – Also called observed level of significance AP StatisticsChap 11-11

12 p-Value Approach to Testing Obtain the p-value from a computer randomization model more extreme Compare the p-value with  – If p-value < , reject H 0 – If p-value  , do not reject H 0 AP StatisticsChap 11-12

13 Interpreting the p-value… AP StatisticsChap Overwhelming Evidence (Highly Significant) Strong Evidence (Significant) Weak Evidence (Not Significant) No Evidence (Not Significant)

14 Pictures were taken of 25 owners and their purebred dogs, selected at random from dog parks. Study participants were shown a picture of an owner together with pictures of two dogs (the owner’s dog and another random dog from the study) and asked to choose which dog most resembled the owner. Of the 25 owners, 16 were paired with the correct dog. Is this convincing evidence that dogs tend to resemble their owners or just the results of random chance? How extreme is a phat of.64, if the results is random chance? Dogs and Owners

15 Distribution of sample proportions P-Value =.238 for two tail test

16 Do men and women have different views on divorce? A May 2010 Gallup poll of U.S. citizens over the age of 18 asked participants if they view divorce as “morally acceptable”. Of the 1029 adults surveyed, 71% of men and 67% of women responded ‘yes’. What does the survey indicate? Men and women may differ in opinion. What is the no change hypothesis? Men and women do not differ in opinion. Attitude Toward Divorce

17 Is there sufficient evidence that men and women differ?

18 Researchers trained a sample of male college students to tap their fingers at a rapid rate. The sample was then divided at random into two groups of ten students each. Each student drank the equivalent of about two cups of coffee, which included about 200 mg of caffeine for the students in one group but was decaffeinated coffee for the second group. After a two hour period, each student was tested to measure finger tapping rate (taps per minute). The goal of the experiment was to determine whether caffeine produces an increase in the average tap rate. What are the Null and Alternate Hypotheses Caffeine and Finger Tapping

19 Hypotheses Or

20 Caffeine and Finger Tapping

21 Researchers conducted a study examining the effect of a smile on the leniency of disciplinary action. For each suspect, along with a description of the offense, a picture was provided with either a smile or neutral facial expression. A leniency score was calculated based on the disciplinary. The experimenters are testing to see if the average lenience score is higher for smiling students than it is for students with a neutral facial. Smiles and Punishment What are the null and alternate hypotheses?

22 Smiles and Punishment If α =.05, is the results statistically significant?

23 In a study of relationships between the type of uniforms worn by professional sports teams and the aggressiveness of the team, they consider teams from the National Football League (NFL). Participants with no knowledge of the teams rated the jerseys on characteristics such as timid/aggressive, nice/mean and good/bad. The averages of these responses produced a “malevolence” index with higher scores signifying impressions of more malevolent uniforms. To measure aggressiveness, the authors used the amount of converted to z-scores and averaged for each team over the seasons from r = 0.43 Is there a correlation between uniforms and penalties in the NFL? What are Ho and Ha? NFL Uniforms vs Penalties

24 Hypotheses

25 NFL Uniforms vs Penalties

26 Lithium vs Placebo An experiment to investigate the effectiveness of the two drugs desipramine and lithium in the treatment of cocaine addiction was conducted. Subjects (cocaine addicts seeking treatment) were randomly assigned to take one of the treatment drugs or a placebo so that there were 24 patients in each group. The results of the study are summarized in the table below. The question of interest is whether lithium is more effective at preventing relapse than taking an inert pill. State the null and alternative hypotheses. How would you test these hypotheses?


28 Type I and Type II Errors State of Nature Decision Do Not Reject No Error Type II Error Reject Type I Error Possible Hypothesis Test Outcomes H 0 False H 0 True No Error

29 Practical vs Statistical Significance Local college offers an SAT preparation course and provides a statistical analysis on its website showing that 95% of students improve their SAT score after taking their $1000 course. How much would it have to improve your score to make the cost of the course worthwhile? 50 points? 100 points? 300 points? Statistically significant results does not imply the size of the difference.

