Presentation is loading. Please wait.

Presentation is loading. Please wait.

6.1 - One Sample 6.1 - One Sample  Mean μ, Variance σ 2, Proportion π 6.2 - Two Samples 6.2 - Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.

Similar presentations


Presentation on theme: "6.1 - One Sample 6.1 - One Sample  Mean μ, Variance σ 2, Proportion π 6.2 - Two Samples 6.2 - Two Samples  Means, Variances, Proportions μ 1 vs. μ 2."— Presentation transcript:

1

2 6.1 - One Sample 6.1 - One Sample  Mean μ, Variance σ 2, Proportion π 6.2 - Two Samples 6.2 - Two Samples  Means, Variances, Proportions μ 1 vs. μ 2 σ 1 2 vs. σ 2 2 π 1 vs. π 2 μ 1 vs. μ 2 σ 1 2 vs. σ 2 2 π 1 vs. π 2 6.3 - Multiple Samples 6.3 - Multiple Samples  Means, Variances, Proportions μ 1, …, μ k σ 1 2, …, σ k 2 π 1, …, π k μ 1, …, μ k σ 1 2, …, σ k 2 π 1, …, π k CHAPTER 6 Statistical Inference & Hypothesis Testing CHAPTER 6 Statistical Inference & Hypothesis Testing

3 6.1 - One Sample 6.1 - One Sample  Mean μ, Variance σ 2, Proportion π 6.2 - Two Samples 6.2 - Two Samples  Means, Variances, Proportions μ 1 vs. μ 2 σ 1 2 vs. σ 2 2 π 1 vs. π 2 μ 1 vs. μ 2 σ 1 2 vs. σ 2 2 π 1 vs. π 2 6.3 - Multiple Samples 6.3 - Multiple Samples  Means, Variances, Proportions μ 1, …, μ k σ 1 2, …, σ k 2 π 1, …, π k μ 1, …, μ k σ 1 2, …, σ k 2 π 1, …, π k CHAPTER 6 Statistical Inference & Hypothesis Testing CHAPTER 6 Statistical Inference & Hypothesis Testing

4 6.1 - One Sample 6.1 - One Sample  Mean μ, Variance σ 2, Proportion π 6.2 - Two Samples 6.2 - Two Samples  Means, Variances, Proportions μ 1 vs. μ 2 σ 1 2 vs. σ 2 2 π 1 vs. π 2 μ 1 vs. μ 2 σ 1 2 vs. σ 2 2 π 1 vs. π 2 6.3 - Multiple Samples 6.3 - Multiple Samples  Means, Variances, Proportions μ 1, …, μ k σ 1 2, …, σ k 2 π 1, …, π k μ 1, …, μ k σ 1 2, …, σ k 2 π 1, …, π k CHAPTER 6 Statistical Inference & Hypothesis Testing CHAPTER 6 Statistical Inference & Hypothesis Testing

5 Women in U.S. who have given birth POPULATION “Random Variable” X = Age (years) That is, X ~ N( μ, 1.5). Present: Assume that X follows a “normal distribution” in the population, with std dev σ = 1.5 yrs, but unknown mean μ = ? mean FORMULA mean μ = ??? {x 1, x 2, x 3, x 4, …, x 400 } standard deviation σ = 1.5 This is referred to as a “point estimate” of μ from the sample. Improve this point estimate of μ to an “interval estimate” of μ, via the… “Sampling Distribution of ” size n = 400 Example: One Mean Objective 1: “Parameter Estimation” Estimate the parameter value μ.

6 X = Age of women in U.S. who have given birth Sampling Distribution of Population Distribution of X X μ standard deviation σ = 1.5 yrs If X ~ N(μ, σ), then… for any sample size n. μ “standard error”

7 Sampling Distribution of μ “standard error” To achieve Objective 1 — obtain an “interval estimate” of μ — we first ask the following general question: Find a “margin of error” (d) so that there is a 95% probability that the interval contains μ. Suppose is any random sample mean. |μ|μ

8 |μ|μ Sampling Distribution of μ “standard error” standard normal distribution N(0, 1) Z 0.95 0.025 +z.025 -z.025 d = (z.025 )(s.e.) = (1.96)(.075 yrs) = 0.147 yrs

9 |μ|μ standard normal distribution N(0, 1) Z 0.95 0.025 +z.025 -z.025 d = (z.025 )(s.e.) = (1.96)(.075 yrs) = 0.147 yrs The “confidence level” is 95%. IMPORTANT DEF’NS and FACTS d is called the “95% margin of error” and is equal to the product of the “.025 critical value” (i.e., z.025 = 1.96) times the “standard error” (i.e., ). The “significance level” is 5%. For any random sample mean the “95% confidence interval” is It contains μ with probability 95%. In this example, the 95% CI is For instance, if a particular sample yields the 95% CI is (25.6 – 0.147, 25.6 + 0.147) = (25.543, 25.747) yrs. It contains μ with 95% “confidence.”

10 |μ|μ standard normal distribution N(0, 1) Z 0.95 0.025 +z.025 -z.025 d = (z.025 )(s.e.) = (1.96)(.075 yrs) = 0.147 yrs The “confidence level” is 95%. IMPORTANT DEF’NS and FACTS d is called the “95% margin of error” and is equal to the product of the “.025 critical value” (i.e., z.025 = 1.96) times the “standard error” (i.e., ). The “significance level” is 5%. For any random sample mean the “95% confidence interval” is It contains μ with probability 95%. In this example, the 95% CI is For instance, if a particular sample yields the 95% CI is (25.6 – 0.147, 25.6 + 0.147) = (25.543, 25.747) yrs. It contains μ with 95% “confidence.” 1 – α α/2 +z α /2 -z α /2 1 – α “ α /2 “100(1 – α )% margin of error” z α /2 ) 1 – α. α.α. “100(1 – α )% “confidence interval” 1 – α. d = (z α/2 )(s.e.)

11 Example: α =.05, 1 – α =.95Example: α =.10, 1 – α =.90Example: α =.01, 1 – α =.99 +1.645-1.645 |μ|μ standard normal distribution N(0, 1) Z 0.95 0.025 +z.025 -z.025 The “confidence level” is 95%. IMPORTANT DEF’NS and FACTS d is called the “95% margin of error” and is equal to the product of the “.025 critical value” (i.e., z.025 = 1.96) times the “standard error” (i.e., ). The “significance level” is 5%. For any random sample mean the “95% confidence interval” is It contains μ with probability 95%. 1 – α α/2 +z α /2 -z α /2 1 – α “ α /2 “100(1 – α )% margin of error” z α /2 ) 1 – α. α.α. “100(1 – α )% “confidence interval” 1 – α. d = (z α/2 )(s.e.) What happens if we change α ? -1.96 +1.96 +2.575 -2.575 |0|0 Why not ask for α = 0, i.e., 1 – α = 1? Because then the critical values → ± ∞.

12 95% margin of error (z.025 )(s.e.) = (1.96)(.075 yrs) = 0.147 yrs IMPORTANT DEF’NS and FACTS μ|μ| … etc… In principle, over the long run, the probability that a random interval contains μ will approach 95%. standard normal distribution N(0, 1) 0.95 0.025 +1.96 -1.96 Z ? BUT…. In this example, the 95% CI is For instance, if a particular sample yields the 95% CI is (25.6 – 0.147, 25.6 + 0.147) = (25.543, 25.747) yrs. It contains μ with 95% “confidence.”

13 μ|μ| In principle, over the long run, the probability that a random interval contains μ will approach 95%. standard normal distribution N(0, 1) 0.95 0.025 +1.96 -1.96 Z BUT…. In practice, only a single, fixed interval is generated from a single random sample, so technically, “probability” does not apply. NOW, let us introduce and test a specific hypothesis… 95% margin of error (z.025 )(s.e.) = (1.96)(.075 yrs) = 0.147 yrs IMPORTANT DEF’NS and FACTS In this example, the 95% CI is For instance, if a particular sample yields the 95% CI is (25.6 – 0.147, 25.6 + 0.147) = (25.543, 25.747) yrs. It contains μ with 95% “confidence.”

14 Women in U.S. who have given birth μ > 25.4 POPULATION “Random Variable” X = Age at first birth mean μ = 25.4 H0:H0: “Null Hypothesis” μ < 25.4 That is, X ~ N(25.4, 1.5). μ < 25.4 standard deviation σ = 1.5 μ > 25.4 Year 2010: Suppose we know that X follows a “normal distribution” (a.k.a. “bell curve”) in the population. Or, is the “alternative hypothesis” H A : μ ≠ 25.4 true? mean Statistical Inference and Hypothesis Testing {x 1, x 2, x 3, x 4, …, x 400 } Study Question: Has “age at first birth” of women in the U.S. changed over time? public education, awareness programs socioeconomic conditions, etc. FORMULA Does the sample statistic tend to support H 0, or refute H 0 in favor of H A ? i.e., either or ? (2-sided) Present: Is μ = 25.4 still true?

15 95% CONFIDENCE INTERVAL FOR µ 25.74725.543 BASED ON OUR SAMPLE DATA, the true value of μ today is between 25.543 and 25.747, with 95% “confidence.” We have now seen: FORMAL CONCLUSIONS: The 95% confidence interval corresponding to our sample mean does not contain the “null value” of the population mean, μ = 25.4. Based on our sample data, we may reject the null hypothesis H 0 : μ = 25.4 in favor of the two-sided alternative hypothesis H A : μ ≠ 25.4, at the α =.05 significance level. INTERPRETATION: According to the results of this study, there exists a statistically significant difference between the mean ages at first birth in 2010 (25.4 yrs) and today, at the 5% significance level. Moreover, the evidence from the sample data suggests that the population mean age today is older than in 2010, rather than younger. FORMAL CONCLUSIONS: The 95% confidence interval corresponding to our sample mean does not contain the “null value” of the population mean, μ = 25.4. Based on our sample data, we may reject the null hypothesis H 0 : μ = 25.4 in favor of the two-sided alternative hypothesis H A : μ ≠ 25.4, at the α =.05 significance level. INTERPRETATION: According to the results of this study, there exists a statistically significant difference between the mean ages at first birth in 2010 (25.4 yrs) and today, at the 5% significance level. Moreover, the evidence from the sample data suggests that the population mean age today is older than in 2010, rather than younger. “point estimate” for μ Objective 2: Hypothesis Testing… via Confidence Interval NOTE THAT THE CONFIDENCE INTERVAL ONLY DEPENDS ON THE SAMPLE, NOT A SPECIFIC NULL HYPOTHESIS!!!

16 BASED ON OUR SAMPLE DATA, the true value of μ today is between 25.543 and 25.747, with 95% “confidence.” BASED ON OUR SAMPLE DATA, the true value of μ today is between 25.053 and 25.347, with 95% “confidence.” We have now seen: What if…? 95% CONFIDENCE INTERVAL FOR µ 25.74725.543 FORMAL CONCLUSIONS: The 95% confidence interval corresponding to our sample mean does not contain the “null value” of the population mean, μ = 25.4. Based on our sample data, we may reject the null hypothesis H 0 : μ = 25.4 in favor of the two-sided alternative hypothesis H A : μ ≠ 25.4, at the α =.05 significance level. INTERPRETATION: According to the results of this study, there exists a statistically significant difference between the mean ages at first birth in 2010 (25.4 yrs) and today, at the 5% significance level. Moreover, the evidence from the sample data suggests that the population mean age today is older than in 2010, rather than younger. FORMAL CONCLUSIONS: The 95% confidence interval corresponding to our sample mean does not contain the “null value” of the population mean, μ = 25.4. Based on our sample data, we may reject the null hypothesis H 0 : μ = 25.4 in favor of the two-sided alternative hypothesis H A : μ ≠ 25.4, at the α =.05 significance level. INTERPRETATION: According to the results of this study, there exists a statistically significant difference between the mean ages at first birth in 2010 (25.4 yrs) and today, at the 5% significance level. Moreover, the evidence from the sample data suggests that the population mean age today is older than in 2010, rather than younger. “point estimate” for μ Objective 2: Hypothesis Testing… via Confidence Interval “point estimate” for μ 95% CONFIDENCE INTERVAL FOR µ 25.34725.053 younger than in 2010, rather than older. NOTE THAT THE CONFIDENCE INTERVAL ONLY DEPENDS ON THE SAMPLE, NOT A SPECIFIC NULL HYPOTHESIS!!!

17 Objective 2: Hypothesis Testing… via Confidence Interval Objective 2: Hypothesis Testing… via Acceptance Region | μ = 25.4 95% margin of error (z.025 )(s.e.) = (1.96)(.075 yrs) = 0.147 yrs μ “standard error” Sampling Distribution of 25.4 “Null” Distribution of 25.253 25.547 95% ACCEPTANCE REGION FOR H 0 … we would expect a random sample mean to lie in here, with 95% probability… … and out here… …with 5% probability. 0.025 0.95

18 IF H 0 is true, then we would expect a random sample mean to lie between 25.253 and 25.547, with 95% probability. 25.547 25.253 95% ACCEPTANCE REGION FOR H 0 Objective 2: Hypothesis Testing… via Acceptance Region FORMAL CONCLUSIONS: The 95% acceptance region for the null hypothesis does not contain the sample mean of Based on our sample data, we may reject the null hypothesis H 0 : μ = 25.4 in favor of the two-sided alternative hypothesis H A : μ ≠ 25.4, at the α =.05 significance level. INTERPRETATION: According to the results of this study, there exists a statistically significant difference between the mean ages at first birth in 2010 (25.4 yrs) and today, at the 5% significance level. Moreover, the evidence from the sample data suggests that the population mean age today is older than in 2010, rather than younger. FORMAL CONCLUSIONS: The 95% acceptance region for the null hypothesis does not contain the sample mean of Based on our sample data, we may reject the null hypothesis H 0 : μ = 25.4 in favor of the two-sided alternative hypothesis H A : μ ≠ 25.4, at the α =.05 significance level. INTERPRETATION: According to the results of this study, there exists a statistically significant difference between the mean ages at first birth in 2010 (25.4 yrs) and today, at the 5% significance level. Moreover, the evidence from the sample data suggests that the population mean age today is older than in 2010, rather than younger. We have now seen: O u r d a t a v a l u e l i e s i n t h e 5 % R E J E C T I O N R E G I O N. NOTE THAT THE ACCEPTANCE REGION ONLY DEPENDS ON THE NULL HYPOTHESIS, NOT ON THE SAMPLE!!!

19 what is the probability of obtaining a random sample mean that is as, or more, extreme than the one actually obtained? Objective 2: Hypothesis Testing… via Acceptance Region Objective 2: Hypothesis Testing… via “p-value” | μ = 25.4 25.253 25.547 95% ACCEPTANCE REGION FOR H 0 0.025 0.95 i.e., 0.2 yrs OR MORE away from μ = 25.4, ON EITHER SIDE (since the alternative hypothesis is 2-sided)? 0.00383 > 1.96 - measures the strength of the rejection

20 what is the probability of obtaining a random sample mean that is as, or more, extreme than the one actually obtained? Objective 2: Hypothesis Testing… via Acceptance Region Objective 2: Hypothesis Testing… via “p-value” | μ = 25.4 25.253 25.547 95% ACCEPTANCE REGION FOR H 0 0.025 0.95 i.e., 0.2 yrs OR MORE away from μ = 25.4, ON EITHER SIDE (since the alternative hypothesis is 2-sided)? 0.00383 > 1.96 1 – α α / 2 100(1 – α)% ACCEPTANCE REGION FOR H 0 -z α /2 +z α /2 - measures the strength of the rejection

21  =.05 If p-value < , then reject H 0 ; significance!... But interpret it correctly!

22  CONFIDENCE INTERVAL Compute the sample mean Compute the 100(1 – α)% “margin of error” = (critical value)(standard error) Then the 100(1 – α)% CI = Formal Conclusion: Reject null hypothesis at level α, Statistical significance! Otherwise, retain it. if CI does not contain μ 0. ~ Summary of Hypothesis Testing for One Mean ~ NULL HYPOTHESIS H 0 : μ = μ 0 (“null value”) ALTERNATIVE HYPOTHESIS H A : μ  μ 0 i.e., either μ μ 0 (“two-sided”) Test null hypothesis at significance level α. z α/2 Assume the population random variable is normally distributed, i.e., X  N(μ, σ ).

23  ACCEPTANCE REGION Compute the sample mean Compute the 100(1 – α)% “margin of error” = (critical value)(standard error) Then the 100(1 – α)% AR = Formal Conclusion: Reject null hypothesis at level α, Statistical significance! Otherwise, retain it. ~ Summary of Hypothesis Testing for One Mean ~ NULL HYPOTHESIS H 0 : μ = μ 0 (“null value”) ALTERNATIVE HYPOTHESIS H A : μ  μ 0 i.e., either μ μ 0 (“two-sided”) Test null hypothesis at significance level α. z α/2 Assume the population random variable is normally distributed, i.e., X  N(μ, σ ). if AR does not contain

24  p-value Compute the sample mean Compute the z-score If +, then the p-value = 2 P(Z ≥ z-score ). If –, then the p-value = 2 P(Z ≤ z-score ). Formal Conclusion: Reject null hypothesis Statistical significance! Otherwise, retain it. Remember: “The smaller the p-value, the stronger the rejection, and the more statistically significant the result.” ~ Summary of Hypothesis Testing for One Mean ~ NULL HYPOTHESIS H 0 : μ = μ 0 (“null value”) ALTERNATIVE HYPOTHESIS H A : μ  μ 0 i.e., either μ μ 0 (“two-sided”) Test null hypothesis at significance level α. Assume the population random variable is normally distributed, i.e., X  N(μ, σ ). Z ~ N(0, 1) z-score if p < α.

25 The alternative hypothesis usually reflects the investigator’s belief! T h e a l t e r n a t i v e h y p o t h e s i s u s u a l l y r e f l e c t s t h e i n v e s t i g a t o r ’ s b e l i e f ! | μ = 25.4 95% ACCEPTANCE REGION FOR H 0 0.95 25.253 25.547 0.025.00383 Objective 2: Hypothesis Testing… 1-sided tests 2-sided test H 0 : μ = 25.4 H A : μ  25.4 1-sided tests “Right-tailed” H 0 : μ  25.4 H A : μ > 25.4 Here, all of  =.05 is in the right tail. In this case,  =.05 is split evenly between the two tails, left and right. p-value The alternative hypothesis usually reflects the investigator’s belief! T h e a l t e r n a t i v e h y p o t h e s i s u s u a l l y r e f l e c t s t h e i n v e s t i g a t o r ’ s b e l i e f !

26 .00383 25.547 25.253 0.025 95% ACCEPTANCE REGION FOR H 0 Objective 2: Hypothesis Testing… 1-sided tests 2-sided test H 0 : μ = 25.4 H A : μ  25.4 In this case,  =.05 is split evenly between the two tails, left and right. p-value | μ = 25.4 0.95.00383 1-sided tests “Right-tailed” H 0 : μ  25.4 H A : μ > 25.4 Here, all of  =.05 is in the right tail. 95% ACCEPTANCE REGION FOR H 0 0.05 ?

27 Objective 2: Hypothesis Testing… 1-sided tests 2-sided test H 0 : μ = 25.4 H A : μ  25.4 In this case,  =.05 is split evenly between the two tails, left and right. p-value | μ = 25.4 0.95 1-sided tests “Right-tailed” H 0 : μ  25.4 H A : μ > 25.4 Here, all of  =.05 is in the right tail. 95% ACCEPTANCE REGION FOR H 0 0.05 ? 25.2 “Left-tailed” H 0 : μ  25.4 H A : μ < 25.4 Here, all of  =.05 is in the left tail. The alternative hypothesis usually reflects the investigator’s belief! T h e a l t e r n a t i v e h y p o t h e s i s u s u a l l y r e f l e c t s t h e i n v e s t i g a t o r ’ s b e l i e f !

28 STATBOT 301 Subject: basic calculation of p-values for z- test sign of z-score? 1 – table entry table entry H A : μ ≠ μ 0 ? H A : μ < μ 0 H A : μ > μ 0 2 × table entry 2 × (1 – table entry) – + C ALCULATE … from H 0 Test Statistic “z-score” = C ALCULATE … from H 0 Test Statistic “z-score” = sign of z-score? 1 – table entry table entry H A : μ ≠ μ 0 ? H A : μ < μ 0 H A : μ > μ 0 2 × table entry 2 × (1 – table entry) – +

29 Women in U.S. who have given birth POPULATION “Random Variable” X = Age (years) Present: Assume that X follows a “normal distribution” in the population, with std dev σ = 1.5 yrs, but unknown mean μ = ? mean FORMULA mean μ = ??? {x 1, x 2, x 3, x 4, …, x 400 } Estimate the parameter value μ. standard deviation σ = 1.5 This is referred to as a “point estimate” of μ from the sample. Improve this point estimate of μ to an “interval estimate” of μ, via the… “Sampling Distribution of “ size n = 400 Example: One Mean Objective 1: “Parameter Estimation” But how do we know that the variance is the same as in 2010? That is, X  N( μ, 1.5).

30 (1.5) 2 = 2.25 in our example … which leads us to…

31 All have postively-skewed tails. HOWEVER……

32 In practice,  2 is almost never known, so the sample variance s 2 is used as a substitute in all calculations!


Download ppt "6.1 - One Sample 6.1 - One Sample  Mean μ, Variance σ 2, Proportion π 6.2 - Two Samples 6.2 - Two Samples  Means, Variances, Proportions μ 1 vs. μ 2."

Similar presentations


Ads by Google