Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.

Similar presentations


Presentation on theme: "Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part."— Presentation transcript:

1 Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part

2 What is inference? Inference is when we use a sample to make conclusions about a population. 1. Draw a Representative SAMPLE from the POPULATION Var 1 Var 2 Va 3 459Brown28 657Red43 321Green46 213Blue47 536Blue53 2. Describe the SAMPLE 3. Use Rules of Probability and Statistics to make Conclusions about the POPULATION from the SAMPLE.

3 Population Parameters p = population proportion p = population proportion µ = population mean µ = population mean σ = population standard deviation σ = population standard deviation β 1 = population slope (we will see later) β 1 = population slope (we will see later) Sample Statistics = sample proportion = sample proportion = sample mean = sample mean s = sample standard deviation s = sample standard deviation b 1 = sample slope (we will see later) b 1 = sample slope (we will see later)

4 Two Types of Inference 1. Confidence Intervals: –Confidence Intervals give us a range in which the population parameter is likely to fall. –We use confidence intervals whenever the research question calls for an estimation of a population parameter. Example: Estimate the proportion of US adult women who would vote for Hillary Clinton as president. Example: Estimate the proportion of US adult women who would vote for Hillary Clinton as president. Example: What is the mean age of trees in the forest? Example: What is the mean age of trees in the forest?

5 2. Hypothesis Testing: –Hypothesis tests are tests of population parameters. Example: Is the proportion of US adult women who would vote for Hillary Clinton greater than 50%? Example: Is the proportion of US adult women who would vote for Hillary Clinton greater than 50%? –We can only prove that a population parameter is ‘different’ than our null value. We cannot prove that a population parameter is equal to some value. Example: Valid Hypothesis: Is the mean age of trees in the forest greater than 50 years? Invalid Hypothesis: Is the mean age of trees in the forest equal to 50 years? Two Types of Inference, Cont

6 Types of CI’s and Hypothesis Tests For Hypothesis Tests and C.I.’s: 1-proportion (1-categorical variable) 1-proportion (1-categorical variable) 1-mean (1-quantitative variable) 1-mean (1-quantitative variable) Difference in 2 proportions (2-categorical variables, both with 2 possible outcomes) Difference in 2 proportions (2-categorical variables, both with 2 possible outcomes) Difference in 2 means (1-quantitative and 1- categorical variable, or 2-quantitative variables, independent samples) Difference in 2 means (1-quantitative and 1- categorical variable, or 2-quantitative variables, independent samples) Regression, Slope (2-quantitative variables) Regression, Slope (2-quantitative variables) For Hypothesis Tests only: Chi-Square Test (2-categorical variables, at least one with 3 or more levels!) Chi-Square Test (2-categorical variables, at least one with 3 or more levels!)

7 Some Examples Some Examples Polina wants to estimate the mean high-school GPA of incoming freshman at FIT. Polina wants to estimate the mean high-school GPA of incoming freshman at FIT. Solution- CI for one population mean. Pampos wants to know if the proportion of PSU students who engage in under age drinking is greater than 25%. Pampos wants to know if the proportion of PSU students who engage in under age drinking is greater than 25%. Solution- Hypothesis test of one proportion Null Hypothesis: H 0 : p ≤.25 Alternative Hypothesis: Ha: p >.25 Isaac wants to estimate the difference in the proportion of men and women who smoke. Isaac wants to estimate the difference in the proportion of men and women who smoke. Solution- CI for difference in 2-proportions.

8 Interpreting Confidence Intervals Given the confidence level, 90%, 95%, 99%, etc. Given the confidence level, 90%, 95%, 99%, etc. conclude the following (let L= confidence level): “With L% confidence the population parameter is “With L% confidence the population parameter is within the confidence interval.” within the confidence interval.” Example: Suppose the 90% CI for age of trees in the forest is (32,45) years. We are 90% confident that the true mean age of trees in the forest is between 32 and 45 years.

9 Interpreting Hypothesis Tests There are two hypotheses, the null and the alternative. The research aim is to to prove the alternative hypothesis significant. There are two hypotheses, the null and the alternative. The research aim is to to prove the alternative hypothesis significant. Use the p-value to determine whether we can reject the null hypothesis (H 0 ). Use the p-value to determine whether we can reject the null hypothesis (H 0 ). At this point we don’t need to know the exact definition, or how to calculate the p-value. But generally, the p-value is a measure of how consistent the data is with the null hypothesis. A small p-value (<.05) indicates the data we obtained was UNLIKELY under the null hypothesis. At this point we don’t need to know the exact definition, or how to calculate the p-value. But generally, the p-value is a measure of how consistent the data is with the null hypothesis. A small p-value (<.05) indicates the data we obtained was UNLIKELY under the null hypothesis. Decision Rule: If the p-value is <.05 we REJECT the null hypothesis, and accept the alternative. We have a statistically significant result! If the p-value is >.05 then we say that we do NOT have enough evidence in the data to reject the null hypothesis.

10 Second Part Confidence Intervals for 1-Proportion

11 Sample Proportion  Mean for = E( ) = p StdDev for = s.d.( ) = StdDev for = s.d.( ) = Standard Error of = s.e.( ) = Standard Error of = s.e.( ) =  If np and n(1-p) are greater than or equal to 10, the sampling distribution of is approximately normal with mean p and standard deviation i.e. normal with mean p and standard deviation i.e.

12 From Sampling Distributions to Confidence Intervals… The sample proportion will fall close to the true (unknown) proportion. The sample proportion will fall close to the true (unknown) proportion. Thus, the true proportion is likely to be close to the observed sample proportion. How close? Thus, the true proportion is likely to be close to the observed sample proportion. How close? 95% of the would be expected to fall within ± 2 standard deviations of the true proportion p. 95% of the would be expected to fall within ± 2 standard deviations of the true proportion p. SO if we were to construct intervals around the sample proportion with a width of ± 2 standard deviations these intervals would contain the TRUE population proportion 95% of the time! SO if we were to construct intervals around the sample proportion with a width of ± 2 standard deviations these intervals would contain the TRUE population proportion 95% of the time!

13 Margin of Error & C.I. is an estimator of p but it is not exactly equal to p. is an estimator of p but it is not exactly equal to p. But how far is from p? Or, how far is p from ? But how far is from p? Or, how far is p from ? Margin of Error is a measure of accuracy providing a likely upper limit for the difference between and p. Margin of Error is a measure of accuracy providing a likely upper limit for the difference between and p. In other words, this difference is almost always less than the Margin of Error, i.e. In other words, this difference is almost always less than the Margin of Error, i.e. The almost always is translated “with large probability”. The almost always is translated “with large probability”. Usually we are talking about 90%, 95% or 99% probability. Usually we are talking about 90%, 95% or 99% probability.

14 Margin of Error & C.I., Cont This probability is the confidence level. This probability is the confidence level. For example, if the confidence level is 95%, it means that 95% of the times the difference between and p is less than the Margin of Error. (e.g. we expect 38 out of 40 samples to give a such that its difference with p is less than the Margin of Error.) For example, if the confidence level is 95%, it means that 95% of the times the difference between and p is less than the Margin of Error. (e.g. we expect 38 out of 40 samples to give a such that its difference with p is less than the Margin of Error.) Example: Based on a sample of 1000 voters, the proportion of voters who favor candidate A are 34% with a 3% Margin of Error based on a 95% confidence level. What does this tell us? Example: Based on a sample of 1000 voters, the proportion of voters who favor candidate A are 34% with a 3% Margin of Error based on a 95% confidence level. What does this tell us?

15 Confidence Interval for 1-proprtion Conditions: We need to have Conditions: We need to have Note that we are using instead of p here! Note that we are using instead of p here! CI for p: CI for p: –M = multiplier, depends on the level of confidence desired. For a 95% CI the multiplier is ~ 2. –SE( ) is the standard error of the sample proportion. –Margin of Error = the multiplier times the SE Interpretation: Interpretation: If M=2, we are 95% confident that the true population proportion is contained within the confidence interval. If M=2, we are 95% confident that the true population proportion is contained within the confidence interval. Margin of Error

16 Example 1: A sample of 1200 people is polled to determine the percentage that are in favor of candidate A. Suppose 580 say they are in favor. Construct a 95% CI for the true population proportion. Conclusion: We are 95% confident that the true population proportion of those who support candidate A is between 45.5% and 51.2%. Conclusion: We are 95% confident that the true population proportion of those who support candidate A is between 45.5% and 51.2%.

17 Example 2: 300 high-risk patients received an experimental AIDS vaccine. The patients were followed for a period of 5 years and ultimately 53 came down with the virus. Assuming all patients were exposed to the virus construct a 99% CI for the proportion of individuals protected. 300 high-risk patients received an experimental AIDS vaccine. The patients were followed for a period of 5 years and ultimately 53 came down with the virus. Assuming all patients were exposed to the virus construct a 99% CI for the proportion of individuals protected. 99% CI = ± M*SE( ) 99% CI = ± M*SE( ) = 247/300 =.823 = 247/300 =.823 SE( ) = = sqrt(.823*(1-.823)/300) =.0220 SE( ) = = sqrt(.823*(1-.823)/300) =.0220 M = 2.58 M = 2.58 Can you see why M=2.58 using the Normal table? Can you see why M=2.58 using the Normal table? So 99% CI =.823 +/- 2.58*.0220 = (.767,.880) So 99% CI =.823 +/- 2.58*.0220 = (.767,.880) We are 99% confident that the true proportion of those protected by the vaccine is between 76.7% and 88.0%.

18 Width of a Confidence Interval is affected by: n as the sample size increases the standard error of decreases and the confidence interval gets smaller. So a larger sample size gives us a more precise estimate of p. n as the sample size increases the standard error of decreases and the confidence interval gets smaller. So a larger sample size gives us a more precise estimate of p. M as the confidence level increases, M the multiplier increases leading to a wider confidence interval. M as the confidence level increases, M the multiplier increases leading to a wider confidence interval. So, if we want to control the length of the C.I. we can adjust the confidence level or the sample size... So, if we want to control the length of the C.I. we can adjust the confidence level or the sample size...


Download ppt "Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part."

Similar presentations


Ads by Google