St. Edward’s University

St. Edward’s University
SLIDES BY John Loucks St. Edward’s University .

Chapter 11 Comparisons Involving Proportions and a Test of Independence
Inferences About the Difference Between Two Population Proportions Hypothesis Test for Proportions of a Multinomial Population Test of Independence

Inferences About the Difference Between Two Population Proportions
Interval Estimation of p1 - p2 Hypothesis Tests About p1 - p2

Inferences About the Difference Between Two Population Proportions
Let: p1 denote the proportion for population 1 p2 denote the population for population 2 To make an inference about p1 - p2 we will select two independent random samples consisting of n1 units from population 1 and n2 units from population 2. Let: denote the sample proportion for population 1 denote the sample proportion for population 2

Sampling Distribution of
Expected Value Standard Deviation (Standard Error) where: n1 = size of sample taken from population 1 n2 = size of sample taken from population 2

If the sample sizes are large, the sampling distribution of can be approximated by a normal probability distribution. The sample sizes are sufficiently large if all of these conditions are met: n1p1 > 5 n1(1 - p1) > 5 n2p2 > 5 n2(1 - p2) > 5

p1 – p2

Interval Estimation of p1 - p2
Interval Estimate where: Point Estimate is Margin of Error is

Example: Market Research Associates Market Research Associates is conducting research to evaluate the effectiveness of a client’s new advertising campaign. Before the new campaign began, a telephone survey of 150 households in the test market area showed 60 households “aware” of the client’s product. The new campaign has been initiated with TV and newspaper advertisements running for three weeks.

Example: Market Research Associates A survey conducted immediately after the new campaign showed 120 of 250 households “aware” of the client’s product. Does the data support the position that the advertising campaign has provided an increased awareness of the client’s product?

Point Estimator of the Difference Between Two Population Proportions
p1 = proportion of the population of households “aware” of the product after the new campaign p2 = proportion of the population of households “aware” of the product before the new campaign = sample proportion of households “aware” of the product after the new campaign product before the new campaign

For = .05, z.025 = 1.96: (.0510) Hence, the 95% confidence interval for the difference in before and after awareness of the product is -.02 to +.18.

Excel Formula Worksheet A B C D E 1 Sur2 Sur1 Survey 2 (from Popul.1) Survey 1 (from Popul.2) 2 No Yes Sample Size 250 150 3 No. of "Yes" =COUNTIF(A2:A251,"Yes") =COUNTIF(B2:B151,"Yes") 4 Samp. Propor. =D3/D2 =E3/E2 5 6 Confid. Coeff. 0.95 7 Lev. Of Signif. =1-D6 8 z Value =NORM.S.INV(1-D7/2,TRUE) 9 10 Std. Error =SQRT(D4*(1-D4)/D2+E4*(1-E4)/E2) 11 Marg. of Error =D8*D10 12 13 Pt. Est. of Diff. =D4-E4 14 Lower Limit =D13-D11 15 Upper Limit =D13+D11 Note: Rows are not shown.

Excel Value Worksheet A B C D E 1 Sur2 Sur1 Survey 2 (from Popul.1) Survey 1 (from Popul.2) 2 No Yes Sample Size 250 150 3 No. of "Yes" 120 60 4 Samp. Propor. 0.48 0.40 5 6 Confid. Coeff. 0.95 7 Lev. Of Signif. 0.05 8 z Value 1.960 9 10 Std. Error 0.0510 11 Marg. of Error 0.0999 12 13 Pt. Est. of Diff. 0.080 14 Lower Limit -0.020 15 Upper Limit 0.180 Note: Rows are not shown.

Hypothesis Tests about p1 - p2
Hypotheses We focus on tests involving no difference between the two population proportions (i.e. p1 = p2) H0: p1 - p2 < 0 Ha: p1 - p2 > 0 Left-tailed Right-tailed Two-tailed

Pooled Estimate of Standard Error of where:

Test Statistic

Example: Market Research Associates Can we conclude, using a .05 level of significance, that the proportion of households aware of the client’s product increased after the new advertising campaign?

p -Value and Critical Value Approaches 1. Develop the hypotheses. H0: p1 - p2 < 0 Ha: p1 - p2 > 0 p1 = proportion of the population of households “aware” of the product after the new campaign p2 = proportion of the population of households “aware” of the product before the new campaign

p -Value and Critical Value Approaches 2. Specify the level of significance. a = .05 3. Compute the value of the test statistic.

p –Value Approach 4. Compute the p –value. For z = 1.56, the p–value = .0594 5. Determine whether to reject H0. Because p–value > a = .05, we cannot reject H0. We cannot conclude that the proportion of households aware of the client’s product increased after the new campaign.

Critical Value Approach 4. Determine the critical value and rejection rule. For a = .05, z.05 = 1.645 Reject H0 if z > 1.645 5. Determine whether to reject H0. Because 1.56 < 1.645, we cannot reject H0. We cannot conclude that the proportion of households aware of the client’s product increased after the new campaign.

Excel Formula Worksheet A B C D E 1 Sur2 Sur1 Survey 2 (from Popul.1) Survey 1 (from Popul.2) 2 No Yes Sample Size =COUNTA(A2:A251) =COUNTA(B2:B151) 3 Resp. of Interest 4 Count for Resp. =COUNTIF(A2:A251,D3) =COUNTIF(B2:B151,E3) 5 Sample Propor. =D4/D2 =E4/E2 6 7 Hypoth. Value 8 Point Est. of Diff. =D5-E5 9 10 Pooled Est. of p =(D2*D5+E2*E5)/(D2+E2) 11 Standard Error 12 Test Statistic =(D8-D7)/D11 13 14 -Value (lower tail) =NORM.S.DIST(D12,TRUE) 15 -Value (upper tail) =1-NORM.S.DIST(D12,TRUE) 16 -Value (two tail) =2*MIN(D14,D15) =SQRT(D10*(1-D10)*(1/D2+1/E2)) Note: Rows are not shown.

Excel Value Worksheet A B C D E 1 Sur2 Sur1 Survey 2 (from Popul.1) Survey 1 (from Popul.2) 2 No Yes Sample Size 250 150 3 Resp. of Interest 4 Count for Resp. 120 60 5 Sample Propor. 0.48 0.40 6 7 Hypoth. Value 8 Point Est. of Diff. 0.08 9 10 Pooled Est. of p 0.450 11 Standard Error 12 Test Statistic 1.557 13 14 -Value (lower tail) 0.940 15 -Value (upper tail) 0.060 16 -Value (two tail) 0.120 0.0514 Note: Rows are not shown.

Hypothesis Test for Proportions of a Multinomial Population
In this case, each element of a population is assigned to one and only one of several classes or categories. Such a population is a multinomial population. The multinomial distribution can be thought of as an extension of the binomial distribution. On each trial of a multinomial experiment: One and only one of the outcomes occurs Each trial is assumed to be independent The probabilities of the outcomes remain the same for each trial

1. State the null and alternative hypotheses.
Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population 1. State the null and alternative hypotheses. H0: The population follows a multinomial distribution with specified probabilities for each of the k categories Ha: The population does not follow a multinomial distribution with specified probabilities for each of the k categories

Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population
2. Select a random sample and record the observed frequency, fi , for each of the k categories. 3. Assuming H0 is true, compute the expected frequency, ei , in each category by multiplying the category probability by the sample size.

4. Compute the value of the test statistic. where: fi = observed frequency for category i ei = expected frequency for category i k = number of categories Note: The test statistic has a chi-square distribution with k – 1 df provided that the expected frequencies are 5 or more for all categories.

5. Rejection rule: p-value approach: Reject H0 if p-value < a Critical value approach: Reject H0 if where  is the significance level and there are k - 1 degrees of freedom

Multinomial Distribution Goodness of Fit Test
Example: Finger Lakes Homes (A) Finger Lakes Homes manufactures four models of prefabricated homes, a two-story colonial, a log cabin, a split-level, and an A-frame. To help in production planning, management would like to determine if previous customer purchases indicate that there is a preference in the style selected.

Example: Finger Lakes Homes (A) The number of homes sold of each model for 100 sales over the past two years is shown below. Split A- Model Colonial Log Level Frame # Sold

Hypotheses H0: pC = pL = pS = pA = .25 Ha: The population proportions are not pC = .25, pL = .25, pS = .25, and pA = .25 where: pC = population proportion that purchase a colonial pL = population proportion that purchase a log cabin pS = population proportion that purchase a split-level pA = population proportion that purchase an A-frame

Rejection Rule Reject H0 if p-value < .05 or c2 > With  = .05 and k - 1 = = 3 degrees of freedom Do Not Reject H0 Reject H0 2 7.815

Expected Frequencies Test Statistic e1 = .25(100) = e2 = .25(100) = 25 e3 = .25(100) = e4 = .25(100) = 25 = = 10

Conclusion Using the p-Value Approach Area in Upper Tail c2 Value (df = 3) Because c2 = 10 is between and , the area in the upper tail of the distribution is between .025 and .01. The p-value < a . We can reject the null hypothesis.

Conclusion Using the Critical Value Approach c2 = 10 > 7.815 We reject, at the .05 level of significance, the assumption that there is no home style preference.

Excel Worksheet (showing data) Note: Rows are not shown.

Excel Formula Worksheet C D E F G H I 1 Hyp. Observed Expect. Sq'd. Sq.Diff./ 2 Categ. Prop. Frequency Freq. Diff. Exp.Freq. 3 Col. 0.25 =COUNTIF(B2:B101,"Col") =D3*$E$7 =E3-F3 =G3^2 =H3/F3 4 Log =COUNTIF(B2:B101,"Log") =D4*$E$7 =E4-F4 =G4^2 =H4/F4 5 Split-L =COUNTIF(B2:B101,"Spl") =D5*$E$7 =E5-F5 =G5^2 =H5/F5 6 A-Fr. =COUNTIF(B2:B101,"Afr") =D6*$E$7 =E6-F6 =G6^2 =H6/F6 7 Total =SUM(E3:E6) =SUM(I3:I6) 8 9 10 =I7 11 =E9-1 12 =CHISQ.DIST.RT(E10,E11) Categories Degr. of Free. p -Value Test Statistic Note: Columns A-B and rows are not shown.

Excel Value Worksheet C D E F G H I 1 Hyp. Observed Expect. Sq'd. Sq.Diff./ 2 Categ. Prop. Frequency Freq. Diff. Exp.Freq. 3 Col. 0.25 30 25 5 4 Log 20 -5 Split-L 35 10 100 6 A-Fr. 15 -10 7 Total 8 9 11 12 0.0186 Categories Degr. of Free. p -Value Test Statistic Note: Columns A-B and rows are not shown.

Test of Independence Another important application of the chi-square distribution involves using sample data to test for the independence of two variables. To test whether two variables are independent, one sample is selected and crosstabulation is used to summarize the data for the two variables simultaneously.

Test of Independence 1. Set up the null and alternative hypotheses.
H0: The column variable is independent of the row variable Ha: The column variable is not independent of the row variable 2. Select a random sample and record the observed frequency, fij , for each cell of the contingency table. 3. Compute the expected frequency, eij , for each cell.

Test of Independence 4. Compute the test statistic.
5. Determine the rejection rule. Reject H0 if p -value < a or where  is the significance level and, with n rows and m columns, there are (n - 1)(m - 1) degrees of freedom.

Test of Independence Example: Finger Lakes Homes (B)
Each home sold by Finger Lakes Homes can be classified according to price and to style. Finger Lakes’ manager would like to determine if the price of the home and the style of the home are independent variables.

Test of Independence Example: Finger Lakes Homes (B)
The number of homes sold for each model and price for the past two years is shown below. For convenience, the price of the home is listed as either $200,000 or less or more than $200,000. Price Colonial Log Split-Level A-Frame < $200, > $200,

Test of Independence Hypotheses
H0: Price of the home is independent of the style of the home that is purchased Ha: Price of the home is not independent of the style of the home that is purchased

Test of Independence Expected Frequencies
Price Colonial Log Split-Level A-Frame Total < $200K > $200K Total

Test of Independence Rejection Rule
With  = .05 and (2 - 1)(4 - 1) = 3 d.f., Reject H0 if p-value < .05 or 2 > 7.815 Test Statistic = =

Test of Independence Conclusion Using the p-Value Approach
Area in Upper Tail c2 Value (df = 3) Because c2 = is between and 9.348, the area in the upper tail of the distribution is between .05 and .025. The p-value < a . We can reject the null hypothesis.

Test of Independence Conclusion Using the Critical Value Approach
We reject, at the .05 level of significance, the assumption that the price of the home is independent of the style of home that is purchased.

Test of Independence Excel Worksheet (showing data) A B C D E 1 Home
Price ($) Style 2 >200K Colonial 3 <=200K Log 4 5 A-Frame 6 7 Split-Level 8 9 10 Note: Rows are not shown.

Test of Independence Excel Worksheet (showing Pivot Table)
J 1 2 Price ($) Colonial Log Split-Lev. A-Frame 3 <=200K 18 6 19 12 55 4 >200K 14 16 45 5 Grand Total 30 20 35 15 100 Count of Home Preference Grand Tot. Note: Columns A-D are not shown.

Test of Independence Excel Formula Worksheet
G H I J 1 2 Price ($) Colonial Log Split-Lev. A-Frame 3 <=200K 18 6 19 12 55 4 >200K 14 16 45 5 Grand Total 30 20 35 15 100 Count of Home Preference 7 Expected Frequencies 8 9 10 11 =I5*J3/J5 =I5*J4/J5 =H5*J3/J5 =H5*J4/J5 =F5*J3/J5 =G5*J3/J5 =F5*J4/J5 =G5*J4/J5 p -Value =CHISQ.TEST(F3:I4,F9:I10) Grand Tot. Note: Columns A-D are not shown.

Test of Independence Excel Value Worksheet
G H I J 1 2 Price ($) Colonial Log Split-Lev. A-Frame 3 <=200K 18 6 19 12 55 4 >200K 14 16 45 5 Grand Total 30 20 35 15 100 Count of Home Preference 7 Expected Frequencies 8 9 10 11 8.25 6.75 19.25 15.75 16.50 11.00 13.50 9.00 p -Value 0.0274 Grand Tot. Note: Columns A-D are not shown.

End of Chapter 11

St. Edward’s University

Similar presentations

Presentation on theme: "St. Edward’s University"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

St. Edward’s University

Similar presentations

Presentation on theme: "St. Edward’s University"— Presentation transcript:

Similar presentations

About project

Feedback