Compare two populations by doing inference about the difference between two sample proportions.
we want to compare two populations or the responses to two treatments based on two independent samples. Ex: Do diet pills really work??? 100 adults took a daily diet pill, while 100 different adults were given a placebo. After a month, they found the proportion of people that lost weight using the diet pill and compared it to the proportion of adults who were taking the placebo and lost weight. (Remember, these are 2 independent group of adults)
Example 1: To study the long term effects of preschool programs for poor children, the Ed. Research Foundation has followed two groups of Michigan children since early childhood. One group of 62 attended preschool as 3 and 4 year olds. This is a sample from population 2, poor children and attended preschool. A control group of 61 children from the same area and similar backgrounds represents population1, poor children with no preschool. Thus, the sample sizes are n₁= 61 and n₂=62. A response variable is the need for social services as adults. In the past ten years, 38 of the pre-school sample and 49 of the control sample have needed social services. Therefore: =49/61= 38/62 *You don’t need to convert to decimals
1-Both random samples 2-Both populations are at least 10X sample 3-They’re independent 4-n₁p₁≥ 10 n₁(1-p₁)≥ 10 n₂p₂≥ 10 n₂(1-p₂)≥ 10
1-Both random samples 2-Both populations are at least 10X sample 3-They’re independent 4- 49≥ 10 38≥ 10 12≥ 10 24≥ 10
The Sampling distribution of The mean of is p₁-p₂ The variance of the difference is the sum of the variances of p1and p2. To get the standard deviation, then take the square root: ***** WE ADD THE variances, NOT THE standard deviation. When the samples are large, the distribution of is approximately normal.
Draw an SRS of size n₁ from a population having proportion p₁ of successes and draw an independent SRS of size n₂ from another population having proportion p₂ of successes. When n₁ and n₂ are large, an approximate level C confidence interval for p₁- p₂ is:
The standard error SE is: SE= z* is the upper (1-C)/2 standard normal critical value.
Name of interval: 2 proportion z-interval Assumptions: SAME AS ABOVE!! -they’re independent -both are random sample -both population is at least 10x sample -49≥1012≥10 -38≥1024≥10 PopulationPop. Description Sample SizeNumber who need s.s. 1Control6149 2Preschool6238
Calculate by hand- Calculate the 95% Confidence Interval: 0.194 +/- 1.96(0.08) =(0.033,0.347) 0.1904 +/- 0.157=(0.033,0.347) We are 95% confident that the difference in the true proportion of adults who need social Services is b/w 0.03 and 0.35
Example 2: While her husband spent 2½ hours picking out new speakers, a statistician decided to determine whether the percent of men who enjoy shopping for electronic equipment is higher than the percent of women who enjoy shopping for electronic equipment. The population was Saturday afternoon shoppers. Out of 67 men, 24 said they enjoyed the activity. 8 of the 24 women surveyed claimed to enjoy the activity. Are the results of the survey significant?
p₁=true proportion of men who enjoy shopping for electronic equipment p₂= true proportion of women who enjoy shopping for electronic equipment H₀: p₁=p₂ Ha: p₁>p₂ Assumptions: -they’re independent -both random sample -both population is at least 10x sample 24≥5 8≥5 43≥516≥5 (Remember if not ≥ 10, you can use 5)
2 proportion z- test α=0.05 P(z>0.219)=0.4133 Since p∡α, it is not statistically significant. Therefore we do not reject H₀. There is not enough evidence to say that the proportion of men who like shopping is greater than the proportion of women who like shopping for electronics.
Example 3: We are interested in determining whether or not Spark notes is a reasonable substitute for actually reading a novel for English class. Many educators believe that students who rely on Spark notes do not do as well as those who read the novel and do not rely on Spark notes. Give a 90% confidence interval for the difference in proportions of those who do well with and without relying on Spark notes. A- or higherB+ or lower Reliant on spark notes 1915 Not reliant on spark notes 2910
Two proportion z interval Assumptions: -they’re independent -both random samples -Both populations are at least 10x sample -19≥1029 ≥10 15 ≥10 10≥10 -0.18475 +/- 0.18125=(-0.366,-0.0035) We are 90% confident that the difference of the true proportion of those who did good and did and did not use spark notes is between -0.366 and -0.0035.