Two-Sample Inference Procedures with Means

Two-Sample Inference Procedures with Means

Two-Sample Procedures with means
The goal of these inference procedures is to compare the responses to two treatments or to compare the characteristics of two populations. We have INDEPENDENT samples from each treatment or population

Assumptions: Have two SRS’s from the populations or two randomly assigned treatment groups Samples are independent Both distributions are approximately normally Have large sample sizes Graph BOTH sets of data s’s known/unknown

Note: confidence interval statements
Matched pairs – refer to “mean difference” Two-Sample – refer to “difference of means”

Hypothesis Statements:
H0: m1 = m2 H0: m1 - m2 = 0 Ha: m1 - m2 < 0 Ha: m1 - m2 > 0 Ha: m1 - m2 ≠ 0 Be sure to define BOTH m1 and m2! Ha: m1< m2 Ha: m1> m2 Ha: m1 ≠ m2

Hypothesis Test: Since we usually assume H0 is true, then this equals 0 – so we can usually leave it out

Hypothesis statements:
H0: p1 = p2 H0: p1 - p2 = 0 Ha: p1 - p2 > 0 Ha: p1 - p2 < 0 Ha: p1 - p2 ≠ 0 Be sure to define both p1 & p2! Ha: p1 > p2 Ha: p1 < p2 Ha: p1 ≠ p2

Formula for Hypothesis test:
Usually p1 – p2 =0

Remember: We will be interested in the difference of means, so we will use this to find standard error.

mx-y =6 inches & sx-y =3.471 inches
Suppose we have a population of adult men with a mean height of inches and standard deviation of 2.6 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation of 2.3 inches. Assume heights are normally distributed. Describe the distribution of the difference in heights between males and females (male-female). Normal distribution with mx-y =6 inches & sx-y =3.471 inches

71 65 Female Male 6 Difference = male - female s = 3.471

What is the probability that the height of a randomly selected man is at most 5 inches taller than the height of a randomly selected woman? b) What is the 70th percentile for the difference (male-female) in heights of a randomly selected man & woman? P((xM-xF) < 5) = normalcdf(-∞,5,6,3.471) = .3866 (xM-xF) = invNorm(.7,6,3.471) = 7.82

Formulas Since in real-life, we will NOT know both s’s, we will do t-procedures.

Calculator does this automatically!
Degrees of Freedom Option 1: use the smaller of the two values n1 – 1 and n2 – 1 This will produce conservative results – higher p-values & lower confidence. Option 2: approximation used by technology Calculator does this automatically!

Confidence intervals:
Called standard error

Pooled procedures: Used for two populations with the same variance
When you pool, you average the two-sample variances to estimate the common population variance. DO NOT use on AP Exam!!!!! We do NOT know the variances of the population, so ALWAYS tell the calculator NO for pooling!

Two competing headache remedies claim to give fast-acting relief
Two competing headache remedies claim to give fast-acting relief. An experiment was performed to compare the mean lengths of time required for bodily absorption of brand A and brand B. Assume the absorption time is normally distributed. Twelve people were randomly selected and given an oral dosage of brand A. Another 12 were randomly selected and given an equal dosage of brand B. The length of time in minutes for the drugs to reach a specified level in the blood was recorded. The results follow: mean SD n Brand A Brand B Describe the shape & standard error for sampling distribution of the differences in the mean speed of absorption. (answer on next screen)

Describe the sampling distribution of the differences in the mean speed of absorption.
Find a 95% confidence interval difference in mean lengths of time required for bodily absorption of each brand. (answer on next screen) Normal distribution with S.E. = 3.316

From calculator df = 21.53, use t* for df = 21 & 95% confidence level
Assumptions: Have 2 independent randomly assigned treatments Given the absorption rate is normally distributed s’s unknown From calculator df = 21.53, use t* for df = 21 & 95% confidence level We are 95% confident that the true difference in mean lengths of time required for bodily absorption of each brand is between –5.685 minutes and minutes.

The length of time in minutes for the drugs to reach a specified level in the blood was recorded. The results follow: mean SD n Brand A Brand B Is there sufficient evidence that these drugs differ in the speed at which they enter the blood stream?

State assumptions! Hypotheses & define variables! H0: mA= mB Ha:mA= mB
Have 2 independent randomly assigned treatments Given the absorption rate is normally distributed s’s unknown State assumptions! Hypotheses & define variables! H0: mA= mB Ha:mA= mB Where mA is the true mean absorption time for Brand A & mB is the true mean absorption time for Brand B Formula & calculations Conclusion in context Since p-value > a, I fail to reject H0. There is not sufficient evidence to suggest that these drugs differ in the speed at which they enter the blood stream.

Suppose that the sample mean of Brand B is 16
Suppose that the sample mean of Brand B is 16.5, then is Brand B faster? No, I would still fail to reject the null hypothesis.

Robustness: Two-sample procedures are more robust than one-sample procedures BEST to have equal sample sizes! (but not necessary)

A modification has been made to the process for producing a certain type of time-zero film (film that begins to develop as soon as the picture is taken). Because the modification involves extra cost, it will be incorporated only if sample data indicate that the modification decreases true average development time by more than 1 second. Should the company incorporate the modification? Original Modified

H0: mO- mM = 1 Ha:mO- mM > 1
Assume we have 2 independent SRS of film Both distributions are approximately normal due to approximately symmetrical boxplots s’s unknown H0: mO- mM = 1 Ha:mO- mM > 1 Where mO is the true mean developing time for original film & mM is the true mean developing time for modified film Since p-value > a, I fail to reject H0. There is not sufficient evidence to suggest that the company incorporate the modification.

Two-Sample Proportions Inference

Assumptions: Two, independent SRS’s from populations ( or randomly assigned treatments) Populations at least 10n Normal approximation for both

Sampling Distributions for the difference in proportions
When tossing pennies, the probability of the coin landing on heads is However, when spinning the coin, the probability of the coin landing on heads is 0.4. Assume 25 trials were completed

Looking at the sampling distribution of the difference in sample proportions:
What is the mean of the difference in sample proportions (flip - spin)? Can the sampling distribution of difference in sample proportions (flip - spin) be approximated by a normal distribution? What is the probability that the difference in proportions (flipped – spun) is at least .25? Yes, since n1p1=12.5, n1(1-p1)=12.5, n2p2=10, n2(1-p2)=15 – so all are at least 5)

Formula for confidence interval:
Margin of error! Standard error! Note: use p-hat when p is not known

Example 1: At Community Hospital, the burn center is experimenting with a new plasma compress treatment. A random sample of 316 patients with minor burns received the plasma compress treatment. Of these patients, it was found that 259 had no visible scars after treatment. Another random sample of 419 patients with minor burns received no plasma compress treatment. For this group, it was found that 94 had no visible scars after treatment. What is the shape & standard error of the sampling distribution of the difference in the proportions of people with visible scars between the two groups? Since n1p1=259, n1(1-p1)=57, n2p2=94, n2(1-p2)=325 and all > 5, then the distribution of difference in proportions is approximately normal.

Example 1: At Community Hospital, the burn center is experimenting with a new plasma compress treatment. A random sample of 316 patients with minor burns received the plasma compress treatment. Of these patients, it was found that 259 had no visible scars after treatment. Another random sample of 419 patients with minor burns received no plasma compress treatment. For this group, it was found that 94 had no visible scars after treatment. What is a 95% confidence interval of the difference in proportion of people who had no visible scars between the plasma compress treatment & control group?

Have 2 independent randomly assigned treatment groups
Assumptions: Have 2 independent randomly assigned treatment groups Both distributions are approximately normal since n1p1=259, n1(1-p1)=57, n2p2=94, n2(1-p2)=325 and all > 5 Population of burn patients is at least 7350. Since these are all burn patients, we can add = 735. If not the same – you MUST list separately. We are 95% confident that the true the difference in proportion of people who had no visible scars between the plasma compress treatment & control group is between 53.7% and 65.4%

Example 2: Suppose that researchers want to estimate the difference in proportions of people who are against the death penalty in Texas & in California. If the two sample sizes are the same, what size sample is needed to be within 2% of the true difference at 90% confidence? Since both n’s are the same size, you have common denominators – so add! n = 3383

SO – which is correct? CIA = (.67, .83) CIB =(.52, .70)
Example 3: Researchers comparing the effectiveness of two pain medications randomly selected a group of patients who had been complaining of a certain kind of joint pain. They randomly divided these people into two groups, and then administered the painkillers. Of the 112 people in the group who received medication A, 84 said this pain reliever was effective. Of the 108 people in the other group, 66 reported that pain reliever B was effective. (BVD, p. 435) a) Construct separate 95% confidence intervals for the proportion of people who reported that the pain reliever was effective. Based on these intervals how do the proportions of people who reported pain relieve with medication A or medication B compare? b) Construct a 95% confidence interval for the difference in the proportions of people who may find these medications effective. SO – which is correct? CIA = (.67, .83) CIB =(.52, .70) Since the intervals overlap, it appears that there is no difference in the proportion of people who reported pain relieve between the two medicines. CI = (0.017, 0.261) Since zero is not in the interval, there is a difference in the proportion of people who reported pain relieve between the two medicines.

Since we assume that the population proportions are equal in the null hypothesis, the variances are equal. Therefore, we pool the variances! Do not do on AP exam!!!

Formula for Hypothesis test:
Usually p1 – p2 =0

Example 4: A forest in Oregon has an infestation of spruce moths
Example 4: A forest in Oregon has an infestation of spruce moths. In an effort to control the moth, one area has been regularly sprayed from airplanes. In this area, a random sample of 495 spruce trees showed that 81 had been killed by moths. A second nearby area receives no treatment. In this area, a random sample of 518 spruce trees showed that 92 had been killed by the moth. Do these data indicate that the proportion of spruce trees killed by the moth is different for these areas?

Assumptions: Have 2 independent SRS of spruce trees Both distributions are approximately normal since n1p1=81, n1(1-p1)=414, n2p2=92, n2(1-p2)=426 and all > 5 Population of spruce trees is at least 10,130. H0: p1=p where p1 is the true proportion of trees killed by moths Ha: p1≠p2 in the treated area p2 is the true proportion of trees killed by moths in the untreated area P-value = a = 0.05 Since p-value > a, I fail to reject H0. There is not sufficient evidence to suggest that the proportion of spruce trees killed by the moth is different for these areas

Two-Sample Inference Procedures with Means

Similar presentations

Presentation on theme: "Two-Sample Inference Procedures with Means"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Two-Sample Inference Procedures with Means

Similar presentations

Presentation on theme: "Two-Sample Inference Procedures with Means"— Presentation transcript:

Similar presentations

About project

Feedback