Presentation on theme: "AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were."— Presentation transcript:
Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were randomly assigned to two treatment groups: Lipitor or Pravachol.
At the end of the study, researchers compared the mean “bad cholesterol levels” for each group. This is a question about comparing two means.
The researchers also compared the proportion of subjects who died, had a heart attack or suffered other serious consequences in the first two years. This is a question about comparing two proportions.
Two-sample problems can arise from a randomized comparative experiment that randomly divides the subjects into two groups and exposes each group to a different treatment. Unlike the matched pairs design studied earlier there is no matching of the units in the two samples and the samples can even be of different sizes. Two-sample problems also arise when comparing two different samples randomly selected from two populations.
Conditions for Comparing Two Means SRS: ___________________________________________ This allows us to generalize our findings. We measure the same variable for both groups. Normality: Both populations are Normally distributed. In practice, it is enough that the distributions have _______________ and that the data have no strong _________. More on this at the end of the notes. Independence: The samples are independent. That is, one sample has no influence on the other. Paired observations violate independence, for example. When sampling without replacement from two distinct populations, each population must be at least _____ times as large as the corresponding sample size.
We want to compare the two population means, either by giving a confidence interval for their difference _______ or by testing the hypothesis of no difference, ___________. To do inference about the difference between the means of the two populations, we start with the difference between the means of the two samples, _____.
The Two-Sample z Statistic Here are the facts about the sampling distribution of the difference between the two sample means of independent SRSs. 1. The mean of equals ________ (i.e. the difference of sample means is an __________ estimator of the difference of population means. 2. The variance of the difference is the sum of the variances of, which is Note: the variances add because the samples are independent. The standard deviations do not. 3. If the two population distributions are both Normal, then the distribution of is also Normal.
Two-sample z statistic (for use when is known) Suppose that is the mean of an SRS of size drawn from a Normally distributed population with mean and standard deviation and that is the mean of an SRS of size drawn from a Normally distributed population with mean and standard deviation. Then the two-sample z statistic has the standard Normal distribution.
It is really very unlikely that both population standard deviations are known. Since this is rarely the case, let’s consider the more useful t procedures.
The Two-Sample t Procedures Because we don’t know the population standard deviations, we estimate them by the standard deviations from our two samples. Recall that this is called the ______________
We standardize our estimate, using the two-sample t statistic:
The level C confidence interval for is given by the formula:
The degrees of freedom, will equal _______________________
Note: The two-sample t statistic has approximately a t distribution. It does not have exactly a t distribution even if the populations are both exactly Normal.
Example 13.2-3: Does increasing the amount of calcium in our diet reduce blood pressure? Examination of a large sample of people revealed a relationship between calcium intake and blood pressure. The relationship was strongest for black men. Such observational studies do not establish causation. Researchers therefore designed a randomized comparative experiment. The subjects in part of the experiment were 21 healthy black men. A randomly chosen group of 10 of the men received a calcium supplement for 12 weeks. The control group of 11 men received a placebo pill that looked identical. The experiment was double-blind. The response variable is the decrease in systolic blood pressure for a subject after 12 weeks, in mm of Hg. An increase appears as a negative response.
Example: Construct and interpret a 90% confidence interval for the previous example.
We know that sample size does influence the P-value of a test. A result that fails to be significant at a specified level in a small sample may be significant in a larger sample. Subsequent analysis of data from an experiment with more subjects resulted in a P-value of 0.008.
Robustness Again The two-sample t procedures are more robust than the one-sample t methods, particularly when the distributions are _____________. When the sizes of the two samples are _______ and the two populations being compared have distributions with similar ______, probability values from the t table are quite accurate for a broad range of distributions, even when the sample sizes are as small as ____.
As a guide, should be greater than or equal to ___ with both __ and __.
In planning a two-sample study, choose _______ sample sizes if you can. The two-sample t procedures are most robust against non-Normality in this case and the conservative P-values are most accurate.