Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hypothesis testing – mean differences between populations

Similar presentations


Presentation on theme: "Hypothesis testing – mean differences between populations"— Presentation transcript:

1 Hypothesis testing – mean differences between populations
Part 2

2 Hypothesis Tests for Differences Between Two Population Means
Note that most research is not designed in such a way that the mean of one sample is compared to the population mean. In practice, most experimenters like to use control groups in their research. Often control groups are used as substitutes for population values. Mean weight of men who exercise regularly and mean weight of men who never exercise. Instead of comparing the mean of the research sample to the mean of the population, we could have compared the mean of the research sample to the mean of the control sample. This is normally done by computing the difference between the two means and then comparing this difference to the mean of the sampling distribution of differences between means.

3 Difference between two Population Means
INDEPENDENT SAMPLES If the selection of sample from one population does not affect the selection of the second sample from the second population. Suppose we want to estimate the difference between the mean income of male and female lecturers. We draw two samples, one from the population of male lecturers and another from the female lecturers. DEPENDENT SAMPLES To estimate the difference between the mean weight of all participants before and after a weight loss programme. To investigate the weight before and after the programme, it must involve the same respondents. The two samples are dependent.

4 Researchers want to see if men have a higher blood pressure than women do. A study is planned in which the blood pressures of 50 men and 50 women will be measured. Ho: m  f H1: m > f Alternatively, we can present as Ho: m - f  0 H1: m - f > 0 An airport official wants to assess if the flights from one airline (Airline 1) are less delayed than flights from another airline (Airline 2). H0: 1  2 (1 - 2  0) H1: 1 < 2 (1 - 2 <0)

5 To test H0: X=Y (equivalent to H0: X-Y = 0), we use the fact that:
Hypothesis Tests for Mean Differences Between Two Population and variances are known Apply the same principles as for hypothesis testing for a single population mean. To test H0: X=Y (equivalent to H0: X-Y = 0), we use the fact that: and thus the test statistic

6 Hypothesis Tests for Two Population Means: A Summary
Two Population Means, Independent Samples Lower-tail one-sided test: H0: μx  μy H1: μx < μy i.e., H0: μx – μy  0 H1: μx – μy < 0 Upper-tail one-sided test: H0: μx ≤ μy H1: μx > μy i.e., H0: μx – μy ≤ 0 H1: μx – μy > 0 Two-tail (two-sided) test: H0: μx = μy H1: μx ≠ μy i.e., H0: μx – μy = 0 H1: μx – μy ≠ 0

7 a a a/2 a/2 -za za -za/2 za/2 Decision Rules Reject H0 if z < -za
Two Population Means, Independent Samples, Variances Known Lower-tail test: H0: μx – μy  0 H1: μx – μy < 0 Upper-tail test: H0: μx – μy ≤ 0 H1: μx – μy > 0 Two-tail test: H0: μx – μy = 0 H1: μx – μy ≠ 0 a a a/2 a/2 -za za -za/2 za/2 Reject H0 if z < -za Reject H0 if z > za Reject H0 if z < -za/2 or z > za/2

8 Test statistic when σ12 and σ22 unknown
If σ12 and σ22 are unknown, replace with s12 , s22 (sample variances). If the samples of n1 and n2 are large (each n100), besides t test statistic, Z test statistic can be used too. If sample sizes are small, but at least 30 and above for each sample, and we are sampling from populations with normal distributions, we use t-test at df= (n1-1)+(n2-1).

9 Decision Rules a a a/2 a/2 -ta ta -ta/2 ta/2 Reject H0 if Reject H0 if
Two Population Means, Independent Samples, Variances Unknown Lower-tail test: H0: μx – μy  0 H1: μx – μy < 0 Upper-tail test: H0: μx – μy ≤ 0 H1: μx – μy > 0 Two-tail test: H0: μx – μy = 0 H1: μx – μy ≠ 0 a a a/2 a/2 -ta ta -ta/2 ta/2 Reject H0 if t < -t (nx+ny – 2), a Reject H0 if t > t (nx+ny, a) Reject H0 if t < -t (nx+ny – 2), a/2 or t > t ((nx+ny – 2), a/2

10 Decision making and conclusion for mean difference
H0: mean nose lengths of men and women are the same. H1: men have a longer mean nose length than women. If the p-value is (this is determined based on test statistic value. H1 show one-tailed (right) test). If the significance level is set 0.05, we do not reject the hull hypothesis. The conclusion is there is not enough evidence to say that that the populations of men and women have statistically significant mean difference in nose lengths. The observed mean difference in the sample is likely due to chance (sampling error when we collect data from a particular sample) If the p-value is assumed at 0.01, what is your decision?

11 Decision making using p-value
if p-value < significance level, reject null hypothesis. if p-value > significance level, do not reject null hypothesis

12 Note population variances are unknown, use t test statistic.
A test was given to two classes of 40 and 50 students respectively. In the first class, the mean mark was 74 with a standard deviation of 8. In the second class the mean mark was 78 with a standard deviation of 7. Is there a significant difference between the performance of the two classes at the 5% level of significance? Ho : 1 = 2 H1 : 1  2 (there is a significant difference between the population means) Note population variances are unknown, use t test statistic. t test = 74-78/[ (64/40) + (49/50)] = -4/2.58 = -4/1.606 =

13 Using the critical value approach
reject H reject H0 t There are 2 rejection regions for two-tailed test. Significant level  has to be divided by 2. If =0.05, the rejection area for each side becomes (0.05/2). At 0.025, the critical values get from t distribution table are and 1.96. The calculated t test statistic lies to the left of -1.96, so it in the rejection region. We reject the hull hypothesis. Hence we can reject H0 and conclude that there is a significant difference in the performance of the two classes

14 P-value approach Alternative hypothesis H1 indicates that it is a two-tailed test. Given t-test statistic , look for P(t>2.49) and P(t<-2.49). Summing the two probability values, you get the p-value. The P(t>2.49) at 88 d.f. will lie between 0.01 to Assume it is 0.006 P(t>2.49) +P(t<-2.49)= =0.012 Significance level =0.05, P-value= Thus, P-value < , we reject the null hypothesis. Conclusion - The results are statistically significant and so there is mean difference between performance of the two classes in the populations.

15 For the variable “Time spent watching TV in Typical Day,” here are results of a two-sample t-procedure that compares a random sample of men and women at a college. Which of the following is the correct conclusion about these results using a 5% significance level? The mean TV watching times of men and women at the college are equal. There is a statistically significant difference between the mean TV watching times of men and women at the college. There is not a statistically significant difference between the mean TV watching times of men and women at the college. There is not enough information to judge statistical significance here.

16 Assumption of variance equality
So far we have not said anything about variation in different populations . They could be equal or not equal. The degree of freedom in t-test statistic and standard error of sample mean are affected by the assumption of variance equality. The d.f. calculation for variances not equal is a bit complicated to determine. For this reason, we refer to the output in SPSS. Process: (i) test equality of variances using the F-test. In SPSS, it is stated under Levene test. (ii) once variance equality is known, choose the appropriate t- test for means difference.

17 A large car insurance company is conducting a study to see if male and female drivers have the same number of accidents, on average, or if male drivers (who tend to be thought of as more aggressive drivers) have more. Data on the number of accidents in the past 5 years is collected for randomly selected drivers who are insured by this company. An analysis of the results produced the following output. Can we reject 2m =2f?

18 State the null and alternative hypotheses.
Choose the appropriate t-test to use. State the p-value for the t-test statistic. Are the results statistically significant, if the significance level =0.05? Why? What is your conclusion for the results?

19 Two groups of 10 students each took an examination to see whether they have understood the course materials which were taught. Do the two groups differ in their understanding of the course materials at =0.10, based on the SPSS output below? Must state the null hypothesis and alternative hypothesis in your solution though it’s not mentioned in the question.

20 Using SPSS for hypothesis testing
A. Testing mean value One population mean: Analyse compare means one sample t test  drag a quantitative variable into test variable’s box and specify the test value, e.g 35 OK Two population means difference: Independent samples: Analyse compare means  independent samples t test  drag one numeric variable into the box of test variable and drag a qualitative variable with two categories responses only into the grouping variable Dependent samples (before-after): Analyse compare means pair samples t test  drag a numeric variable into variable 1 dialog box and repeat for the second variable.

21

22 In SPSS, it provides analysis for equal variance assumption
In SPSS, it provides analysis for equal variance assumption. Note that variance for population is usually unknown to researcher. It would be good if we can test out the validity of equal assumption as it will affect degree of freedom and standard error of means difference

23

24 One way ANOVA – mean differences for more than 2 groups
Do graduates of undergraduate business programs with different majors tend to earn disparate average starting salaries? Consider the data given in the table below. Accounting Marketing Finance Management $37,220 $28,620 $29,870 $28,600 $30,950 $27,750 $31,700 $27,450 $32,630 $27,650 $31,740 $26,410 $31,350 $27,640 $32,750 $27,340 $29,410 $28,340 $30,550 $27,300 $37,330 $29,250 $35,700 $28,890 $30,150 Can you reject at the 10% significance level that the mean starting salary is the same for each of the given business majors?

25 H0: At least two population means are unequal
Accounting Marketing Finance Management Sample sizes 7 5 8 Sample means Sample standard deviations H0: µa = µm = µf = µman H0: At least two population means are unequal Given the p-value (0.0001) is less than the significant level (0.1 or 10%), we have evidence to reject the null hypothesis that the means starting salary are not all equal for each major.

26 State the null hypothesis and alternative hypothesis.
What is your conclusion for this test?

27

28 Command to get ANOVA in SPSS
Analyse compare means one-way ANOVA a dialog is open. Drag a quantitative variable into the box of dependent variable and under factor drag a categorical variable with at least 3 responses.  OK.


Download ppt "Hypothesis testing – mean differences between populations"

Similar presentations


Ads by Google