Presentation is loading. Please wait.

Presentation is loading. Please wait.

MBA 7025 Statistical Business Analysis Hypothesis Testing Jan 27, 2015

Similar presentations


Presentation on theme: "MBA 7025 Statistical Business Analysis Hypothesis Testing Jan 27, 2015"— Presentation transcript:

1 MBA 7025 Statistical Business Analysis Hypothesis Testing Jan 27, 2015

2 Agenda Hypothesis Testing One-sample Hypothesis Test for the Mean
Chi-Squared Tests

3 Attempt to prove (or disprove) some assumption Setup:
Introduction Attempt to prove (or disprove) some assumption Setup: Alternate hypothesis: What you wish to prove Example: Person is guilty of crime Null hypothesis: Assume the opposite of what is to be proven. The null is always stated as an equality. Example: Person is innocent

4 Hypothesis Testing Take a sample, compute statistic of interest.
The evidence gathered against defendant. How likely is it that if the null were true, you would get such a statistic? (the p-value) How likely is it that an innocent person would be found at the scene of crime, with gun in hand, etc. If very unlikely, then null must be false, hence alternate is proven beyond reasonable doubt. If quite likely, then null may be true, so not enough evidence to discard it in favor of the alternate.

5 Types of Errors Null is really True False reject null,
assume alternate is proven Type I Error (convict the innocent) Good Decision do not reject null, evidence for alternate not strong enough Type II Error (let guilty go free)

6 Same tests as Non-Normal Medians
Hypothesis Testing Roadmap Hypothesis Testing Attribute Continuous Normal, Interval Scaled Non-Normal, Ordinal Scaled c2 Contingency Tables Correlation Same tests as Non-Normal Medians Variance Medians Means Levene’s Sign Test Wilcoxon Kruskal-Wallis Mood’s Friedman’s c2 F-test Bartlett’s Z-tests t-tests ANOVA Regression

7 Parametric Tests Use parametric tests when:
The data are normally distributed The variances of populations (if more than one is sampled from) are equal The data are at least interval scaled

8 1) One sample z - test Used when testing to see if sample comes from a known population. A sample of 25 measurements shows a mean of 17. Test whether this is significantly different from a the hypothesized mean of 15, assuming the population standard deviation is known to be 4. One-Sample Z Test of mu = 15 vs not = 15 The assumed standard deviation = 4 N Mean SE Mean % CI Z P ( , )

9 2) z – test for proportions
70% of 200 customers surveyed say they prefer the taste of Brand X over competitors. Test the hypothesis that more than 66% of people in the population prefer Brand X. Test and CI for One Proportion Test of p = 0.66 vs p > 0.66 95% Lower Sample X N Sample p Bound Z-Value P-Value

10 3) One sample t-test The data show reductions in Blood Pressure in a sample of 17 people after a certain treatment. We wish to test whether the average reduction in BP was at least 13%, a benchmark set by some other treatment that we wish to match or better. BP Reduction% 10 12 9 8 7 14 13 15 16 18 19 20 17

11 3) One sample t-test The p-value of 0.20 indicates that the reduction in BP could not be proven to be greater than 13%. There is a 0.20 probability that it is not greater than 13%. One-Sample T: BP Reduction Test of mu = 13 vs > 13 95% Lower Variable N Mean StDev SE Mean Bound T P BP Reduction

12 4) Two sample t-test You realize that though the overall reduction is not proven to be more than 13%, there seems to be a difference between how men and women react to the treatment. You separate the 17 observations by gender, and wish to test whether there is in fact a significant difference between genders. M F 10 15 12 16 9 18 8 12 7 18 12 19 14 20 13 17 15

13 4) Two sample t-test The test for equal variances shows that they are not different for the 2 samples. Thus a 2-sample t test may be conducted. The results are shown below. The p-value indicates there is a significant difference between the genders in their reaction to the treatment. Two-sample T for BP Reduction M vs BP Reduction F N Mean StDev SE Mean BP Red M BP Red F Difference = mu (BP Red M) - mu (BP Red F) Estimate for difference: 95% CI for difference: ( , ) T-Test of difference = 0 (vs not =): T-Value = P-Value = 0.000 DF = 15 Both use Pooled StDev =

14 Basics of ANOVA Obs. Type A Type B 1 2 3 4 6 7 8 Mean SS Overall
Analysis of Variance, or ANOVA is a technique used to test the hypothesis that there is a difference between the means of two or more populations. It is used in Regression, as well as to analyze a factorial experiment design, and in Gauge R&R studies. The basic premise of ANOVA is that differences in the means of 2 or more groups can be seen by partitioning the Sum of Squares. Sum of Squares (SS) is simply the sum of the squared deviations of the observations from their means. Consider the following example with two groups. The measurements show the thumb lengths in centimeters of two types of primates. Obs. Type A Type B 1 2 3 4 6 7 8 Mean SS Overall Mean = 5 SS = 28

15 See www.statsoft.com for more details Obs. Type A Type B 1 2 3 4 6 7 8
Basics of ANOVA Total variation (SS) is 28, of which only 4 (2+2) is within the two groups. Thus 24 of the 28 is due to the differences between the groups. This partitioning of SS into ‘between’ and ‘within’ is used to test the hypothesis that the groups are in fact different from each other. See for more details Obs. Type A Type B 1 2 3 4 6 7 8 Mean SS Overall Mean = 5 SS = 28

16 5) One-Way ANOVA The results of running an ANOVA on the sample data from the previous slide are shown here. The hypothesis test computes the F-value as the ratio of MS ‘Between’ to MS ‘Within’. The greater the value of F, the greater the likelihood that there is in fact a difference between the groups. looking it up in an F-distribution table shows a p-value of 0.008, indicating a 99.2% confidence that the difference is real (exists in the Population, not just in the sample). One-way ANOVA: Type A, Type B Source DF SS MS F P Factor Error Total ___________________________________ S = 1 R-Sq = 85.71% R-Sq(adj) = 82.14% Minitab: Stat/ANOVA/One-Way (unstacked)

17 6) Two-Way ANOVA Is the strength of steel produced different for different temperatures to which it is heated and the speed with which it is cooled? Here 2 factors (speed and temp) are varied at 2 levels each, and strengths of 3 parts produced at each combination are measured as the response variable The results show significant main effects as well as an interaction effect. Strength Temp Speed 20.0 Low Slow 22.0 Low Slow 21.5 Low Slow 23.0 Low Fast 24.0 Low Fast 22.0 Low Fast 25.0 High Slow 24.0 High Slow 24.5 High Slow 17.0 High Fast 18.0 High Fast 17.5 High Fast Two-way ANOVA: Strength versus Temp, Speed Source DF SS MS F P Temp Speed Interaction Error Total S = R-Sq = 94.08% R-Sq(adj) = 91.86%

18 6) Two-Way ANOVA The box plots give an indication of the interaction effect. The effect of speed on the response is different for different levels of temperature. Thus, there is an interaction effect between temperature and speed.

19 Agenda Hypothesis Testing One-sample Hypothesis Test for the Mean
Chi-Squared Tests

20 Hypothesis Testing Example Gas Price
You believe that the current price of unleaded regular gasoline is less than $4.00 on average nationwide, and wish to prove it. Set up the hypothesis and test it. 99 99 99 99 99 99 99 99 99 94

21 i) Null and Alternate Hypotheses
What we wish to prove is called the Alternate Hypothesis. The opposite of that is the Null, which must be assumed and shown to be unlikely, based on sample data. H0: μ = 4.00 Ha: μ < 4.00 What constitutes proof? Any conclusion based on a sample may be wrong. What probability (at most) of being wrong is acceptable to you? This is called (alpha), or the acceptable Type I Error. Let = 0.05 (or 5%) 100 100 100

22 ii) The Sample Data A sample of 49 gas stations nationwide shows average price of unleaded is $ 3.87 and a standard deviation of $ Could this sample have come from a population where the Mean was in fact $4.00 (or greater)? Assume the null is true, and this sample did in fact come from such a population. 100 100 100

23 iii) Sampling Distribution if H0 True
What would the distribution of sample means from such a population look like? From the Central Limit Theorem, we have the following: = $4.00 = = 0.15/√49 = $ 100 100 100

24 iv) The Test Statistic How far from the assumed mean of 4.00 is the observed sample mean of 3.87? Measured in Standard Errors, this is the t-statistic. One-sample t-test t = (Sample Mean – Population Mean) / Standard Error t = ( )/ = -6.06 100 100 100

25 v) p-value The probability that a value would be as extreme as (or more extreme than) 6.06 SEs below the Mean is: ! [In Excel, =TDIST(6.06,48,1)] This is called the p-value of the Hypothesis test. 100 100 100

26 vi) Conclusion To determine if a result is statistically significant, a researcher would have to calculate a p-value, which is the probability of observing an effect given that the null hypothesis is true. The null hypothesis is rejected if the p-value is less than 0.05 (5%). If the null were true (the average price were in fact 4.00), there is only a probability that you would pick a sample with a mean of 3.87 or smaller from such a population. Therefore, either the null must be false (and therefore you proved your case) or you picked an extremely rare sample. You can conclude that the sample could not have come from a population with Mean = 4.00 as assumed, and instead must have come from one with Mean < 4.00. The chance that you are wrong is less than 5%, your tolerance level. In other words, p < , hence you proved the case beyond reasonable doubt. 100 100 100

27 Agenda Hypothesis Testing One-sample Hypothesis Test for the Mean
Chi-Squared Tests

28 Goodness-of-fit Test Douglas Fir 54% Ponderosa Pine 40% Grand Fir 5%
Western Larch 1% A managed forest has the following distribution of trees: Douglas Fir 70 Ponderosa Pine 79 Grand Fir 3 Western Larch 4 Mannan & Meslow (1984) made 156 observations of foraging by red-breasted nuthatches and found the following: Mannan, R.W., and E.C. Meslow “Bird populations and vegetation characteristics in managed and old-growth forests, northeastern Oregon.” J. Wildl. Manage. 48: 94 94 94 94 94 93 93 93 93 89

29 Hypotheses Do the birds forage randomly, without regard to what species of tree they are in? To be true, the observed and expected distributions should be alike. Null: The distributions are alike (good fit, meaning birds forage randomly) Alternate: The distributions are different (lack of fit, or birds prefer certain vegetation) 113 113

30 Expected Values Based on the percentage distribution of trees, the expected counts for each type (out of 156) are: Douglas Fir 84.24 Ponderosa Pine 62.40 Grand Fir 7.80 Western Larch 1.56 (54% of 156 = 84.24) 113 113

31 Chi-Square Statistics
Expected Observed o-e (o-e)Sq (o-e)Sq/e Douglas Fir 84.24 70 -14.24 202.78 2.41 Ponderosa Pine 62.40 79 16.60 275.56 4.42 Grand Fir 7.80 3 -4.80 23.04 2.95 Western Larch 1.56 4 2.44 5.95 3.82 156.00 156 Chi-square = 13.593 p-value = For p-value in Excel, type =CHIDIST(13.593,3), for 3 degrees of freedom (n groups -1) 113 113

32 Conclusion Hypotheses:
Null: The distributions are alike (good fit, meaning birds forage randomly) Alternate: The distributions are different (lack of fit, or birds prefer certain vegetation) To determine if a result is statistically significant, a researcher would have to calculate a p-value, which is the probability of observing an effect given that the null hypothesis is true. The null hypothesis is rejected if the p-value is less than 0.05 (5%). Given the small p-value, we reject the null. These birds are not foraging randomly – they prefer certain types of trees. 113 113

33 Test of Independence No Dog Have Female 29 23 Male 35 24 Demographic data on 111 students is available. We wish to study gender differences, in this case pertaining to dog ownership. Data Set: Student Variables: Gender, Dog (Yes/No) Are Gender and Dog Ownership independent of each other? 94 94 94 94 94 93 93 93 93 89

34 Hypotheses Null: The two variables are independent of each other (the occurrence of one does not influence the probability of the occurrence of the other.) Alternate: They are not independent (one influences the other) 113 113

35 Chi-Square Statistics
Tabulated statistics: Gender, Dog Rows: Gender Columns: Dog No Yes All Female Male All Cell Contents: Count Expected count Pearson Chi-Square = 0.143, DF = 1, P-Value = 0.705 Likelihood Ratio Chi-Square = 0.143, DF = 1, P-Value = 0.705 113 113

36 Conclusion Hypotheses:
Null: The two variables are independent of each other (the occurrence of one does not influence the probability of the occurrence of the other.) Alternate: They are not independent (one influences the other) To determine if a result is statistically significant, a researcher would have to calculate a p-value, which is the probability of observing an effect given that the null hypothesis is true. The null hypothesis is rejected if the p-value is less than 0.05 (5%). Given the p-value>0.05, the null hypothesis is true. Gender and Dog Ownership are independent of each other. The gender difference does not influence the dog ownership. 113 113


Download ppt "MBA 7025 Statistical Business Analysis Hypothesis Testing Jan 27, 2015"

Similar presentations


Ads by Google