2 Basic Concepts and Tests of Association Chapter SeventeenHypothesis Testing:Basic Concepts and Tests of Association
3 Hypothesis Testing: Basic Concepts Assumption (hypothesis) made about a population parameter (not sample parameter)Purpose of Hypothesis TestingTo make a judgment about the difference between two sample statistics or between sample statistic and a hypothesized population parameterEvidence has to be evaluated statistically before arriving at a conclusion regarding the hypothesis.Depends on whether information generated from the sample is with fewer or larger observations
4 Hypothesis TestingThe null hypothesis (Ho) is tested against the alternative hypothesis (Ha).At least the null hypothesis is stated.Decide upon the criteria to be used in making the decision whether to “reject” or "not reject" the null hypothesis.
5 Hypothesis Testing Process Problem DefinitionClearly state the null and alternative hypothesesChoose the relevant test and the appropriate probability distributionChoose the critical valueCompare test statistic & critical valueReject nullDetermine the significance levelCompute relevant test statisticDetermine the degrees of freedomDecide if one-or two-tailed testDo not reject nullDoesthe test statisticfall in the criticalregion?
6 Basic Concepts of Hypothesis Testing Three Criteria Used To Decide Critical Value (Whether To Accept or Reject Null Hypothesis):Significance LevelDegrees of FreedomOne or Two Tailed Test
7 Significance Level Look at book page 473: explain Type I/II error Indicates the percentage of sample means that is outside the cut-off limits (critical value)The higher the significance level () used for testing a hypothesis, the higher the probability of rejecting a null hypothesis when it is true (Type I error)Accepting a null hypothesis when it is false is called a Type II error and its probability is ()When choosing a level of significance, there is an inherent tradeoff between these two types of errorsA good test of hypothesis should reject a null hypothesis when it is falseLook at book page 473: explain Type I/II error
9 Relationship between Type I & Type II Errors (Contd.)
10 Relationship between Type I & Type II Errors (Contd.)
11 Choosing The Critical Value Power of hypothesis test(1 - ) should be as high as possibleDegrees of FreedomThe number or bits of "free" or unconstrained data used in calculating a sample statistic or test statisticA sample mean (X) has `n' degree of freedomA sample variance (s2) has (n-1) degrees of freedom
13 One or Two-tail Test One-tailed Hypothesis Test Determines whether a particular population parameter is larger or smaller than some predefined valueUses one critical value of test statisticTwo-tailed Hypothesis TestDetermines the likelihood that a population parameter is within certain upper and lower boundsMay use one or two critical values
14 Basic Concepts of Hypothesis Testing (Contd.) Select the appropriate probability distribution based on two criteriaSize of the sampleWhether the population standard deviation is known or not
15 Hypothesis Testing Data Analysis Outcome Accept Null Hypothesis Reject Null HypothesisNull Hypothesis is TrueCorrect DecisionType I ErrorNull Hypothesis is FalseType II Error
16 Cross-tabulation and Chi Square In Marketing Applications, Chi-square Statistic is used as:Test of IndependenceAre there associations between two or more variables in a study?Test of Goodness of FitIs there a significant difference between an observed frequency distribution and a theoretical frequency distribution?Statistical IndependenceTwo variables are statistically independent if a knowledge of one would offer no information as to the identity of the other
17 The Concept of Statistical Independence If n is equal to 200 and Ei is the number of outcomes expected in cell i,
19 Chi-Square As a Test of Independence (Contd.) Null Hypothesis HoTwo (nominally scaled) variables are statistically independentAlternative Hypothesis HaThe two variables are not independentUse Chi-square distribution to test.
20 Chi-square Distribution A probability distributionTotal area under the curve is 1.0A different chi-square distribution is associated with different degrees of freedomCutoff points of the chi-square distribution function
21 Chi-square Distribution (Contd.) Degrees of FreedomNumber of degrees of freedom, v = (r - 1) * (c - 1)r = number of rows in contingency tablec = number of columnsMean of chi-squared distribution = Degree of freedom (v)Variance = 2v
22 Chi-square Statistic (2) Measures of the difference between the actual numbers observed in cell i (Oi), and number expected (Ei) under assumption of statistical independence if the null hypothesis were trueWith (r-1)*(c-1) degrees of freedomOi = observed number in cell iEi = number in cell i expected under independencer = number of rowsc = number of columnsExpected frequency in each cell, Ei = pc * pr * nWhere pc and pr are proportions for independent variablesn is the total number of observations
23 Chi-square Step-by-Step Formulate HypothesisCalculate row & column totalsCalculate row & column proportionsCalculate expected frequencies (Ei)Calculate χ2 statisticCalculate degrees of freedomObtain critical value from tableMake decision regarding Null-hypothesis
24 Strength of Association Measured by contingency coefficient0 = no association (i.e., Variables are statistically independent)Maximum value depends on the size of tableCompare only tables of same size
25 Limitations of Chi-square as an Association Measure It is basically proportional to sample sizeDifficult to interpret in absolute sense and compare cross-tabs of unequal sizeIt has no upper boundDifficult to obtain a feel for its valueDoes not indicate how two variables are related
26 Measures of Association for Nominal Variables Measures based on Chi-SquarePhi-squaredCramer’s V
27 Chi-square Goodness of Fit Used to investigate how well the observed pattern fits the expected patternResearcher may determine whether population distribution corresponds to either a normal, Poisson or binomial distributionTo determine degrees of freedom:Employ (k-1) ruleSubtract an additional degree of freedom for each population parameter that has to be estimated from the sample data