MKTG 368 All Statistics PowerPoints

Slides:



Advertisements
Similar presentations
Simple Linear Regression and Correlation by Asst. Prof. Dr. Min Aung.
Advertisements

2013/12/10.  The Kendall’s tau correlation is another non- parametric correlation coefficient  Let x 1, …, x n be a sample for random variable x and.
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Hypothesis Testing Steps in Hypothesis Testing:
Describing Relationships Using Correlation and Regression
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
Chapter 13 Multiple Regression
PSY 307 – Statistics for the Behavioral Sciences
Chapter 12 Multiple Regression
The Simple Regression Model
SIMPLE LINEAR REGRESSION
More Simple Linear Regression 1. Variation 2 Remember to calculate the standard deviation of a variable we take each value and subtract off the mean and.
Ch 11: Correlations (pt. 2) and Ch 12: Regression (pt.1) Nov. 13, 2014.
SIMPLE LINEAR REGRESSION
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Chi-Square Tests and the F-Distribution
Relationships Among Variables
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Correlation and Linear Regression
Correlation and Regression
SIMPLE LINEAR REGRESSION
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
The Chi-Square Distribution 1. The student will be able to  Perform a Goodness of Fit hypothesis test  Perform a Test of Independence hypothesis test.
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
Introduction to Linear Regression and Correlation Analysis
Hypothesis Testing:.
Linear Regression and Correlation
Chapter 15 Correlation and Regression
Anthony Greene1 Correlation The Association Between Variables.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
POSC 202A: Lecture 12/10 Announcements: “Lab” Tomorrow; Final ed out tomorrow or Friday. I will make it due Wed, 5pm. Aren’t I tender? Lecture: Substantive.
Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
CLASS 5 Normal Distribution Hypothesis Testing Test between means of related groups Test between means of different groups Analysis of Variance (ANOVA)
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Midterm Review Ch 7-8. Requests for Help by Chapter.
MARKETING RESEARCH CHAPTER 17: Hypothesis Testing Related to Differences.
Chapter 12 Introduction to Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick.
Data Analysis.
© Copyright McGraw-Hill 2004
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Chapter 7 Calculation of Pearson Coefficient of Correlation, r and testing its significance.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Dependent-Samples t-Test
Correlation and Simple Linear Regression
Hypothesis Testing Review
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Correlation and Simple Linear Regression
Correlation and Regression
Exam 5 Review GOVT 201.
Correlation and Simple Linear Regression
Statistical Inference about Regression
Association, correlation and regression in biomedical research
Chapter 10 Analyzing the Association Between Categorical Variables
Inferential Statistics and Probability a Holistic Approach
Simple Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
CLASS 6 CLASS 7 Tutorial 2 (EXCEL version)
Introduction to Regression
CHI SQUARE (χ2) Dangerous Curves Ahead!.
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

MKTG 368 All Statistics PowerPoints Setting Up Null and Alternative Hypotheses One-tailed vs. Two-Tailed Hypotheses Single Sample T-Test Paired Samples T-Test Independent Samples T-Test ANOVA Correlation and Regression One-Way and Two-Way Chi-Square

Translating a Problem Statement Into the Null and Alternative Hypotheses

Initial Problem Statement Example: Let’s say we are interested in whether a flyer increases contributions to National Public Radio. We know that last year the average contribution was $52. This year, we sent out a flyer to 30 people explaining the benefits of NPR and asked for donations. This year’s average contribution with the flyer ended up being $55, with a standard deviation of $12. How do we translate this into the null and alternative hypotheses (in terms of both a sentence and a formula)?

Gleaning Information from the Statement Example: Let’s say we are interested in whether a flyer increases contributions to National Public Radio. We know that last year the average contribution was $52. This year, we sent out a flyer to 30 people explaining the benefits of NPR and asked for donations. This year’s average contribution with the flyer ended up being $55, with a standard deviation of $12. Direction of Alternative Hypothesis Population Information Sample Information

Translating Information into Null and Alternative Hypotheses Set up Alternative Hypothesis First Null is exact opposite of Alternative Null + Alternative must include all possibilities Hence, we say ‘less than or equal to’ rather than just ‘less than’ Ho (Null Hypothesis): A flyer does not increase contribution to NPR: Flyer$ ≤ Last Year$ H1 (Alternative Hypothesis): A flyer increases contributions to NPR: Flyer$ > Last Year$ Subscript = Dependent Variable (what you are comparing them on) Groups, Conditions, or Levels of the Independent Variable

On One-Tailed (Directional) vs. Two-Tailed (Non-Directional) Hypotheses

Basics on the Normal Distribution Positive Values Negative 68% 95% 99%

One-Tailed Hypothesis (H1: Condition 1 > Condition 2) Ho (Null Hypothesis): A flyer does not increase contribution to NPR: Flyer$ ≤ Last Year$ H1 (Alternative Hypothesis): A flyer increases contributions to NPR: Flyer$ > Last Year$ In H1, b/c Flyer > Last Year Alpha region is on right side “Alpha Region” α = .05, 1-tailed (positive) t-critical

One-Tailed Hypothesis (H1: Condition 1 < Condition 2) Ho (Null Hypothesis): A poster does not decrease lbs of litter in park: Posterlbs ≤ Last Yearlbs H1 (Alternative Hypothesis): A poster decreases lbs of litter in park: Posterlbs < Last Yearlbs In H1, b/c Poster < Last Year Alpha region is on left side “Alpha Region” α = .01, 1-tailed (negative) t-critical

Two-Tailed Hypothesis (H1: Condition 1 ≠ Condition 2) Ho (Null Hypothesis): People are willing to pay the same for Nike vs. Adidas: NikeWTP = AdidasWTP H1 (Alternative Hypothesis): People not willing to pay same for Nike vs. Adidas: NikeWTP ≠ AdidasWTP In H1, b/c NikeWTP ≠ AdidasWTP Alpha region is on both sides; Half of .05 goes on each side “Alpha Region” α = .025 “Alpha Region” α = .025 (positive) t-critical (negative) t-critical

T-Tests Single Sample Paired Samples (Correlated Groups) Independent Samples

Single Sample T-Test (Example 1) Comparing a sample mean to an existing population mean

Gleaning Information from the Statement Example: Let’s say we are interested in whether a flyer increases contributions to National Public Radio. We know that last year the average contribution was $52. This year, we sent out a flyer to 30 people explaining the benefits of NPR and asked for donations. This year’s average contribution with the flyer ended up being $55, with a standard deviation of $12. Direction of Alternative Hypothesis Single Sample T-test: df = N-1 = 30-1 = 29 Population Information Sample Information Use alpha = .05 How do we get a t-critical value? 

Critical T-Table For single sample t-test, df = N-1

One-Tailed Hypothesis (H1: Condition 1 > Condition 2) Ho (Null Hypothesis): A flyer does not increase contribution to NPR: Flyer$ ≤ Last Year$ H1 (Alternative Hypothesis): A flyer increases contributions to NPR: Flyer$ > Last Year$ In H1, b/c Flyer > Last Year Alpha region is on right side “Alpha Region” α = .05, 1-tailed t-critical = 1.699 If t-obtained > t-critical, reject Ho (i.e., if t-obtained falls in the critical region, reject Ho).

Computation of Single Sample T-test “Alpha Region” α = .05, 1-tailed t-critical = 1.699 Decision? Because t-obtained (1.37) < t-critical (1.699), retain Ho. Conclusion? The flyer did not increase contributions to NPR. t-obtained = 1.37

Single Sample T-Test (Example 2) Comparing a sample mean to an existing population mean

Gleaning Information from the Statement Example: Let’s say we are interested in whether a poster decreases amount of litter in city parks. We know that last year the average amount of litter in city parks was 115 lbs. This year, we placed flyers in 25 parks that said “Did you know that 95% of people don’t litter? Join the crowd.” Later, when we weighed the litter, the average amount of litter was 100 lbs, with a standard deviation of 10 lbs. Direction of Alternative Hypothesis Single Sample T-test: df = N-1 = 25-1 = 24 Population Information Sample Information Use alpha = .01 How do we get a t-critical value? 

Critical T-Table For single sample t-test, df = N-1

One-Tailed Hypothesis (H1: Condition 1 < Condition 2) Ho (Null Hypothesis): A poster does not decrease lbs of litter in park: Posterlbs ≤ Last Yearlbs H1 (Alternative Hypothesis): A poster decreases lbs of litter in park: Posterlbs < Last Yearlbs In H1, b/c Poster < Last Year Alpha region is on left side “Alpha Region” α = .01, 1-tailed t-critical = -2.492

Computation of Single Sample T-test Decision? Because t-obtained (-7.50) < t-critical (-2.492), reject Ho. Conclusion? The signs did decrease lbs of trash in the park. “Alpha Region” α = .01, 1-tailed t-critical = -2.492 t-obtained = -7.50

Comparing two scores from the same Individual (or unit of analysis) Paired Samples T-Test Comparing two scores from the same Individual (or unit of analysis)

Gleaning Information from the Statement Example: Let’s say we are interested in whether a brand name (Nike vs. Adidas) affects willingness to pay for a sweatshirt. To explore this question, we take 9 people and have them indicate their WTP for a Nike sweatshirt and for an Adidas sweatshirt. The only difference between the sweatshirts is the brand name. Non-Directional (Two-Tailed) Alternative Hypothesis; doesn’t say “is higher” or “is lower”; just says “affects” Paired Samples T-test: df = N-1 = 9-1 = 8 Paired Scores From Same Person Use alpha = .05 How do we get a t-critical value? 

Two-Tailed Hypothesis (H1: Condition 1 ≠ Condition 2) Ho (Null Hypothesis): People are willing to pay the same for Nike vs. Adidas: NikeWTP = AdidasWTP H1 (Alternative Hypothesis): People not willing to pay same for Nike vs. Adidas: NikeWTP ≠ AdidasWTP In H1, b/c NikeWTP ≠ AdidasWTP Alpha region is on both sides; Half of .05 goes on each side “Alpha Region” α = .025 “Alpha Region” α = .025 (negative) t-critical (positive) t-critical

Critical T-Table For paired samples t-test, df = N-1

Two-Tailed Hypothesis (H1: Condition 1 ≠ Condition 2) Ho (Null Hypothesis): People are willing to pay the same for Nike vs. Adidas: NikeWTP = AdidasWTP H1 (Alternative Hypothesis): People not willing to pay same for Nike vs. Adidas: NikeWTP ≠ AdidasWTP In H1, b/c NikeWTP ≠ AdidasWTP Alpha region is on both sides; Half of .05 goes on each side “Alpha Region” α = .025 “Alpha Region” α = .025 t-critical = -2.306 t-critical = + 2.306

Defining Symbols in Paired T-test _ D = average difference score. D = difference score (eg., time 1 vs. time 2; midterm vs. final; husband vs. wife) N = # Paired Scores (not the # of numbers in front of you).  = average difference score in the Null Hypothesis Population (most often = 0) SSD = Sum of Squared Deviations for the Difference Scores = D2 – [(D)2/N] tobt = the t statistic which is compared to tcrit with N-1 df

Paired Samples T-test Nike vs. Adidas Sweatshirt Example Difference Nike Adidas D D2 25 27 -2 4 29 16 33 30 3 9 34 36 81 40 11 121 45 5 42 38 Sum 37 281 Mean D 4.11 First, Compute SSD Then, Compute t

Decision and Conclusion? “Alpha Region” α = .025 t-critical = -2.306 t-critical = + 2.306 t-obtained = 3.07 Decision? Because t-obtained (3.07) < t-critical (2.306), reject Ho. Conclusion? People willing to pay more for Nike than for Adidas. We know this, because the average difference score was positive. (Nike – Adidas)

Independent Samples T-Test Comparing means of two conditions or groups

Gleaning Information from the Statement Example: Let’s say we are interested in how consumers respond to service failures, so we decide to run an experiment. We ask people to read about a hypothetical service failure scenario (e.g., delayed service at a restaurant). Then we randomly assign half of the subjects to the “apology” condition (we’ll call this Group 1), and the other half to a “control” condition (we’ll call this Group 2). Those in the apology condition read that the restaurant owner offered a sincere apology for having to wait so long. After this, we assess subjects’ self-reported anger (1 = not at all angry, 11 = fuming mad). We hypothesize that subjects will report less anger in the apology condition. Independent Samples t-test: df = N-2 = 20-2 = 18 Directional (One-Tailed) Alternative Hypothesis Scores come from two Independent groups Use alpha = .05 How do we get a t-critical value? 

One-Tailed Hypothesis (H1: Condition 1 < Condition 2) Ho (Null Hypothesis): An apology does not decrease anger: ApologyAnger ≥ ControlAnger H1 (Alternative Hypothesis): Anger will be lower in the Apology Condition: ApologyAnger < ControlAnger In H1, b/c Aplogy < No Apology Alpha region is on left side “Alpha Region” α = .05, 1-tailed (negative) t-critical

Critical T-Table For independent t-test, df = N-2

One-Tailed Hypothesis (H1: Condition 1 < Condition 2) Ho (Null Hypothesis): An apology does not decrease anger: ApologyAnger ≥ ControlAnger H1 (Alternative Hypothesis): Anger will be lower in the Apology Condition: ApologyAnger < ControlAnger In H1, b/c Aplogy < No Apology Alpha region is on left side “Alpha Region” α = .05, 1-tailed t-critical = -1.734

Defining Symbols in Independent T-test _ _ X1 and X2 = the means of X1 and X2 (our two conditions), respectively SS1 and SS2 = Sum of Squared Deviations for X1 and X2 where…SS= X2 – [(X)2/N] for each group n = the number of subjects in each conditions. n1 + n2 = N. In other words, n  N! tobt = the t statistic which is compared tcrit with N-2 df.

Independent Samples T-test Apology vs. No Apology Example NA2 5 10 25 100 3 7 9 49 2 4 6 36 1 11 121 16 SumA SumNA SumA2 SumNA2 32 60 142 434 Mean 3.2 6.0 SS 39.6 74 First, Compute SS for Each Condition Then, Compute t

Decision and Conclusion? t-critical = -1.734 “Alpha Region” α = .05, 1-tailed Decision? Because t-obtained (-2.50) < t-critical (-1.734), reject Ho. Conclusion? People report less anger after an apology t-obtained = -3.07

Analysis of Variance (ANOVA) Comparing means of three or more conditions or groups

The F-Ratio: A Ratio of Variances Between and Within Groups

Between Groups Variance (Numerator of F-ratio) Within Group 3 Variance Within Group 1 Variance Within Group 2 Mean = 9 Mean = 3 Mean = 5 Between Groups Variance (Numerator of F-ratio)

F-Distribution Probability distribution All values positive (variance ratio) Positively skewed Median = 1 Shape varies with degrees of freedom (within and between) “Alpha Region” α = .05 1

F-critical Table: If we have 3 conditions, N = 14, alpha = F-critical Table: If we have 3 conditions, N = 14, alpha = .05; F-crit = 3.98 Alpha Level df numerator = K-1 df denominator = N-K

Null and Alternative Hypotheses Let’s say a marketing researcher is interested in the impact of music on sales at a new clothing store targeted to tweens. She sets up a mock store in her university’s research lab, gives each subject $50 spending money, and then randomly assigns subjects to one of three conditions. One third of the subjects browse the mock store with no music. One third browse the store with soft music. And the final third browse the store with loud music. The sales figures are shown below. Assume the researcher decides to use an alpha level of .01. Null Hypothesis (Ho): All of the means are equal (ucontrol = usoft music = uloud music) Alternative (H1): At least two means are different F-critical (based on alpha = .01; df-numerator = 2; df-denominator = 9): 8.02 Decision Rule: If Fobt ≥ Fcritical, then reject Ho. Otherwise, retain Ho

The Data: Sales as a Function of Music Condition

Subtract Group Mean from Each Score Then Square and Add Up This gives you the SS for that group

Do this for each of the three conditions

(See Statistics Notes Packet) Environment SS Control SS Soft Music SS Loud Music No Music Soft Music Loud Music (x-mean) (x-mean)2 22 27 39 -4 16 1 1.0 26 23 36 -2 4.0 28 2 4 30 41 3 9 9.0   Mean Sum 38 24 18 Grand Mean 30.333 SS-between 354.667 df-between MS-between 177.333 SS-within 68 df-within MS-within 7.556 F 23.471 Source SS df MS Between 177.33 23.47 Within 68.00 7.56 Total 422.67 11 (See Statistics Notes Packet) Summarize in a Source Table

ANOVA - Source Table Source SS df MS F Between 354.667 2 177.33 23.47 Within 68.00 9 7.56   Total 422.67 11

F-critical in our example = 8.02 K = 3 Alpha = .01

Decision Rule and Conclusion? Reject Null Hypothesis At least two means are different “Alpha Region” α = .01 F-critical 8.02 F-obtained 23.47

Correlation

Differences Between Correlation and Regression Correlation (r) assessing direction (+ or -) and degree (strong, medium, weak) of relationship between two variables Linear Regression (slope, y-intercept) assessing nature of relationship between an outcome variable and one or more predictors making predictions for Y (cfc) based on X (impuss)

Reading Scatterplots Negative Correlation Zero Correlation Positive X (Predictor) (Criterion) Y X (Predictor) (Criterion) Y X (Predictor) (Criterion) Y

Two Interpretations of Correlation Coefficient Direction & Degree of Relationship Between Two Variables Range from –1 to +1 Stronger correlations at the extremes r = -1 (perfect negative relationship) r = 0 (no relationship) r = +1 (perfect positive relationship) Variance Explained r2, Ranges from 0 to + 1.0 What percent of the variance in Y is explained by X? Model Comparison Approach

Problem Statement - A Let’s say we survey 5 shoppers about their level of satisfaction with the service they received from a furniture store (X = satisfaction w/service) and their intention to return to the store in the future (Y = future intentions). Presumably, there should be a positive correlation between these variables. Null Hypothesis (Ho): Satisfaction with service and future shopping intentions are not positively correlated Alternative (H1): Satisfaction with service and future shopping intentions are positively correlated (this is a directional hypothesis) r-critical (based on alpha = .05(one-tailed),df = N-2 = 3): r-critical = .8054 Decision Rule (in this example, because r is predicted to be positive): If robt ≥ rcritical, then reject Ho. Otherwise, retain Ho

r-critical Table If alpha = .05 (1-tailed), N = 5, df = 3, r-critical = .8054 Decision Rule (when r is predicted to be positive): If robt ≥ rcritical, then reject Ho. Otherwise, retain Ho Decision Rule (when r is predicted to be negative): If robt ≤ rcritical, then reject Ho. Otherwise, retain Ho Decision Rule (when H1 is non-directional): If |robt| ≥ |rcritical|, then reject Ho. Otherwise, retain Ho

Data and Scatterplot Data Scatterplot

Raw Score Formula for Pearson’s r Correlation

Computing Pearson’s r (and variance explained) Compute SSx and SSy Then compute r r2= (.313*.313) = .098 So, satisfaction explains 9.8% of variance in future intentions

Regression

Problem Statement - B Let’s use the data we just worked with for correlation. Five shoppers were asked their satisfaction with the service they received and their intention to shop at the store in the future. Regression would be used to make predictions for future shopping intentions (Y) based on people’s satisfaction with service (X). For example, what would we predict if a shopper rated their satisfaction with service at a 3? First need to compute regression equation, then use it to make a prediction

(change in Y for 1 unit change in X) Slope and Y-Intercept X (Predictor) (Criterion) Y Y-intercept (bo) (value of Y, when X = 0) Slope (b1) (change in Y for 1 unit change in X)

Raw Score Formula for Slope and Y-Intercept

Computing Regression Equation (Predictor) (Criterion) Satisfaction Future Customer w/Service Intentions X Y X2 Y2 XY Hector 5 6 25 36 30 Marge 2 4 16 8 Fredrick 3 9 15 Susie Gwen Sum X Sum Y Sum X2 Sum Y2 Sum XY 22 20 106 90 91 [(Sum X)*(Sum Y)]/N 88 Numerator SS X 9.2 b1 (the slope) 0.326 mean y 4.0 mean x 4.4 bo (y-intercept) 2.57 First compute slope Then compute y-intercept So, the regression equation is:

Data and Scatterplot Data Scatterplot

Using the Regression Equation to Make a Prediction Let’s say a customer rates their satisfaction as a ‘3’ on our 7-point scale. What is their predicted future intention of shopping at the store in the future? So, a person who gives a ‘3’ on the satisfaction scale has a predicted future intention score of 3.544

Y-Predicted & Residuals If Satisfaction (X) = 3 Predicted Intention (Y) = 3.54 Y predicted = 3.54 Residual = (Y-Y predicted) When r is strong, residuals are small X = 3

Chi-Square

One-Way vs. Two-Way Chi-Square Chi-square is appropriate when our data are frequency (count) data In “one-way chi-square”, we have one categorical variable (type of shoe) with several levels (Adidas, Asics, Nikes, Pumas) and we want to know whether the frequency of observations differs between the groups (or conditions, or levels) In “two-way chi-square”, we have two categorical variables (Gender x Support for New Stadium) and we want to know if these two variables are related 1 2

One-way Chi-Square Where…

Problem Statement Let’s say we ask 100 people to pick their favorite brand of shoes among four types. The data are shown below. Clearly, the frequencies are not equal (25 in each). Here, 15 pick Adidas, 30 pick Asics, 45 pick Nike, and 10 pick Puma. The question is whether these frequencies are significantly different. Null Hypothesis (Ho): Frequencies of people choosing different brands is equal Alternative (H1): Not all the frequencies are equal (doesn’t mean they’re all different) X2-critical (based on alpha = .05; df = K-1 = 4-1=3): X2-critical = 7.815 (X2 critical always positive) In df, K stands for the number of groups. (see critical table next page) Decision Rule is always as follows (b/c chi-square is always positive): If X2obt ≥ X2critical, then reject Ho. Otherwise, retain Ho

Chi-Square Critical Table (for one-way chi-square) df = K-1, where K = # groups

Formula and Frequency Expected Frequency Observed (Actual Frequencies) Frequency Expected (Typically = Total N/K) To compute chi-square, we need to know fe = expected frequency. Typically, we’ll just assume this represents an equal distribution across the conditions (Total N/K). So, we have a total of 100 people and 4 conditions (brands of shoe). Based on chance alone, an equal distribution across the conditions would mean 25 people would select each type of shoe. So, here we’ll assume fe = 25.

Computation Decision: Reject Ho, because X2obt (30) ≥ X2critical (7.815). Conclusion: People do not show an equal preference among the four brands of shoes.

Two-Way Chi-Square Where…

Problem Statement Let’s say we’re interested in whether males and females differ in their support for building a new football stadium. We survey 40 people (10 men, and 30 women) and we ask them a simple (categorical) yes/no question: Do you support building a new football stadium? Now we want to know if there is a relationship between gender (male/female) and support for the stadium (yes/no). Null Hypothesis (Ho): There is no relationship between gender and support for football stadium Alternative (H1): There is a relationship between gender and support for football stadium X2-critical (based on alpha = .05; df = (Rows-1)*(Columns-1)=(2-1)*(2-1)=1: X2-critical = 3.841 (X2 critical always positive) (see critical table next page) Decision Rule is always as follows (b/c chi-square is always positive): If X2obt ≥ X2critical, then reject Ho. Otherwise, retain Ho Of the 10 men surveyed, 8 supported it, and 2 didn’t Of the 30 women surveyed, 5 supported it and 25 didn’t

Data Of the 10 men surveyed, 8 supported it, and 2 didn’t Of the 30 women surveyed, 5 supported it and 25 didn’t

Chi-Square Critical Table (works for two-way chi-square) df = (Row-1)*(Columns-1)

Formula and Frequency Expected Frequency Observed (Actual Frequencies) Frequency Expected (Row N*Column N)/Total N To compute chi-square, we need to know fe = expected frequency. Typically, we’ll just assume this represents an equal distribution across the conditions (Row N*Column N)/Total N. The next slide illustrates the computation of frequency expected.

Computing Frequency Expected and Chi-Square Cell 1 Cell 2 Cell 3 Cell 4 Decision: Reject Ho, because X2obt (13.7) ≥ X2critical (3.841). Conclusion: Gender is related to support for football stadium (men > women)