# MKTG 368 All Statistics PowerPoints

## Presentation on theme: "MKTG 368 All Statistics PowerPoints"— Presentation transcript:

MKTG 368 All Statistics PowerPoints
Setting Up Null and Alternative Hypotheses One-tailed vs. Two-Tailed Hypotheses Single Sample T-Test Paired Samples T-Test Independent Samples T-Test ANOVA Correlation and Regression One-Way and Two-Way Chi-Square

Translating a Problem Statement Into the Null and Alternative Hypotheses

Initial Problem Statement
Example: Let’s say we are interested in whether a flyer increases contributions to National Public Radio. We know that last year the average contribution was \$52. This year, we sent out a flyer to 30 people explaining the benefits of NPR and asked for donations. This year’s average contribution with the flyer ended up being \$55, with a standard deviation of \$12. How do we translate this into the null and alternative hypotheses (in terms of both a sentence and a formula)?

Gleaning Information from the Statement
Example: Let’s say we are interested in whether a flyer increases contributions to National Public Radio. We know that last year the average contribution was \$52. This year, we sent out a flyer to 30 people explaining the benefits of NPR and asked for donations. This year’s average contribution with the flyer ended up being \$55, with a standard deviation of \$12. Direction of Alternative Hypothesis Population Information Sample Information

Translating Information into Null and Alternative Hypotheses
Set up Alternative Hypothesis First Null is exact opposite of Alternative Null + Alternative must include all possibilities Hence, we say ‘less than or equal to’ rather than just ‘less than’ Ho (Null Hypothesis): A flyer does not increase contribution to NPR: Flyer\$ ≤ Last Year\$ H1 (Alternative Hypothesis): A flyer increases contributions to NPR: Flyer\$ > Last Year\$ Subscript = Dependent Variable (what you are comparing them on) Groups, Conditions, or Levels of the Independent Variable

On One-Tailed (Directional) vs. Two-Tailed (Non-Directional) Hypotheses

Basics on the Normal Distribution
Positive Values Negative 68% 95% 99%

One-Tailed Hypothesis (H1: Condition 1 > Condition 2)
Ho (Null Hypothesis): A flyer does not increase contribution to NPR: Flyer\$ ≤ Last Year\$ H1 (Alternative Hypothesis): A flyer increases contributions to NPR: Flyer\$ > Last Year\$ In H1, b/c Flyer > Last Year Alpha region is on right side “Alpha Region” α = .05, 1-tailed (positive) t-critical

One-Tailed Hypothesis (H1: Condition 1 < Condition 2)
Ho (Null Hypothesis): A poster does not decrease lbs of litter in park: Posterlbs ≤ Last Yearlbs H1 (Alternative Hypothesis): A poster decreases lbs of litter in park: Posterlbs < Last Yearlbs In H1, b/c Poster < Last Year Alpha region is on left side “Alpha Region” α = .01, 1-tailed (negative) t-critical

Two-Tailed Hypothesis (H1: Condition 1 ≠ Condition 2)
Ho (Null Hypothesis): People are willing to pay the same for Nike vs. Adidas: NikeWTP = AdidasWTP H1 (Alternative Hypothesis): People not willing to pay same for Nike vs. Adidas: NikeWTP ≠ AdidasWTP In H1, b/c NikeWTP ≠ AdidasWTP Alpha region is on both sides; Half of .05 goes on each side “Alpha Region” α = .025 “Alpha Region” α = .025 (positive) t-critical (negative) t-critical

T-Tests Single Sample Paired Samples (Correlated Groups) Independent Samples

Single Sample T-Test (Example 1)
Comparing a sample mean to an existing population mean

Gleaning Information from the Statement
Example: Let’s say we are interested in whether a flyer increases contributions to National Public Radio. We know that last year the average contribution was \$52. This year, we sent out a flyer to 30 people explaining the benefits of NPR and asked for donations. This year’s average contribution with the flyer ended up being \$55, with a standard deviation of \$12. Direction of Alternative Hypothesis Single Sample T-test: df = N-1 = 30-1 = 29 Population Information Sample Information Use alpha = .05 How do we get a t-critical value? 

Critical T-Table For single sample t-test, df = N-1

One-Tailed Hypothesis (H1: Condition 1 > Condition 2)
Ho (Null Hypothesis): A flyer does not increase contribution to NPR: Flyer\$ ≤ Last Year\$ H1 (Alternative Hypothesis): A flyer increases contributions to NPR: Flyer\$ > Last Year\$ In H1, b/c Flyer > Last Year Alpha region is on right side “Alpha Region” α = .05, 1-tailed t-critical = 1.699 If t-obtained > t-critical, reject Ho (i.e., if t-obtained falls in the critical region, reject Ho).

Computation of Single Sample T-test
“Alpha Region” α = .05, 1-tailed t-critical = 1.699 Decision? Because t-obtained (1.37) < t-critical (1.699), retain Ho. Conclusion? The flyer did not increase contributions to NPR. t-obtained = 1.37

Single Sample T-Test (Example 2)
Comparing a sample mean to an existing population mean

Gleaning Information from the Statement
Example: Let’s say we are interested in whether a poster decreases amount of litter in city parks. We know that last year the average amount of litter in city parks was 115 lbs. This year, we placed flyers in 25 parks that said “Did you know that 95% of people don’t litter? Join the crowd.” Later, when we weighed the litter, the average amount of litter was 100 lbs, with a standard deviation of 10 lbs. Direction of Alternative Hypothesis Single Sample T-test: df = N-1 = 25-1 = 24 Population Information Sample Information Use alpha = .01 How do we get a t-critical value? 

Critical T-Table For single sample t-test, df = N-1

One-Tailed Hypothesis (H1: Condition 1 < Condition 2)
Ho (Null Hypothesis): A poster does not decrease lbs of litter in park: Posterlbs ≤ Last Yearlbs H1 (Alternative Hypothesis): A poster decreases lbs of litter in park: Posterlbs < Last Yearlbs In H1, b/c Poster < Last Year Alpha region is on left side “Alpha Region” α = .01, 1-tailed t-critical =

Computation of Single Sample T-test
Decision? Because t-obtained (-7.50) < t-critical (-2.492), reject Ho. Conclusion? The signs did decrease lbs of trash in the park. “Alpha Region” α = .01, 1-tailed t-critical = t-obtained = -7.50

Comparing two scores from the same Individual (or unit of analysis)
Paired Samples T-Test Comparing two scores from the same Individual (or unit of analysis)

Gleaning Information from the Statement
Example: Let’s say we are interested in whether a brand name (Nike vs. Adidas) affects willingness to pay for a sweatshirt. To explore this question, we take 9 people and have them indicate their WTP for a Nike sweatshirt and for an Adidas sweatshirt. The only difference between the sweatshirts is the brand name. Non-Directional (Two-Tailed) Alternative Hypothesis; doesn’t say “is higher” or “is lower”; just says “affects” Paired Samples T-test: df = N-1 = 9-1 = 8 Paired Scores From Same Person Use alpha = .05 How do we get a t-critical value? 

Two-Tailed Hypothesis (H1: Condition 1 ≠ Condition 2)
Ho (Null Hypothesis): People are willing to pay the same for Nike vs. Adidas: NikeWTP = AdidasWTP H1 (Alternative Hypothesis): People not willing to pay same for Nike vs. Adidas: NikeWTP ≠ AdidasWTP In H1, b/c NikeWTP ≠ AdidasWTP Alpha region is on both sides; Half of .05 goes on each side “Alpha Region” α = .025 “Alpha Region” α = .025 (negative) t-critical (positive) t-critical

Critical T-Table For paired samples t-test, df = N-1

Two-Tailed Hypothesis (H1: Condition 1 ≠ Condition 2)
Ho (Null Hypothesis): People are willing to pay the same for Nike vs. Adidas: NikeWTP = AdidasWTP H1 (Alternative Hypothesis): People not willing to pay same for Nike vs. Adidas: NikeWTP ≠ AdidasWTP In H1, b/c NikeWTP ≠ AdidasWTP Alpha region is on both sides; Half of .05 goes on each side “Alpha Region” α = .025 “Alpha Region” α = .025 t-critical = t-critical =

Defining Symbols in Paired T-test
_ D = average difference score. D = difference score (eg., time 1 vs. time 2; midterm vs. final; husband vs. wife) N = # Paired Scores (not the # of numbers in front of you).  = average difference score in the Null Hypothesis Population (most often = 0) SSD = Sum of Squared Deviations for the Difference Scores = D2 – [(D)2/N] tobt = the t statistic which is compared to tcrit with N-1 df

Paired Samples T-test Nike vs. Adidas Sweatshirt Example
Difference Nike Adidas D D2 25 27 -2 4 29 16 33 30 3 9 34 36 81 40 11 121 45 5 42 38 Sum 37 281 Mean D 4.11 First, Compute SSD Then, Compute t

Decision and Conclusion?
“Alpha Region” α = .025 t-critical = t-critical = t-obtained = 3.07 Decision? Because t-obtained (3.07) < t-critical (2.306), reject Ho. Conclusion? People willing to pay more for Nike than for Adidas. We know this, because the average difference score was positive. (Nike – Adidas)

Independent Samples T-Test
Comparing means of two conditions or groups

Gleaning Information from the Statement
Example: Let’s say we are interested in how consumers respond to service failures, so we decide to run an experiment. We ask people to read about a hypothetical service failure scenario (e.g., delayed service at a restaurant). Then we randomly assign half of the subjects to the “apology” condition (we’ll call this Group 1), and the other half to a “control” condition (we’ll call this Group 2). Those in the apology condition read that the restaurant owner offered a sincere apology for having to wait so long. After this, we assess subjects’ self-reported anger (1 = not at all angry, 11 = fuming mad). We hypothesize that subjects will report less anger in the apology condition. Independent Samples t-test: df = N-2 = 20-2 = 18 Directional (One-Tailed) Alternative Hypothesis Scores come from two Independent groups Use alpha = .05 How do we get a t-critical value? 

One-Tailed Hypothesis (H1: Condition 1 < Condition 2)
Ho (Null Hypothesis): An apology does not decrease anger: ApologyAnger ≥ ControlAnger H1 (Alternative Hypothesis): Anger will be lower in the Apology Condition: ApologyAnger < ControlAnger In H1, b/c Aplogy < No Apology Alpha region is on left side “Alpha Region” α = .05, 1-tailed (negative) t-critical

Critical T-Table For independent t-test, df = N-2

One-Tailed Hypothesis (H1: Condition 1 < Condition 2)
Ho (Null Hypothesis): An apology does not decrease anger: ApologyAnger ≥ ControlAnger H1 (Alternative Hypothesis): Anger will be lower in the Apology Condition: ApologyAnger < ControlAnger In H1, b/c Aplogy < No Apology Alpha region is on left side “Alpha Region” α = .05, 1-tailed t-critical =

Defining Symbols in Independent T-test
_ _ X1 and X2 = the means of X1 and X2 (our two conditions), respectively SS1 and SS2 = Sum of Squared Deviations for X1 and X2 where…SS= X2 – [(X)2/N] for each group n = the number of subjects in each conditions. n1 + n2 = N. In other words, n  N! tobt = the t statistic which is compared tcrit with N-2 df.

Independent Samples T-test Apology vs. No Apology Example
NA2 5 10 25 100 3 7 9 49 2 4 6 36 1 11 121 16 SumA SumNA SumA2 SumNA2 32 60 142 434 Mean 3.2 6.0 SS 39.6 74 First, Compute SS for Each Condition Then, Compute t

Decision and Conclusion?
t-critical = “Alpha Region” α = .05, 1-tailed Decision? Because t-obtained (-2.50) < t-critical (-1.734), reject Ho. Conclusion? People report less anger after an apology t-obtained = -3.07

Analysis of Variance (ANOVA)
Comparing means of three or more conditions or groups

The F-Ratio: A Ratio of Variances Between and Within Groups

Between Groups Variance (Numerator of F-ratio)
Within Group 3 Variance Within Group 1 Variance Within Group 2 Mean = 9 Mean = 3 Mean = 5 Between Groups Variance (Numerator of F-ratio)

F-Distribution Probability distribution
All values positive (variance ratio) Positively skewed Median = 1 Shape varies with degrees of freedom (within and between) “Alpha Region” α = .05 1

F-critical Table: If we have 3 conditions, N = 14, alpha =
F-critical Table: If we have 3 conditions, N = 14, alpha = .05; F-crit = 3.98 Alpha Level df numerator = K-1 df denominator = N-K

Null and Alternative Hypotheses
Let’s say a marketing researcher is interested in the impact of music on sales at a new clothing store targeted to tweens. She sets up a mock store in her university’s research lab, gives each subject \$50 spending money, and then randomly assigns subjects to one of three conditions. One third of the subjects browse the mock store with no music. One third browse the store with soft music. And the final third browse the store with loud music. The sales figures are shown below. Assume the researcher decides to use an alpha level of .01. Null Hypothesis (Ho): All of the means are equal (ucontrol = usoft music = uloud music) Alternative (H1): At least two means are different F-critical (based on alpha = .01; df-numerator = 2; df-denominator = 9): 8.02 Decision Rule: If Fobt ≥ Fcritical, then reject Ho. Otherwise, retain Ho

The Data: Sales as a Function of Music Condition

Subtract Group Mean from Each Score
Then Square and Add Up This gives you the SS for that group

Do this for each of the three conditions

(See Statistics Notes Packet)
Environment SS Control SS Soft Music SS Loud Music No Music Soft Music Loud Music (x-mean) (x-mean)2 22 27 39 -4 16 1 1.0 26 23 36 -2 4.0 28 2 4 30 41 3 9 9.0 Mean Sum 38 24 18 Grand Mean 30.333 SS-between df-between MS-between SS-within 68 df-within MS-within 7.556 F 23.471 Source SS df MS Between 177.33 23.47 Within 68.00 7.56 Total 422.67 11 (See Statistics Notes Packet) Summarize in a Source Table

ANOVA - Source Table Source SS df MS F Between 354.667 2 177.33 23.47
Within 68.00 9 7.56 Total 422.67 11

F-critical in our example = 8.02
K = 3 Alpha = .01

Decision Rule and Conclusion?
Reject Null Hypothesis At least two means are different “Alpha Region” α = .01 F-critical 8.02 F-obtained 23.47

Correlation

Differences Between Correlation and Regression
Correlation (r) assessing direction (+ or -) and degree (strong, medium, weak) of relationship between two variables Linear Regression (slope, y-intercept) assessing nature of relationship between an outcome variable and one or more predictors making predictions for Y (cfc) based on X (impuss)

Reading Scatterplots Negative Correlation Zero Correlation Positive
X (Predictor) (Criterion) Y X (Predictor) (Criterion) Y X (Predictor) (Criterion) Y

Two Interpretations of Correlation Coefficient
Direction & Degree of Relationship Between Two Variables Range from –1 to +1 Stronger correlations at the extremes r = -1 (perfect negative relationship) r = 0 (no relationship) r = +1 (perfect positive relationship) Variance Explained r2, Ranges from 0 to + 1.0 What percent of the variance in Y is explained by X? Model Comparison Approach

Problem Statement - A Let’s say we survey 5 shoppers about their level of satisfaction with the service they received from a furniture store (X = satisfaction w/service) and their intention to return to the store in the future (Y = future intentions). Presumably, there should be a positive correlation between these variables. Null Hypothesis (Ho): Satisfaction with service and future shopping intentions are not positively correlated Alternative (H1): Satisfaction with service and future shopping intentions are positively correlated (this is a directional hypothesis) r-critical (based on alpha = .05(one-tailed),df = N-2 = 3): r-critical = .8054 Decision Rule (in this example, because r is predicted to be positive): If robt ≥ rcritical, then reject Ho. Otherwise, retain Ho

r-critical Table If alpha = .05 (1-tailed), N = 5, df = 3, r-critical = .8054
Decision Rule (when r is predicted to be positive): If robt ≥ rcritical, then reject Ho. Otherwise, retain Ho Decision Rule (when r is predicted to be negative): If robt ≤ rcritical, then reject Ho. Otherwise, retain Ho Decision Rule (when H1 is non-directional): If |robt| ≥ |rcritical|, then reject Ho. Otherwise, retain Ho

Data and Scatterplot Data Scatterplot

Raw Score Formula for Pearson’s r Correlation

Computing Pearson’s r (and variance explained)
Compute SSx and SSy Then compute r r2= (.313*.313) = .098 So, satisfaction explains 9.8% of variance in future intentions

Regression

Problem Statement - B Let’s use the data we just worked with for correlation. Five shoppers were asked their satisfaction with the service they received and their intention to shop at the store in the future. Regression would be used to make predictions for future shopping intentions (Y) based on people’s satisfaction with service (X). For example, what would we predict if a shopper rated their satisfaction with service at a 3? First need to compute regression equation, then use it to make a prediction

(change in Y for 1 unit change in X)
Slope and Y-Intercept X (Predictor) (Criterion) Y Y-intercept (bo) (value of Y, when X = 0) Slope (b1) (change in Y for 1 unit change in X)

Raw Score Formula for Slope and Y-Intercept

Computing Regression Equation
(Predictor) (Criterion) Satisfaction Future Customer w/Service Intentions X Y X2 Y2 XY Hector 5 6 25 36 30 Marge 2 4 16 8 Fredrick 3 9 15 Susie Gwen Sum X Sum Y Sum X2 Sum Y2 Sum XY 22 20 106 90 91 [(Sum X)*(Sum Y)]/N 88 Numerator SS X 9.2 b1 (the slope) 0.326 mean y 4.0 mean x 4.4 bo (y-intercept) 2.57 First compute slope Then compute y-intercept So, the regression equation is:

Data and Scatterplot Data Scatterplot

Using the Regression Equation to Make a Prediction
Let’s say a customer rates their satisfaction as a ‘3’ on our 7-point scale. What is their predicted future intention of shopping at the store in the future? So, a person who gives a ‘3’ on the satisfaction scale has a predicted future intention score of 3.544

Y-Predicted & Residuals
If Satisfaction (X) = 3 Predicted Intention (Y) = 3.54 Y predicted = 3.54 Residual = (Y-Y predicted) When r is strong, residuals are small X = 3

Chi-Square

One-Way vs. Two-Way Chi-Square
Chi-square is appropriate when our data are frequency (count) data In “one-way chi-square”, we have one categorical variable (type of shoe) with several levels (Adidas, Asics, Nikes, Pumas) and we want to know whether the frequency of observations differs between the groups (or conditions, or levels) In “two-way chi-square”, we have two categorical variables (Gender x Support for New Stadium) and we want to know if these two variables are related 1 2

One-way Chi-Square Where…

Problem Statement Let’s say we ask 100 people to pick their favorite brand of shoes among four types. The data are shown below. Clearly, the frequencies are not equal (25 in each). Here, 15 pick Adidas, 30 pick Asics, 45 pick Nike, and 10 pick Puma. The question is whether these frequencies are significantly different. Null Hypothesis (Ho): Frequencies of people choosing different brands is equal Alternative (H1): Not all the frequencies are equal (doesn’t mean they’re all different) X2-critical (based on alpha = .05; df = K-1 = 4-1=3): X2-critical = (X2 critical always positive) In df, K stands for the number of groups. (see critical table next page) Decision Rule is always as follows (b/c chi-square is always positive): If X2obt ≥ X2critical, then reject Ho. Otherwise, retain Ho

Chi-Square Critical Table (for one-way chi-square)
df = K-1, where K = # groups

Formula and Frequency Expected
Frequency Observed (Actual Frequencies) Frequency Expected (Typically = Total N/K) To compute chi-square, we need to know fe = expected frequency. Typically, we’ll just assume this represents an equal distribution across the conditions (Total N/K). So, we have a total of 100 people and 4 conditions (brands of shoe). Based on chance alone, an equal distribution across the conditions would mean 25 people would select each type of shoe. So, here we’ll assume fe = 25.

Computation Decision: Reject Ho, because
X2obt (30) ≥ X2critical (7.815). Conclusion: People do not show an equal preference among the four brands of shoes.

Two-Way Chi-Square Where…

Problem Statement Let’s say we’re interested in whether males and females differ in their support for building a new football stadium. We survey 40 people (10 men, and 30 women) and we ask them a simple (categorical) yes/no question: Do you support building a new football stadium? Now we want to know if there is a relationship between gender (male/female) and support for the stadium (yes/no). Null Hypothesis (Ho): There is no relationship between gender and support for football stadium Alternative (H1): There is a relationship between gender and support for football stadium X2-critical (based on alpha = .05; df = (Rows-1)*(Columns-1)=(2-1)*(2-1)=1: X2-critical = (X2 critical always positive) (see critical table next page) Decision Rule is always as follows (b/c chi-square is always positive): If X2obt ≥ X2critical, then reject Ho. Otherwise, retain Ho Of the 10 men surveyed, 8 supported it, and 2 didn’t Of the 30 women surveyed, 5 supported it and 25 didn’t

Data Of the 10 men surveyed, 8 supported it, and 2 didn’t
Of the 30 women surveyed, 5 supported it and 25 didn’t

Chi-Square Critical Table (works for two-way chi-square)
df = (Row-1)*(Columns-1)

Formula and Frequency Expected
Frequency Observed (Actual Frequencies) Frequency Expected (Row N*Column N)/Total N To compute chi-square, we need to know fe = expected frequency. Typically, we’ll just assume this represents an equal distribution across the conditions (Row N*Column N)/Total N. The next slide illustrates the computation of frequency expected.

Computing Frequency Expected and Chi-Square
Cell 1 Cell 2 Cell 3 Cell 4 Decision: Reject Ho, because X2obt (13.7) ≥ X2critical (3.841). Conclusion: Gender is related to support for football stadium (men > women)