Presentation is loading. Please wait.

Presentation is loading. Please wait.

LEARNING OUTCOMES After studying this chapter, you should be able to

Similar presentations


Presentation on theme: "LEARNING OUTCOMES After studying this chapter, you should be able to"— Presentation transcript:

1 Chapter 15 Differences Between Groups and Relationships Among Variables

2 LEARNING OUTCOMES After studying this chapter, you should be able to
Understand what multivariate statistical analysis involves and know the two types of multivariate analysis Interpret results from multiple regression analysis. Interpret results from multivariate analysis of variance (MANOVA) Interpret basic exploratory factor analysis results

3 What Is the Appropriate Test of Difference?
Test of Differences An investigation of a hypothesis that two (or more) groups differ with respect to measures on a variable. Behavior, characteristics, beliefs, opinions, emotions, or attitudes Bivariate Tests of Differences Involve only two variables: a variable that acts like a dependent variable and a variable that acts as a classification variable. Differences in mean scores between groups or in comparing how two groups’ scores are distributed across possible response categories.

4 EXHIBIT 15.1 Choosing the Right Statistic

5 EXHIBIT 15.1 Choosing the Right Statistic (cont’d)

6 Common Bivariate Tests
Type of Measurement Differences between two independent groups Differences among three or more independent groups Interval and ratio Independent groups: t-test or Z-test One-way ANOVA Ordinal Mann-Whitney U-test Wilcoxon test Kruskal-Wallis test Nominal Z-test (two proportions) Chi-square test Chi-square test

7 Cross-Tabulation Tables: The χ2 Test for Goodness-of-Fit
Cross-Tabulation (Contingency) Table A joint frequency distribution of observations on two more variables. χ2 Distribution Provides a means for testing the statistical significance of a contingency table. Involves comparing observed frequencies (Oi) with expected frequencies (Ei) in each cell of the table. Captures the goodness- (or closeness-) of-fit of the observed distribution with the expected distribution.

8 Example: Papa John’s Restaurants
Univariate Hypothesis: Papa John’s restaurants are more likely to be located in a stand-alone location or in a shopping center. Bivariate Hypothesis: Stand-alone locations are more likely to be profitable than are shopping center locations.

9 Chi-Square Test χ² = chi-square statistic
Oi = observed frequency in the ith cell Ei = expected frequency on the ith cell Ri = total observed frequency in the ith row Cj = total observed frequency in the jth column n = sample size

10 Degrees of Freedom (d.f.)
(R-1)(C-1)=(2-1)(2-1)=1 d.f.=(R-1)(C-1)

11 The t-Test for Comparing Two Means
Independent Samples t-Test A test for hypotheses stating that the mean scores for some interval- or ratio-scaled variable grouped based on some less than interval classificatory variable.

12 The t-Test for Comparing Two Means (cont’d)
Determining when an independent samples t-test is appropriate: Is the dependent variable interval or ratio? Can the dependent variable scores be grouped based upon some categorical variable? Does the grouping result in scores drawn from independent samples? Are two groups involved in the research question?

13 The t-Test for Comparing Two Means (cont’d)
Pooled Estimate of the Standard Error An estimate of the standard error for a t-test of independent means that assumes the variances of both groups are equal.

14 The t-Test for Comparing Two Means (cont’d)

15 EXHIBIT 15.2 Independent Samples t-Test Results

16 What Is ANOVA? Analysis of Variance (ANOVA)
An analysis involving the investigation of the effects of one treatment variable on an interval-scaled dependent variable A hypothesis-testing technique to determine whether statistically significant differences in means occur between two or more groups. A method of comparing variances to make inferences about the means. ANOVA tests whether “grouping” observations explains variance in the dependent variable.

17 Simple Illustration of ANOVA
How much coffee respondents report drinking each day based on which shift they work (GY stands for Graveyard shift). Day 1 Day 3 Day 4 Day 0 Day 2 GY 7 GY 2 GY 1 GY 6 Night 6 Night 8 Night 3 Night 7

18 EXHIBIT 15.3 Illustration of ANOVA Logic

19 Partitioning Variance in ANOVA
Total Variability Grand mean The mean of a variable over all observations. SST The total observed variation across all groups and individual observations SST = Total of (observed value-grand mean)2

20 Partitioning Variance in ANOVA
Between-groups Variance The sum of differences between the group mean and the grand mean summed over all groups for a given set of observations. SSB Systematic variation of scores between groups due to manipulation of an experimental variable or group classifications of a measured independent variable or between-group variance. SSB = Total of ngroup(Group Mean − Grand Mean)2

21 Partitioning Variance in ANOVA
Within-group Error or Variance The sum of the differences between observed values and the group mean for a given set of observations; also known as total error variance. SSE Variation of scores due to random error or within-group variance due to individual differences from the group mean. This is the error of prediction. SSE = Total of (Observed Mean − Group Mean)2

22 The F-Test F-Test Is used to determine whether there is more variability in the scores of one sample than in the scores of another sample. Variance components are used to compute f-ratios SSE, SSB, SST

23 EXHIBIT 15.4 Interpreting ANOVA

24 Correlation Coefficient Analysis
A statistical measure of the covariation, or association, between two at-least interval variables. Covariance Extent to which two variables are associated systematically with each other.

25 Simple Correlation Coefficient
Correlation coefficient (r) Ranges from +1 to -1 Perfect positive linear relationship = +1 Perfect negative (inverse) linear relationship = -1 No correlation = 0 Correlation coefficient for two variables (X,Y)

26 Correlation, Covariance, and Causation
When two variables covary, they display concomitant variation. This systematic covariation does not in and of itself establish causality. Rooster’s crow and the rising of the sun Rooster does not cause the sun to rise.

27 Coefficient of Determination
Coefficient of Determination (R2) A measure obtained by squaring the correlation coefficient; the proportion of the total variance of a variable accounted for by another value of another variable. Measures that part of the total variance of Y that is accounted for by knowing the value of X.

28 Regression Analysis Simple (Bivariate) Linear Regression
A measure of linear association that investigates straight-line relationships between a continuous dependent variable and an independent variable that is usually continuous, but can be a categorical dummy variable. The Regression Equation (Y = α + βX ) Y = the continuous dependent variable X = the independent variable α = the Y intercept (regression line intercepts Y axis) β = the slope of the coefficient (rise over run)

29 The Regression Equation
Parameter Estimate Choices β is indicative of the strength and direction of the relationship between the independent and dependent variable. α (Y intercept) is a fixed point that is considered a constant (how much Y can exist without X) Standardized Regression Coefficient (β) Estimated coefficient of the strength of relationship between the independent and dependent variables. Expressed on a standardized scale where higher absolute values indicate stronger relationships (range is from -1 to 1).

30 EXHIBIT 15.5 The Advantage of Standardized Regression Weights

31 The Regression Equation (cont’d)
Parameter Estimate Choices (cont’d) Raw regression estimates (b1) Raw regression weights have the advantage of retaining the scale metric—which is also their key disadvantage. If the purpose of the regression analysis is forecasting, then raw parameter estimates must be used. This is another way of saying when the researcher is interested only in prediction. Standardized regression estimates (β1) Standardized regression estimates have the advantage of a constant scale. Standardized regression estimates should be used when the researcher is testing explanatory hypotheses.

32 Multiple Regression Analysis
An analysis of association in which the effects of two or more independent variables on a single, interval-scaled dependent variable are investigated simultaneously. Dummy variable The way a dichotomous (two group) independent variable is represented in regression analysis by assigning a 0 to one group and a 1 to the other.

33 Multiple Regression Analysis (cont’d)
A Simple Example Assume that a toy manufacturer wishes to explain store sales (dependent variable) using a sample of stores from Canada and Europe. Several hypotheses are offered: H1: Competitor’s sales are related negatively to sales. H2: Sales are higher in communities with a sales office than when no sales office is present. H3: Grammar school enrollment in a community is related positively to sales.

34 Multiple Regression Analysis (cont’d)
Statistical Results of the Multiple Regression Regression Equation: Coefficient of multiple determination (R2): 0.845 F-value= 14.6; p<.05

35 Multiple Regression Analysis (cont’d)
Regression Coefficients in Multiple Regression Partial correlation The correlation between two variables after taking into account the fact that they are correlated with other variables too. R2 in Multiple Regression The coefficient of multiple determination in multiple regression indicates the percentage of variation in Y explained by all independent variables.

36 Multiple Regression Analysis (cont’d)
Coefficients of Partial Regression bn Independent variables correlated with one another The percentage of variance in the dependent variable that is explained by a single independent variable, holding other independent variables constant R2 The percentage of variance in the dependent variable that is explained by the variation in the independent variables.

37 Multiple Regression Analysis (cont’d)
Statistical Significance in Multiple Regression F-test Tests statistical significance by comparing the variation explained by the regression equation to the residual error variation. Allows for testing of the relative magnitudes of the sum of squares due to the regression (SSR) and the error sum of squares (SSE).

38 Multiple Regression Analysis (cont’d)
Degrees of Freedom (d.f.) k = number of independent variables n = number of observations or respondents Calculating Degrees of Freedom (d.f.) d.f. for the numerator = k d.f. for the denominator = n - k - 1

39 F-test

40 EXHIBIT 15.4 Interpreting Multiple Regression Results

41 Steps in Interpreting a Multiple Regression Model
Examine the model F-test. Examine the individual statistical tests for each parameter estimate. Examine the model R2. Examine collinearity diagnostics.

42 Other Multivariate Techniques
Multivariate Data Analysis A group of statistical techniques allowing for the simultaneous analysis of three or more variables. Multivariate Techniques Exploratory factor analysis Confirmatory factor analysis Multivariate analysis of variance (MANOVA) Multiple discriminant analysis Cluster analysis.

43 Key Terms and Concepts Cross-tabulation (contingency table)
Univariate analysis Bivariate analysis Analysis of variance (ANOVA) Grand mean Between-groups variance Within-group error or variance F-test Within-group variation Between-group variance Total variability (SST) Correlation coefficient Coefficient of determination (r2) Simple linear regression Standardized regression coefficient (β) Multiple regression analysis Multivariate data analysis


Download ppt "LEARNING OUTCOMES After studying this chapter, you should be able to"

Similar presentations


Ads by Google