Download presentation

Presentation is loading. Please wait.

Published byDortha Hawkins Modified over 4 years ago

1
Statistical Analysis Regression & Correlation Psyc 250 Winter, 2013

2
Review: Types of Variables & Steps in Analysis

3
Variables & Statistical Tests Variable TypeExampleCommon Stat Method Nominal by nominal Blood type by gender Chi-square Scale by nominalGPA by gender GPA by major T-test Analysis of Variance Scale by scaleWeight by height GPA by SAT Regression Correlation

4
Evaluating an hypothesis Step 1: What is the relationship in the sample? Step 2: How confidently can one generalize from the sample to the universe from which it comes? p <.05

5
Evaluating an hypothesis Relationship in Sample Statistical Significance 2 nom. vars.Cross-tab / contingency table “p value” from Chi Square Scale dep. & 2-cat indep. Means for each category “p value” from t- test Scale dep. & 3+ cat indep. Means for each category “p value” from ANOVA f ratio 2 scale vars.Regression line Correlation r & r 2 “p value” from reg or correlation

6
Evaluating an hypothesis Relationship in Sample Statistical Significance 2 nom. vars.Cross-tab / contingency table “p value” from Chi Square Scale dep. & 2-cat indep. Means for each category “p value” from t- test Scale dep. & 3+ cat indep. Means for each category “p value” from ANOVA 2 scale vars.Regression line Correlation r & r 2 “p value” from reg or correlation

7
Relationships between Scale Variables Regression Correlation

8
Regression Amount that a dependent variable increases (or decreases) for each unit increase in an independent variable. Expressed as equation for a line – y = m(x) + b – the “regression line” Interpret by slope of the line: m (Or: interpret by “odds ratio” in “logistic regression”)

9
Correlation Strength of association of scale measures r = -1 to 0 to +1 +1 perfect positive correlation -1 perfect negative correlation 0 no correlation Interpret r in terms of variance

10
Mean & Variance

11
Example: Weight & Height Survey of Class n = 42 Height Mother’s height Mother’s education SAT Estimate IQ Well-being (7 pt. Likert) Weight Father’s education Family income G.P.A. Health (7 pt. Likert)

12
Frequency Table for:HEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 59.00 1 2.4 2.4 2.4 61.00 2 4.8 4.8 7.1 62.00 3 7.1 7.1 14.3 63.00 3 7.1 7.1 21.4 65.00 5 11.9 11.9 33.3 66.00 3 7.1 7.1 40.5 67.00 4 9.5 9.5 50.0 68.00 5 11.9 11.9 61.9 69.00 1 2.4 2.4 64.3 70.00 6 14.3 14.3 78.6 71.00 1 2.4 2.4 81.0 72.00 4 9.5 9.5 90.5 73.00 3 7.1 7.1 97.6 74.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0

13
Frequency Table for:HEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 59.00 1 2.4 2.4 2.4 61.00 2 4.8 4.8 7.1 62.00 3 7.1 7.1 14.3 63.00 3 7.1 7.1 21.4 65.00 5 11.9 11.9 33.3 66.00 3 7.1 7.1 40.5 67.00 4 9.5 9.5 50.0 68.00 5 11.9 11.9 61.9 69.00 1 2.4 2.4 64.3 70.00 6 14.3 14.3 78.6 71.00 1 2.4 2.4 81.0 72.00 4 9.5 9.5 90.5 73.00 3 7.1 7.1 97.6 74.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for:HEIGHT Valid Variable Mean Std Dev Variance Range Minimum Maximum N HEIGHT 67.33 3.87 14.96 15.00 59.00 74.00 42 mean

14
Variance x i - Mean ) 2 Variance = s 2 = ----------------------- N - 1 Standard Deviation = s = variance

15
Frequency Table for:WEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 115.00 1 2.4 2.4 2.4 120.00 1 2.4 2.4 4.8 124.00 1 2.4 2.4 7.1 125.00 4 9.5 9.5 16.7 128.00 1 2.4 2.4 19.0 130.00 6 14.3 14.3 33.3 135.00 4 9.5 9.5 42.9 136.00 1 2.4 2.4 45.2 140.00 3 7.1 7.1 52.4 145.00 2 4.8 4.8 57.1 150.00 3 7.1 7.1 64.3 155.00 2 4.8 4.8 69.0 160.00 6 14.3 14.3 83.3 165.00 2 4.8 4.8 88.1 170.00 1 2.4 2.4 90.5 185.00 1 2.4 2.4 92.9 190.00 2 4.8 4.8 97.6 210.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for:WEIGHT Valid Variable Mean Std Dev Variance Range Minimum Maximum N WEIGHT 146.38 21.30 453.80 95.00 115.00 210.00 42 mean

16
Relationship of weight & height: Regression Analysis

18
“Least Squares” Regression Line Dependent = ( B ) (Independent) + constant weight = ( B ) ( height ) + constant

19
Regression line

20
Regression:WEIGHTonHEIGHT Multiple R.59254 R Square.35110 Adjusted R Square.33488 Standard Error 17.37332 Analysis of Variance DF Sum of Squares Mean Square Regression 1 6532.61322 6532.61322 Residual 40 12073.29154 301.83229 F = 21.64319 Signif F =.0000 ------------------ Variables in the Equation ------------------ Variable B SE B Beta T Sig T HEIGHT 3.263587.701511.592541 4.652.0000 (Constant) -73.367236 47.311093 -1.551 [ Equation:Weight = 3.3 ( height ) - 73 ]

21
Regression line W = 3.3 H - 73

22
Strength of Relationship “Goodness of Fit”: Correlation How well does the regression line “fit” the data?

23
Correlation Strength of association of scale measures r = -1 to 0 to +1 +1 perfect positive correlation -1 perfect negative correlation 0 no correlation Interpret r in terms of variance

25
Frequency Table for:WEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 115.00 1 2.4 2.4 2.4 120.00 1 2.4 2.4 4.8 124.00 1 2.4 2.4 7.1 125.00 4 9.5 9.5 16.7 128.00 1 2.4 2.4 19.0 130.00 6 14.3 14.3 33.3 135.00 4 9.5 9.5 42.9 136.00 1 2.4 2.4 45.2 140.00 3 7.1 7.1 52.4 145.00 2 4.8 4.8 57.1 150.00 3 7.1 7.1 64.3 155.00 2 4.8 4.8 69.0 160.00 6 14.3 14.3 83.3 165.00 2 4.8 4.8 88.1 170.00 1 2.4 2.4 90.5 185.00 1 2.4 2.4 92.9 190.00 2 4.8 4.8 97.6 210.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for:WEIGHT Valid Variable Mean Std Dev Variance Range Minimum Maximum N WEIGHT 146.38 21.30 453.80 95.00 115.00 210.00 42 mean

26
Variance = 454

27
Regression line mean

28
Correlation: “Goodness of Fit” Variance (average sum of squared distances from mean) = 454 “Least squares” (average sum of squared distances from regression line) = 295

29
l.s. = 295 Regression line mean S 2 = 454

30
Correlation: “Goodness of Fit” How much is variance reduced by calculating from regression line? 454 – 295 = 159159 / 454 =.35 Variance is reduced 35% by calculating “least squares” from regression line r 2 =.35

31
r 2 = % of variance in WEIGHT “explained” by HEIGHT Correlation coefficient = r

32
Correlation:HEIGHTwith WEIGHT HEIGHT WEIGHT HEIGHT 1.0000.5925 ( 42) ( 42) P=. P=.000 WEIGHT.5925 1.0000 ( 42) ( 42) P=.000 P=.

33
r =.59 r 2 =.35 HEIGHT “explains” 35% of variance in WEIGHT

34
Sentence & G.P.A. Regression: form of relationship Correlation: strength of relationship p value: statistical significance

35
Legal Attitudes Study: 1.Relationship of sentence length to G.P.A.? 2.Relationship of sentence length to Liberal-Conservative views

36
G. P. A.

37
Length of Sentence (simulated data)

38
Scatterplot: Sentence on G.P.A.

39
Regression Coefficients Sentence = -3.5 G.P.A. + 18

40
Sent = -3.5 GPA + 18 “Least Squares” Regression Line

41
Correlation: Sentence & G.P.A.

42
Statistical Significance p =.31 Regression: Correlation

43
Interpreting Correlations r = -.22 r 2 =.05p =.31 G.P.A. “explains” 5% of the variance in length of sentence

44
Write Results “A regression analysis finds that each higher unit of GPA is associated with a 3.5 month decrease in sentence length, but this correlation was low (r = -.22) and not statistically significant (p =.31).”

46
Multiple Regression Problem: relationship of weight and calorie consumption Both weight and calorie consumption related to height Need to “control for” height or assess relative effects of height and calorie consumption

47
Regression line mean Multiple Regression

48
Regression line mean Multiple Regression Residuals

49
Multiple Regression Regress weight residuals (dependent variable) on caloric intake (independent variable) Statistically “controls” for height: removes effect or “confound” of height. How much variance in weight does caloric intake account for over and above height?

50
Multiple Regression How much variance in dependent measure (weight, length of sentence) do all independent variables combined account for? multiple R 2 What is the best “model” for predicting the dependent variable?

51
Malamuth: Sexual Aggression Dependent Var: self-report aggression Indep / Predictor Vars: –Dominance –Hostility toward women –Acceptance of violence toward women –Psychoticism –Sexual Experience + interaction effects

52
Malamuth: multiple regressions Without “tumescence” index: multiple R =.55w/ interactions R =.67 multiple R 2 =.30 R 2 =.45 With “tumescence” index: multiple R =.62w/ interactions R =.87 multiple R 2 =.38 R 2 =.75

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google