Sociology 601 Class 17: October 28, 2009 Review (linear regression) –new terms and concepts –assumptions –reading regression computer outputs Correlation.

Slides:



Advertisements
Similar presentations
Chapter 12 Inference for Linear Regression
Advertisements

Lesson 10: Linear Regression and Correlation
Kin 304 Regression Linear Regression Least Sum of Squares
Chapter 12 Simple Linear Regression
Forecasting Using the Simple Linear Regression Model and Correlation
Regresi Linear Sederhana Pertemuan 01 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Correlation and Regression
1 Simple Linear Regression and Correlation The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES Assessing the model –T-tests –R-square.
Sociology 601, Class17: October 27, 2009 Linear relationships. A & F, chapter 9.1 Least squares estimation. A & F 9.2 The linear regression model (9.3)
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.
Chapter 10 Simple Regression.
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Linear Regression with One Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
The Simple Regression Model
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Korelasi dalam Regresi Linear Sederhana Pertemuan 03 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Business Statistics - QBM117 Least squares regression.
Pertemua 19 Regresi Linier
Correlation and Regression Analysis
Simple Linear Regression and Correlation
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Chapter 7 Forecasting with Simple Regression
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Simple Linear Regression Analysis
Correlation & Regression
Introduction to Linear Regression and Correlation Analysis
MAT 254 – Probability and Statistics Sections 1,2 & Spring.
Chapter 6 & 7 Linear Regression & Correlation
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Introduction to Linear Regression
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
STA291 Statistical Methods Lecture LINEar Association o r measures “closeness” of data to the “best” line. What line is that? And best in what terms.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
LECTURE 9 Tuesday, 24 FEBRUARY STA291 Fall Administrative 4.2 Measures of Variation (Empirical Rule) 4.4 Measures of Linear Relationship Suggested.
Essential Statistics Chapter 51 Least Squares Regression Line u Regression line equation: y = a + bx ^ –x is the value of the explanatory variable –“y-hat”
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Psychology 202a Advanced Psychological Statistics October 22, 2015.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
BUSINESS MATHEMATICS & STATISTICS. Module 6 Correlation ( Lecture 28-29) Line Fitting ( Lectures 30-31) Time Series and Exponential Smoothing ( Lectures.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Part II Exploring Relationships Between Variables.
Regression and Correlation of Data Correlation: Correlation is a measure of the association between random variables, say X and Y. No assumption that one.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Inference for Least Squares Lines
Lecture #26 Thursday, November 17, 2016 Textbook: 14.1 and 14.3
Statistics for Managers using Microsoft Excel 3rd Edition
LECTURE 13 Thursday, 8th October
Correlation and Regression
Relationship with one independent variable
CHAPTER 26: Inference for Regression
Simple Regression Mary M. Whiteside, PhD.
Relationship with one independent variable
Linear Regression and Correlation
Linear Regression and Correlation
Presentation transcript:

Sociology 601 Class 17: October 28, 2009 Review (linear regression) –new terms and concepts –assumptions –reading regression computer outputs Correlation (Agresti and Finlay 9.4) –the correlation coefficient r –relationship to regression coefficient b –r-squared: the reduction in error 1

Review: Linear Regression N ew terms and concepts –slope –intercept –– –– –negative and positive slopes –zero slope –least squares regression –predicted value –residuals –sums of squares error 2

Review: Linear Regression Assumptions –random sample (errors are independent) –linear –no heteroscedasticity –no outliers Linearity, heteroscedasticity, and outliers can be checked with scattergrams and crosstabs –before computing regressions –on residuals 3

A Problem with Regression Coefficients Regression coefficients don’t measure the strength of an association in a way that is easily compared across different models with different variables or different scales. Rescaling one or both axes changes the slope b. Example: murder rate and poverty rate for 50 US States. Y hat = X, o where Y = murder rate per 100,000 per year o and X = poverty rate per 100 If we rescale y, the murder rate, to murders per 100 persons per year, then Y hat = X o (does this mean the association is now weaker?) If we rescale x, the poverty rate, to proportion in poverty (0.00 -> 1.00), then Y hat = X o (does this mean the association is now stronger?) 4

the correlation – a standardized slope An accepted solution for the problem of scale is to standardize both axes (e.g., change them into z-scores with mean zero and a standard deviation of 1), then calculate the slope. b =  Y /  X r = (  Y /s Y )/(  X /s X ) = (  Y /  X )*(s X /s Y )= b*(s X /s Y ) where 5

The Correlation Coefficient, r r is called … –the Pearson correlation (or simply the correlation) –the standardized regression coefficient (or the standardized slope) r = b*(s X /s Y ) r is a sample statistic we use to estimate a population parameter  6

Calculating r: an example Calculating r for the murder and poverty example b =.58, s X = 4.29, s Y = 3.98 r = b*(s X /s Y ) =.58*(4.29/3.98) =.629=.63 alternatively (if the murder rate is per 100 persons), b =.00058, s X = 4.29, s Y = r = b*(s X /s Y ) =.00058*(4.29/.00398) =.629 =.63 7

Properties of the correlation coefficient r: –1  r  1 r can be positive or negative, and has the same sign as b. r = ± 1 when all the points fall exactly on the prediction line. The larger the absolute value of r, the stronger the linear association. r = 0 when there is no linear trend in the relationship between X and Y. 8

Properties of the Correlation Coefficient r: The value of r does not depend on the units of X and Y. The correlation treats X and Y symmetrically –(unlike the slope β) –this means that a correlation implies nothing about causal direction! The correlation is valid only when a straight line is a reasonable model for the relationship between X and Y. 9

Examples of the correlation coefficient r: b = 1, r = 1 b = 5, r = 1 b =.2, r = 1 b = -1, r = -1 b =.5, r =.8 b =.5, r =.3 b = 0, linear assumption holds b = 0, linear assumption does not hold 10

Calculating a correlation coefficient using STATA Recall the religion and state control study, where high levels of state regulation were associated with low levels of weekly church attendance.. correlate attend regul (obs=18) | attend regul attend | regul |

An alternative interpretation of r: proportional reduction in error Old interpretation for murder and poverty example: r =.63, the murder rate for a state is expected to be higher by 0.63 standard deviations for each 1.0 standard deviation increase in the poverty rate. New interpretation: by using poverty rates to predict murder rates, we explain ?? percent of the variation in states’ murder rates. 12

Proportional reduction in error: Predicting Y without using X: Y = Y bar + e 1 ; E 1 =  e 1 2 =  (observed Y – predicted Y) 2 = Total Sums of Squares = TSS Predicting Y using X: Y = Y hat + e 2 = a + bX + e 2 ; E 2 =  e 2 2 =  (observed Y – predicted Y) 2 = Sum of Squared Error = SSE Proportional reduction in error: r 2 = PRE = (E 1 – E 2 ) / E 1 = (TSS – SSE) / TSS 13

Proportional reduction in error. calculating r 2 for the murder and poverty example: r 2 = =.395 alternatively (using computer output), r 2 = (TSS – SSE) / TSS = (777.7 – 470.4)/777.7 =.395 interpretation: 39.5% of the variation in states’ murder rates is explained by its linear relationship with states’ poverty rates. 14

R-square r 2 is also called the coefficient of determination. Properties of r 2 : 0  r 2  1 r 2 = 1 (its maximum value) when SSE = 0. r 2 = 0 when SSE = TSS. (furthermore, b = 0) the higher r 2 is, the stronger the linear association between X and Y. r 2 does not depend on the units of measurement. r 2 takes the same value when X predicts Y as when Y predicts X. 15

Next Class drawing inference to populations from sample b’s & r’s. x 16