Linear Functions 2 Sociology 5811 Lecture 18 Copyright © 2004 by Evan Schofer Do not copy or distribute without permission.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Linear Regression.  The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu:  The model won’t be perfect, regardless.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Greg C Elvers.
Business Research Methods William G. Zikmund Chapter 23 Bivariate Analysis: Measures of Associations.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Department of Applied Economics National Chung Hsing University
The Basics of Regression continued
SIMPLE LINEAR REGRESSION
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
REGRESSION AND CORRELATION
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
SIMPLE LINEAR REGRESSION
Business Statistics - QBM117 Least squares regression.
Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Correlation and Regression Analysis
Leon-Guerrero and Frankfort-Nachmias,
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Relationships Among Variables
Least Squares Regression Line (LSRL)
Correlation and Linear Regression
Chapter 8: Bivariate Regression and Correlation
Lecture 3: Bivariate Data & Linear Regression 1.Introduction 2.Bivariate Data 3.Linear Analysis of Data a)Freehand Linear Fit b)Least Squares Fit c)Interpolation/Extrapolation.
Lecture 16 Correlation and Coefficient of Correlation
Overview 4.2 Introduction to Correlation 4.3 Introduction to Regression.
SIMPLE LINEAR REGRESSION
Chapter 13: Inference in Regression
Linear Regression and Correlation
Correlation and Linear Regression
EC339: Lecture 6 Chapter 5: Interpreting OLS Regression.
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Correlation and Regression. The test you choose depends on level of measurement: IndependentDependentTest DichotomousContinuous Independent Samples t-test.
Chapter 6 & 7 Linear Regression & Correlation
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate.
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
Business Research Methods William G. Zikmund Chapter 23 Bivariate Analysis: Measures of Associations.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Examining Relationships in Quantitative Research
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 16 Data Analysis: Testing for Associations.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
CORRELATION. Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson’s coefficient of correlation.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Correlation – Recap Correlation provides an estimate of how well change in ‘ x ’ causes change in ‘ y ’. The relationship has a magnitude (the r value)
Correlation/Regression - part 2 Consider Example 2.12 in section 2.3. Look at the scatterplot… Example 2.13 shows that the prediction line is given by.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
Linear Regression 1 Sociology 5811 Lecture 19 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
The simple linear regression model and parameter estimation
Statistics for Political Science Levin and Fox Chapter 11:
Regression 1 Sociology 8811 Copyright © 2007 by Evan Schofer
Correlation and Regression
CHAPTER 29: Multiple Regression*
SIMPLE LINEAR REGRESSION
Correlation and Regression
SIMPLE LINEAR REGRESSION
Linear Regression Dr. Richard Jackson
Warsaw Summer School 2017, OSU Study Abroad Program
Presentation transcript:

Linear Functions 2 Sociology 5811 Lecture 18 Copyright © 2004 by Evan Schofer Do not copy or distribute without permission

Announcements Proposals due November 15 Today’s class: Linear functions as summaries; introduction to the linear regression model

Review: Scatterplots Question: Can you describe the association? Answer: Negative linear association

Review: Scatterplots Question: Can you describe the association? Answer: Non-linear positive association

Review: Scatterplots Question: Can you describe the association? Answer: No association

Review: Scatterplots No relationship is represented by a “cloud” of evenly distributed points Strong linear relationships are reflected by visible “diagonal lines” on the graph Non-linear (curved) relationships are reflected by various curved patterns –U-shaped, upside down U-shaped –S-shaped, J-shaped

Review: Linear Association The closer points fall to a single line, the higher the linear association –Measured by the correlation coefficient (r) Some Linear Assoc.Higher Linear Assoc.

Review: Linear Functions Formula: Y = a + bX –Is a linear formula. If you graphed X and Y for any chosen values of a and b, you’d get a straight line. It is a family of functions, like the normal curve –For any value of a and b, you get a particular line a is referred to as the “constant” or “intercept” b is referred to as the “slope” To graph a linear function: Pick values for X, compute corresponding values of Y –Then, connect dots to graph line

Linear Functions: Y = a + bX Y axis X axis The “constant” or “intercept” (a) –Determines where the line intersects the Y-axis –If a increases (decreases), the line moves up (down) Y= X Y= X Y= X

Linear Functions: Y = a + bX The slope (b) determines the steepness of the line Y axis X axis Y=3-1.5X Y=2+.2X Y=3+3X

Linear Functions: Slopes The slope (b) is the ratio of change in Y to change in X Y=3+3X The slope tells you how many points Y will increase for any single point increase in X Change in X =5 Change in Y=15 Slope: b = 15/5 = 3

Linear Functions as Summaries A linear function can be used to summarize the relationship between two variables: Change in X = 40,000 Change in Y = 2 Slope: b = 2 / 40,000 = pts/$ If you change units: b =.05 / $1K b =.5 pts/$10K b = 5 pts/$100K

Linear Functions as Summaries Slope and constant can be “eyeballed” to approximate a formula: Slope (b): b = 2 / 40,000 = pts/$ Constant (a) = Value where line hits Y axis a = 2 Happy = Income

Linear Functions as Summaries Linear functions can powerfully summarize data: –Formula: Happy = Income Gives a sense of how the two variables are related –Namely, people get a increase in happiness for every extra dollar of income (or 5 pts per $100K) Also lets you “predict” values. What if someone earns $150,000? –Happy = ($150,000) = 9.5 But be careful… You shouldn’t assume that a relationship remains linear indefinitely –Also, negative income or happiness make no sense…

Linear Functions as Summaries Come up with a linear function that summarizes this real data: years of education vs. job prestige It isn’t always easy! The line you choose depends on how much you “weight” these points.

Linear Functions as Summaries One estimate of the linear function The line meets the Y-axis at Y=5. Thus a = 5 Formula: Y = 5 + 3X The line increases to about 65 as X reaches 20. The increase is 60 in Y per 20 in X. Thus: b = 60/20 = 3

Linear Functions as Summaries Questions: How much additional job prestige do you get by going to college (an extra 4 years of education)? –Formula: Prestige = 5 + 3*Education Answer: About 12 points of job prestige Change in X is 4… Slope is 3. 3 x 4 = 12 points If X=12, Y=5+3*12 = 41; If X=16, Y=5+3*16 = 53 What is the interpretation of the constant? It is the predicted job prestige of someone with zero years of education… (Prestige = 5)

Linear Functions as Summaries What do you think happens to the relationship between education and job prestige when education exceeds 20? –Would it remain linear? –Or would the effect taper off? Answer: Some would argue that the returns from education diminish beyond a certain point.

Interpreting Linear Functions New Example: In a society, the relationship between education (years) and income (in 1000s of dollars per year) can be summarized by: – Income (in 1000’s) = (Education) Questions: What is the general range of salaries? –0 education = 10k, 20 yrs education = 70K What is the economic benefit of college? Would you encourage your child to attend school? –What if it were: Income = (Education) ?

Interpreting Linear Functions Example: Income (1000s) = (Education) Questions: How would the society be different if the constant was 0? Provide a possible social interpretation. How would the society be different if the constant was 30? How would the society be different if the slope was zero? If it was negative? How would the society be different of the slope was 8?

Linear Functions Many issues remain: 1. How to test for “independence” among two interval measures (like a chi-square test)? –In order to know if a linear relationship exists 2. How to calculate correlation coefficients (r) to measure linear association? 3. How to calculate the linear formula that best summarizes the relationship between two real variables (i.e., based on actual data)? 4. What kinds of hypothesis tests can be done?

Lines: Summaries and Prediction Recall: Lines can be used to summarize –dollars in 1000’s: Slope (b): b = 2 / 40 =.05 pts/K$ Constant (a) = Value where line hits Y axis a = 2 Happy = Income Change in X = 40 Change in Y = 2

Linear Functions as Prediction Linear functions can summarize the relationship between two variables: –Formula: Happy = Income (in 1,000s) Linear functions can also be used to “predict” (estimate) a case’s value of variable (Y i ) based on its value of another variable (X i ) –If you know the constant and slope “Y-hat” indicates an estimation function: b YX denotes the slope of Y with respect to X

Prediction with Linear Functions If X i (Income) = 60K, what is our estimate of Y i (Happiness)? Happy = Income Happiness-hat = (60) = 5 There is an case with Income =60K The prediction in imperfect… The case falls at X = 5.3 (above the line).

The Linear Regression Model To model real data, we must take into account that points will miss the line Similar to ANOVA, we refer to the deviation of points from the estimated value as “error” (e i ) In ANOVA the estimated value is: the group mean –i.e., the grand mean plus the group effect In regression the estimated value is derived from the formula Y = a + bX –Estimation is based on the value of X, slope, and constant (assumes linear relationship between X and Y)

The Linear Regression Model The value of any point (Y i ) can be modeled as: The value of Y for case (i) is made up of A constant (a) A sloping function of the case’s value on variable X (b YX ) An error term (e), the deviation from the line By adding error (e), an abstract mathematical function can be applied to real data points

The Linear Regression Model Visually: Y i = a + bX i + e i Y=2+.5X Constant (a) = 2 a = 2 bX = 3(.5) = 1.5 Case 7: X=3, Y=5 e = 1.5

Estimating Linear Equations Question: How do we choose the best line to describe our real data? –Previously, we just “eyeballed” it Answer: Look at the error If a given line formula misses points by a lot, the observed error will be large If the line is as close to all points as possible, observed error will be small Of course, even the best line has some error –Except when all data points are perfectly on a line

Estimating Linear Equations A poor estimation (big error) Y=1.5-1X

Estimating Linear Equations Better estimation (less error) Y=2+.5X

Estimating Linear Equations Look at the improvement (reduction) in error: High Error vs. Low Error

Estimating Linear Equations Idea: The “best” line is the one that has the least error (deviation from the line) Total deviation from the line can be expressed as: But, to make all deviation positive, we square it, producing the “sum of squares error”

Estimating Linear Equations Goal: Find values of constant (a) and slope (b) that produce the lowest squared error –The “least squares” regression line The formula for the slope (b) that yields the “least squares error” is: Where s 2 x is the variance of X And s YX is the covariance of Y and X.

Covariance Variance: Sum of deviation about Y-bar over N-1 Covariance (s YX ): Sum of deviation about Y-bar multiplied by deviation around X-bar:

Covariance Covariance: A measure of how much variance of a case in X is accompanied by variance in Y It measures whether deviation (from mean) in X tends to be accompanied by similar deviation in Y Or if cases with positive deviation in X have negative deviation in Y This is summed up for all cases in the data The covariance is one numerical measure that characterizes the extent of linear association As is the correlation coefficient (r).