Presentation on theme: "Data Analysis: Relationships Continued Regression"— Presentation transcript:
1 Data Analysis: Relationships Continued Regression Research MethodsDr. Gail Johnson
2 Simple Regression Enables us to estimate the: Strength of relationship Expressed as the percent of variance explainedHow much change you can expect in the dependent variable based on a one unit change in the independent variableEnables you to make predictive estimates
3 Relationships Correlation is not causation. Statistical measurement includes the measurement of relationships. There are 2 ways to measure the strength of a relationship:1. How great a difference the independent variable makes on the dependent variable (sometimes called the effect-description). This allows you to predict the effect of the IV on the DV but you have to have interval data.
4 Relationships2. How completely the dependent variable is explained by the independent variable. (Correlational). (R squared).
5 Simple Regression Assumes a linear relationship Interval level (or dichotomous: which means coded 0 or 1) dataIndependent Variable: interval levelRandom or census
6 Simple Regression Y = a + bX + error Where: a = the constant or Y interceptb = the regression coefficient, or slopeY = predicted value of the dependant variableX = the independent variable.
7 Simple Regression Estimate car repair costs for motor pool Y= car repair costsX = miles drivenCollect data and crunch it. You get these results:Y = -267 and .018X
8 Simple Regression Estimate car repair costs Y = -267 and .018X Interpretation:for every mile driven, the repair costs goes up by 1.8 cents.For every 100 miles driven, costs go up by $1.80
9 Simple Regression Y = -267 and .018X If you expect the cars to be driven a total of 100,000 miles, how much will car repair costs likely be?100,000 x .018 = $1,800Solve equation:Y = ,800 = $1,763
10 Simple Regressionr= correlation coefficient (overall fit) (measure of association but non-directional; zero-order correlational coefficient).r2 = proportion of explained variation1-r2 = proportion of unexplained variation
11 Life is more complexRarely will any one single variable cause something to happenLife is inherently multivariateWhat are the possible causes for urban decay?
12 What are the possible causes for urban decay? lack of jobshigh % of absentee landlordslow % of homeownerspoor quality of schoolsincreased concentration of poorincrease in drugs, crimeaging housing stockflight of middle class to suburbscorruptionaging infrastructurebusiness flight to suburbs
13 What caused drop in crime? Changing demographics?Better policing?Strong economy?Gun control laws?Concealed weapons laws?Increased use of death penalty?Increase in number of police?Rising prison population?Waning Crack epidemic?Legalization of abortion?
14 Multiple Regression Multiple Regression lets you do four things: test your hypothesispredict the dependent variable if you know the values for independent variablesPredicts the independent effect of each independent variable while controlling for the otherstells you the relative strength of each of the independent variable using the beta weights
15 Multiple Regression Y = a1 + bX1 + bX2 + bX3 + b X4 + e. Y = dependent variableX1 = independent variable 1, controlling for X2, X3, X4X2 = independent variable 2controlling for X1, X3, X4X3 = independent variable 3controlling for X1, X2, X4X4= independent variable 4controlling for X1, X2, X3
16 Multiple Regression Income as a function of education and seniority? Y = Income (dep. Var.)Y (Income) = a + education + seniorityY= X X2based on Lewis-Beck example
17 Multiple Regression Y= 6000 + 400X1 + 200X2 R square. = .67 67% of the variation in income is explained by these two variables. Excellent!For every year of education, holding seniority constant, income increases by $400.For every year of seniority, holding education constant, income increases by $200.
18 Multiple Regression Y= 6000 + 400X1 + 200X2 Example: Estimate the income of someone who has 10 years of education and5 years of seniority:Y= (10) + 200(5)Y= $ 11,000
19 Multiple RegressionRelationship between contributions to political campaigns as a function of age and income?Y= campaign contribution (dollars)x1 = age (years)X2 = income (dollars)
20 Multiple RegressionRelationship between contributions to political campaigns as a function of age and income.Y = X X2(age) (income)For every increase in age, contributions go up by $2.For every increase in income, contributions go up .01 dollars
21 Multiple Regression Y = 8 + 2X1 + .010X2 Y= campaign contribution (dollars)But which is stronger?Need to look at the Beta weightsAge = .15Income = .45
23 Quick AnalysisWhenever you are dealing with a correlation (regression analysis)First check the R squared value.A good study will have thisIf it is low, then you know that it is not a strong model and they shouldn’t be making grand conclusionsMake sure they meet 4 conditions necessary for causality
24 ExampleStudy tried to determine what explained why some cities introduced reinvention.Sent out a survey, respectable response rateTested 13 factors they thought would explain reinventionR squared was .05What do you conclude?
25 Example They ran a second model Included managers’ attitudes about innovationR squared was .22What do you conclude?
26 The Levitt Article What data does he show? What kind of question is he asking?Does he show correlations?Does he build a multivariate model?Did anyone see an R squared in this article?