SI0030 Social Research Methods Week 6 Luke Sloan

Slides:



Advertisements
Similar presentations
SI0030 Social Research Methods Week 5 Luke Sloan
Advertisements

Logistic Regression II SIT095 The Collection and Analysis of Quantitative Data II Week 8 Luke Sloan SIT095 The Collection and Analysis of Quantitative.
Multiple Regression II Fenster Multiple Regression Let’s go through an example using multiple regression and compare results between simple regression.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Correlation Correlation is the relationship between two quantitative variables. Correlation coefficient (r) measures the strength of the linear relationship.
Chapter 6 Section 1 Introduction. Probability of an Event The probability of an event is a number that expresses the long run likelihood that an event.
Statistics 350 Lecture 16. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Monday, 4/29/02, Slide #1 MA 102 Statistical Controversies Monday, 4/29/02 Today: CLOSING CEREMONIES!  Discuss HW #3  Review for final exam  Evaluations.
1 BA 275 Quantitative Business Methods Simple Linear Regression Introduction Case Study: Housing Prices Agenda.
Analyzing quantitative data – section III Week 10 Lecture 1.
Review of the fundamental concepts of probability Exploratory data analysis: quantitative and graphical data description Estimation techniques, hypothesis.
Statistics 350 Lecture 17. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
1 BA 275 Quantitative Business Methods Please turn in Progress Report #2 Quiz # 5 Simple Linear Regression Introduction Case Study: Housing Prices Agenda.
SIGCOMM Outline  Introduction  Datasets and Metrics  Analysis Techniques  Engagement  View Level  Viewer Level  Lessons  Conclusion.
A Flight Plan for Studying Statistics. The Scientific Procedure 1) Concepts (empirical and hypothetical) 2)Operational Definitions (measurement and procedure)
Design and Data Analysis in Psychology I Salvador Chacón Moscoso Susana Sanduvete Chaves School of Psychology Dpt. Experimental Psychology 1.
Example 1: page 161 #5 Example 2: page 160 #1 Explanatory Variable - Response Variable - independent variable dependent variable.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Practical Statistics Regression. There are six statistics that will answer 90% of all questions! 1. Descriptive 2. Chi-square 3. Z-tests 4. Comparison.
1 Introduction What does it mean when there is a strong positive correlation between x and y ? Regression analysis aims to find a precise formula to relate.
Scatter Diagrams scatter plot scatter diagram A scatter plot is a graph that may be used to represent the relationship between two variables. Also referred.
WEEK 1 You have 10 seconds to name…
Subjects Review Introduction to Statistical Learning Midterm: Thursday, October 15th :00-16:00 ADV2.
“GLMrous designs” “GLMrous designs” “Are you regressed or something?” “Pseudonyms & aliases” “Pseudonyms & aliases” Models I Models II.
1 Forecasting/ Causal Model MGS Forecasting Quantitative Causal Model Trend Time series Stationary Trend Trend + Seasonality Qualitative Expert.
Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Statistics 3502/6304 Prof. Eric A. Suess Chapter 3.
Topics, Summer 2008 Day 1. Introduction Day 2. Samples and populations Day 3. Evaluating relationships Scatterplots and correlation Day 4. Regression and.
CHAPTER 10 & 13 Correlation and Regression Instructor: Alaa saud Note: This PowerPoint is only a summary and your main source should be the book.
基 督 再 來 (一). 經文: 1 你們心裡不要憂愁;你們信神,也當信我。 2 在我父的家裡有許多住處;若是沒有,我就早 已告訴你們了。我去原是為你們預備地去 。 3 我 若去為你們預備了地方,就必再來接你們到我那 裡去,我在 那裡,叫你們也在那裡, ] ( 約 14 : 1-3)
3.3. SIMPLE LINEAR REGRESSION: DUMMY VARIABLES 1 Design and Data Analysis in Psychology II Salvador Chacón Moscoso Susana Sanduvete Chaves.
DEVRY BSOP 209 E NTIRE C OURSE Check this A+ tutorial guideline at For more classes visit
OPS 571 Week 3 DQ 3 To purchase this material click below link 571/OPS-571-Week-3-DQ-3 From the choice of simple moving.
BUS 308 Complete Class BUS 308 Week 1 DQ 1 Data Scales BUS 308 Week 1 DQ 2 Probability BUS 308 Week 1 Quiz BUS 308 Week 1 Problem Set Week One BUS 308.
BUS 308 Week 4 DQ 1 Simple Regression Analysis Use the data in the chart to answer the questions below. The data indicates the number of “sick days” appliance.
LDR 531 Week 3 Summary R-531/LDR-531-Week-3-Summary For more details
BASIC ECONOMETRICS.
Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016
Regression Chapter 6 I Introduction to Regression
Virtual COMSATS Inferential Statistics Lecture-26
Quantitative Methods Dr. Aravind Banakar –
Quantitative Methods Dr. Aravind Banakar –
Quantitative Methods
Quantitative Methods
Quantitative Methods
Quantitative Methods
Quantitative Methods
Quantitative Methods
Part Three. Data Analysis
Inferences Concerning Regression Parameters
Quantitative Methods Simple Regression.
Linear Regression.
Hypothesis Testing (Cont.)
Multiple Linear Regression
Слайд-дәріс Қарағанды мемлекеттік техникалық университеті
.. -"""--..J '. / /I/I =---=-- -, _ --, _ = :;:.
II //II // \ Others Q.
I1I1 a 1·1,.,.,,I.,,I · I 1··n I J,-·
Chapter 12 Linear Regression and Correlation
Example 3.3 Delivery Time Data
Power and Sample Size I HAVE THE POWER!!! Boulder 2006 Benjamin Neale.
Your Name Adolescent Risk Taking (Psych 4900) Weber State University
What’s the plan? First, we are going to look at the correlation between two variables: studying for calculus and the final percentage grade a student gets.
Cases. Simple Regression Linear Multiple Regression.
Statistics 101 CORRELATION Section 3.2.
. '. '. I;.,, - - "!' - -·-·,Ii '.....,,......, -,
Chapter 13: Using Statistics
Regression and Correlation of Data
Presentation transcript:

SI0030 Social Research Methods Week 6 Luke Sloan Quantitative Data Analysis II: Correlation and Simple Linear Regression SI0030 Social Research Methods Week 6 Luke Sloan

Introduction Last Week – Recap Correlation How To Draw A Line Simple Linear Regression Summary

Last Week - Recap Hypotheses Probability & Significance (p=<0.05) Chi-square test for two categorical variables t-test for one categorical and one interval variables What about a test for two interval variables?...

Correlation I Calculates the strength and direction of a linear relationship between two interval variables e.g. is there a relationship between age and income? Measured using the Pearson correlation coefficient (r) Data must be normally distributed (check with a histogram) If not normally distributed use Spearman’s Rank Order Correlation (rho) - consult Pallant (2005:297)

Correlation II ‘r’ can take any value from +1 to -1 +/- indicates whether the relationship is positive or negative +1 or -1 is a perfect linear relationship, but usually it is not this clear cut Rule of thumb: +/- 0.7 = a strong linear relationship +/- 0.5 = a good linear relationship +/- 0.3 = a linear relationship Below +/- 0.3 = weak linear relationship 0 = no linear relationship Alternatively: +/- 0.10 to 0.29 = weak +/- 0.30 to 0.49 = medium +/- 0.50 to 1.00 - strong

Positive relationship Negative relationship Correlation III Positive relationship No relationship Negative relationship Positive Relationship No Relationship Negative Relationship Formulate hypotheses and use scatter plots!

Correlation IV H1 = There is a relationship between Age and the number of years a candidate has been a member of a political party H0 = There is no relationship between Age and the number of years a candidate has been a member of a political party What do you think?

Is this normal? Just to prove a point… Correlation V Is this normal? Just to prove a point…

Correlation VI Perfect correlation against itself (obviously!) and number of cases in analysis Correlations What was your age last birthday Number of years a party member Pearson Correlation 1 .425** Sig. (2-tailed)   .000 N 4481 1874 1936 **. Correlation is significant at the 0.01 level (2-tailed). Significance for correlation is problematic (highly dependent on sample size) – report p-value but ignore level of significance Pearson’s Correlation Coefficient is r=0.43 – medium/good positive linear relationship

Correlation VII Don’t forget to refute or accept the null hypothesis and discuss the relationship Correlation is not causation! The relationship between the number of years a candidate has been a member of a party and candidate age was explored using Pearson’s correlation coefficient. Both variables were confirmed to have normal distributions [?] and a scatter plot revealed a linear relationship. There was a medium-strength, positive relationship between the two variables (r=0.43, n=4481, p<0.05)... [go on to explain the relationship in detail]

The line of best fit is a predictive – it is the regression line! How To Draw A Line I Correlation is indicative of a relationship, but it does not allow us to quantify it What if we wanted to explain how an increase in age leads to an increase in years of party membership? What if we wanted to predict years of party membership based only on age? The line of best fit is a predictive – it is the regression line!

How To Draw A Line II The regression line allows us to predict any given value of y when we know x i.e. if we know the age of a candidate we can predict how long they are likely to have been a member of a political party Another (more useful!) example would be years in education and income Using a regression line we can predict someone’s income based on the number of years they have been in education Assumes a causal relationship – that income is ‘caused’ by years in education

How To Draw A Line III y = a + b x But… we don’t simply look very closely at the line and the axis of the scatter plot because the regression line can be written as an equation: y = a + b x ‘y’ represents the dependent variable (what we are trying to predict) e.g. income ‘a’ represents the intercept(where the regression line crosses the vertical ‘y’ axis) aka the constant ‘b’ represents the slope of the line (the association between ‘y’ & ‘x’) e.g. how income changes in relation to education ‘x’ represents the independent variable (what we are using to predict ‘y’) e.g. years in education

How To Draw A Line IV y axis x axis y = 0 + 2x y = 0 + 1x y = 0 + 0.5x What about… y = 0 + 0.25x y = 1 + 1x x axis

Simple Linear Regression If we know the slope (b) and the intercept (a), for any given value of ‘x’ we can predict ‘y’ EXAMPLE: predicting income (y) in thousands (£) from years in education (x) Preconditions: Equations: Intercept (a) = 4 y = a + bx Or… Slope (b) = 1.5 Income = intercept + (slope*years in education) For someone with 10 years of education Or… Income = 4 + (1.5*10) = 19 (£19,000)

Simple Linear Regression II Assumptions Interval level data Linearity between ‘x’ and ‘y’ Outliers (check scatter plot) Sample size = 100+? R2 measure of ‘model fit’ Literally the Pearson’s correlation coefficient squared R2 tells us how much of the variance in the dependent variable is explained by the independent variable e.g. how much of the variance in income can be explained by age Expressed as a percentage (1.0 = 100%, 0.5 = 50% etc)

Simple Linear Regression III H0 = There is no relationship between Age and the number of years a candidate has been a member of a political party H1 = There is a relationship between Age and the number of years a candidate has been a member of a political party H2 = As the age of a candidate increases, so will the number of years that they have been a party member ‘Years as Party Member’ = intercept + (slope * ’Age’)

Simple Linear Regression IV Pearson’s correlation coefficient (same value!) 18% of variance in party membership (y) explained by age (x) Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate   1 .425a .181 .180 11.995 a. Predictors: (Constant), What was your age last birthday This tests the hypothesis that the model is a better predictor of party membership than if we simply used the mean value of party membership p<0.05 so the regression model is a significantly better predictor than the mean value ANOVAb Model Sum of Squares df Mean Square F Sig. 1 Regression 59446.085 413.170 .000a Residual 269339.696 1872 143.878   Total 328785.781 1873 a. Predictors: (Constant), What was your age last birthday b. Dependent Variable: Number of years a party member

Simple Linear Regression V y = a + b x p<0.05 so ‘Age’ has a significant effect on ‘Party Membership’ This is the intercept (a) This is the slope (b) Coefficientsa Model Unstandardized Coefficients Standardized Coefficients t Sig. B Std. Error Beta 1 (Constant) -6.899 1.156   -5.966 .000 What was your age last birthday .418 .021 .425 20.327 a. Dependent Variable: Number of years a party member A one unit increase in age will result in an increase in party membership of 0.42 ‘Party Membership’ = -6.9 + (0.42 * ’Age’) Or…

Simple Linear Regression VI … and this is what we saw in the original scatter plot! The ‘regression line’ will intercept the verticle (y) axis at -6.9 The ‘regression line’ rises by 0.42 on the verticle axis (y) for every one unit increase on the horizontal axis (x) The R2 value is low because of the fanning effect (remember the histograms!)

Summary How to describe and quantify the relationship between two interval variables Correlation – the strength and direction of the association Regression – the causal and quantified effect of an independent on a dependent variable