Warsaw Summer School 2017, OSU Study Abroad Program

Slides:



Advertisements
Similar presentations
Simple Linear Regression and Correlation by Asst. Prof. Dr. Min Aung.
Advertisements

Lesson 10: Linear Regression and Correlation
Forecasting Using the Simple Linear Regression Model and Correlation
Correlation and Regression
Describing Relationships Using Correlation and Regression
© The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
PPA 501 – Analytical Methods in Administration Lecture 8 – Linear Regression and Correlation.
PPA 415 – Research Methods in Public Administration
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
Correlation and Regression. Correlation What type of relationship exists between the two variables and is the correlation significant? x y Cigarettes.
Correlation Relationship between Variables. Statistical Relationships What is the difference between correlation and regression? Correlation: measures.
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
SIMPLE LINEAR REGRESSION
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Correlation and Regression Analysis
Lecture 16 Correlation and Coefficient of Correlation
SIMPLE LINEAR REGRESSION
Week 12 Chapter 13 – Association between variables measured at the ordinal level & Chapter 14: Association Between Variables Measured at the Interval-Ratio.
Correlation and Regression
Correlation and Regression. The test you choose depends on level of measurement: IndependentDependentTest DichotomousContinuous Independent Samples t-test.
Chapter 15 Correlation and Regression
1 Chapter 9. Section 9-1 and 9-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Chapter 6 & 7 Linear Regression & Correlation
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
Hypothesis of Association: Correlation
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
Correlation & Regression
Examining Relationships in Quantitative Research
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Warsaw Summer School 2015, OSU Study Abroad Program Regression.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 16 Data Analysis: Testing for Associations.
Political Science 30: Political Inquiry. Linear Regression II: Making Sense of Regression Results Interpreting SPSS regression output Coefficients for.
Chapter Bivariate Data (x,y) data pairs Plotted with Scatter plots x = explanatory variable; y = response Bivariate Normal Distribution – for.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Correlation & Regression Analysis
Advanced Statistical Methods: Continuous Variables REVIEW Dr. Irina Tomescu-Dubrow.
Chapter 7 Calculation of Pearson Coefficient of Correlation, r and testing its significance.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
GOAL: I CAN USE TECHNOLOGY TO COMPUTE AND INTERPRET THE CORRELATION COEFFICIENT OF A LINEAR FIT. (S-ID.8) Data Analysis Correlation Coefficient.
Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.
Scatter Plots and Correlation
Regression and Correlation
Computations, and the best fitting line.
Correlation and Simple Linear Regression
Correlation 10/27.
Political Science 30: Political Inquiry
Chapter 5 STATISTICS (PART 4).
Correlation and Regression
Chapter 11 Simple Regression
CHAPTER 10 Correlation and Regression (Objectives)
Suppose the maximum number of hours of study among students in your sample is 6. If you used the equation to predict the test score of a student who studied.
Correlation and Simple Linear Regression
Correlation and Regression
6-1 Introduction To Empirical Models
Keller: Stats for Mgmt & Econ, 7th Ed
CORRELATION ANALYSIS.
Correlation and Simple Linear Regression
Correlation and Regression
SIMPLE LINEAR REGRESSION
Simple Linear Regression and Correlation
Product moment correlation
SIMPLE LINEAR REGRESSION
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Warsaw Summer School 2017, OSU Study Abroad Program Correlation 1

Bivariate Correlation (and Regression) Assumptions: Random sampling Both variables = continuous (or can be treated as such) Linear relationship between the 2 variables; Normally distributed characteristics of X and Y in the population

Linear Relationship The line = a mathematical function that can be expressed through the formula Y = a + b*X, where Y & X are our variables. Y, the dependent variable, is expressed as a linear function of the independent (explanatory) variable X.

The constant a = value of Y at the point in which the line Linear Relationship The constant a = value of Y at the point in which the line Y = a + bX intersects the Y-axis (also called the intercept). The slope b equals the change in Y for a one-unit increase in X (one-unit increase in X corresponds to a change of b units in Y). The slope describes the rate of change in Y-values, as X increases. Verbal interpretation of the slope of the line: “Rise over run”: the rise divided by the run (the change in the vertical distance is divided by the change in the horizontal distance).

In reality data are scattered

Vertical Axis: the dependent variable (Y) Scatterplot Scatterplot shows where each person falls in the distribution of X and the distribution of Y simultaneously. Vertical Axis: the dependent variable (Y) Horizontal Axis: the independent variable (X) Each point is a person & their coordinates (their responses on the X variable and response on the Y variable) determine where the person goes in the graph.

If Y and X are expressed in standardized scores, z-scores  Z(y) = B If Y and X are expressed in standardized scores, z-scores  Z(y) = B * Z(x) The formula Y = a + bX maps out a strait-line graph with slope b and Y-intercept a. If Y and X are expressed in the standard scores (z-scores), then we have: Z(y) = B*Z(x) Remember, we said that the mean of a standardized distribution is zero. The regression line is the best fitting line that goes through the means of y and of X. Since these are 0 in standardized distributions, it follows that the regression line goes through the intersection of Y and X. Hence, there is no constant. (the constant is zero) When the distributions of the 2 variables are not standardized, then the mean is somewhere further up, but the increase is the same, namely the ratio of Y to X (Y/X).

Table

The Scatterplot

For z scores Remember, we said that the mean of a standradized distribution is zero. The regression line is the best fitting line that goes through the means of y and of X. Since these are 0 in standradized distributions, it follows that the regression line goes through the intersection of Y and X. Hence, there is no constant. (the constant is zero) When the distributions of the 2 variables are not standradized, then the mean is somehwere further up, but the increase is the same, namely the ratio of Y to X (Y/X).

r Correlation is a statistical technique used to measure & describe a relationship btw 2 variables (whether X and Y are related to each other). How it works: Correlation is represented by a line through the data points & by a number (the correlation coefficient) that indicates how close the data points are to the line. The stronger the relationship between X & Y, the closer the data points will be to the line; the weaker the relationship, the farther the data points will drift away from the line. For z-scores of Y & X the correlation equals the standardized coefficient, Beta. If X and Y are expressed in standrad scores (i.e. z-scores), we have Z(y) = β*Z(x) and r = Σ(Zy*Zx)/N = beta

Pearson’s Correlation coefficient Denoted with “r” or “ρ” (Greek letter “rho”). Ranges from –1 to +1 (cannot be outside this range) r = +1 denotes perfect positive correlation between X &Y (all points on the line)

Z(y) = β*Z(x) and r = Σ(Zy*Zx)/N = beta Correlation - measures the size & the direction of the linear relation btw. 2 variables (i.e. measure of association) - unitless statistic (it is standardized); we can directly compare the strength of correlations for various pairs of variables Pearson’s r = the sum of the products of the deviations from each mean, divided by the square root of the product of the sum of squares for each variable. r ranges from -1 to 1 If X and Y are expressed in standrad scores (i.e. z-scores), we have Z(y) = β*Z(x) and r = Σ(Zy*Zx)/N = beta

r = -1 r = -1 denotes perfect negative correlation between X & Y (all points on the line)

r = 0 r = 0 denotes no relationship between X & Y (points all over the place, no line is decipherable)

Information the Pearson’s r gives us: 1. Strength of relationship Interpretation Information the Pearson’s r gives us: 1. Strength of relationship 2. Direction of relationship

a) Positive correlation: r = positive numer (from 0 1). Interpretation of Pearson’s r: 1. Strength of relationship: correlations are on a continuum; any value of r btw. –1 & 1 we need to describe: weak  |0.10| moderate  |0.30| strong  |0.60| 0 = no relationship 2. Direction of relationship: a) Positive correlation: r = positive numer (from 0 1). Interpretation: the two variables move in the same direction. - as values on X variable go up, values on Y increase as well; - as values on X variable decrease, values on Y decreases as well. b) Negative correlation: r = negative number (from -10). Interpretation: the two variables move in opposite direction: - as values on X variable go up, values on Y decrease; as values on X variable decrease, values on Y increase. Test of Significance: H0: r = 0

Computational table

Calculations and logic of Pearson’s correlation r = the sum of the products of the deviations from each mean, divided by the square root of the product of the sum of squares for each variable.

r = – 0.709 Interpretation: Strength: Strong correlation. Calculation r = – 0.709 Interpretation: Strength: Strong correlation. Direction: negative: younger people have relatively higher education

Is this correlation statistically significant? Test Is this correlation statistically significant? We need Hypothesis Test: Test the Null Hypothesis that r = 0 (that there is no relationship/correlation between X and Y), using a two-tailed t-test.

Testing Step 1: Assumptions: Interval data, random samples, normal population distributions, linear relationship (& not many outliers). Thus: We have to look at the scatterplot before going on with the test What to search for in the scatter plot: Is the relation linear (do points follow a straight line, or not)? Are there outliers?

H0: r = 0 There is No relationship btw. age (X) & education (Y) Test Step 2: Hypotheses: H0: r = 0 There is No relationship btw. age (X) & education (Y) H1: r ≠ 0 There is a relationship btw. X & Y

Test Step 3: two-tailed test t-test, but it’s set up a bit differently from the other t-tests

Step 4: alpha = .05; get t critical (use the same table as before) Test Step 4: alpha = .05; get t critical (use the same table as before) Critical values? (+/- 2.447)

Step 5: Get t calculated (we already have r = -.709) Test Step 5: Get t calculated (we already have r = -.709)

Step 6: Decision & Interpretation: Reject the null hypothesis. Test Step 6: Decision & Interpretation: Reject the null hypothesis. Interpretation: we are 95% confident that the correlation between age & education exists in the population