Chapters 8 and 9: Correlations Between Data Sets Math 1680.

Slides:



Advertisements
Similar presentations
 Objective: To look for relationships between two quantitative variables.
Advertisements

Correlation Data collected from students in Statistics classes included their heights (in inches) and weights (in pounds): Here we see a positive association.
Statistics lecture 4 Relationships Between Measurement Variables.
Correlation & Regression Chapter 10. Outline Section 10-1Introduction Section 10-2Scatter Plots Section 10-3Correlation Section 10-4Regression Section.
Math 3680 Lecture #19 Correlation and Regression.
MA-250 Probability and Statistics
Correlation 2 Computations, and the best fitting line.
Two Quantitative Variables Scatterplots examples how to draw them Association what to look for in a scatterplot Correlation strength of a linear relationship.
Correlation A correlation exists between two variables when one of them is related to the other in some way. A scatterplot is a graph in which the paired.
Correlation Relationship between Variables. Statistical Relationships What is the difference between correlation and regression? Correlation: measures.
Correlation: Relationship between Variables
Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise.
Chapters 10 and 11: Using Regression to Predict Math 1680.
Simple ideas of correlation Correlation refers to a connection between two sets of data. We will also be able to quantify the strength of that relationship.
Correlation 10/30. Relationships Between Continuous Variables Some studies measure multiple variables – Any paired-sample experiment – Training & testing.
STATISTICS ELEMENTARY C.M. Pascual
Scatterplots, Association, and Correlation Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Correlation By Dr.Muthupandi,. Correlation Correlation is a statistical technique which can show whether and how strongly pairs of variables are related.
Correlation and regression 1: Correlation Coefficient
1 Chapter 9. Section 9-1 and 9-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 7 Scatterplots, Association, and Correlation.
Introduction to Quantitative Data Analysis (continued) Reading on Quantitative Data Analysis: Baxter and Babbie, 2004, Chapter 12.
Examining Relationships
MA-250 Probability and Statistics Nazar Khan PUCIT Lecture 5.
Scatterplots, Associations, and Correlation
Regression and Correlation. Bivariate Analysis Can we say if there is a relationship between the number of hours spent in Facebook and the number of friends.
Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010.
Wednesday, October 12 Correlation and Linear Regression.
The Correlation Coefficient. Social Security Numbers.
Statistics in Applied Science and Technology Chapter 13, Correlation and Regression Part I, Correlation (Measure of Association)
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
Advanced Math Topics The Coefficient Of Correlation.
1 Chapter 7 Scatterplots, Association, and Correlation.
1 Examining Relationships in Data William P. Wattles, Ph.D. Francis Marion University.
STAT 1301 Chapter 8 Scatter Plots, Correlation. For Regression Unit You Should Know n How to plot points n Equation of a line Y = mX + b m = slope b =
Investigating the Relationship between Scores
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
Scatterplots are used to investigate and describe the relationship between two numerical variables When constructing a scatterplot it is conventional to.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
Chapter 2 Section 3 Using Scientific Measurements Graphs & Tables: Key Features and Reading.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
The Practice of Statistics
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Chapter 7 Scatterplots, Association, and Correlation.
Section 2.6 – Draw Scatter Plots and Best Fitting Lines A scatterplot is a graph of a set of data pairs (x, y). If y tends to increase as x increases,
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
Correlation. Correlation is a measure of the strength of the relation between two or more variables. Any correlation coefficient has two parts – Valence:
Scatter Diagram of Bivariate Measurement Data. Bivariate Measurement Data Example of Bivariate Measurement:
2.5 Using Linear Models A scatter plot is a graph that relates two sets of data by plotting the data as ordered pairs. You can use a scatter plot to determine.
Correlations AP Psychology. Correlations  Co-relation  It describes the relationship b/w two variables.  Example #1  How is studying related to grades?
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Scatterplots, Association, and Correlation.
Copyright © 2010 Pearson Education, Inc. Chapter 7 Scatterplots, Association, and Correlation.
Lecture 29 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
Math 3680 Lecture #18 Correlation. The Correlation Coefficient: Intuition.
Chapter 7 Scatterplots, Association, and Correlation.
Lesson 4.7 – Interpreting the Correlation Coefficient and Distinguishing between Correlation & Causation EQs: How do you calculate the correlation coefficient?
Lecturer’s desk Physics- atmospheric Sciences (PAS) - Room 201 s c r e e n Row A Row B Row C Row D Row E Row F Row G Row H Row A
1 MVS 250: V. Katch S TATISTICS Chapter 5 Correlation/Regression.
The Normal Approximation for Data. History The normal curve was discovered by Abraham de Moivre around Around 1870, the Belgian mathematician Adolph.
Algebra 1 Section 4.1 Plot points and scatter plots To locate a point in a plane, you can use an ordered pair in the form (x,y) in a Cartesian Coordinate.
1.5 Scatter Plots & Line of Best Fit. Scatter Plots A scatter plot is a graph that shows the relationship between two sets of data. In a scatter plot,
Correlation and Regression. O UTLINE Introduction  10-1 Scatter plots.  10-2 Correlation.  10-3 Correlation Coefficient.  10-4 Regression.
Dr Hidayathulla Shaikh Correlation and Regression.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 3 Investigating the Relationship of Scores.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 7- 1.
Sit in your permanent seat
Scatterplots Chapter 6.1 Notes.
Correlation and Regression
Presentation transcript:

Chapters 8 and 9: Correlations Between Data Sets Math 1680

Overview Scatter Plots Associations The Correlation Coefficient Sketching Scatter Plots Changes of Scale Summary

Scatter Plots Often, we are interested in comparing two related data sets Heights and weights of students SAT scores and freshman GPA Age and fuel efficiency of vehicles We can draw a scatter plot of the data set Plot paired data points on a Cartesian plane

Scatter Plots Scatter plot for the heights of 1,078 fathers and their adult sons From HANES study

Scatter Plots What does the dashed diagonal line represent? Find the point representing a 5'3¼" father who has a 5'6½" son

Scatter Plots What does the vertical dashed column represent? Consider the families where the father was 72" tall, to the nearest inch How tall was the tallest son? Shortest?

Scatter Plots Was the average height of the fathers around 64”, 68” or 72”? Was the SD of the fathers’ heights around 3", 6" or 9"?

Scatter Plots The points form a swarm that is more or less football- shaped This indicates that there is a linear association between the fathers’ heights and the sons’ heights

Scatter Plots Short fathers tend to have short sons, and tall fathers tend to have tall sons We say there is a positive association between the heights of fathers and sons What would it mean for there to be a negative association between the heights?

Scatter Plots Does knowing the father’s height give a precise prediction of his son’s height? Does knowing the father’s height let you better predict his son’s height?

Scatter Plots We will generally assume the scatter plots are football-shaped Association is linear in nature Each data set is approximately normal

Scatter Plots Key features of scatter plots Given two data sets X and Y, … The point of averages is the point (  x,  y ) The average of a data set is denoted by μ (Greek mu, for mean) The subscript indicates which set is being referenced It will be in the center of the cloud Due to the normal approximation, the vast majority (95%) of the cloud should fall within 2 SD’s less than and greater than average for both X and Y

Scatter Plots

Associations When given a value in one data set, we often want to make a prediction for the other data set We call our given value the independent variable We call the value we are trying to predict the dependent variable

Associations If there is indeed a relationship between the two data sets, we can say various things about their association: Strong: Knowing X helps you a lot in predicting Y, and vice versa Weak: Knowing X doesn’t really help you predict Y, and vice versa Positive: X and Y are directly proportional The higher in one you look, the higher in the other you should be Negative: X and Y are inversely proportional The higher in one you look, the lower in the other you should be

Associations Positive associations Study time/final grade Height/weight SAT score/GPA Clouds in sky/chance of rain Bowling practice/bowling score Age of husband/age of wife Negative associations Age of car/fuel efficiency Golfing practice/golf score Dental hygiene/cavities formed Pollution/air quality Speed/mile time

Associations What kind of association is this?

Associations What kind of association is this?

Associations Remember that even a very strong association does not necessarily imply a causal relationship There may be a confounding influence at play

The Correlation Coefficient While strong/weak and positive/negative give a sense of the association, we want a way to quantify the strength and direction of the association The correlation coefficient (r) is the statistic which accomplishes this

The Correlation Coefficient The correlation coefficient is always between –1 and 1 A positive r means that there is a positive association between the sets A negative r means that there is a negative association between the sets If r is close to 0, then there is only a weak association between the sets If r is close to 1 or –1, then there is a strong association between the sets

The Correlation Coefficient The following plots have and, with 50 points in them The only difference between them is the correlation coefficient Note how the points fall into a line as r approaches 1 or –1

The Correlation Coefficient To calculate r… Find the average and SD of each data set Multiply the data sets pairwise and find the average The correlation is the average of the product minus the product of the averages, all divided by the product of the SD’s

The Correlation Coefficient XY XY

The Correlation Coefficient Compute r for the following data XY XY

The Correlation Coefficient Estimate the correlation

The Correlation Coefficient Estimate the correlation

Sketching Scatter Plots The SD line is the line consisting of all the points where the standard score in X equals the standard score in Y z X = z Y To sketch the SD line, draw a line bisecting the long axis of the football shape Note that the SD line always goes through the point of averages

Sketching Scatter Plots Given the five-statistic summary (averages, SD’s, and correlation) for a pair of data sets, we can sketch the scatter plot Plot the point of averages in the center Mark two SD’s in both directions, on both axes Plot the point 1 SD above average for both data sets draw a line connecting this point and the point of averages This is the SD line Draw an ellipse with the SD line as its long axis Ellipse should go just beyond the 2 SD marks in all directions The value of r determines how oblong the ellipse is

Sketching Scatter Plots A study of the IQs of husbands and wives obtained the following results Husbands:average IQ = 100,SD = 15 Wives:average IQ = 100,SD = 15 r = 0.6 Sketch the scatter plot

Changes of Scale The correlation coefficient is not affected by changes of scale Moving: adding the same number to all of the values of one variable Stretching: multiplying the same positive number to all the values of one variable Would r change if we multiplied by a negative number? The correlation coefficient is also unaffected by interchanging the two data sets

Changes of Scale

Compute r for each of the following data sets XY XY r = -0.15

Summary The relationship between two variables, X and Y, can be graphed in a scatter plot When the scatter plot is tightly clustered around a line, there is a strong linear association between X and Y A scatter plot can be characterized by its five- statistic summary Average and SD of the X values Average and SD of the Y values Correlation coefficient

Summary When the correlation coefficient gets closer to 1 or –1, the points cluster more tightly around a line Positive association has a positive r-value Negative association has a negative r-value Calculating the correlation coefficient Take the average of the product Subtract the product of the averages Divide the difference by the product of the SD’s

Summary The correlation coefficient is not affected by changes of scale or transposing the variables Correlation does not measure causation!