Presentation on theme: "Bivariate Data – Scatter Plots and Correlation Coefficient…… Section 3.1 and 3.2."— Presentation transcript:
Bivariate Data – Scatter Plots and Correlation Coefficient…… Section 3.1 and 3.2
2 Quantitative Variables…… We represent 2 variables that are quantitative by using a scatter plot. Scatter Plot – a plot of ordered pairs (x,y) of bivariate data on a coordinate axis system. It is a visual or pictoral way to describe the nature of the relationship between 2 variables.
Input and Output Variables…… X: a. Input Variable b. Independent Var c. Controlled Var Y: a. Output Variable b. Dependent Var c. Results from the Controlled variable
Example…… When dealing with height and weight, which variable would you use as the input variable and why? Answer: Height would be used as the input variable because weight is often predicted based on a person’s height.
Constructing a scatter plot…… Do a scatter plot of the following data: IndependentDependent Variable AgeBlood Pressure 43128 48120 56135 61143 67141 70152
What do we look for?...... A. Is it a positive correlation, negative correlation, or no correlation? B. Is it a strong or weak correlation? C. What is the shape of the graph?
Correlation…… Definition – a statistical method used to determine whether a relationship exists between variables. 3 Types of Correlation: A. Positive B. Negative C. No Correlation
Positive Correlation: as x increases, y increases or as x decreases, y decreases. Negative Correlation: as x increases, y decreases. No Correlation: there is no relationship between the variables.
Linear Correlation Analysis …… Primary Purpose: to measure the strength of the relationship between the variables. *This is a test question!!!!
Coefficient of Linear Correlation The numerical measure of the strength and the direction between 2 variables. This number is called the correlation coefficient. The symbol used to represent the correlation coefficient is “r.”
The range of “r” values…… The range of the correlation coefficient is -1 to +1. The closer to 0 you get, the weaker the correlation.
Range…… Strong Negative No Linear RelationshipStrong Positive ____________________________________ -1 0 +1
Computational Formula using z-scores of x and y……
Example 1…… Find the correlation coefficient (r) of the following example. Use the lists in the calculator. xy 280 5 170 490 260
Find mean and st. dev first…… Since you will be using a formula that uses z-scores, you will need to know the mean and standard deviation of the x and y values. Put x’s in L1 Put y’s in L2 Run stat calc one var stats L1 – Write down mean & st. dev. Run stat calc one var stats L2 – Write down mean & st. dev.
List values you should have…… L1L2L3L4L5 43128-1.368-0.74581.0205 48120-0.8965-1.4481.2978 n=656135-0.1415-0.13160.01863 611430.330280.570310.18836 671410.896470.394830.35395 701521.17961.361.6042 4.483364073
List Values you should have…… L1L2L3L4L5 682-0.48980.53626-0.2626 286-1.4040.7746-1.088 15431.5673-1.788-2.802 n=79740.195910.059580.01167 12580.88158-0.8938-0.7879 590-0.71831.0129-0.7276 878-0.03270.29792-0.0097 -5.66529102
List Values you should have…… Hrs of ExerciseAmt of MilkL3L4L5 348-0.30150.30713-0.0926 08-1.206-1.4761.7804 232-0.603-0.40620.24495 5640.301511.02050.30768 n=98101.206-1.387-1.673 5320.30151-0.4062-0.1225 10561.80910.663791.2008 272-0.6031.3771-0.8304 148-0.90450.30713-0.2778 0.537689672
Describe It…… Since r =.067 No Correlation…..No correlation exists
What is It is the coefficient of determination. It is the percentage of the total variation in y which can be explained by the relationship between x and y. A way to think of it: The value tells you how much your ability to predict is improved by using the regression line compared with NOT using the regression line.
For Example…… If it means that 89% of the variation in y can be explained by the relationship between x and y. It is a good fit.