4 We are going to use two measures of dependence—covariance and the coefficient of linear correlation—to measure the relationship between two variables. We’ll begin our discussion by examining a set of bivariate data and identifying some related facts as we prepare to define covariance.
5 Example 1 – Understanding and Calculating Covariance Figure 13.1 shows a sample of six bivariate data (ordered pairs): (2, 1), (3, 5), (6, 3), (8, 2), (11, 6), (12, 1). The mean of the six x values (2, 3, 6, 8, 11, 12) is = 7. The mean of the six y values (1, 5, 3, 2, 6, 1) is = 3. Figure 13.1 Graph of Bivariate Data
6 Example 1 – Understanding and Calculating Covariance The point, which is (7, 3), is located as shown on the graph of the sample points in Figure 13.2. Figure 13.2 The Point (7, 3) Is the Centroid cont’d
7 Example 1 – Understanding and Calculating Covariance The point is called the centroid of the data. A vertical and a horizontal line drawn through the centroid divide the graph into four sections, as shown in Figure 13.2. Each point (x, y) lies a certain distance from each of these two lines: is the horizontal distance from (x, y) to the vertical line that passes through the centroid, and is the vertical distance from (x, y) to the horizontal line that passes through the centroid. cont’d
8 Example 1 – Understanding and Calculating Covariance Both the horizontal and vertical distances of each data point from the centroid can be measured, as shown in Figure 13.3. Figure 13.3 Measuring the Distance of Each Data Point from the Centroid cont’d
9 Example 1 – Understanding and Calculating Covariance The distances may be positive, negative, or zero, depending on the position of the point (x, y) in relation to. [Figure 13.3 shows and represented by braces, with positive or negative signs.] cont’d
10 Height Compatibility If we standardize x and y by dividing the distance of each from the respective mean by the respective standard deviation: and then compute the covariance of x’ and y’, we will have a covariance that is not affected by the spread of the data.
11 Height Compatibility This is exactly what is accomplished by the linear correlation coefficient. It divides the covariance of x and y by a measure of the spread of x and by a measure of the spread of y (the standard deviations of x and of y are used as measures of spread). Therefore, by definition, the coefficient of linear correlation is:
12 Height Compatibility The coefficient of linear correlation standardizes the measure of dependency and allows us to compare the relative strengths of dependency of different sets of data. [Formula (13.2) for linear correlation is also commonly referred to as Pearson’s product moment, r.]