Download presentation
Presentation is loading. Please wait.
1
CHAPTER 7 LINEAR RELATIONSHIPS
KEY TERMS: SCATTERPLOTS, ASSOCIATION, AND CORRELATION ADDITIONAL REFERENCE READING MATERIAL Introductory Statistics CHAPTERS 7 – 8
2
An Example Of A Graphical Display of A Linear Relationship
A Straight Line Graph 1 2 3 4 5 6 7 8 9 10 -9 -8 -7 -6 -5 -4 -3 -2 -1 -10 x y
3
Associations Between Variables
Many interesting examples of the use of statistics involve relationships between pairs of variables. Two variables measured on the same cases (cases are the objects described by a set of data. It may be customers, companies, subjects in a study, units in an experiment, or other objects) are associated if knowing the value of one of the variables tells you something that you would not otherwise know about the value of the other variable.
4
Positive And Negative Association, Linear Relationship
Two variables have a positive association when the values of one variable tend to increase as the values of the other variable increase. Two variables have a negative association when the values of one variable tend to decrease as the values of the other variable increase. Two variables have a linear relationship when the pattern of their relationship resembles a straight line.
5
Linear Relationships
6
Three Tools We Will Use Scatterplot: A two-dimensional graph of data values. Correlation: A statistic that measures the strength and direction of a linear relationship between two quantitative variables. Regression equation: An equation that describes the average relationship between a quantitative response and explanatory variable, that is, between the x and y variables.
7
Scatterplots The most useful graph for displaying the relationship between two quantitative variables is a scatterplot. A scatterplot shows the relationship between two quantitative variables measured on the same cases. The values of one variable appear on the horizontal axis, and the values of the other variable appear on the vertical axis. Each case corresponds to one point on the graph.
8
Scatterplot Response Variable Explanatory Variable
A response variable measures an outcome of a study. The values of the response variable appear on vertical axis. Algebraically speaking, it is the y - variable or dependent variable. An explanatory variable explains or causes changes in the response variable. The values of the explanatory variable appear on horizontal axis. Algebraically speaking, it is the x - variable or independent variable.
9
Scatterplots GIVEN A SET OF n OBSERVATIONS,
QUESTION: WHAT LINE “BEST” FITS THE OBSERVATIONS? WE SHALL ANSWER THIS QUESTION GRAPHICALLY USING A SCATTERPLOT, AND ANALYTICALLY USING LEAST SQUARES REGRESSION FORMULA.
10
WHAT LINE “BEST FITS” THE SET OF OBSERVATIONS?
GRAPHICAL SOLUTION
11
Scatterplots A SCATTERPLOT IS A PLOT OF THE POINTS
12
How to Make a Scatterplot
Scatterplots Decide which variable should go on each axis. If a distinction exists, plot the explanatory variable on the x axis and the response variable on the y axis. Label and scale your axes. Plot individual data values. How to Make a Scatterplot
13
EXAMPLE: GIVEN THE SET OF OBSERVATIONS, (1,2), (2,5), (3,4), (4,1), (5,8), (6,3), (7,2), PLOT A SCATTERGRAM. SCATTERGRAM Y 8 X 6 X 4 X X 2 X X X X 2 4 6 8
14
Looking For Patterns In A Scatterplot
Questions to Ask about a Scatterplot What is the average pattern? Does it look like a straight line, or is it curved? What is the direction of the pattern? How much do individual points vary from the average pattern? Are there any unusual data points?
15
Height and Handspan Data shown are the first 12 observations of a data set that includes the heights (in inches) and fully stretched handspans (in centimeters) of 167 college students
16
Height and Handspan Taller people tend to have greater handspan measurements than shorter people do. When two variables tend to increase together, we say that they have a positive association. The handspan and height measurements may have a linear relationship.
17
Example: Driver Age and Maximum Legibility Distance of Highway Signs
A research firm determined the maximum distance at which each of 30 drivers could read a newly designed sign. The 30 participants in the study ranged in age from 18 to 82 years old. We want to examine the relationship between age and the sign legibility distance
18
Example: Driver Age and Maximum Legibility Distance of Highway Signs
We see a negative association with a linear pattern. We will use a straight-line equation to model this relationship. Example: 100 Cars on the Lot of a Used-Car Dealership Question: Would you expect a positive association, a negative association or no association between the age of the car and the mileage on the odometer? Positive association Negative association No association
19
Interpreting Scatterplots
To interpret a scatterplot, follow the basic strategy of data analysis. Look for patterns and important departures from those patterns. How to Examine a Scatterplot As in any graph of data, look for the overall pattern and for striking deviations from that pattern. You can describe the overall pattern of a scatterplot by the form, direction, and strength of the relationship. An important kind of departure is an outlier, an individual value that falls outside the overall pattern of the relationship.
20
Interpreting Scatterplots
Two variables are positively associated when above-average values of one tend to accompany above-average values of the other and when below-average values also tend to occur together. Two variables are negatively associated when above-average values of one tend to accompany below-average values of the other and vice versa.
21
DIRECTION POSITIVE NEGATIVE NEITHER THE PATTERN RUNS THE PATTERN RUNS
FROM THE BOTTOM LEFT FROM THE UPPER LEFT TO THE UPPER RIGHT TO THE LOWER RIGHT. X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X
22
FORM: POSITIVE STRAIGHT DIRECTION
POSITIVELY STRAIGHT RELATIONSHIP X X X X X X X X X X X X X X X X X X X X X X
23
FORM: NEGATIVE STRAIGHT DIRECTION
NEGATIVELY STRAIGHT RELATIONSHIP X X X X X X X X X X X X X X X X X X X X X X X X
24
Nonlinear Relationships
There are other forms of relationships besides linear. The scatterplot below is an example of a nonlinear form. Note the curvature in the relationship between x and y.
25
FORM:EXOTIC – SHARP POINTS
OUTSTANDING FEATURE – SHARP POINTS X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X
26
STRENGTH STRONG MODERATE WEAK X X X X X X X X X X X X X X X X X X X X
27
UNUSUAL FEATURES OUTLIERS SUBGROUPS X X X X X X X X X X X X X X X X X
28
Unusual Features
29
Unusual Features - Outlier
30
Example: Association (a) T-Shirts at a store: price; number sold.
Suppose you were to collect data for each pair of variables. You want to make a scatterplot. Which variable would you use as the explanatory variable, and which as the response variable? What would you expect to see in the scatterplot? Discuss the likely direction, form, and strength. (a) T-Shirts at a store: price; number sold. (b) Scuba diving: depth; water pressure. (c) Scuba diving: depth; visibility. (d) All elementary-school students: weight; score on a reading test.
31
Solution
32
Correlation Correlation coefficient r Properties of r
The Correlation Coefficient Is A Numerical Measure Of The Direction And Strength Of A Linear Association.
33
Measuring Linear associations
A scatterplot displays the strength, direction, and form of the relationship between two quantitative variables. Linear relationships are important because a straight line is a simple pattern that is quite common. Our eyes are not always good judges of how strong a relationship is. Therefore, we use a numerical measure to supplement our scatterplot and help us interpret the strength of the linear relationship.
34
Correlation Coefficient Example
Find The Linear Correlation Coefficient For The Following Bivariate Data: (6,5), (10,3), (14,7), (19,8), (21,12). SOLUTION: See handout
35
Measuring Linear Associations
We say a linear relationship is strong if the points lie close to a straight line and weak if they are widely scattered about a line. The following facts about r help us further interpret the strength of the linear relationship.
36
Measuring Linear Associations
Properties of Correlation r is always a number between –1 and 1. r > 0 indicates a positive association. r < 0 indicates a negative association. Values of r near 0 indicate a very weak linear relationship. The strength of the linear relationship increases as r moves away from 0 toward –1 or 1. The extreme values r = –1 and r = 1 occur only in the case of a perfect linear relationship.
37
Scatterplots and Correlation
38
Scatterplots and Correlation
39
More Properties of Correlation
Correlation makes no distinction between explanatory and response variables. r has no units and does not change when we change the units of measurement of x, y, or both. Positive r indicates positive association between the variables, and negative r indicates negative association. The correlation r is always a number between –1 and 1.
40
Correlation Cautions:
Correlation requires that both variables be quantitative. Correlation does not describe curved relationships between variables, no matter how strong the relationship is. The correlation r is not resistant; it can be strongly affected by a few outlying observations. Correlation is not a complete summary of two- variable data.
41
Helpful Tips In Determining Strength And Weakness
MODERATE MODERATE -1 -0.8 -0.5 0.5 0.8 +1 WEAK STRONG STRONG
42
CORRELATION CONDITIONS
QUANTITATIVE VARIABLES CONDITION CORRELATION APPLIES ONLY TO QUANTITATIVE VARIABLES. BE SURE NOT TO APPLY CORRELATION TO CATEGORICAL DATA MASQUERADING AS QUANTITATIVE. CHECK THE VARIABLES’ UNITS AND WHAT THEY MEASURE.
43
STRAIGHT ENOUGH CONDITION
MAKE SURE THE FORM OF THE SCATTERPLOT IS STRAIGHT ENOUGH THAT A LINEAR RELATIONSHIP MAKES SENSE. CORRELATION MEASURES THE STRENGTH ONLY OF THE LINEAR ASSOCIATION, AND WILL BE MISLEADING IF THE RELATIONSHIP IS NOT LINEAR. IF A RELATIONSHIP IS CURVED, THEN SUMMARIZING ITS STRENGTH WITH A CORRELATION WOULD BE MISLEADING.
44
OUTLIER CONDITION OUTLIERS CAN DISTORT THE CORRELATION DRAMATICALLY.
AN OUTLIER CAN MAKE AN OTHERWISE WEAK CORRELATION LOOK BIG OR HIDE A STRONG CORRELATION. AN OUTLIER CAN EVEN GIVE AN OTHERWISE POSITIVE ASSOCIATION A NEGATIVE CORRELATION (AND VICE VERSA) WHEN YOU SEE AN OUTLIER, IT’S OFTEN A GOOD IDEA TO REPORT THE CORRELATION WITH AND WITHOUT THAT POINT.
45
WHICH LINE “BEST FITS” THE SET OF OBSERVATIONS?
THE ANALYTICAL APPROACH
46
FITTING THE MODEL: THE LEAST SQUARES METHOD
CONSIDER THE EXAMPLE: SUPPOSE AN APPLIANCE STORE CONDUCTS A FIVE-MONTH EXPERIMENT TO DETERMINE THE EFFECT OF ADVERTISING ON SALES REVENUE. THE RESULTS ARE SHOWN IN THE TABLE. MONTH ADVERTISING EXPENDITURE, X ($100s) SALE REVENUE Y, ($1,000s) 1 2 3 4 5
47
FIRST STEP IS TO MAKE A SCATTERGRAM
SALES REVENUE ($1000s) Y 4 X 3 2 X X 1 X X x 1 2 3 4 5 AD. EXPENDITURE ($100s)
48
WHAT IS THE BEST FIT? Y X SCATTERGRAM WITH POSSIBLE FITS 4 X 3 2 X X 1
5 6
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.