Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?

Slides:



Advertisements
Similar presentations
Chapter 3 Examining Relationships Lindsey Van Cleave AP Statistics September 24, 2006.
Advertisements

Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
2nd Day: Bear Example Length (in) Weight (lb)
Scatter Diagrams and Linear Correlation
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
CHAPTER 4: Scatterplots and Correlation
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
The Practice of Statistics
Chapter 3: Examining Relationships
CHAPTER 3 Describing Relationships
Regression, Residuals, and Coefficient of Determination Section 3.2.
C HAPTER 3: E XAMINING R ELATIONSHIPS. S ECTION 3.3: L EAST -S QUARES R EGRESSION Correlation measures the strength and direction of the linear relationship.
AP STATISTICS LESSON 3 – 1 EXAMINING RELATIONSHIPS SCATTER PLOTS.
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Descriptive Methods in Regression and Correlation
Relationship of two variables
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
Chapter 6: Exploring Data: Relationships Lesson Plan Displaying Relationships: Scatterplots Making Predictions: Regression Line Correlation Least-Squares.
Chapter 6 & 7 Linear Regression & Correlation
Ch 3 – Examining Relationships YMS – 3.1
Lesson Least-Squares Regression. Knowledge Objectives Explain what is meant by a regression line. Explain what is meant by extrapolation. Explain.
Chapter 3 concepts/objectives Define and describe density curves Measure position using percentiles Measure position using z-scores Describe Normal distributions.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
Chapter 5 Residuals, Residual Plots, & Influential points.
Chapter 5 Residuals, Residual Plots, Coefficient of determination, & Influential points.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
The Practice of Statistics
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Scatterplots and Correlations
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
POD 09/19/ B #5P a)Describe the relationship between speed and pulse as shown in the scatterplot to the right. b)The correlation coefficient, r,
^ y = a + bx Stats Chapter 5 - Least Squares Regression
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
Independent Dependent Scatterplot Least Squares
Notes Chapter 7 Bivariate Data. Relationships between two (or more) variables. The response variable measures an outcome of a study. The explanatory variable.
Unit 4 Lesson 3 (5.3) Summarizing Bivariate Data 5.3: LSRL.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.1 Scatterplots and Correlation.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Statistics 101 Chapter 3 Section 3.
CHAPTER 3 Describing Relationships
SCATTERPLOTS, ASSOCIATION AND RELATIONSHIPS
Least-Squares Regression
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Warmup A study was done comparing the number of registered automatic weapons (in thousands) along with the murder rate (in murders per 100,000) for 8.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Honors Statistics Review Chapters 7 & 8
Review of Chapter 3 Examining Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Chapter 3 Section 3.1 Examining Relationships

Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe? What are the variables? How are the variables measured? Are all the variables quantitative or is at least one a categorical variable? Do you want to explore the nature of the relationship or do you think some of the variables explain or cause the changes in others?

bivariate involving two variables, especially, when attempting to show a correlation between two variables, the analysis is said to be bivariate

When working with bivariate data, each variable plays a different role. One variable is the explanatory or predictor variable, while the other is the response variable.

Bivariate data is graphed on a scatterplot with an x-axis (horizontal) and y-axis (vertical). The explanatory variable is graphed on the horizontal, and the response variable is graphed on the vertical. A scatterplot is a picture of the association between two variables.

Do Problem 3.2 pg 123.

Tips for drawing scatterplots Scale the horizontal and vertical axes. The intervals must be uniform. If the scale does not begin at zero use the // symbol to indicate a break Label both axes If given a grid, use a scale so that your plot utilizes the whole grid. Don’t compress the plot into one corner of the grid.

To analyze a scatterplot, describe the data in terms of: Direction (positive or negative) Form (linear, clustered, curve) Scatter or strength ( recognize positive or negative association and linear patterns) Outlier (deviation from the overall pattern)

Do the following problems as example problems 3.6 pg pg pg 139

End of Section 3.1

CHAPTER 3 SECTION 3.2

Lesson 3.2 Correlation Correlation is given by the following equation: Correlation measures the direction and strength of the linear relationship between two quantitative variables. It is the average of the products of the standardized values.

The correlation computed from the sample data measures the direction and strength of the linear relationship between two quantitative variables. The symbol for the sample correlation coefficient is r. The range of the correlation is from -1 to +1. When r is close to +1, there is a strong positive linear relationship between the variables. When r is close to -1, there is a strong negative relationship between the variables. When there is no linear relationship or only a weak relationship, the value of r will be close to 0. The correlation is not resistant. It is strongly affected by outliers.

If women always married men who were two years older than themselves, what would be the correlation between the ages of husband and wife?

The gas mileage of an automobile first increases and then decreases as the speed increases. This relationship is very regular as shown by the following data on speed (miles per hour) and the mileage (miles per gallon): Speed: MPG: Make a scatter plot; calculate r.

End of Section 3.2

Section 3.3 Least Squares Regression

LEAST-SQUARES REGRESSION Given a scatter plot, one must be able to draw the line of best fit. Best fit means that the sum of the squares of the vertical distances from each point to the line is minimized.

When the scatterplot appears linear, the line of best fit is the Least-Squares Regression Line (LSRL).

Equation of the Least-Squares Regression Line (LSRL) is read “y-hat” and means the predicted value of y. a is the y-intercept. b is the slope. is on the LSRL.

Equation for the slope of the LSRL: r is the correlation coefficient. s x if the standard deviation of x. s y is the standard deviation of y.

Equation for the y-intercept of the LSRL : a is the y-intercept. is the mean of the y-values. is the mean of the x-values.

y: observed value y bar: mean of observed values ŷ: predicted values

Do problem 3.38 pg 158 refer back to data in FIG 3.1 pg 127

What is ? How do we interpret ? If you know nothing about y’s relationship to x, when you want to predict y, the best you can do is use y-bar. In this case, TOTAL SUM OF SQUARED ERROR :

If you know something about the relationship between x and y, then SUM OF SQUARES FOR ERROR:

Which is greater, SST or SSE?

If x is a poor predictor of y, then the sum of square of deviation about the mean y and the sum of square of deviation about the regression line ŷ would be approximately the same

Is the amount of error you eliminated, and is the proportion of error eliminated out of the total error you started with.

R 2 The Coefficient of Determination It is, also, known as the coefficient of variation. The coefficient of determination, r 2, is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x.

When you report r, give r 2 as a measure of how successful the regression was in explaining the response. When you see r, square it to get a better feel for the strength of the response. Example: r =.7, r 2 =.49. r 2 =.49 means that 49% of the variation in y is explained by the least squares regression of y on x.

The correlation between math and verbal SAT scores for this class was.66. What percent of the variation in the verbal scores is explained by the math scores?

In a study of the effect of temperature on household heating bills, an investigator said, “Our research shows that about 70% of the variability in the heating units used by a particular house over the years can be explained by outside temperature.” Explain what the investigator meant by this statement. According to this study, what is the correlation between outside temperature and heating bills?

RESIDUALS Residual = The mean of the least squares residuals always equals zero. (taking into account round-off error) An effective tool for testing the goodness of fit of a regression line to a bivariate data set is the residual plot.

Do problem 3.42 pg 167

RESIDUAL PLOT The residual plot displays the scatterplot of the points If the residual plot shows a random dispersion with no apparent pattern, the LSRL fits the data. If the residual plot shows a curved pattern or fanned pattern, the LSRL is not a good summary for the data

When the TI-83 executes a regression model, the residuals are automatically computed and stored in the list RESID. It will be located alphabetically in the NAMES list.

Do problem 3.48 which refers back to data in Table 3.4 and the equation in Example 3.14 pg 168.

TECHNOLOGY TOOLBOX

Analyzing Data for Two Variables

End of Chapter 3