Chapter 3 Review Two Variable Statistics Veronica Wright Christy Treekhem River Brooks.

Slides:



Advertisements
Similar presentations
Chapter 3 Examining Relationships Lindsey Van Cleave AP Statistics September 24, 2006.
Advertisements

Residuals.
Chapter 3 Examining Relationships
Chapter 3 Bivariate Data
2nd Day: Bear Example Length (in) Weight (lb)
Chapter 8 Linear Regression © 2010 Pearson Education 1.
Scatter Diagrams and Linear Correlation
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Describing the Relation Between Two Variables
CHAPTER 3 Describing Relationships
Haroon Alam, Mitchell Sanders, Chuck McAllister- Ashley, and Arjun Patel.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
Exploratory Data Analysis: Two Variables
Warm-up with 3.3 Notes on Correlation
Scatterplots, Associations, and Correlation
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Warm-up with 3.3 Notes on Correlation Universities use SAT scores in the admissions process because they believe these scores provide some insight into.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Chapter 10 Correlation and Regression
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
3.2 Least Squares Regression Line. Regression Line Describes how a response variable changes as an explanatory variable changes Formula sheet: Calculator.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Correlation The apparent relation between two variables.
 Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination  Justify the regression.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
3.2 Least-Squares Regression Objectives SWBAT: INTERPRET the slope and y intercept of a least-squares regression line. USE the least-squares regression.
Business Statistics for Managerial Decision Making
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
CHAPTER 3 Describing Relationships
LEAST-SQUARES REGRESSION 3.2 Role of s and r 2 in Regression.
Response Variable: measures the outcome of a study (aka Dependent Variable) Explanatory Variable: helps explain or influences the change in the response.
AP Statistics Review Day 1 Chapters 1-4. AP Exam Exploring Data accounts for 20%-30% of the material covered on the AP Exam. “Exploratory analysis of.
Two-Variable Data Analysis
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
CHAPTER 3 Describing Relationships
Describing Relationships
Chapter 4.2 Notes LSRL.
Statistics 101 Chapter 3 Section 3.
CHAPTER 3 Describing Relationships
SCATTERPLOTS, ASSOCIATION AND RELATIONSHIPS
Regression and Residual Plots
Chapter 7 Part 1 Scatterplots, Association, and Correlation
CHAPTER 26: Inference for Regression
residual = observed y – predicted y residual = y - ŷ
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Unit 4 Vocabulary.
Review of Chapter 3 Examining Relationships
Chapter 5 LSRL.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Bivariate Data Response Variable: measures the outcome of a study (aka Dependent Variable) Explanatory Variable: helps explain or influences the change.
Honors Statistics Review Chapters 7 & 8
Review of Chapter 3 Examining Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Chapter 3 Review Two Variable Statistics Veronica Wright Christy Treekhem River Brooks

The Big Idea This chapter explains how scatterplots can be used to represent data in a variety of useful ways. They give good graphical representations of the relationship between the two variables and can be used to easily spot trends such as strength and direction and help to isolate outliers. Residual plots can also be used as a tool for determining how the variables interact. The LSRL, correlation, and correlation coefficient can be used to predict results based on the data and to mathematically prove just how accurate these predictions are. We use this all the time in statistics and just about everywhere else. Just looking at a scatterplot a person already uses a number of these principles in order to infer information from it. The most obvious piece of information being how things will develop based on all of the data that has been collected so far. Almost every field uses this. Economics, politics, manufacturing, and even sports.

Important Vocabulary Direction – the overall direction that data moves towards when displayed on a scatterplot Scatterplot – a graph that shows the relationship between 2 quantitative variables that are measured on the same individuals Response variable – a variable that measures the outcome of a study, i.e. dependent variable Explanatory variable – a variable that influences the response variable, i.e. independent variable Form – the shape that the data resembles when displayed on a scatterplot ex. curved, linear, exponential, etc. Strength – how closely the data points follow the form Outlier – a data point that doesn’t follow the form as closely as all the others, a data point that seems significantly out of place on a scatterplot Correlation coefficient – a measure of the direction and strength of the linear relationship between two quantitative variables, usually represented as r

Important Vocabulary Regression line – a line that describes how the response variable changes when the explanatory variable changes Extrapolation – using the regression line to predict results beyond the scope of the actual data LSRL, the least-squares regression line – a line that has the smallest possible total distance from the data points: ^y = a + bx Residual – the difference between an actual data point and where the regression line says that particular data point should fall Residual plot – a scatterplot of the data’s residuals against its explanatory variables Coefficient of determination – the amount of variability in the data that is accounted for by the LSRL, the higher the coefficient, the more accurately the LSRL represents the data. It is usually shown as r^2 and never greater than 1 Lurking variable – a variable other than thee response and explanatory variables that may influence the relationship between them

Key Topics Covered in the Chapter How to graph and determine the relationship between independent (explanatory) and dependent (response) variables Correlation– how to find it, and what it means Regression line (Best fit, LSRL) – how to find it, what it means, and how well it fits the data

Formulas You Ought to Know The regression line formula (LSRL): – ŷ = a + bx With ŷ being the predicted response, a being the y-intercept, b being the slope, and x being the explanatory variable. The formula for the mean: – (a 1 + a 2 + a 3, a n )/n The formula for standard deviation: The formula for r (correlation coefficient)

Calculator Key Strokes In this unit, on our calculator we are forced to find the Sx, Sy, mean of x, mean of y, r, r 2, and LSRL, as well as graphing the scatterplot and residual plot. To find the r^2, r, and LSRL, do the following: (enter data sets into L1 and L2) Insert your lists in the order, (Explanatory List, Response List) To find Sx, Sy, the mean of x, or the mean of y, do all of the above, except press “2” instead of “8” (enter data sets into L1 and L2) Insert your lists in the order, (Explanatory List, Response List)

Calculator Key Strokes To plot the scatterplot, do the following: Enter Data set Make sure plot is On Choose the Scatterplot To find the Residual plot, do all of the above, except change “Ylist” to “Resid” (If you cannot find the RESID button in your Statlist, do the following): And now it should work, but MAKE SURE that you have already calculated the LSRL. (Scroll down to DiagnosticOn)

Example Problems A study shows that there is a positive correlation between the size of the hospital and the median number of days patients remain in the hospital. Does this mean you can shorten a stay by choosing a small hospital? Explain. No, correlation is not causation. Also, the patients with minor injuries may not feel the need to go to a larger hospital, thus shortening the stay.

Example Problems The Standard and Poor 500 index is an average of the price of 500 stocks. There is a moderately strong correlation (r equals approximately 0.6) between how much this index changes in January and how much it changes during the entire year. If we looked instead at data on all 500 individual stocks, we would find a very different correlation. Would the correlation be higher or lower? Why? The correlation would be lower; the individual stock performances will be more variable, weakening the relationship.

Example Problems A study of elementary school children ages 6-11 finds a high positive correlation between shoe size x and score y on a test of reading comprehension. What explains this correlation? Age is a lurking variable. We would expect both quantities to increase with age.

Example Problems A college newspaper interviews a psychologist about student ratings of the teaching of faculty members. The psychologist says, “The evidence indicates that the correlation between research productivity and teaching rating of faculty members is close to zero.” The paper reports this as “Professor McDaniel said that good researchers tend to be poor teachers, and vice versa.” Explain why this is wrong, and explain the psychologist's meaning. Professor McDaniel did not say that good researchers make poor teachers; he simply said that there is a low correlation between research productivity and teaching rating.

Example Problems Explain why this is wrong: “There is a high correlation between gender of American workers and their income.” Gender is categorical, not quantitative.

Example Problems Explain the error: “We found a high correlation (r=1.09) between students' ratings of teaching and ratings made by other faculty members.” r must be between 0 and 1.

Helpful Hints Some people can’t find the RESID button to get the residual plot plot/get an error: That’s because you need to find the LSRL first before that is even possible. If you can’t see anything when you plot your scatterplot, press Zoom -> 9. If the RESID plot has any type of pattern, you don’t want an LSRL. A different model – perhaps a power or exponential one, if it is curved – would suit the data better.

The End Click to add text