Transforming to Achieve Linearity

Slides:



Advertisements
Similar presentations
Chapter 4 Review: More About Relationship Between Two Variables
Advertisements

Inference for Regression
CHAPTER 24: Inference for Regression
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 14: More About Regression Section 14.1 Inference for Linear Regression.
Objectives (BPS chapter 24)
Inference for Regression 1Section 13.3, Page 284.
+ Hw: pg 764: 21 – 26; pg 786: 33, 35 Chapter 12: More About Regression Section 12.2a Transforming to Achieve Linearity.
Correlation and Regression. Spearman's rank correlation An alternative to correlation that does not make so many assumptions Still measures the strength.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Statistics 303 Chapter 10 Least Squares Regression Analysis.
Welcome to class today! Chapter 12 summary sheet Jimmy Fallon video
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Chapter 12 Section 1 Inference for Linear Regression.
Correlation & Regression
+ Hw: pg 788: 37, 39, 41, Chapter 12: More About Regression Section 12.2b Transforming using Logarithms.
CHAPTER 12 More About Regression
Inference for regression - Simple linear regression
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Inferences for Regression
Confidence Intervals for the Regression Slope 12.1b Target Goal: I can perform a significance test about the slope β of a population (true) regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
+ Chapter 12 Section 2 Transforming to Achieve Linearity.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 19 Linear Patterns.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
12.1 WS Solutions. (b) The y-intercept says that if there no time spent at the table, we would predict the average number of calories consumed to be
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Simple Linear Regression ANOVA for regression (10.2)
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
 Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination  Justify the regression.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Chapter 8: Simple Linear Regression Yang Zhenlin.
AP Statistics Section 4.1 A Transforming to Achieve Linearity.
Chapter 10 Inference for Regression
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Regression. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other words, there is a distribution.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
AP Statistics Section 15 A. The Regression Model When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative.
Chapter 3: Describing Relationships
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Chapter 26: Inference for Slope. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other.
BPS - 5th Ed. Chapter 231 Inference for Regression.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Linear Regression Hypothesis testing and Estimation.
Chapter 8 Part I Answers The explanatory variable (x) is initial drop, measured in feet, and the response variable (y) is duration, measured in seconds.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.2 Transforming.
Inference for Linear Regression
CHAPTER 12 More About Regression
Chapter 4 Basic Estimation Techniques
CHAPTER 12 More About Regression
CHAPTER 3 Describing Relationships
CHAPTER 12 More About Regression
1) A residual: a) is the amount of variation explained by the LSRL of y on x b) is how much an observed y-value differs from a predicted y-value c) predicts.
Unit 3 – Linear regression
Chapter 3: Describing Relationships
Inference for Regression
CHAPTER 12 More About Regression
Chapter 12 Review Inference for Regression
CHAPTER 12 More About Regression
Chapter 3: Describing Relationships
CHAPTER 12 More About Regression
CHAPTER 12 More About Regression
Inferences for Regression
CHAPTER 12 More About Regression
Presentation transcript:

Transforming to Achieve Linearity Section 12.2 Transforming to Achieve Linearity

Example 1: A fisheries biologist wants to predict the weight (in grams) of perch (a type of fish) caught in a certain lake from their length (in cm). He catches, measures, and weights 13 perch whose lengths were between 8 and 48 cm. Below is a scatterplot of his data, along with the residual plot from a linear regression analysis. a) Is a linear model appropriate for these data? Justify your answer. A linear model is not appropriate because there is a curved pattern in both the scatter and residual plots.

If the scatterplot of logarithm (or natural logarithm) of the response variable values and the original explanatory values has a linear form, then the 2 variables can be modeled using an exponential function. If the scatterplot of logarithm (or natural logarithm) of the response variable values and the logarithm (or natural logarithm) of the explanatory values has a linear form, then the 2 variables can be modeled using a power function.

b) Below is a scatterplot of the natural logarithm of weight vs b) Below is a scatterplot of the natural logarithm of weight vs. the natural logarithm of length. This relationship is clearly more linear than the one above. Does this suggest that the relationship between length and weight can be modeled by an exponential function or by a power function? Explain.

The relationship between length and weight can be modeled by a power function because when the ln was taken of each variable the resulting scatterplot showed a linear pattern.

c) Computer output from the regression of ln (Weight) vs c) Computer output from the regression of ln (Weight) vs. ln (Length) is given below. Use it to predict the weight of a fish that is 75 cm long.

Example 2: Is there a link between the amount of cigarette smoking in countries and death rates from coronary heart disease (CHD)? Below is computer output from a regression analysis of this relationship for 14 randomly-selected countries from around the world, along with a residual plot. The explanatory variable is annual consumption of cigarettes per person and the response variable is annual deaths from coronary heart disease per 100,000 people.

a) What is the equation of the least-squares regression line based on these data? Define any variables used. b) Interpret the slope of the regression line. A one-cigarette increase in the annual number of cigarettes consumed in a country is associated with a predicted increase of 0.02268 in annual deaths from CHD.

c) If we are trying to determine the relationship between these two variables throughout the world, is the slope you provided in part (b) a statistic or a parameter? Explain. This is a statistic: it is an estimate of the population regression slope based on this particular random sample of 14 countries.

d) Assuming all conditions have been met, construct and interpret a 90% confidence interval for the slope of the least squares regression of annual CHD deaths on annual cigarette consumption. State: We want to estimate β, the true slope of the population regression line relating annual cigarette consumption to annual deaths from CHD, with 90% confidence. Plan: We are told to assume all conditions for inference have been met, so we will use a t-interval for the slope to estimate β.

Do: df = 14 – 2 = 12 For a 90% confidence level, the critical value is t* =1.782. So the 90% confidence interval for β is 0.02268 ± 1.782(0.01926) ≈ 0.0227 ± 0.0343 (–0.0116, 0.0570) Conclude: We are 90% confident that the interval from –0.0116 to 0.0570 captures the actual slope of the population regression line relating annual deaths by CHD to annual cigarette consumption per person in all countries.

e) If you were to perform a test of the hypotheses H0: β = 0 versus Ha: β ≠ 0 at the α = 0.10 level, what would you conclude? Justify your answer using your result in part (d). Since the 90% confidence interval contains 0, we fail to reject H0 at the α = 0.10 level. We do not have enough evidence to suggest that the slope of the population regression line relating annual cigarette consumption to annual deaths from CHD is different from 0.

Example 3: Lupe is shopping for a used car and collects data on age (in years) and price (in 1000s of dollars) for Ford Taurus sedans on a used-car web site. The computer output for three different regression models: Price vs. Age, Log (Price) vs. Age, and Log (Price) vs. Log (Age) are shown on this page and the next. I. Price versus Age

II. Log Price versus Age

III. Log Price versus Log Age

a) Explain how the information provided suggests that a linear model may not be appropriate for describing the relationship between car age and price. The scatterplot shows a curved relationship between Price and Age. This is reinforced by the distinctive “U-shaped” pattern in residual plot—positive residuals for high and low ages and negative residuals in between.

b) Would an exponential model or a power model provide a better description of this relationship? Use the information provided to justify your answer. The plot of Log (Price) versus Age is clearly linear, and the residual plot shows a random scatter of points on either side of the line residual = 0. (The Log (Price) versus Log (Age) scatterplot and residual plot do not suggest a linear relationship). If Log (Price) vs. Age is roughly linear, then Price vs. Age can be modeled well by an exponential function.

c) Give the equation of the model you chose in part (b), using the transformed variable(s).

d) Use the model you chose in part (c) to predict the price of a 5-year-old Ford Taurus. Show your work!