Non-Linear Regression. The data frame trees is made available in R with >data(trees) These record the girth in inches, height in feet and volume of timber.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Objectives (BPS chapter 24)
Practical Sheet 6 Solutions Practical Sheet 6 Solutions The R data frame “whiteside” which deals with gas consumption is made available in R by > data(whiteside,
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Chapter 12 Simple Regression
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Gordon Stringer, UCCS1 Regression Analysis Gordon Stringer.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Business Statistics - QBM117 Statistical inference for regression.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Example – Black Cherry Trees. The data frame trees is made available in R with >data(trees) and contains the well-known black cherry trees data. These.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Linear Regression Analysis
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Hypothesis Testing in Linear Regression Analysis
Session 4. Applied Regression -- Prof. Juran2 Outline for Session 4 Summary Measures for the Full Model –Top Section of the Output –Interval Estimation.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
CORRELATION & REGRESSION
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Chapter 10 Correlation and Regression
Correlation & Regression
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Multiple Regression BPS chapter 28 © 2006 W.H. Freeman and Company.
Discussion of time series and panel models
Copyright © 2009 Cengage Learning 18.1 Chapter 20 Model Building.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Worked Example Using R. > plot(y~x) >plot(epsilon1~x) This is a plot of residuals against the exploratory variable, x.
1 Quadratic Model In order to account for curvature in the relationship between an explanatory and a response variable, one often adds the square of the.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Introduction to Statistical Modelling Example: Body and heart weights of cats. The R data frame cats, and the variables therein, are made available by.
Trees Example More than one variable. The residual plot suggests that the linear model is satisfactory. The R squared value seems quite low though,
Chapter 8: Simple Linear Regression Yang Zhenlin.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
Non-Linear Regression. The data frame trees is made available in R with >data(trees) These record the girth in inches, height in feet and volume of timber.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 18 Introduction to Simple Linear Regression (Data)Data.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Example x y We wish to check for a non zero correlation.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Regression Several Explanatory Variables. Example: Scottish hill races data. These data are made available in R as data(hills, package=MASS) They give.
Stats Methods at IC Lecture 3: Regression.
The simple linear regression model and parameter estimation
Chapter 4 Basic Estimation Techniques
Modeling in R Sanna Härkönen.
Basic Estimation Techniques
Basic Estimation Techniques
Unit 3 – Linear regression
Presentation transcript:

Non-Linear Regression

The data frame trees is made available in R with >data(trees) These record the girth in inches, height in feet and volume of timber in cubic feet of each of a sample of 31 felled black cherry trees in Allegheny National Forest, Pennsylvania. Note that girth is the diameter of the tree (in inches) measured at 4 ft 6 in above the ground.

We treat volume as the (continuous) response variable y and seek a reasonable model describing its distribution conditional first on the explanatory variable girth (we will call this x). This might be a first step to prediction of volume based on further observations of the explanatory variables.

Observation of the graph leads us to first try out whether there may be a linear dependence here. Thus the relationship is approximately y=a+bx+є, for some constants a and b We will use R to find a and b, their standard errors and the residuals.

The fitted model is volume = − × girth + residual i.e. y = − x (+ residual) To check its validity, first look at the standard errors

The standard errors of both a and b are low in comparison with the actual values and the p- values associated with the coefficients show that neither of these may reasonably be taken as zero. Thus there is evidence that the model is appropriate.

Some measure of the success of the fitted model is also given by the residual standard error. For a good fit this should be small in relation to the variation in the response variable itself.

Note:  18.1 = 4.252

However, a full examination of the residuals, and of the nature of any further dependence they may have on the explanatory variables, is to be preferred to reliance on any single number. All this will require graphical analysis, the results of which follow.

There is a slight evidence of non random behaviour in the residuals with perhaps the hint of a quadratic curve. We now adapt the model.

The residuals from Model 1 show some further, perhaps quadratic, dependence on the explanatory variable girth, so we try introducing a nonlinear term. We consider the model volume = a + b 1 × girth + b 2 × (girth) 2 + resid The relevant R commands, and associated output, are now >model2 = lm(Volume~Girth+I(Girth^2)) > summary(model.2)

The fitted model is therefore volume = 10.8 − 2.09 × girth × (girth) 2 + residual.

Consider now the graphs produced by the following commands. > plot(Volume~Girth) > lines(fitted(model2)~Girth) > plot(residuals(model2)~Girth, ylab="residuals from Model 2")

It is clear that these residuals are both smaller than those from Model 1 and show no further obvious dependence on the explanatory variable girth. Further the very small p-value ( ) associated with the coefficient b 2 shows that this cannot reasonably be set equal to zero, so that Model 2 is considerably more successful than Model 1.

Note also that the residual standard error in Model 2 is whilst in Model 1 it is Further Analysis: On physical grounds, we might also consider the simpler model Volume = b 2 × (Girth) 2 + Residual For extra justification look at this R output

The R code to fit this model, and brief summary output, are: > model3 = lm(Volume ~ I(Girth^2) - 1) > summary(model3)

We might now ask if we can find a model with both explanatory variables height and girth. Physical considerations suggest that we should explore the very simple model Volume = b 1 × height × (girth) 2 +  This is basically the formula for the volume of a cylinder.

So the equation is: Volume = × height × (girth) 2 + 

The residuals are considerably smaller than those from any of the previous models considered. Further graphical analysis fails to reveal any further obvious dependence on either of the explanatory variable girth or height. Further analysis also shows that inclusion of a constant term in the model does not significantly improve the fit. Model 4 is thus the most satisfactory of those models considered for the data.

However, this is regression “through the origin” so it may be more satisfactory to rewrite Model 4 as volume = b 1 +  height × (girth) 2

so that b 1 can then just be regarded as the mean of the observations of volume height × (girth) 2 recall that  is assumed to have location measure (here mean) 0.

Compare with found earlier

Multiple Regression Example y x1x1x1x1 x2x2x2x

So y = x x 2 + e

> ynew=c(y,12) > x1new=c(x1,20) > x2new=c(x2,100) > multregressnew=lm(ynew~x1new+x2new) Adding an extra point:

Very large influence