Part III The General Linear Model Chapter 9 Regression.

Slides:



Advertisements
Similar presentations
Part III The General Linear Model. Multiple Explanatory Variables Chapter 12 Multiple Regression.
Advertisements

Chapter 27 Inferences for Regression This is just for one sample We want to talk about the relation between waist size and %body fat for the complete population.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Copyright © 2010 Pearson Education, Inc. Slide
Inference for Regression
REMINDER 1) GLM Review on Friday 2) Exam II on Monday.
CHAPTER 24: Inference for Regression
Objectives (BPS chapter 24)
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 10 Simple Regression.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
The Simple Regression Model
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Linear Regression Example Data
Business Statistics - QBM117 Statistical inference for regression.
Chapter 7 Forecasting with Simple Regression
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Chapter 12 Section 1 Inference for Linear Regression.
Regression and Correlation Methods Judy Zhong Ph.D.
Inference for regression - Simple linear regression
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 13: Inference in Regression
STA291 Statistical Methods Lecture 27. Inference for Regression.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Inferences for Regression
Today: Quizz 8 Friday: GLM review Monday: Exam 2.
Part IV The General Linear Model. Multiple Explanatory Variables Chapter 13.3 Fixed *Random Effects Paired t-test.
BIOL 4605/7220 Ch 13.3 Paired t-test GPT Lectures Cailin Xu October 26, 2011.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Part III The General Linear Model Chapter 10 GLM. ANOVA.
BIOL 4605/7220 GPT Lectures Cailin Xu October 12, 2011 CH 9.3 Regression.
Chapter 14 Inference for Regression AP Statistics 14.1 – Inference about the Model 14.2 – Predictions and Conditions.
Lesson Inference for Regression. Knowledge Objectives Identify the conditions necessary to do inference for regression. Explain what is meant by.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Chapter 10 Inference for Regression
Regression. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other words, there is a distribution.
Inference for Regression
Lesson Testing the Significance of the Least Squares Regression Model.
Chapter 26: Inference for Slope. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other.
Chapter 15 Inference for Regression. How is this similar to what we have done in the past few chapters?  We have been using statistics to estimate parameters.
Stats Methods at IC Lecture 3: Regression.
Chapter 13 Simple Linear Regression
AP Statistics Chapter 14 Section 1.
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Inferences for Regression
Inference for Regression
Simple Linear Regression - Introduction
CHAPTER 29: Multiple Regression*
Regression Chapter 8.
CHAPTER 12 More About Regression
Chapter 7: The Normality Assumption and Inference with OLS
Chapter 14 Inference for Regression
Inference for Regression
Inferences for Regression
Inference for Regression
Presentation transcript:

Part III The General Linear Model Chapter 9 Regression

GLM, applied to regression Example from Snedecor and Cochran (1989) Interested in the relationship between: – phosphorus content of corn (Pcorn in ppm) & phosphorus levels in soil samples (Psoil in ppm).

1. Construct Model Verbal Graphical Formal

1. Construct Model NameUnitsDimensionsMeasurement Scale Response Explanatory Graphical Verbal Phosphorus content of corn (Pcorn) depends on Phosphorus content of soil (Psoil)

1. Construct Model Verbal Graphical Formal Phosphorus content of corn (Pcorn) depends on Phosphorus content of soil (Psoil) UnitsDimensionsMeasurement Scale

2. Execute analysis. Place data in model format: lm1 <- lm(Pcorn~Psoil, data=corn) 2. Execute analysis. Compute fitted values and residuals. fits <- fitted(lm1) resid <- residuals(lm1) cbind(corn, fits, resid)

3. Evaluate Model. Plot residuals against fitted values Check linear trend

3. Evaluate Model. Plot residuals against fitted values plot(fits,resid,pch=16) Check linear trend

3. Evaluate Model. Plot residuals against fitted values

3. Evaluate Model. Using theoretical distributions ( χ 2, t, F) to calculate p-value, therefore we need to check their assumptions: – Fixed variance (errors homogeneous) – Normally distributed errors. – Independent errors – Unbiased estimate (errors sum to zero)

3. Evaluate Model. Homogeneous errors.

3. Evaluate Model. Normal errors.

3. Evaluate Model. Independent errors. This is a text example, we do not have information on spatial layout of samples, or on collection sequence. We will assume independence 3. Evaluate Model. Conclusion. Residuals appear to homogeneous, but not normal. We assume independence, we do not have enough information to evaluate this assumption. We may need to use an empirical distribution to compute p- values or confidence limits

4. State population and whether sample is representative. Population? Sample (n=9) The population is all values of phosphorus in corn, given knowledge of phosphorus in the soil The sample is representative if the 17 soil types represent the range of possible soil types

5. Decide on mode of inference. Is hypothesis testing appropriate? Since the relationship between P and P content in corn is unknown, we proceed 6. State H A / H o, test statistic and α HA:HA: Ho:Ho: Statistic:α:

7. ANOVA: partition df according to model. n=9 df tot = ________ = _____ df model = 1 df res = df total – df model = _____

7. ANOVA: Calculate SS, partition according to model.

Null model: Pcorn = mean(Pcorn) SS total: Regression model: *Psoil SS residual: SS improvement? __________

7. ANOVA: Calculate SS, partition according to model.

7. ANOVA: Partition df, SS according to model. Complete ANOVA table 7. ANOVA: Calculate Type I error from F distribution. Packages compute and place the p-value in the ANOVA table p =

8. Recompute p-value if necessary. p-values can be inaccurate if assumptions are violated Distortion depends on sample size – As a rule of thumb, distortion is greatest if n < 30 – less serious if 30 < n < 100 – usually not serious if n > 100 When assumptions are not met, recompute Type I error if two conditions are met: 1.n small 2.p near α

8. Recompute p-value if necessary. Due diligence  recompute p-value using randomization – Free of assumptions In 4000 randomizations there were 27 instances of an F-ratio greater than – Empirical p-value: – Theoretical p-value:

9. Declare and report decision about model terms.

10. Report and interpret parameters of biological interest.

Today: Lab 4 due Monday & Tuesday: No classes Wednesday: Grad seminar Lecture Quizz 5 Thursday: Lab 5a

Chapter 9.2 Regression. Explanatory Variable Fixed into Classes

GLM, applied to regression X variable fixed into classes Example: Galton’s Law Quantity of interest is the stature (height) of sons in relation to stature (height) of their fathers. Data collected by Francis Galton at end of the 19th century. 1 st application of regression

1. Construct Model Verbal Graphical Formal Data

1. Construct Model Verbal Graphical Formal Data There is a positive relation between heights of sons and fathers Explanatory: _____________ Response:_____________ Model: __________________

1. Construct Model SymbolUnitsDimensionsMeasurement Scale H son HfHf ………

2. Execute analysis. Place data in model format: lm1 <- lm(Hson~Hf, weights=Nfamily, data=Heights) ………

2. Execute analysis. Compute fitted values and residuals. coefficients(lm1) (Intercept) Hf = =+ ………

3. Evaluate Model □ Straight line model ok? □ Errors homogeneous? □ Errors normal? □ Errors independent?

4. State population and whether sample is representative. Population is all possible measurements, given the measurement protocol, if we repeated the study thousands of times We infer a population consisting of thousands of runs of the same experiment, using the same protocol

5. Decide on mode of inference. Is hypothesis testing appropriate? Might expect a 1:1 ratio Undertake hypothesis testing? Use confidence limits 10. Report and interpret parameters of biological interest. Compute confidence limits from standard error of the slope parameter summary(lm1)$coefficients Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) e-12 *** Hf e-12 ***

10. Report and interpret parameters of biological interest.

Chapter 9.3 Regression. Explanatory Variable Measured with Error

Adds bias to regression parameter estimates Example: – Relation between number of eggs and body size in cabezon fish (Box 14.12, Sokal and Rohlf 1995) – What is the magnitude of the bias? GLM, applied to regression Explanatory Variable Measured with Error

1. Construct Model Verbal – Does egg number N eggs depend on body mass M ? Graphical D V G F Formal – Response: N eggs – Explanatory: M units? dimensions? measurement scale?

2. Execute analysis. Place data in model format: lm1 <- lm(Neggs~M, data=data) Estimate parameters and compute fitted values and residuals

2. Execute analysis. Place data in model format: lm1 <- lm(Neggs~M, data=data) Estimate parameters and compute fitted values and residuals

3. Evaluate Model □ Structure? □ Straight line model ok? □ Errors homogeneous? □ Errors normal? □ Errors independent?

3. Evaluate Model □ Structure? □ Straight line model ok? □ Errors homogeneous? □ Errors normal? □ Errors independent?

3. Evaluate Model □ Structure? □ Straight line model ok? □ Errors homogeneous? □ Errors normal? □ Errors independent? M Neggs Res Lag.Res NA

4. State population and whether sample is representative. a)All measurements that could have been made on the fish by this protocol b)All cabezon fish c)All fish that could have been collected when the collection was made d)Measurements from 11 cabenzon fish reported here

5. Decide on mode of inference. Is hypothesis testing appropriate? We want to know if the relationship between body size and egg count deviates from 1:1 Use confidence limits 10. Report and interpret parameters of biological interest. Compute confidence limits confint(lm1) 2.5 % 97.5 % (Intercept) M

10. Report and interpret parameters of biological interest. Neggs=Fits+Res 61= = = = = = = = = = = Check limits free of assumptions – randomization

10. Report and interpret parameters of biological interest.

Chapter 9.4 Exponential Function, using Linear Regression

Exponential functions

Exponential rates are common in biology Example: specific growth rate – Growth of 6 lungfish in 2001 in Lake Baringo, Kenya kg kg Time Initial End Days

1. Construct Model Verbal – Growth rate of lungfish is exponential, with fixed growth rate k Graphical D V G F

2. Execute analysis.

3. Evaluate Model □ Straight line model ok? □ Errors homogeneous? □ Errors normal? □ Errors independent?

4. State population and whether sample is representative. All measurements that could have been made on the fish by this protocol 5. Decide whether to use hypothesis testing. The research objective is to estimate specific growth rate of fish. We will examine the parameters and compute confidence limits (skip to step 10).

10. Report and interpret parameters of biological interest. Compute confidence limits Limits bound zero, suggesting no growth. Yet all fish were larger upon recapture. Improbable result: – = But was growth exponential? confint(lm1) 2.5 % 97.5 % (Intercept) t L = Lower limit = %/day U = Upper limit = %/day

10. Report and interpret parameters of biological interest. The estimate of growth rate is approximately 0.1%/day, or about 3% per month – but the estimate is not reliable!