Linear regression models. Purposes: To describe the linear relationship between two continuous variables, the response variable (y- axis) and a single.

Slides:



Advertisements
Similar presentations
Multiple and complex regression. Extensions of simple linear regression Multiple regression models: predictor variables are continuous Analysis of variance:
Advertisements

Managerial Economics in a Global Economy
Multiple Regression Analysis
Topic 12: Multiple Linear Regression
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Kin 304 Regression Linear Regression Least Sum of Squares
Probability & Statistical Inference Lecture 9
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Chapter 10 Curve Fitting and Regression Analysis
Linear regression models
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
Chapter 13 Multiple Regression
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
Multipe and non-linear regression. What is what? Regression: One variable is considered dependent on the other(s) Correlation: No variables are considered.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
N-way ANOVA. 3-way ANOVA 2 H 0 : The mean respiratory rate is the same for all species H 0 : The mean respiratory rate is the same for all temperatures.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 11 th Edition.
Least Square Regression
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
Chapter Topics Types of Regression Models
Ch. 14: The Multiple Regression Model building
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Correlation and Regression Analysis
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Introduction to Regression Analysis, Chapter 13,
Variance and covariance Sums of squares General linear models.
Regression and Correlation Methods Judy Zhong Ph.D.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Chapter 11 Simple Regression
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Applied Quantitative Analysis and Practices LECTURE#22 By Dr. Osman Sadiq Paracha.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
Multiple regression models Experimental design and data analysis for biologists (Quinn & Keough, 2002) Environmental sampling and analysis.
Regression. Population Covariance and Correlation.
Go to Table of Content Single Variable Regression Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Simple Linear Regression (SLR)
Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.
Lecture 10: Correlation and Regression Model.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
ENGR 610 Applied Statistics Fall Week 11 Marshall University CITE Jack Smith.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Chapter 13 Simple Linear Regression
The simple linear regression model and parameter estimation
Chapter 15 Multiple Regression and Model Building
Regression Analysis AGEC 784.
Statistics for Managers using Microsoft Excel 3rd Edition
Regression Analysis Week 4.
CHAPTER 29: Multiple Regression*
Presentation transcript:

Linear regression models

Purposes: To describe the linear relationship between two continuous variables, the response variable (y- axis) and a single predictor variable (x-axis) To determine how much of the variation in Y can be explained by the linear relationship with X and how much of this relationship remains unexplained To predict new values of Y from new values of X

The linear regression model is: ß 0 = population intercept (when x i =0) ß 1 = population slope, measures the change in Y per unit change in X ε i = random or unexplained error associated with the i th observation.

Linear relationship Y X ß0ß0 ß1ß1 1.0

Linear models approximate non-linear functions over a limited domain extrapolation interpolation

μ yi = β o + β 1 *x i + є i x1x1 x2x2 μ y1 μ y2 yiyi y i pred y i –y pred = residual xixi Fitting data to a linear model

The residual The residual sum of squares

The “best fit” estimates are the values that minimize the residual sum of squares (RSS) between each observed value and the predicted value of the model

Sum of squares Sun of cross products

Variance Covariance

Least-squares parameter estimates where

To solve the intercept

Variance of the error of regression

Variance components and Coefficient of determination

Coefficient of determination

Product-moment correlation coefficient

ANOVA table for regression SourceDegrees of freedom Sum of squaresMean square Expected mean square F ratio Regression 1 Residual n-2 Total n-1

Publication form of ANOVA table for regression Source Sum of Squaresdf Mean SquareFSig. Regression Residual Total

Variance of estimated intercept

Variance of the slope estimator

Variance of the fitted value

Regression

Assumptions of regression The linear model correctly describes the functional relationship between X and Y The X variable is measured without error For a given value of X, the sampled Y values are independent with normally distributed errors Variances are constant along the regression line

Residual plot for species-area relationship

The influence function

Logistic regression

Height vs. survival in Hypericum cumulicola

Multiple regression

Relative abundance of C 3 and C 4 plants Paruelo & Lauenroth (1996) Geographic distribution and the effects of climate variables on the relative abundance of a number of plant functional types (PFTs): shrubs, forbs, succulents, C 3 grasses and C 4 grasses.

data Relative abundance of PTFs (based on cover, biomass, and primary production) for each site Longitude Latitude Mean annual temperature Mean annual precipitation Winter (%) precipitation Summer (%) precipitation Biomes (grassland, shrubland) 73 sites across temperate central North America Response variablePredictor variables

Box 6.1 Relative abundance transformed ln(dat+1) because positively skewed

Collinearity Causes computational problems because it makes the determinant of the matrix of X-variables close to zero and matrix inversion basically involves dividing by the determinant (very sensitive to small differences in the numbers) Standard errors of the estimated regression slopes are inflated

Detecting collinearlity Check tolerance values Plot the variables Examine a matrix of correlation coefficients between predictor variables

Dealing with collinearity Omit predictor variables if they are highly correlated with other predictor variables that remain in the model

Additive model (log 10 C 3 )= β o + β 1 (lat)+ β 2 (long)+ β 3 (map)+ β 4 (mat)+ β 5 (JJAmap)+ β 6 (DJFmap) (lnC 3 )= β o + β 1 (lat)+ β 2 (long)+ β 3 (map)+ β 4 (mat)+ β 5 (JJAmap)+ β 6 (DJFmap) Adding 0.1

Additive model (log 10 C 3 )= β o + β 1 (lat)+ β 2 (long)+ β 3 (map)+ β 4 (mat)+ β 5 (JJAmap)+ β 6 (DJFmap) (lnC 3 )= β o + β 1 (lat)+ β 2 (long)+ β 3 (map)+ β 4 (mat)+ β 5 (JJAmap)+ β 6 (DJFmap) Adding 0.1 Adding 1

(lnC 3 )= β o + β 1 (lat)+ β 2 (long)+ β 3 (latxlong) After centering both lat and long

If we omit the interaction and refit the model, the partial regression slope for latitude changes The collinearity problems disappear

R 2 =0.514

Matrix algebra approach to OLS estimation of multiple regression models Y=bX+ε X’Xb=X’Y b=(X’X) -1 (X’Y)

The forward selection is

The backward selection is

Adjusted r 2

Akaike information criteria

Bayesian information criteria

Hierarchical partitioning and model selection No predmodelr2r2 Adjr 2 CpCp AICSchwarz BIC 1 Lon Lat Lon x Lat Lon + Lat Long +Lat + Lon x Lat