Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Lesson 10: Linear Regression and Correlation
Kin 304 Regression Linear Regression Least Sum of Squares
Brief introduction on Logistic Regression
Correlation and regression
Forecasting Using the Simple Linear Regression Model and Correlation
Regresi Linear Sederhana Pertemuan 01 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression, Correlation. Research Theoretical empirical Usually combination of the two.
1 Simple Linear Regression and Correlation The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES Assessing the model –T-tests –R-square.
Simple Linear Regression
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
9. SIMPLE LINEAR REGESSION AND CORRELATION
Regression and Correlation
REGRESSION What is Regression? What is the Regression Equation? What is the Least-Squares Solution? How is Regression Based on Correlation? What are the.
The Simple Regression Model
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 11 Notes Class notes for ISE 201 San Jose State University.
Simple Linear Regression Analysis
Topic 3: Regression.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
REGRESSION Predict future scores on Y based on measured scores on X Predictions are based on a correlation from a sample where both X and Y were measured.
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Simple Linear Regression Analysis
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Lecture 5 Correlation and Regression
Correlation & Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Chapter 11 Simple Regression
MAT 254 – Probability and Statistics Sections 1,2 & Spring.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Biostatistics Unit 9 – Regression and Correlation.
Chapter 6 & 7 Linear Regression & Correlation
Ch4 Describing Relationships Between Variables. Pressure.
Introduction to Linear Regression
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Chapter 10 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 A perfect correlation implies the ability to predict one score from another perfectly.
ECON 338/ENVR 305 CLICKER QUESTIONS Statistics – Question Set #8 (from Chapter 10)
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
1 Regression Review Population Vs. Sample Regression Line Residual and Standard Error of Regression Interpretation of intercept & slope T-test, F-test.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Nonparametric Statistics
BIVARIATE/MULTIVARIATE DESCRIPTIVE STATISTICS Displaying and analyzing the relationship between continuous variables.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Regression Analysis AGEC 784.
Linear Regression.
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.
Correlation and Simple Linear Regression
6-1 Introduction To Empirical Models
Correlation and Simple Linear Regression
Simple Linear Regression
Simple Linear Regression and Correlation
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Regression

Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used method Generalized Linear Model (GLS) –Flexible generalization of ordinary linear regression that allows for response variables that have other than a normal distribution. –Can be used with categorical data

Regression is used to create empirical models Common Regression – Interval/Ratio (Real) Dependent factor Logistic Regression – Nominal/Ordinal (Integer) Dependent factor Basic Assumption: There is a set of X independent factors that controls the magnitude of a dependent factor Y.

Basic Form Y = a + b1 * X1 + b2 * X2 + b3 * X3 ………+ bn * Xn + e The equation assumes that the relationship between Y and Xi is linear and the effects of each variable is additive. Assumptions: Variables are measured without error There is a linear relationship and all variables relevant No autocorrelation of independent variables No Perfect Collinearity (correlation) between variables Errors are normally distributed for each variable The variance of the error term is constant (not heteroscedasticity)

is uniquely determined by its y-intercept b 0 and its steepness or slope b 1. For any given value of X, we can find a corresponding value of Y on the line. Simple Linear Regression

If the relationship between X and Y is linear, the average Y value for any given X will lie right on the regression line. Realistically, in any population, there's bound to be some variation between observations. The variation is due either to the data itself, or some kind of error in the measurement. Even in simple linear regression, often, not all of the data points will fall exactly on a regression line. Therefore, we account for the observations not falling on the regression line as the error term, e The error term e i for an observation, i, is the difference between the observed data point (X i, Y i ) and the theoretical regression line. Simple Linear Regression

So, even though there may be several Y values with the same X value, the relationship can still be considered linear if we assume the average Y value for any given X value is on the regression line. In a regression model we also assume that for any given value of X, the errors are normally distributed with a mean of zero and a constant variance s 2. Negative error values Positive error values The errors essentially cancel themselves out and the mean is 0.

Now that we have our coefficients the next question is whether the data is any good. One of the ways to test this is to determine a coefficient of determination(COD). We will not go into detail about what a COD is, but suffice it to say that it represents how well the formula we created actually fits the data. The correlation coefficient creates a value between 0 and 1. The closer the value is to 1, the better the fit. Linear Regression Example

Least Squares Regression y = β 0 + β 1 x + ε Y = β X + ε

1. Generalized Linear Regression for Binary or Ordinal Data: Y = (1,0) X = Set of independent variables prediction = a0+ b1x1 + b2x2 + b3x Using a GLM 2. Logit Transformation: Probability (0-1) = 1 / (1 + (exp ( - (a0 + alxl + a2x2 + a3x3 +...))) Two Step Process

Christopherson et al.

The fit statistics for Logistic Regression can be misleading. The “best” fitted model may be relatively “poor” in regards to predictive power. The goal is usually to delineate the area with the best change of finding the event. The smaller the area with the most observed events the better. You can always expand the number of events in the “high” probability areas by generalizing the model. You should perform additional tests to assess model performance. Chi-square and K-S can be used. Relative indices. If possible use an independent data set for testing.

Model Strength

Methods - Deforestation Probability Surface Cell by Cell Logistic Regression for Each Analysis Year (1986 to 1999) using 5 % Stratified Random Samples (> 1,100,000 cells): –Dependent Variable: Deforested (1) / Forested (0) –Independent Variables: LN distance to Roads, LN Distance to Settlements, Well (1) / Poorly (0) Draining Soils

Deforestation Probability Surface Observed Deforestation Results 1986