Multipe and non-linear regression. What is what? Regression: One variable is considered dependent on the other(s) Correlation: No variables are considered.

Slides:



Advertisements
Similar presentations
Topic 12: Multiple Linear Regression
Advertisements

Lesson 10: Linear Regression and Correlation
Kin 304 Regression Linear Regression Least Sum of Squares
Chapter 12 Simple Linear Regression
BA 275 Quantitative Business Methods
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Regression analysis Linear regression Logistic regression.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Linear regression models
1 Simple Linear Regression and Correlation The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES Assessing the model –T-tests –R-square.
Ch11 Curve Fitting Dr. Deshi Ye
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
N-way ANOVA. 3-way ANOVA 2 H 0 : The mean respiratory rate is the same for all species H 0 : The mean respiratory rate is the same for all temperatures.
Chapter 12 Simple Regression
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Multiple Regression and Correlation Analysis
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation and Regression Analysis
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Simple Linear Regression Analysis
Lecture 5 Correlation and Regression
Objectives of Multiple Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Chapter 11 Simple Regression
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Ch4 Describing Relationships Between Variables. Pressure.
12a - 1 © 2000 Prentice-Hall, Inc. Statistics Multiple Regression and Model Building Chapter 12 part I.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Regression. Population Covariance and Correlation.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Linear correlation and linear regression + summary of tests
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Chapter 16 Data Analysis: Testing for Associations.
Chapter 13 Multiple Regression
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Environmental Modeling Basic Testing Methods - Statistics III.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
1 Regression Review Population Vs. Sample Regression Line Residual and Standard Error of Regression Interpretation of intercept & slope T-test, F-test.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Multiple Regression.
The simple linear regression model and parameter estimation
Chapter 14 Introduction to Multiple Regression
Chapter 20 Linear and Multiple Regression
Chapter 4 Basic Estimation Techniques
Kin 304 Regression Linear Regression Least Sum of Squares
Regression.
John Loucks St. Edward’s University . SLIDES . BY.
BPK 304W Regression Linear Regression Least Sum of Squares
BPK 304W Correlation.
CHAPTER 29: Multiple Regression*
Multiple Regression.
Prepared by Lee Revere and John Large
Created by Erin Hodgess, Houston, Texas
Regression Part II.
Presentation transcript:

Multipe and non-linear regression

What is what? Regression: One variable is considered dependent on the other(s) Correlation: No variables are considered dependent on the other(s) Multiple regression: More than one independent variable Linear regression: The independent factor is scalar and linearly dependent on the independent factor(s) Logistic regression: The independent factor is categorical (hopefully only two levels) and follows a s-shaped relation. 2

Remember the simple linear regression? If Y is linaery dependent on X, simple linear regression is used:  is the intercept, the value of Y when X = 0  is the slope, the rate in which Y increases when X increases 3

I the relation linaer? 4

Multiple linear regression If Y is linaery dependent on more than one independent variable:  is the intercept, the value of Y when X 1 and X 2 = 0  1 and  2 are termed partial regression coefficients  1 expresses the change of Y for one unit of X when  2 is kept constant 5

Multiple linear regression – residual error and estimations As the collected data is not expected to fall in a plane an error term must be added The error term summes up to be zero. Estimating the dependent factor and the population parameters: 6

Multiple linear regression – general equations In general an finite number (m) of independent variables may be used to estimate the hyperplane The number of sample points must be two more than the number of variables 7

Multiple linear regression – least sum of squares The principle of the least sum of squares are usually used to perform the fit: 8

Multiple linear regression – An example 9

Multiple linear regression – The fitted equation 10

Multiple linear regression – Are any of the coefficients significant? F = regression MS / residual MS 11

Multiple linear regression – Is it a good fit? R 2 = 1-regression SS / total SS Is an expression of how much of the variation can be described by the model When comparing models with different numbers of variables the ajusted R-square should be used: R a 2 = 1 – regression MS / total MS The multiple regression coefficient: R = sqrt(R 2 ) The standard error of the estimate = sqrt(residual MS) 12

Multiple linear regression – Which of the coefficient are significant? s bi is the standard error of the regresion parameter b i t-test tests if b i is different from 0 t = b i / s bi is the residual DF p values can be found in a table 13

Multiple linear regression – Which of the are most important? The standardized regression coefficient, b’ is a normalized version of b 14

Multiple linear regression - multicollinearity If two factors are well correlated the estimated b’s becomes inaccurate. Collinearity, intercorrelation, nonorthogonality, illconditioning Tolerance or variance inflation factors can be computed Extreme correlation is called singularity and on of the correlated variables must be removed. 15

Multiple linear regression – Pairvise correlation coefficients 16

Multiple linear regression – Assumptions The same as for simple linear regression: 1.Y’s are randomly sampled 2.The reciduals are normal distributed 3.The reciduals hav equal variance 4.The X’s are fixed factors (their error are small). 5.The X’s are not perfectly correlated 17

Logistic regression 18

Logistic Regression If the dependent variable is categorical and especially binary? Use some interpolation method Linear regression cannot help us. 19

20 The sigmodal curve

21 The sigmodal curve The intercept basically just ‘scale’ the input variable

22 The sigmodal curve The intercept basically just ‘scale’ the input variable Large regression coefficient → risk factor strongly influences the probability

23 The sigmodal curve The intercept basically just ‘scale’ the input variable Large regression coefficient → risk factor strongly influences the probability Positive regression coefficient → risk factor increases the probability Logistic regession uses maximum likelihood estimation, not least square estimation

Does age influence the diagnosis? Continuous independent variable 24 Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a Age,109,010108,7451,0001,1151,0921,138 Constant-4,213,42399,0971,000,015 a. Variable(s) entered on step 1: Age.

Does previous intake of OCP influence the diagnosis? Categorical independent variable Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a OCP(1)-,311,1802,9791,084,733,5151,043 Constant,233,1233,5831,0581,263 a. Variable(s) entered on step 1: OCP. 25

Odds ratio 26

Multiple logistic regression Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a Age,123,011115,3431,0001,1311,1061,157 BMI,083,01918,7321,0001,0871,0461,128 OCP,528,2195,8081,0161,6951,1042,603 Constant-6,974,76283,7771,000,001 a. Variable(s) entered on step 1: Age, BMI, OCP. 27

Predicting the diagnosis by logistic regression What is the probability that the tumor of a 50 year old woman who has been using OCP and has a BMI of 26 is malignant? z = * * *1 = p = 1/(1+e ) = Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a Age,123,011115,3431,0001,1311,1061,157 BMI,083,01918,7321,0001,0871,0461,128 OCP,528,2195,8081,0161,6951,1042,603 Constant-6,974,76283,7771,000,001 a. Variable(s) entered on step 1: Age, BMI, OCP.

Exercises 20.1,