Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multipe and non-linear regression. What is what? Regression: One variable is considered dependent on the other(s) Correlation: No variables are considered.

Similar presentations


Presentation on theme: "Multipe and non-linear regression. What is what? Regression: One variable is considered dependent on the other(s) Correlation: No variables are considered."— Presentation transcript:

1 Multipe and non-linear regression

2 What is what? Regression: One variable is considered dependent on the other(s) Correlation: No variables are considered dependent on the other(s) Multiple regression: More than one independent variable Linear regression: The independent factor is scalar and linearly dependent on the independent factor(s) Logistic regression: The independent factor is categorical (hopefully only two levels) and follows a s-shaped relation. 2

3 Remember the simple linear regression? If Y is linaery dependent on X, simple linear regression is used:  is the intercept, the value of Y when X = 0  is the slope, the rate in which Y increases when X increases 3

4 I the relation linaer? 4

5 Multiple linear regression If Y is linaery dependent on more than one independent variable:  is the intercept, the value of Y when X 1 and X 2 = 0  1 and  2 are termed partial regression coefficients  1 expresses the change of Y for one unit of X when  2 is kept constant 5

6 Multiple linear regression – residual error and estimations As the collected data is not expected to fall in a plane an error term must be added The error term summes up to be zero. Estimating the dependent factor and the population parameters: 6

7 Multiple linear regression – general equations In general an finite number (m) of independent variables may be used to estimate the hyperplane The number of sample points must be two more than the number of variables 7

8 Multiple linear regression – least sum of squares The principle of the least sum of squares are usually used to perform the fit: 8

9 Multiple linear regression – An example 9

10 Multiple linear regression – The fitted equation 10

11 Multiple linear regression – Are any of the coefficients significant? F = regression MS / residual MS 11

12 Multiple linear regression – Is it a good fit? R 2 = 1-regression SS / total SS Is an expression of how much of the variation can be described by the model When comparing models with different numbers of variables the ajusted R-square should be used: R a 2 = 1 – regression MS / total MS The multiple regression coefficient: R = sqrt(R 2 ) The standard error of the estimate = sqrt(residual MS) 12

13 Multiple linear regression – Which of the coefficient are significant? s bi is the standard error of the regresion parameter b i t-test tests if b i is different from 0 t = b i / s bi is the residual DF p values can be found in a table 13

14 Multiple linear regression – Which of the are most important? The standardized regression coefficient, b’ is a normalized version of b 14

15 Multiple linear regression - multicollinearity If two factors are well correlated the estimated b’s becomes inaccurate. Collinearity, intercorrelation, nonorthogonality, illconditioning Tolerance or variance inflation factors can be computed Extreme correlation is called singularity and on of the correlated variables must be removed. 15

16 Multiple linear regression – Pairvise correlation coefficients 16

17 Multiple linear regression – Assumptions The same as for simple linear regression: 1.Y’s are randomly sampled 2.The reciduals are normal distributed 3.The reciduals hav equal variance 4.The X’s are fixed factors (their error are small). 5.The X’s are not perfectly correlated 17

18 Logistic regression 18

19 Logistic Regression If the dependent variable is categorical and especially binary? Use some interpolation method Linear regression cannot help us. 19

20 20 The sigmodal curve

21 21 The sigmodal curve The intercept basically just ‘scale’ the input variable

22 22 The sigmodal curve The intercept basically just ‘scale’ the input variable Large regression coefficient → risk factor strongly influences the probability

23 23 The sigmodal curve The intercept basically just ‘scale’ the input variable Large regression coefficient → risk factor strongly influences the probability Positive regression coefficient → risk factor increases the probability Logistic regession uses maximum likelihood estimation, not least square estimation

24 Does age influence the diagnosis? Continuous independent variable 24 Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a Age,109,010108,7451,0001,1151,0921,138 Constant-4,213,42399,0971,000,015 a. Variable(s) entered on step 1: Age.

25 Does previous intake of OCP influence the diagnosis? Categorical independent variable Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a OCP(1)-,311,1802,9791,084,733,5151,043 Constant,233,1233,5831,0581,263 a. Variable(s) entered on step 1: OCP. 25

26 Odds ratio 26

27 Multiple logistic regression Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a Age,123,011115,3431,0001,1311,1061,157 BMI,083,01918,7321,0001,0871,0461,128 OCP,528,2195,8081,0161,6951,1042,603 Constant-6,974,76283,7771,000,001 a. Variable(s) entered on step 1: Age, BMI, OCP. 27

28 Predicting the diagnosis by logistic regression What is the probability that the tumor of a 50 year old woman who has been using OCP and has a BMI of 26 is malignant? z = -6.974 + 0.123*50 + 0.083*26 + 0.28*1 = 1.6140 p = 1/(1+e -1.6140 ) = 0.8340 28 Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a Age,123,011115,3431,0001,1311,1061,157 BMI,083,01918,7321,0001,0871,0461,128 OCP,528,2195,8081,0161,6951,1042,603 Constant-6,974,76283,7771,000,001 a. Variable(s) entered on step 1: Age, BMI, OCP.

29 Exercises 20.1, 20.2 29


Download ppt "Multipe and non-linear regression. What is what? Regression: One variable is considered dependent on the other(s) Correlation: No variables are considered."

Similar presentations


Ads by Google