Presentation is loading. Please wait.

Presentation is loading. Please wait.

Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.

Similar presentations


Presentation on theme: "Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the."— Presentation transcript:

1 Logistic Regression

2 Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the disease Interested to find the attributes that are associated with the onset of the disease Or interested to predict the probability of getting the disease, given a set of attributes Theory

3 Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the disease Interested to find the attributes that are associated with the onset of the disease Or interested to predict the probability of getting the disease, given a set of attributes Fits the model: Effectively a linear model for log odds Theory

4 Lung Cancer An age effect? Associated with smoking?

5 Logistic Regression Assess whether a variable is significantly associated with the response Quantify the association, in terms of odds ratio

6 Logistic Regression Assess whether a variable is significantly associated with the response Quantify the association, in terms of odds ratio Consider the equation 0.03 + 1.4 Smoke + 0.02 Gender + 0.01 (Age – 20) where p = probability of getting lung cancer with baseline of a non-smoking female of age 20 Keep everything else constant to interpret the effects of each variable

7 0.03 + 1.4 Smoke + 0.02 Gender + 0.01 (Age – 20) Non-smoking male of age 20 is exp(0.02) = 1.02 times more likely than a non-smoking female of age 20 to get lung cancer Smoking female of age 20 is exp(1.4) = 4.06 times more likely than non-smoking female of age 20 Non-smoking female of age 50 is exp(30  0.01) = 1.35 times more likely than non-smoking female of age 20 Combining the effects Smoking male of age 50 is exp(1.4 + 0.02 + 0.01  30) = 5.58 times more likely than a non-smoking female of age 20

8

9

10

11

12

13 Note the encodings!

14

15 Interpret based on encodings

16

17

18

19

20

21 Summary Large suite of statistical tools for analysing data Important to choose the appropriate tools for the kind of data available. Most statistical tests require particular assumptions to be valid – need to check these assumptions.


Download ppt "Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the."

Similar presentations


Ads by Google