Logistic Regression In logistic regression the outcome variable is binary, and the purpose of the analysis is to assess the effects of multiple explanatory.

Slides:



Advertisements
Similar presentations
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Advertisements

732G21/732G28/732A35 Lecture computer programmers with different experience have performed a test. For each programmer we have recorded whether.
Logistic Regression.
Simple Logistic Regression
Trashball: A Logistic Regression Classroom Activity Christopher Morrell (Joint work with Richard Auer) Mathematics and Statistics Department Loyola University.
Logistic Regression Example: Horseshoe Crab Data
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
REGRESSION Want to predict one variable (say Y) using the other variable (say X) GOAL: Set up an equation connecting X and Y. Linear regression linear.
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Log-linear and logistic models
EPI 809/Spring Multiple Logistic Regression.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
An Introduction to Logistic Regression
Generalized Linear Models
Logistic regression for binary response variables.
Logistic Regression Logistic Regression - Dichotomous Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
Regression and Correlation
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Session 10. Applied Regression -- Prof. Juran2 Outline Binary Logistic Regression Why? –Theoretical and practical difficulties in using regular (continuous)
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
AN INTRODUCTION TO LOGISTIC REGRESSION ENI SUMARMININGSIH, SSI, MM PROGRAM STUDI STATISTIKA JURUSAN MATEMATIKA UNIVERSITAS BRAWIJAYA.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
Logistic (regression) single and multiple. Overview  Defined: A model for predicting one variable from other variable(s).  Variables:IV(s) is continuous/categorical,
Chap 14-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Multiple Logistic Regression STAT E-150 Statistical Methods.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
Heart Disease Example Male residents age Two models examined A) independence 1)logit(╥) = α B) linear logit 1)logit(╥) = α + βx¡
LOGISTIC REGRESSION Binary dependent variable (pass-fail) Odds ratio: p/(1-p) eg. 1/9 means 1 time in 10 pass, 9 times fail Log-odds ratio: y = ln[p/(1-p)]
Logistic regression (when you have a binary response variable)
1 Say good things, think good thoughts, and do good deeds.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Chapter 9 Minitab Recipe Cards. Contingency tests Enter the data from Example 9.1 in C1, C2 and C3.
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
Week 7: General linear models Overview Questions from last week What are general linear models? Discussion of the 3 articles.
Logistic Regression Logistic Regression - Binary Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international.
Nonparametric Statistics
Chapter 4: Basic Estimation Techniques
BINARY LOGISTIC REGRESSION
Chapter 4 Basic Estimation Techniques
Logistic Regression When and why do we use logistic regression?
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
Logistic Regression APKC – STATS AFAC (2016).
CHAPTER 7 Linear Correlation & Regression Methods
Notes on Logistic Regression
Chapter 13 Nonlinear and Multiple Regression
Least Square Regression
Generalized Linear Models
Multiple logistic regression
Stats Club Marnie Brennan
Nonparametric Statistics
Topic 10 - Categorical Outcomes
Logistic Regression.
Presentation transcript:

Logistic Regression In logistic regression the outcome variable is binary, and the purpose of the analysis is to assess the effects of multiple explanatory variables, which can be numeric and/or categorical, on the outcome variable.

Requirements for Logistic Regression The Following need to be specified: 1) An outcome variable with two possible categorical outcomes (1=success; 0=failure). 2) A way to estimate the probability P of the outcome variable. 3) A way of linking the outcome variable to the explanatory variables. 4) A way of estimating the coefficients of the regression equation, as well as their confidence intervals. 5) A way to test the goodness of fit of the regression model.

Measuring the Probability of Outcome The probability of the outcome is measured by the odds of occurrence of an event. If P is the probability of an event, then (1-P) is the probability of it not occurring. Odds of success = P / 1-P

The Logistic Regression The joint effects of all explanatory variables put together on the odds is The joint effects of all explanatory variables put together on the odds is Odds = P/1-P = e α + β1X1 + β2X2 + …+βpXp Taking the logarithms of both sides Log{P/1-P} = log α+β1X1+β2X2+…+βpXp Logit P = α+β1X1+β2X2+..+βpXp The coefficients β1, β2, βp are such that the sums of the squared distance between the observed and predicted values (i.e. regression line) are smallest.

The Logistic Regression Logit p = α + β1X1 +β2X βpXp Logit p = α + β1X1 +β2X βpXp α represents the overall disease risk β1 represents the fraction by which the disease risk is altered by a unit change in X1 β2 is the fraction by which the disease risk is altered by a unit change in X2 β2 is the fraction by which the disease risk is altered by a unit change in X2 ……. and so on. ……. and so on. What changes is the log odds. The odds themselves are changed by e β If β = 1.6 the odds are e 1.6 = 4.95

Analysis in Logistic Regression - 1 The study to be analysed is about the use of radioisotope thallium while the subject is made to exercise. 100 subjects underwent both thallium exercise and cardiac catheterisation. Some were on propranol. Change in heart rate if more than 85% of maximum, E.C.G. and occurrence of pain during exercise were recorded. The study to be analysed is about the use of radioisotope thallium while the subject is made to exercise. 100 subjects underwent both thallium exercise and cardiac catheterisation. Some were on propranol. Change in heart rate if more than 85% of maximum, E.C.G. and occurrence of pain during exercise were recorded.

Interpreting the Computer Printout Logistic Regression Table Odds 95% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant Stn Propran HrtRate IscExr Sex PnExr Log-Likelihood = Test that all slopes are zero: G = , DF = 6, P-Value = Goodness-of-Fit Tests Method Chi-Square DF P Pearson Deviance Hosmer-Lemeshow

Interpreting the Computer Printout - 2 Table of Observed and Expected Frequencies: (See Hosmer-Lemeshow Test for the Pearson Chi-Square Statistic) Group Value Total 1 Obs Exp Obs Exp Pairs Number Percent Summary Measures Concordant % Somers' D 0.51 Discordant % Goodman-Kruskal Gamma 0.52 Ties % Kendall's Tau-a 0.23 Total %

Regression Diagnostics In logistic regression Residual = 1− Estimated probability. Residuals for each subject are calculated standardised and plotted against probability. Eight diagnostic plots are available, four dealing with residuals and four with leverage. In logistic regression Residual = 1− Estimated probability. Residuals for each subject are calculated standardised and plotted against probability. Eight diagnostic plots are available, four dealing with residuals and four with leverage. These plots are demonstrated in the slides that follow. These plots are demonstrated in the slides that follow.

Diagnostic plots for residuals

Diagnostic plots for leverage