Logistic Regression For a binary response variable: 1=Yes, 0=No This slide show is a free open source document. See the last slide for copyright information.

Slides:



Advertisements
Similar presentations
Statistical Analysis SC504/HS927 Spring Term 2008
Advertisements

STA305 Spring 2014 This started with excerpts from STA2101f13
Brief introduction on Logistic Regression
Logistic Regression Psy 524 Ainsworth.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Introduction to Regression with Measurement Error STA431: Spring 2015.
Logistic Regression.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
Structural Equation Models: The General Case STA431: Spring 2013.
Logistic Regression STA302 F 2014 See last slide for copyright information 1.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
SLIDE 1IS 240 – Spring 2010 Logistic Regression The logistic function: The logistic function is useful because it can take as an input any.
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
EPI 809/Spring Multiple Logistic Regression.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
An Introduction to Logistic Regression
Introduction to Regression with Measurement Error STA431: Spring 2013.
Generalized Linear Models
STAT E-150 Statistical Methods
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
Log-linear Models For 2-dimensional tables. Two-Factor ANOVA (Mean rot of potatoes) Bacteria Type Temp123 1=Cool 2=Warm.
Double Measurement Regression STA431: Spring 2013.
Maximum Likelihood See Davison Ch. 4 for background and a more thorough discussion. Sometimes.
Introduction to Regression with Measurement Error STA302: Fall/Winter 2013.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
Factorial ANOVA More than one categorical explanatory variable STA305 Spring 2014.
Logistic Regression STA2101/442 F 2014 See last slide for copyright information.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
Copyright © 2005 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics Thomas Maurice eighth edition Chapter 4.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
The binomial applied: absolute and relative risks, chi-square.
Simple Linear Regression. The term linear regression implies that  Y|x is linearly related to x by the population regression equation  Y|x =  +  x.
Confirmatory Factor Analysis Part Two STA431: Spring 2013.
Categorical Independent Variables STA302 Fall 2013.
Logistic Regression. Linear Regression Purchases vs. Income.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Regression Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Categorical Independent Variables STA302 Fall 2015.
STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Nonparametric Statistics
The Sample Variation Method A way to select sample size This slide show is a free open source document. See the last slide for copyright information.
Logistic Regression Logistic Regression - Binary Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Nonparametric Statistics
Chapter 4: Basic Estimation Techniques
Chapter 4 Basic Estimation Techniques
Basic Estimation Techniques
Logistic Regression STA2101/442 F 2017
Poisson Regression STA 2101/442 Fall 2017
Interactions and Factorial ANOVA
Logistic Regression Part One
Multinomial Logit Models
Generalized Linear Models
Basic Estimation Techniques
Nonparametric Statistics
Logistic Regression.
Presentation transcript:

Logistic Regression For a binary response variable: 1=Yes, 0=No This slide show is a free open source document. See the last slide for copyright information.

Binary outcomes are common and important The patient survives the operation, or does not. The accused is convicted, or is not. The customer makes a purchase, or does not. The marriage lasts at least five years, or does not. The student graduates, or does not.

For a binary variable The population mean E[Y] is the probability that Y=1 Make the mean depend on a set of explanatory variables Consider one explanatory variable. Think of a scatterplot

Least Squares vs. Logistic Regression

The logistic regression curve arises from an indirect representation of the probability of Y=1 for a given set of x values. Representing the probability of an event by

If P(Y=1)=1/2, odds =.5/(1-.5) = 1 (to 1) If P(Y=1)=2/3, odds = 2 (to 1) If P(Y=1)=3/5, odds = (3/5)/(2/5) = 1.5 (to 1) If P(Y=1)=1/5, odds =.25 (to 1)

The higher the probability, the greater the odds

Linear model for the log odds Natural log, not base 10 Symbolized ln

Some facts about ln The higher the probability, the higher the log odds. ln(e)=1, e = … Only defined for positive numbers. So logistic regression will not work for events of probability exactly zero or exactly one (why not one?)

The log of a product is the sum of logs This means the log of an odds ratio is the difference between the two log odds quantities.

Linear regression model for the log odds of the event Y=1

Equivalent Statements

In terms of log odds, logistic regression is like regular regression

In terms of plain odds, Logistic regression coefficients represent odds ratios For example, “Among 50 year old men, the odds of being dead before age 60 are three times as great for smokers.”

Logistic regression X=1 means smoker, X=0 means non- smoker Y=1 means dead, Y=0 means alive Log odds of death = Odds of death =

Exponential function f(t) = e t Always positive e 0 =1, so when, the odds ratio equals one (50-50). f(t) = e t is increasing

One more example

For any given disease severity x,

In general, When x k is increased by one unit and all other explanatory variables are held constant, the odds of Y=1 are multiplied by That is, is an odds ratio --- the ratio of the odds of Y=1 when x k is increased by one unit, to the odds of Y=1 when everything is left alone. As in ordinary regression, we speak of “controlling” for the other variables.

The conditional probability of Y=1 This formula can be used to calculate an estimated P(Y=1) Just replace betas by their estimates (b) It can also be used to calculate the probability of getting The sample data values we actually did observe.

Maximum likelihood estimation Likelihood = Probability of getting the data values we did observe Viewed as a function of the parameters (betas), it’s called the “likelihood function.” Those parameter values for which the likelihood function is greatest are called the maximum likelihood estimates. Thank you again, Mr. Fisher.

Likelihood Function for Simple Logistic Regression

Maximum likelihood estimates Must be found numerically. Lead to nice large-sample chi-square tests. We will mostly use Wald tests.

Copyright Information This slide show was prepared by Jerry Brunner, Department of Statistical Sciences, University of Toronto. It is licensed under a Creative Commons Attribution - ShareAlike 3.0 Unported License. Use any part of it as you like and share the result freely. These Powerpoint slides are available from the course website: