Logistic Regression Part One

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

STA305 Spring 2014 This started with excerpts from STA2101f13
Linear Regression.
Logistic Regression Psy 524 Ainsworth.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Introduction to Regression with Measurement Error STA431: Spring 2015.
Exploratory Factor Analysis
Logistic Regression.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
Structural Equation Models: The General Case STA431: Spring 2013.
Logistic Regression STA302 F 2014 See last slide for copyright information 1.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
GRA 6020 Multivariate Statistics; The Linear Probability model and The Logit Model (Probit) Ulf H. Olsson Professor of Statistics.
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
FIN357 Li1 Binary Dependent Variables Chapter 12 P(y = 1|x) = G(  0 + x  )
An Introduction to Logistic Regression
Introduction to Regression with Measurement Error STA431: Spring 2013.
Logistic regression for binary response variables.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Double Measurement Regression STA431: Spring 2013.
Maximum Likelihood See Davison Ch. 4 for background and a more thorough discussion. Sometimes.
Introduction to Regression with Measurement Error STA302: Fall/Winter 2013.
Factorial ANOVA More than one categorical explanatory variable STA305 Spring 2014.
Learning Theory Reza Shadmehr logistic regression, iterative re-weighted least squares.
Logistic Regression STA2101/442 F 2014 See last slide for copyright information.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Logistic Regression Database Marketing Instructor: N. Kumar.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
Categorical Independent Variables STA302 Fall 2013.
Logistic Regression. Linear Regression Purchases vs. Income.
STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Regression Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Categorical Independent Variables STA302 Fall 2015.
STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Logistic Regression Hal Whitehead BIOL4062/5062.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Nonparametric Statistics
The Sample Variation Method A way to select sample size This slide show is a free open source document. See the last slide for copyright information.
Logistic Regression For a binary response variable: 1=Yes, 0=No This slide show is a free open source document. See the last slide for copyright information.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Logistic Regression: Regression with a Binary Dependent Variable.
Nonparametric Statistics
Probability Theory and Parameter Estimation I
Exploratory Factor Analysis
Survival curves We know how to compute survival curves if everyone reaches the endpoint so there is no “censored” data. Survival at t = S(t) = number still.
Chapter 13 Nonlinear and Multiple Regression
Logistic Regression STA2101/442 F 2017
Poisson Regression STA 2101/442 Fall 2017
Interactions and Factorial ANOVA
STA302: Regression Analysis
Multinomial Logit Models
THE LOGIT AND PROBIT MODELS
Nonparametric Statistics
Statistical Assumptions for SLR
Mathematical Foundations of BME Reza Shadmehr
Categorical Data Analysis Review for Final
Logistic Regression.
Exploratory Factor Analysis. Factor Analysis: The Measurement Model D1D1 D8D8 D7D7 D6D6 D5D5 D4D4 D3D3 D2D2 F1F1 F2F2.
Logistic Regression.
Presentation transcript:

Logistic Regression Part One STA2101/442 F 2012 See last slide for copyright information

For a binary dependent variable: 1=Yes, 0=No Logistic Regression For a binary dependent variable: 1=Yes, 0=No Data analysis text has a lot of this stuff Pr\{Y=1|\mathbf{X}=\mathbf{x}\} = \pi % 42

Least Squares vs. Logistic Regression

Linear regression model for the log odds of the event Y=1 Beta0 is the intercept. $\beta_k$ is the increase in log odds of $Y=1$ when $x_k$ is increased by one unit, and all other independent variables are held constant. \begin{equation*} \log\left(\frac{\pi}{1-\pi} \right) = \beta_0 + \beta_1 x_1 + \ldots + \beta_{p-1} x_{p-1}. \end{equation*} % 32 Note $\pi$ is a \emph{conditional} probability. % 32

Equivalent Statements Odds RATIOS will yield nice cancellation \begin{equation*} \log\left(\frac{\pi}{1-\pi} \right) = \beta_0 + \beta_1 x_1 + \ldots + \beta_{p-1} x_{p-1}. \end{equation*} % 32 \begin{eqnarray*} \frac{\pi}{1-\pi} & = & e^{\beta_0 + \beta_1 x_1 + \ldots + \beta_{p-1} x_{p-1}} \\ & = & e^{\beta_0} e^{\beta_1 x_1} \cdots e^{\beta_{p-1} x_{p-1}}, \nonumber \end{eqnarray*} % 32 \begin{displaymath} \pi = \frac{e^{\beta_0 + \beta_1 x_1 + \ldots + \beta_{p-1} x_{p-1}}} {1+e^{\beta_0 + \beta_1 x_1 + \ldots + \beta_{p-1} x_{p-1}}} \end{displaymath} % 32

A distinctly non-linear function Non-linear in the betas So logistic regression is an example of non-linear regression.

Could use any cumulative distribution function: CDF of the standard normal used to be popular Called probit analysis Can be closely approximated with a logistic regression.

In terms of log odds, logistic regression is like regular regression Beta0 is intercept, beta_k is …

In terms of plain odds, (Exponential function of) the logistic regression coefficients are odds ratios For example, “Among 50 year old men, the odds of being dead before age 60 are three times as great for smokers.” Remember, alpha is an odds ratio

Logistic regression X=1 means smoker, X=0 means non-smoker Y=1 means dead, Y=0 means alive Log odds of death = Odds of death =

Cancer Therapy Example X = disease severity x is severity of disease

For any given disease severity x, What if beta1 > 0 ? Chemo alone is better. Reasonable?

In general, When xk is increased by one unit and all other independent variables are held constant, the odds of Y=1 are multiplied by That is, is an odds ratio --- the ratio of the odds of Y=1 when xk is increased by one unit, to the odds of Y=1 when everything is left alone. As in ordinary regression, we speak of “controlling” for the other variables.

The conditional probability of Y=1 This formula can be used to calculate a predicted P(Y=1|x). Just replace betas by their estimates Nonlinear model for the probability It can also be used to calculate the probability of getting the sample data values we actually did observe, as a function of the betas.

Likelihood Function And that’s about it.

Maximum likelihood estimation Likelihood = Conditional probability of getting the data values we did observe, As a function of the betas Maximize the (log) likelihood with respect to betas. Maximize numerically (“Iteratively re-weighted least squares”) Likelihood ratio, Wald tests as usual Divide regression coefficients by estimated standard errors to get Z-tests of H0: bj=0. These Z-tests are like the t-tests in ordinary regression. No details about the numerical method

Copyright Information This slide show was prepared by Jerry Brunner, Department of Statistics, University of Toronto. It is licensed under a Creative Commons Attribution - ShareAlike 3.0 Unported License. Use any part of it as you like and share the result freely. These Powerpoint slides will be available from the course website: http://www.utstat.toronto.edu/brunner/oldclass/appliedf12