Logistic Regression Saed Sayad 1www.ismartsoft.com.

Slides:



Advertisements
Similar presentations
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Advertisements

Brief introduction on Logistic Regression
Logistic Regression.
Logistic Regression STA302 F 2014 See last slide for copyright information 1.
Multivariate linear models for regression and classification Outline: 1) multivariate linear regression 2) linear classification (perceptron) 3) logistic.
Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,
Regression with a Binary Dependent Variable
Copyright © 2008 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics, 9e Managerial Economics Thomas Maurice.
GRA 6020 Multivariate Statistics; The Linear Probability model and The Logit Model (Probit) Ulf H. Olsson Professor of Statistics.
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
An Introduction to Logistic Regression JohnWhitehead Department of Economics Appalachian State University.
An Introduction to Logistic Regression
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Lecture 5 Correlation and Regression
Logistic Regression II Simple 2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0.
Week 6: Model selection Overview Questions from last week Model selection in multivariable analysis -bivariate significance -interaction and confounding.
Multinomial Distribution
Logistic Regression STA2101/442 F 2014 See last slide for copyright information.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Logistic Regression Database Marketing Instructor: N. Kumar.
AN INTRODUCTION TO LOGISTIC REGRESSION ENI SUMARMININGSIH, SSI, MM PROGRAM STUDI STATISTIKA JURUSAN MATEMATIKA UNIVERSITAS BRAWIJAYA.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
Logistic (regression) single and multiple. Overview  Defined: A model for predicting one variable from other variable(s).  Variables:IV(s) is continuous/categorical,
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
Chapter 14 Inference for Regression AP Statistics 14.1 – Inference about the Model 14.2 – Predictions and Conditions.
Copyright © 2005 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics Thomas Maurice eighth edition Chapter 4.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Loan Default Model Saed Sayad 1www.ismartsoft.com.
© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
Logistic Regression. Linear Regression Purchases vs. Income.
Multiple Logistic Regression STAT E-150 Statistical Methods.
AMMBR II Gerrit Rooks. Checking assumptions in logistic regression Hosmer & Lemeshow Residuals Multi-collinearity Cooks distance.
LOGISTIC REGRESSION Binary dependent variable (pass-fail) Odds ratio: p/(1-p) eg. 1/9 means 1 time in 10 pass, 9 times fail Log-odds ratio: y = ln[p/(1-p)]
Logistic Regression Analysis Gerrit Rooks
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Machine Learning 5. Parametric Methods.
Dates Presentations Wed / Fri Ex. 4, logistic regression, Monday Dec 7 th Final Tues. Dec 8 th, 3:30.
ALISON BOWLING MAXIMUM LIKELIHOOD. GENERAL LINEAR MODEL.
Beginning Statistics Table of Contents HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Nonparametric Statistics
Logistic Regression For a binary response variable: 1=Yes, 0=No This slide show is a free open source document. See the last slide for copyright information.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Week 7: General linear models Overview Questions from last week What are general linear models? Discussion of the 3 articles.
Model Evaluation Saed Sayad
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Nonparametric Statistics
Chapter 4: Basic Estimation Techniques
BINARY LOGISTIC REGRESSION
Logistic Regression When and why do we use logistic regression?
Chapter 13 Nonlinear and Multiple Regression
Basic Estimation Techniques
Logistic Regression Part One
Statistical Learning Dong Liu Dept. EEIS, USTC.
Basic Estimation Techniques
Nonparametric Statistics
Statistical Assumptions for SLR
Mathematical Foundations of BME Reza Shadmehr
Modeling with Dichotomous Dependent Variables
Chapter 14 Inference for Regression
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
Presentation transcript:

Logistic Regression Saed Sayad 1www.ismartsoft.com

Definition Logistic Regression is a type of regression model where the dependent variable (target) has just two values, such as: 0, 1 Y, N F, T Logistic Regression is a type of regression model where the dependent variable (target) has just two values, such as: 0, 1 Y, N F, T 2www.ismartsoft.com

Sample Dataset Months n BusinessBalanceDefault 189$429, $240, $231, $196, $193, $190, $184, $152, $151, $135, $119, $116, $123,

Linear Regression ( Continuous Dependent Variable ) Months in Business Balance

Linear Regression ( Binary Dependent Variable ) Default Months in Business

Linear Regression Model – Binary Target If the actual Y is a binary variable then the predicted Y can be less than zero or greater than 1 If the actual Y is a binary variable then error is not normally distributed. If the actual Y is a binary variable then the predicted Y can be less than zero or greater than 1 If the actual Y is a binary variable then error is not normally distributed. 6www.ismartsoft.com

Linear Regression Model 0 1 Y Y X X 7www.ismartsoft.com

Frequency Table Months in BusinessCount Default Count Default Frequency < >300441

Frequency Plot 9 Months in Business - Bins Default Probability

Logistic Function

Logistic Regression  The logistic distribution constrains the estimated probabilities to lie between 0 and 1.  Maximum Likelihood Estimation is a statistical method for estimating the coefficients of a model.  The logistic distribution constrains the estimated probabilities to lie between 0 and 1.  Maximum Likelihood Estimation is a statistical method for estimating the coefficients of a model. 11www.ismartsoft.com

Logistic Regression Model 0 1 Linear Model Logistic Model Y Y X X 12www.ismartsoft.com

Maximum Likelihood Estimation (MLE) MLE maximizes the log likelihood (LL) which reflects how likely it is that the dependent variable will be predicted from the independent variables. MLE is an iterative algorithm which starts with initial arbitrary numbers of what the coefficients should be. After this initial function is estimated, the process is repeated until LL does not change significantly. MLE maximizes the log likelihood (LL) which reflects how likely it is that the dependent variable will be predicted from the independent variables. MLE is an iterative algorithm which starts with initial arbitrary numbers of what the coefficients should be. After this initial function is estimated, the process is repeated until LL does not change significantly. 13www.ismartsoft.comCopyright iSmartsoft Inc. 2008

Log Likelihood (LL) Likelihood is the probability that the dependent variable may be predicted from the independent variables. LL is calculated through iteration, using maximum likelihood estimation (MLE). Log likelihood is the basis for tests of a logistic model. Likelihood is the probability that the dependent variable may be predicted from the independent variables. LL is calculated through iteration, using maximum likelihood estimation (MLE). Log likelihood is the basis for tests of a logistic model.

Log Likelihood Test (-2LL) The log likelihood test is a test of the significance of the difference between the likelihood ratio for the baseline model minus the likelihood ratio for a reduced model. This difference is called "model chi-square“. Also called Likelihood Ratio test. The log likelihood test is a test of the significance of the difference between the likelihood ratio for the baseline model minus the likelihood ratio for a reduced model. This difference is called "model chi-square“. Also called Likelihood Ratio test.

Wald Test A Wald test is used to test the statistical significance of each coefficient (  ) in the model. A Wald test calculates a Z statistic, which is: This Z value is then squared, yielding a Wald statistic with a chi-square distribution. A Wald test is used to test the statistical significance of each coefficient (  ) in the model. A Wald test calculates a Z statistic, which is: This Z value is then squared, yielding a Wald statistic with a chi-square distribution.

Summary Logistic Regression is a classification method. It returns the probability that the binary dependent variable may be predicted from the independent variables. Maximum Likelihood Estimation is a statistical method for estimating the coefficients of the model. The Likelihood Ratio test is used to test the statistical significance between the full model and the simpler model. The Wald test is used to test the statistical significance of each coefficient in the model. Logistic Regression is a classification method. It returns the probability that the binary dependent variable may be predicted from the independent variables. Maximum Likelihood Estimation is a statistical method for estimating the coefficients of the model. The Likelihood Ratio test is used to test the statistical significance between the full model and the simpler model. The Wald test is used to test the statistical significance of each coefficient in the model.

18www.ismartsoft.com Questions?