A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.

Slides:



Advertisements
Similar presentations
- Word counts - Speech error counts - Metaphor counts - Active construction counts Moving further Categorical count data.
Advertisements

Logistic Regression Psy 524 Ainsworth.
Logistic Regression Example: Horseshoe Crab Data
Logistic Regression Predicting Dichotomous Data. Predicting a Dichotomy Response variable has only two states: male/female, present/absent, yes/no, etc.
Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Introduction to Logistic Regression Analysis Dr Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
Generalized Linear Models
Logistic Regression with “Grouped” Data Lobster Survival by Size in a Tethering Experiment Source: E.B. Wilkinson, J.H. Grabowski, G.D. Sherwood, P.O.
Logistic regression for binary response variables.
MATH 3359 Introduction to Mathematical Modeling Download/Import/Modify Data, Logistic Regression.
Logistic Regression and Generalized Linear Models:
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
New Ways of Looking at Binary Data Fitting in R Yoon G Kim, Colloquium Talk.
MATH 3359 Introduction to Mathematical Modeling Project Multiple Linear Regression Multiple Logistic Regression.
Lecture 6 Generalized Linear Models Olivier MISSA, Advanced Research Skills.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
Logistic Regression Pre-Challenger Relation Between Temperature and Field-Joint O-Ring Failure Dalal, Fowlkes, and Hoadley (1989). “Risk Analysis of the.
Introduction to Generalized Linear Models Prepared by Louise Francis Francis Analytics and Actuarial Data Mining, Inc. October 3, 2004.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Repeated Measures  The term repeated measures refers to data sets with multiple measurements of a response variable on the same experimental unit or subject.
Logistic regression. Analysis of proportion data We know how many times an event occurred, and how many times did not occur. We want to know if these.
Linear Model. Formal Definition General Linear Model.
November 5, 2008 Logistic and Poisson Regression: Modeling Binary and Count Data LISA Short Course Series Mark Seiss, Dept. of Statistics.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.
Business Intelligence and Decision Modeling
A preliminary exploration into the Binomial Logistic Regression Models in R and their potential application Andrew Trant PPS Arctic - Labrador Highlands.
Logistic Regression. Linear Regression Purchases vs. Income.
Applied Statistics Week 4 Exercise 3 Tick bites and suspicion of Borrelia Mihaela Frincu
Count Data. HT Cleopatra VII & Marcus Antony C c Aa.
Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.
© Department of Statistics 2012 STATS 330 Lecture 22: Slide 1 Stats 330: Lecture 22.
Université d’Ottawa - Bio Biostatistiques appliquées © Antoine Morin et Scott Findlay :32 1 Logistic regression.
Logistic Regression. Example: Survival of Titanic passengers  We want to know if the probability of survival is higher among children  Outcome (y) =
Statistics 2: generalized linear models. General linear model: Y ~ a + b 1 * x 1 + … + b n * x n + ε There are many cases when general linear models are.
© Department of Statistics 2012 STATS 330 Lecture 24: Slide 1 Stats 330: Lecture 24.
Logistic regression (when you have a binary response variable)
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
1 Fighting for fame, scrambling for fortune, where is the end? Great wealth and glorious honor, no more than a night dream. Lasting pleasure, worry-free.
Variance Stabilizing Transformations. Variance is Related to Mean Usual Assumption in ANOVA and Regression is that the variance of each observation is.
Logistic Regression and Odds Ratios Psych DeShon.
R Programming/ Binomial Models Shinichiro Suna. Binomial Models In binomial model, we have one outcome which is binary and a set of explanatory variables.
 Naïve Bayes  Data import – Delimited, Fixed, SAS, SPSS, OBDC  Variable creation & transformation  Recode variables  Factor variables  Missing.
Logistic Regression Jeff Witmer 30 March Categorical Response Variables Examples: Whether or not a person smokes Success of a medical treatment.
Unit 32: The Generalized Linear Model
Transforming the data Modified from:
Logistic regression.
WiFi password:
Statistical Modelling
Logistic Regression When and why do we use logistic regression?
Logistic Regression APKC – STATS AFAC (2016).
Logistic Regression.
Logistic Regression CSC 600: Data Mining Class 14.
THE LOGIT AND PROBIT MODELS
Generalized Linear Models
LOGISTIC REGRESSION 1.
Drop-in Sessions! When: Hillary Term - Week 1 Where: Q-Step Lab (TBC) Sign up with Alice Evans.
Generalized Linear Models
Introduction to logistic regression a.k.a. Varbrul
Quantitative Methods What lies beyond?.
SAME THING?.
DCAL Stats Workshop Bodo Winter.
Quantitative Methods What lies beyond?.
Do whatever is needed to finish…
Introduction to Logistic Regression
MPHIL AdvancedEconometrics
Logistic Regression with “Grouped” Data
Presentation transcript:

A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson regression (2) binary data  logistic regression

A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson regression (2) binary data  logistic regression

Output example > summary(xglm) Call: glm(formula = error ~ alc, family = "binomial") Deviance Residuals: Min 1Q Median 3Q Max -2.1073 -0.5495 -0.3257 0.7173 1.8070 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.643 1.123 -3.244 0.001179 ** alc 16.118 4.856 3.319 0.000903 ***

The linear model Y ~ b0 + b1*X1 + b2*X2 so this is how we expand this… do this on the blackboard …. make a list with X1 = gender, and X2 = focus or no focus …and then 0 times, 1 times etc.

p(Y) ~ logit-1(b0 + b1*X1 + b2*X2) The logistic model p(Y) ~ logit-1(b0 + b1*X1 + b2*X2) linear predictor so this is how we expand this… do this on the blackboard …. make a list with X1 = gender, and X2 = focus or no focus …and then 0 times, 1 times etc.

Representative values Probability Odds Log odds (= “logits”) 0.1 0.111 -2.197 0.2 0.25 -1.386 0.3 0.428 -0.847 0.4 0.667 -0.405 0.5 1 0.6 1.5 0.405 0.7 2.33 0.847 0.8 4 1.386 0.9 9 2.197 - So a probability of 80% of an event occurring means that the odds are “4 to 1” for it occurring What happens if the odds are 50 to 50? -> ratio is 1 If the probability of non-occurrence is higher than occurrence, fractions If the probability of occurrence is higher, positive numbers

Snijders & Bosker (1999: 212)

= inverse logit function plogis()

Estimate Std. Error z value Pr(>|z|) (Intercept) -3. 643 1. 123 -3 Estimate Std. Error z value Pr(>|z|) (Intercept) -3.643 1.123 -3.244 0.001179 ** alc 16.118 4.856 3.319 0.000903 *** for probabilities: transform the entire LP with the logistic function for odds: transform individual predictors with exp(x) plogis()

General Linear Model Generalized Linear Model

= “Generalizing” the General Linear Model to cases that don’t include continuous response variables (in particular categorical ones) = Consists of two things: (1) an error distribution, (2) a link function Generalized Linear Model

= “Generalizing” the General Linear Model to cases that don’t include continuous response variables (in particular categorical ones) = Consists of two things: (1) an error distribution, (2) a link function Logistic regression: Binomial distribution Poisson regression: Poisson distribution lm(response ~ predictor) glm(response ~ predictor, family=”binomial”) glm(response ~ predictor, family=”poisson”) Logistic regression: Logit link function Poisson regression: Log link function

Simple linear regression & multiple regression = generalized linear model with normal error structure and identity link function