- Word counts - Speech error counts - Metaphor counts - Active construction counts Moving further Categorical count data.

Slides:



Advertisements
Similar presentations
Workshop in R & GLMs: #3 Diane Srivastava University of British Columbia
Advertisements

Lecture 11 (Chapter 9).
Lecture Data Mining in R 732A44 Programming in R.
Multinomial Logistic Regression David F. Staples.
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/
Data: Crab mating patterns Data: Typists (Poisson with random effects) (Poisson Regression, ZIP model, Negative Binomial) Data: Challenger (Binomial with.
Logistic Regression Psy 524 Ainsworth.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Week 3. Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs.
Simple Logistic Regression
Logistic Regression Example: Horseshoe Crab Data
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Logistic Regression Predicting Dichotomous Data. Predicting a Dichotomy Response variable has only two states: male/female, present/absent, yes/no, etc.
Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,
EPI 809/Spring Multiple Logistic Regression.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Generalized Linear Models
Logistic regression for binary response variables.
Review of Lecture Two Linear Regression Normal Equation
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
Logistic Regression Pre-Challenger Relation Between Temperature and Field-Joint O-Ring Failure Dalal, Fowlkes, and Hoadley (1989). “Risk Analysis of the.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Today: Lab 9ab due after lecture: CEQ Monday: Quizz 11: review Wednesday: Guest lecture – Multivariate Analysis Friday: last lecture: review – Bring questions.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Generalized Linear Models All the regression models treated so far have common structure. This structure can be split up into two parts: The random part:
Linear Model. Formal Definition General Linear Model.
Different Distributions David Purdie. Topics Application of GEE to: Binary outcomes: – logistic regression Events over time (rate): –Poisson regression.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.
Logistic Regression. Linear Regression Purchases vs. Income.
Log-linear Models HRP /03/04 Log-Linear Models for Multi-way Contingency Tables 1. GLM for Poisson-distributed data with log-link (see Agresti.
CHAPTER 10: Logistic Regression. Binary classification Two classes Y = {0,1} Goal is to learn how to correctly classify the input into one of these two.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Statistics 2: generalized linear models. General linear model: Y ~ a + b 1 * x 1 + … + b n * x n + ε There are many cases when general linear models are.
Dates Presentations Wed / Fri Ex. 4, logistic regression, Monday Dec 7 th Final Tues. Dec 8 th, 3:30.
Remembering way back: Generalized Linear Models Ordinary linear regression What if we want to model a response that is not Gaussian?? We may have experiments.
Logistic regression (when you have a binary response variable)
1 Introduction to Modeling Beyond the Basics (Chapter 7)
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
1 Fighting for fame, scrambling for fortune, where is the end? Great wealth and glorious honor, no more than a night dream. Lasting pleasure, worry-free.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Logistic Regression Hal Whitehead BIOL4062/5062.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Logistic Regression and Odds Ratios Psych DeShon.
R Programming/ Binomial Models Shinichiro Suna. Binomial Models In binomial model, we have one outcome which is binary and a set of explanatory variables.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
BINARY LOGISTIC REGRESSION
WiFi password:
Statistical Modelling
A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.
Logistic Regression APKC – STATS AFAC (2016).
Generalized Linear Models
Regression Techniques
Generalized Linear Models
Introduction to logistic regression a.k.a. Varbrul
Quantitative Methods What lies beyond?.
DCAL Stats Workshop Bodo Winter.
Quantitative Methods What lies beyond?.
Do whatever is needed to finish…
Introduction to Logistic Regression
Presentation transcript:

- Word counts - Speech error counts - Metaphor counts - Active construction counts Moving further Categorical count data

Hissing Koreans Winter & Grawunder (2012)

No. of Cases Bentz & Winter (2013)

Poisson Model

Siméon Poisson 1898: Ladislaus Bortkiewicz Army Corps with few Horses Army Corps lots of Horses few deaths low variability many deaths high variability The Poisson Distribution

Poisson Regression = generalized linear model with Poisson error structure and log link function

The Poisson Model Y ~ log(b 0 + b 1 *X 1 + b 2 *X 2 )

In R: lmer(my_counts ~ my_predictors + (1|subject), mydataset, family="poisson")

Poisson model output log values predicted mean rate exponentiate

Poisson Model

- Focus vs. no-focus - Yes vs. No - Dative vs. genitive - Correct vs. incorrect Moving further Binary categorical data

Bentz & Winter (2013) Case yes vs. no ~ Percent L2 speakers

Logistic Regression = generalized linear model with binomial error structure and logistic link function

The Logistic Model p(Y) ~ logit -1 (b 0 + b 1 *X 1 + b 2 *X 2 )

In R: lmer(binary_variable ~ my_predictors + (1|subject), mydataset, family="binomial")

Probabilities and Odds Probability of an Event Odds of an Event

Intuition about Odds N = 12 What are the odds that I pick a blue marble? Answer: 2/10

Log odds = logit function

Representative values ProbabilityOddsLog odds (= “logits”)

Snijders & Bosker (1999: 212)

Bentz & Winter (2013)

Estimate Std. Error z value Pr(>|z|) (Intercept) Percent.L Case yes vs. no ~ Percent L2 speakers Log odds when Percent.L2 = 0

Bentz & Winter (2013)

Estimate Std. Error z value Pr(>|z|) (Intercept) Percent.L Case yes vs. no ~ Percent L2 speakers For each increase in Percent.L2 by 1%, how much the log odds decrease (= the slope)

Bentz & Winter (2013)

Estimate Std. Error z value Pr(>|z|) (Intercept) Percent.L Case yes vs. no ~ Percent L2 speakers Logits or “log odds” Exponentiate Transform by inverse logit Odds Proba- bilitie s

Estimate Std. Error z value Pr(>|z|) (Intercept) Percent.L Case yes vs. no ~ Percent L2 speakers Logits or “log odds” Transform by inverse logit Odds Proba- bilitie s exp( )

Estimate Std. Error z value Pr(>|z|) (Intercept) Percent.L Case yes vs. no ~ Percent L2 speakers Logits or “log odds” exp( ) Transform by inverse logit Proba- bilitie s

Odds > 1 < 1 Numerator more likely Denominator more likely = event happens more often than not = event is more likely not to happen

Estimate Std. Error z value Pr(>|z|) (Intercept) Percent.L Case yes vs. no ~ Percent L2 speakers Logits or “log odds” exp( ) Transform by inverse logit Proba- bilitie s

Estimate Std. Error z value Pr(>|z|) (Intercept) Percent.L Case yes vs. no ~ Percent L2 speakers Logits or “log odds” logit.inv(1.4576) 0.81

Bentz & Winter (2013) About 80%(makes sense)

Estimate Std. Error z value Pr(>|z|) (Intercept) Percent.L Case yes vs. no ~ Percent L2 speakers Logits or “log odds” logit.inv(1.4576) 0.81 logit.inv( *0.3) 0.37

Bentz & Winter (2013)

= logit function = inverse logit function

This is the famous “logistic function” logit -1

Inverse logit function (transforms back to probabilities) logit.inv = function(x){exp(x)/(1+exp(x))} (this defines the function in R)

General Linear Model General Linear Model Generalized Linear Model Generalized Linear Model Generalized Linear Mixed Model

General Linear Model General Linear Model Generalized Linear Model Generalized Linear Model Generalized Linear Mixed Model

General Linear Model General Linear Model Generalized Linear Model Generalized Linear Model Generalized Linear Mixed Model

Generalized Linear Model Generalized Linear Model = “Generalizing” the General Linear Model to cases that don’t include continuous response variables (in particular categorical ones) = Consists of two things: (1) an error distribution, (2) a link function

= “Generalizing” the General Linear Model to cases that don’t include continuous response variables (in particular categorical ones) = Consists of two things: (1) an error distribution, (2) a link function Logistic regression: Binomial distribution Poisson regression: Poisson distribution Logistic regression: Logit link function Poisson regression: Log link function

= “Generalizing” the General Linear Model to cases that don’t include continuous response variables (in particular categorical ones) = Consists of two things: (1) an error distribution, (2) a link function Logistic regression: Binomial distribution Poisson regression: Poisson distribution Logistic regression: Logit link function Poisson regression: Log link function lm(response ~ predictor) glm(response ~ predictor, family="binomial") glm(response ~ predictor, family="poisson")

Categorical Data Dichotomous/Binary Count Logistic Regression Poisson Regression

General structure Linear Model continuous~any type of variable Logistic Regression dichotomous~any type of variable Poisson Regression count~any type of variable

For the generalized linear mixed model… … you only have to specify the family. lmer(…) lmer(…,family="poisson") lmer(…,family="binomial")

That’s it (for now)