Generalized Linear Model

Slides:



Advertisements
Similar presentations
AP Statistics Chapter 7 – Random Variables. Random Variables Random Variable – A variable whose value is a numerical outcome of a random phenomenon. Discrete.
Advertisements

Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/
AP Statistics Chapter 7 Notes. Random Variables Random Variable –A variable whose value is a numerical outcome of a random phenomenon. Discrete Random.
[Part 1] 1/15 Discrete Choice Modeling Econometric Methodology Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
Nguyen Ngoc Anh Nguyen Ha Trang
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.
Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear.
Linear statistical models 2008 Binary and binomial responses The response probabilities are modelled as functions of the predictors Link functions: the.
Log-linear modeling and missing data A short course Frans Willekens Boulder, July
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
Final Review Session.
Log-linear and logistic models
Linear statistical models 2008 Count data, contingency tables and log-linear models Expected frequency: Log-linear models are linear models of the log.
Analysis of Complex Survey Data Day 3: Regression.
OLS versus MLE Example YX Here is the data:
Linear statistical models 2009 Count data  Contingency tables and log-linear models  Poisson regression.
Multivariate Probability Distributions. Multivariate Random Variables In many settings, we are interested in 2 or more characteristics observed in experiments.
Generalized Linear Models
Poisson Regression Caution Flags (Crashes) in NASCAR Winston Cup Races L. Winner (2006). “NASCAR Winston Cup Race Results for ,” Journal.
SAS Lecture 5 – Some regression procedures Aidan McDermott, April 25, 2005.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 4 and 5 Probability and Discrete Random Variables.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Discrete Random Variables Chapter 4.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Fixed vs. Random Effects Fixed effect –we are interested in the effects of the treatments (or blocks) per se –if the experiment were repeated, the levels.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
Statistical Experiment A statistical experiment or observation is any process by which an measurements are obtained.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 5 Discrete Random Variables.
Linear Model. Formal Definition General Linear Model.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Introduction to Multiple Imputation CFDR Workshop Series Spring 2008.
Modeling the Loss Process for Medical Malpractice Bill Faltas GE Insurance Solutions CAS Special Interest Seminar … Predictive Modeling “GLM and the Medical.
Generalized Linear Models (GLMs) and Their Applications.
SAS® Global Forum 2014 March Washington, DC Got Randomness?
Statistics 2: generalized linear models. General linear model: Y ~ a + b 1 * x 1 + … + b n * x n + ε There are many cases when general linear models are.
© Department of Statistics 2012 STATS 330 Lecture 24: Slide 1 Stats 330: Lecture 24.
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
1 Fighting for fame, scrambling for fortune, where is the end? Great wealth and glorious honor, no more than a night dream. Lasting pleasure, worry-free.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Nonparametric Statistics
Week 7: General linear models Overview Questions from last week What are general linear models? Discussion of the 3 articles.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
 Naïve Bayes  Data import – Delimited, Fixed, SAS, SPSS, OBDC  Variable creation & transformation  Recode variables  Factor variables  Missing.
Nonparametric Statistics
Statistical Modelling
Logistic Regression When and why do we use logistic regression?
Discrete Random Variables
Probability Theory and Parameter Estimation I
IEE 380 Review.
Chapter 13 Nonlinear and Multiple Regression
William Greene Stern School of Business New York University
Generalized Linear Models
Generalized Linear Models
Caution Flags (Crashes) in NASCAR Winston Cup Races
Generalized Linear Models (GLM) in R
Introduction to logistic regression a.k.a. Varbrul
AP Statistics: Chapter 7
Multinomial Distribution
Quantitative Methods What lies beyond?.
Nonparametric Statistics
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
Statistics review Basic concepts: Variability measures Distributions
Quantitative Methods What lies beyond?.
Chapter 2. Random Variables
Chp 7 Logit Models for Multivariate Responses
Presentation transcript:

Generalized Linear Model

Generalized Linear Model A Unified Theory Various responses Binary Ordinal Count Polytomous

Generalized Linear Model A Unified Theory

Mean Structure Ordinary Linear Model Generalized Linear Model

Link Functions Link function: Log link Logit link Log-log link Probit link

Logistic Regression Model Link functions

Poisson Regression Model Link Functions Example: Auto Insurance

Ordinal Regression Model Link Functions

Polytomous Regression Model Link Functions

Polytomous Regression Properties Therefore,

Deviance Likelihood function Deviance Objective: Measuring discrepancy (like residual sum of squares)

Normal Example Likelihood function Deviance

Poisson Example Likelihood function Deviance

Analysis of Deviance Model d.f. Discrepancy s.s. 1 11 1000 A 8 500 3 A+B 6 200 300 2 A+B+A*B

Pearson Residuals Define where Example: Normal Distribution

Deviance Residual Deviance Define Example: Normal Distribution

Logistic Regression

Binary Responses Example Properties Credit approval, employment Response can only take one of the two possible outcomes Covariates could be anything

Logistic Regression Model Link functions

Case Study Objective: Comparing site preference for lizard Data Source: Fienberg (1970b) Variables Response: Site preference (Sunny/Shady). Discretized perch height and diameter Time of the data (Early, Mid, Late) Species: Grahami and Opalinus

Statistical Model

SAS Program proc genmod data=A0; class site diameter height time species; freq number; model species = diameter height time site /dist=bin link=logit p r type3; run;

Logit Link

Probit Link

Poisson Regression

Count Responses Example Properties Auto accidents, service request Properties Constant arriving rate Independent waiting time Waiting time is memorylessness Then, the number of requests per unit time has to be a Poisson random variable

Poisson Distribution Probability Density Function Mean and Variance

Normalization Transformation Define transformation Limiting distribution (why?)

Variance Stabilization Transformation Define It can be obtained then Therefore

Poisson Regression Model Link Functions Example: Auto Insurance

Case Study Objective: What cause the wave damage to cargo ships Data Source: Lloyd’s Register of Shipping by J. Crilley and L. N. Heminway Variables Ship type: A – E. Year of construction: 60-64, 65-69, 70-74,75-79 Period of operation: 60-74, 75-79 Aggregate months services

Statistical Model Log(expected number of incidents) = log(aggregate month services) + (effect due to ship type) + (effect due to year of construction) + (effect due to service period)

SAS Program proc genmod data=A0; class type year period; model number = type year period logMonth/dist=P link=log p r type3; run;

Ordinal Regression

Ordinal Responses Example Properties Preference data No numerical meaning Order dose matters Consecutive categories can be collapsed into one

Ordinal Regression Model Link Functions

Case Study Objective: Which cheese customers like most? Data Source: Experiment done by Dr. Graeme Newell Variables Cheese type: A – D. Response: 1 – 9 with larger value = better

Statistical Model

SAS Program proc genmod data=A0; class type; freq total; model pref = type/dist=multinomial link=cumlogit p r type3; run;

Progress Report Due: Next week before the class Requirement: Electronic submission by e-mail In one zipped file, no other format will be accepted The zip file should contain: Finalize project proposal in WORD or PDF format Cleaned data set: In SAS format Preliminary/Descriptive Analysis Report Please refer to the sample directory structure on server

New Project A animal study Requirement Four treatment groups with one is control Ordinal responses were measure on 17 consecutive days Question: Is there a treatment effect? Requirement Formal report as before Due: In two weeks (Dec. 13th)