WLS for Categorical Data

Slides:



Advertisements
Similar presentations
Multilevel analysis with EQS. Castello2004 Data is datamlevel.xls, datamlevel.sav, datamlevel.ess.
Advertisements

A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
I OWA S TATE U NIVERSITY Department of Animal Science PROC ROBUSTRET & Evaluating Regression analyses With The Help of PROC RSQUARE Animal Science 500.
A Model to Evaluate Recreational Management Measures Objective I – Stock Assessment Analysis Create a model to distribute estimated landings (A + B1 fish)
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Regression with Autocorrelated Errors U.S. Wine Consumption and Adult Population –
Logistic Regression Example: Horseshoe Crab Data
Loglinear Contingency Table Analysis Karl L. Wuensch Dept of Psychology East Carolina University.
Structural Equation Modeling
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Log-Linear Models & Dependent Samples Feng Ye, Xiao Guo, Jing Wang.
Adjusting for extraneous factors Topics for today Stratified analysis of 2x2 tables Regression Readings Jewell Chapter 9.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
EPI 809/Spring Multiple Logistic Regression.
1 Modeling Ordinal Associations Section 9.4 Roanna Gee.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Deaths of snails vs exposure by species. Deaths of snails vs exposure by temperature.
The General LISREL Model Ulf H. Olsson Professor of statistics.
Adjusting for extraneous factors Topics for today More on logistic regression analysis for binary data and how it relates to the Wolf and Mantel- Haenszel.
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
Linear statistical models 2009 Count data  Contingency tables and log-linear models  Poisson regression.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
The General (LISREL) SEM model Ulf H. Olsson Professor of statistics.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Logistic Regression II Simple 2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0.
The Chi-square Statistic. Goodness of fit 0 This test is used to decide whether there is any difference between the observed (experimental) value and.
SAS Lecture 5 – Some regression procedures Aidan McDermott, April 25, 2005.
April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.
EIPB 698E Lecture 10 Raul Cruz-Cano Fall Comments for future evaluations Include only output used for conclusions Mention p-values explicitly (also.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Adventures in ODS: Producing Customized Reports Using Output from Multiple SAS® Procedures Stuart Long Westat, Durham,
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time and body weight in the males of a mammalian species.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
Estimation Kline Chapter 7 (skip , appendices)
Applied Epidemiologic Analysis - P8400 Fall 2002
1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Regression Correlation Background Defines relationship between two variables X and Y R ranges from -1 (perfect negative correlation) 0 (No correlation)
Preparing for the final - sample questions with answers.
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
1 Topic 2 LOGIT analysis of contingency tables. 2 Contingency table a cross classification Table containing two or more variables of classification, and.
Nonparametric Tests: Chi Square   Lesson 16. Parametric vs. Nonparametric Tests n Parametric hypothesis test about population parameter (  or  2.
BUSI 6480 Lecture 8 Repeated Measures.
CHI SQUARE TESTS.
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence.
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
Multivariate Statistics Confirmatory Factor Analysis I W. M. van der Veld University of Amsterdam.
Measurement Models: Identification and Estimation James G. Anderson, Ph.D. Purdue University.
Lecture 3 Topic - Descriptive Procedures Programs 3-4 LSB 4:1-4.4; 4:9:4:11; 8:1-8:5; 5:1-5.2.
Estimating and Testing Hypotheses about Means James G. Anderson, Ph.D. Purdue University.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Log-linear Models HRP /03/04 Log-Linear Models for Multi-way Contingency Tables 1. GLM for Poisson-distributed data with log-link (see Agresti.
1 Topic 4 : Ordered Logit Analysis. 2 Often we deal with data where the responses are ordered – e.g. : (i) Eyesight tests – bad; average; good (ii) Voting.
Anova and contingency tables
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Estimation Kline Chapter 7 (skip , appendices)
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Nonparametric Statistics
Analysis of matched data Analysis of matched data.
Regression Models First-order with Two Independent Variables
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Correlation, Regression & Nested Models
ביצוע רגרסיה לוגיסטית. פרק ה-2
Logistic Regression.
Presentation transcript:

WLS for Categorical Data

SAS – CATMOD Procedure To fit a model using PROC CATMOD WEIGHT statement – to specify the weight variable Use WLS option at MODEL statement to obtain WLS estimates

Data - Response Whether the investigation of the child also involves further investigation of the siblings REVSIB = 0 (No), 1 (Yes)

Data – Covariates q1a – relationship to children: 1 – Biological parent 2 – Common-law partner 3 – Foster parent 4 – Adoptive parent 5 – Step-parent 6 – Grandparent 7 – Other

Data - Covariates q2a – Gender of the Caregiver: 0 – Female 1 – Male 99 – No response q3a – Age of the Caregiver: 1 – Less than 19 2 – 19 – 21 3 – 22 – 25 4 – 26 – 30 5 – 31 – 40 6 – Over 40 99 – No Response

SAS Code Saturated model: proc catmod; weight wtr; model revsib=q1a|q2a|q3a_age / wls; run; quit;

Output The CATMOD Procedure Data Summary Response revsib Response Levels 2 Weight Variable wtr Populations 28 Data Set T2 Total Frequency 6821.55 Frequency Missing 59.54 Observations 1574

Analysis of Variance Source DF Chi-Square Pr > ChiSq ------------------------------------------------- Intercept 1 3.70 0.0544 q1a 5 12.89 0.0244 q2a 1 0.18 0.6753 q1a*q2a 4* 18.74 0.0009 q3a_age 5 12.35 0.0303 q1a*q3a_age 7* 28.19 0.0002 q2a*q3a_age 3* 5.17 0.1598 q1a*q2a*q3a_age 2* 13.34 0.0013 Residual 0 . . NOTE: Effects marked with '*' contain one or more redundant or restricted parameters. Q2a – not significant, but has a three-way interaction?

Maximum Likelihood Analysis of Variance Maximum Likelihood Analysis of Variance Source DF Chi-Square Pr > ChiSq --------------------------------------------------- Intercept 1 1727.82 <.0001 q1a 0* . . q2a 0* . . q1a*q2a 0* . . q3a_age 1* . . q1a*q3a_age 7* . . q2a*q3a_age 1* . . q1a*q2a*q3a_age 6* . . Likelihood Ratio 12 0.00 1.0000 NOTE: Effects marked with '*' contain one or more redundant or restricted parameters. Without WEIGHT statement and WLS option – cannot interpret

Analysis of Maximum Likelihood Estimates Standard Chi- Parameter Estimate Error Square Pr > ChiSq ------------------------------------------------------------------------------- Intercept -6.8146 0.1639 1727.82 <.0001 q1a 1 3.3370# . . . 3 19.7614# . . . 4 -29.8195# . . . 5 2.8181# . . . 6 -5.2236# . . . q2a 0 -4.8953# . . . q1a*q2a 1 0 5.2304# . . . 3 0 -19.0829# . . . 4 0 12.8882# . . . 5 0 -3.3065# . . . 6 0 5.6687# . . . q3a_age 1 12.6303# . . . 2 -0.0398 500.1 0.00 0.9999 3 -3.9163# . . . 4 -15.1158# . . . 5 3.0629# . . . Cannot interpret the Estimates

Reduced Model Analysis of Variance Source DF Chi-Square Pr > ChiSq --------------------------------------------- Intercept 1 6.51 0.0107 q1a 5 15.88 0.0072 q3a_age 5 155.85 <.0001 q1a*q3a_age 7* 13.06 0.0707 Residual 0 . . Try model without Q2A – perhaps there’s no interaction between relationship of children and age group of the caregiver

Main Effect Analysis of Variance Source DF Chi-Square Pr > ChiSq --------------------------------------------- Intercept 1 15.76 <.0001 q1a 5 52.18 <.0001 q3a_age 5 366.53 <.0001 Residual 7 13.06 0.0707 Try model with Main Effect only

Analysis of Weighted Least Squares Estimates Standard Chi- Parameter Estimate Error Square Pr > ChiSq ------------------------------------------------------------ Intercept -1.6354 0.4119 15.76 <.0001 q1a 1 -0.1394 0.3190 0.19 0.6622 3 -0.3338 0.8170 0.17 0.6828 4 3.8902 1.2238 10.11 0.0015 5 -2.8567 0.6279 20.70 <.0001 6 -1.3913 0.3849 13.07 0.0003 q3a_age 1 0.1185 1.2875 0.01 0.9267 2 -1.5960 0.3706 18.55 <.0001 3 1.5098 0.2785 29.40 <.0001 4 -0.8969 0.2780 10.41 0.0013 5 0.0673 0.2673 0.06 0.8013 Interpret the estimates: negative estimates  those ones are less likely to have investigation done on the siblings

Conclusion For cases where the Caregiver is “Adoptive parent”, it is “highly likely” that the siblings will also be investigated For Caregiver between age 22-25, those cases will also likely to have the siblings investigated Intercept  when not much information is observed regarding the caregiver, chances are the siblings will not be reviewed in the case.

Questions WLS is more efficient than ML? Should the records with “no response” be deleted? Is “99” the best code to indicate “no response”? How would the model change if we have less category in each covariates?

Thank you 