Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / §❶ Review of Likelihood Inference Robert J. Tempelman 1.

Slides:

Advertisements

Similar presentations

Bayes rule, priors and maximum a posteriori

Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.

Continued Psy 524 Ainsworth

A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.

Brief introduction on Logistic Regression

Data: Crab mating patterns Data: Typists (Poisson with random effects) (Poisson Regression, ZIP model, Negative Binomial) Data: Challenger (Binomial with.

Logistic Regression Psy 524 Ainsworth.

Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.

CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.

1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.

Overview of Logistics Regression and its SAS implementation

The Power of Proc Nlmixed. Introduction Proc Nlmixed fits nonlinear mixed-effects models (NLMMs) – models in which the fixed and random effects have a.

Integration of sensory modalities

Models with Discrete Dependent Variables

Exact Logistic Regression Larry Cook. Outline Review the logistic regression model Explore an example where model assumptions fail –Brief algebraic interlude.

Applied Bayesian Inference, KSU, April 29, 2012 §  / §❸Empirical Bayes Robert J. Tempelman 1.

1 STA 517 – Introduction: Distribution and Inference 1.5 STATISTICAL INFERENCE FOR MULTINOMIAL PARAMETERS  Recall multi(n, =( 1,  2, …,  c ))  Suppose.

Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.

Maximum likelihood (ML) and likelihood ratio (LR) test

Maximum likelihood (ML)

Maximum likelihood (ML) and likelihood ratio (LR) test

Maximum-Likelihood estimation Consider as usual a random sample x = x 1, …, x n from a distribution with p.d.f. f (x;  ) (and c.d.f. F(x;  ) ) The maximum.

EPI 809/Spring Multiple Logistic Regression.

Maximum likelihood (ML)

Generalized Linear Models

9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.

SAS Lecture 5 – Some regression procedures Aidan McDermott, April 25, 2005.

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:

The Triangle of Statistical Inference: Likelihoood

Model Inference and Averaging

Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.

Random Sampling, Point Estimation and Maximum Likelihood.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.

Applied Bayesian Inference, KSU, April 29, 2012 § ❷ / §❷ An Introduction to Bayesian inference Robert J. Tempelman 1.

April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.

2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Conjugate Priors Multinomial Gaussian MAP Variance Estimation Example.

April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.

Multilevel Modeling Software Wayne Osgood Crime, Law & Justice Program Department of Sociology.

The Triangle of Statistical Inference: Likelihoood Data Scientific Model Probability Model Inference.

GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.

© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.

Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.

Lecture 12: Cox Proportional Hazards Model

Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.

Confirmatory Factor Analysis Part Two STA431: Spring 2013.

1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.

1 Follow the three R’s: Respect for self, Respect for others and Responsibility for all your actions.

Lecture 2: Statistical learning primer for biologists

1 Topic 4 : Ordered Logit Analysis. 2 Often we deal with data where the responses are ordered – e.g. : (i) Eyesight tests – bad; average; good (ii) Voting.

1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.

Love does not come by demanding from others, but it is a self initiation. Survival Analysis.

ALISON BOWLING MAXIMUM LIKELIHOOD. GENERAL LINEAR MODEL.

1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.

Review of statistical modeling and probability theory Alan Moses ML4bio.

G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.

Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.

LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.

12. Principles of Parameter Estimation

Generalized Linear Models

Model Inference and Averaging

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Integration of sensory modalities

Parametric Methods Berlin Chen, 2005 References:

Learning From Observed Data

Mathematical Foundations of BME Reza Shadmehr

12. Principles of Parameter Estimation

Mathematical Foundations of BME Reza Shadmehr

Presentation transcript:

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / §❶ Review of Likelihood Inference Robert J. Tempelman 1

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Likelihood Inference Necessary prerequisites to understanding Bayesian inference – Distribution theory – Calculus – Asymptotic theory (e.g, Taylor expansions) – Numerical Methods/Optimization – Simulation-based analyses – Programming Skills SAS PROC ???? or R package ???? is only really a start to understanding data analysis. I don’t think that SAS PROC MCMC (version 9.3)/WinBuGs is a fix to all of your potential Bayesian inference problems. Data Analysts: Don’t throw away that Math Stats text just yet!!! Meaningful computing skills is a plus! 2

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / The “simplest” model Basic mean model: – Common distributional assumption: What does this mean? Think pdf!!! pdf: probability density function joint pdf is product of independent pdf’s 3 Conditional independence

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Likelihood function Simplify joint pdf further Regard joint pdf as function of parameters ‘proportional to’ 4

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Maximum likelihood estimation Maximize with respect to unknowns. – Well actually, we directly maximize log likelihood – One strategy: Use first derivatives: i.e., determine and and set to 0. – Result? 5

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Example Data, Log Likelihood & Maximum Likelihood estimates 6

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Likelihood inference for discrete data Consider the binomial distribution: Set to zero → 7

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Sometimes iterative solutions are required First derivative based methods can be slow for some problems. Second-derivative methods are often desirable, e.g. Newton-Raphson – Generally faster – Provide asymptotic standard errors as useful by- product 8

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Plant Genetics Example (Rao, 1971) y 1, y 2, y 3, and y 4 are the observed numbers of 4 different phenotypes involving genotypes different at two loci from the progeny of self- fertilized heterogygotes (AaBb). It is known that under genetic theory that the distribution of four different phenotypes (with complete dominance at each loci) is multinomial. 9

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Probabilities ProbabilityGenotypeData (Counts) Prob(A_B_)y 1 =1997 Prob(aaB_)y 2 =906 Prob(A_bb)y 3 =904 Prob(aabb)y 4 =32 0    1  → 0: close linkage in repulsion  → 1: close linkage in coupling 10

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Genetic Illustration of Coupling/Repulsion Coupling Repulsion A B a b A b a B 11  = 1  = 0

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Likelihood function Given: 12

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / First and second derivatives First derivative: Second derivative: Recall Newton Raphson algorithm: 13

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Newton Raphson: SAS data step and output data newton; y1 = 1997; y2 = 906; y3 = 904; y4 = 32; theta = 0.01; /* try starting value of 0.50 too */ do iterate = 1 to 5; loglike = y1*log(2+theta) + (y2+y3)*log(1-theta) + y4*log(theta); firstder = y1/(2+theta) - (y2+y3)/(1-theta) + y4/theta; secndder = (-y1/(2+theta)**2 - (y2+y3)/(1-theta)**2 - y4/theta**2); theta = theta + firstder/(-secndder); output; end; asyvar = 1/(-secndder); /* asymptotic variance of theta_hat at convergence */ output; run; proc print data=newton; var iterate theta loglike; run; iteratethetaloglike

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Asymptotic standard errors Given: 15 Observed information proc print data=newton; var asyvar; run;

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Alternative to Newton Raphson Fisher’s scoring – Substitute for in Newton Raphson. – Now – Then 16 Expected information

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Fisher scoring: SAS data step and output: data newton; y1 = 1997; y2 = 906; y3 = 904; y4 = 32; theta = 0.01; /* try starting value of 0.50 too */ do iterate = 1 to 5; loglike = y1*log(2+theta) + (y2+y3)*log(1-theta) + y4*log(theta); firstder = y1/(2+theta) - (y2+y3)/(1-theta) + y4/theta; secndder = (n/4)*(-1/(2+theta) - 2/(1-theta) - 1/theta); theta = theta + firstder/(-secndder); output; end; asyvar = 1/(-secndder); /* asymptotic variance of theta_hat at convergence */ output; run; proc print data=newton; var iterate theta loglike; run; iteratethetaloglike In some applications, Fisher’s scoring is easier than Newton Raphson…but observed information probably more reliable than expected information (Efron and Hinckley, 1978 )

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Extensions to multivariate . Suppose that  is p x 1 vector. Newton Raphson Fisher’s scoring or 18

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Generalized linear models For multifactorial analysis of non-normal (binary, count) data. Consider the probit link binary model. – Implies the existence of normally distributed latent (underlying) variables ( i ). – Could do something similarly for logistic link binary model Consider a simple population mean model: – i =  + e i ; e i ~ N(0,  e 2 ) – Let  = 10 and  e = 2 19

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / The liability (latent variable) concept  =12 (“THRESHOLD”)  = 10  e = 2 20 i.e. probability of “success” = 15.87% i pdf( i ) Y=1 (“success”) Y=0 (“failure”)

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Inferential utopia! Suppose we’re able to measure the liabilities directly – Also suppose a more general multi-population(trt) model = X  + e; e ~ N(0, R); typically R = I  2 = ML(  ) = OLS(  ): But (sigh…), we of course don’t generally observe 21

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Suppose there are 3 subclasses Mean liabilities Use “corner parameterization”: = X  + e Herd 1 Herd 2 Herd 3 22

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Probability of success as function of effects (can’t observe liabilities…just observed binary data) Shaded areas 23

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Reparameterize model Let  and x i '  = (  + x i '*  *) cannot be estimated separately from  2 e ….i.e.,  2 e not identifiable. Herd 1 Herd 2 Herd 3 24

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Reparameterize the model again. consider the remaining parameters as standardized ratios:  =  /  e,  =  /  e, and  =  */  e -> same as constraining  e = 1. Notice that the new threshold is now 12/2 = 6, whereas the mean responses for the three herds are now 9/2, 10/2 and 11/2 25

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / There is still another identifiability problem Between  and  One solution? –  “zero out” . 26 Notice that the new threshold is now 0, whereas the mean responses for the three herds are now -1.5, -1 and -0.5

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Note that higher values of translate into lower probabilities of disease 27

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Deriving likelihood function Given: i.e., Suppose you have second animal (i’) Suppose animals i and i’ are conditionally independent Example y = 0,1 28

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Deriving likelihood function More general case – conditional independence So…likelihood function for probit model: Alternative: logistic model: 29 →

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Small probit regression example Data YiYi XiXi Link function = probit 30

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Log likelihood Newton Raphson equations can be written as: Fisher’s scoring: 31

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / A SAS program data example_binary; input x y; cards; ; proc genmod data=example_binary descending; class y; model y = x /dist=bin link=probit; contrast 'slope ' x 1; run; 32

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Key output Criteria For Assessing Goodness Of Fit CriterionDFValueValue/DF Log Likelihood Analysis Of Maximum Likelihood Parameter Estimates Paramet er DFEstimateStandard Error Wald 95% Confidence Limits Wald Chi- Square Pr > ChiSq Intercept x Scale Contrast Results ContrastDFChi-SquarePr > ChiSqType slope LR 33

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Wald test Asymptotic inference: – Reported standard errors are square roots of diagonals. Hypothesis test: on K’  = 0: When is n “large enough” for this to be trustworthy???? 34

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Likelihood ratio test proc genmod data=example_binary descending; class y; model y = /dist=bin link=probit; run; Criteria For Assessing Goodness Of Fit CriterionDFValueValue/DF Log Likelihood (logL reduced - logL full ) = -2( ) =2.84 H o :  1 = 0 is Prob(  2 1 >2.84) =.09. Reduced Model: 35 Again..asymptotic

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / A PROC GLIMMIX “fix” for uncertainty: use asymptotic F-tests rather than  2 -tests proc glimmix data=example_binary ; model y = x /dist=bin link=probit; contrast 'slope ' x 1; run; Type III Tests of Fixed Effects EffectNum DFDen DFF ValuePr > F x Contrasts LabelNum DFDen DFF ValuePr > F slope “less asymptotic?” 36

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Ordinal Categorical Data How I learned this? – “Sire evaluation for ordered categorical data with a threshold model” by Dan Gianola and Jean Louis Foulley (1983) in Genetics, Selection, Evolution 15: (GF83) – See also Harville and Mee (1984) Biometrics (HM84) Application: – Calving ease scores (0= unassisted, 5 = Caesarean) – Determined by underlying continuous liability relative to set of thresholds: 37

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Liabilities: Consider three different herds/subclasses:  e = 2 38

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Underlying normal densities for each of three herds. Probabilities highlighted for Herd 2 39

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Constraints Not really possible to separately estimate  e from  1,  2,  1,  2, and  3. Define then L* = L/  e,  1 * =  1 /  e,  1 * =  1 /  e,  2 * =  2 /  e, and  3 * =  3 /  e

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Yet another constraint required Suppose we use the corner parameterization: when expressed as a ratio over  e is Such that  1 * or  2 * are not separately identifiable from  *  1 **=  1 * -  * = = -1.5  2 **=  2 * -  * = = = i.e., zero out  *

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ /  1 **=  1 * -  * = = -1.5  2 **=  2 * -  * = = =  **

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Alternative constraint. Estimate  but “zero out” one of  1 or  2,say  1 Start with and  1 * = 4.0 and  2 * = 6.0. Then:  **=  *-  1 * = = 1.5  2 ** =  2 * -  1 * = =

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / One last constraint possibility Setting  1 = 0 and  2 to arbitrary value >  1 and infer upon  e Say  e = 2.  1 fixed to 0;  2 fixed to 4 44

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Likelihood function for Ordinal Categorical Data Based on the multinomial (m categories) where and Likelihood: Log Likelihood: 45

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Hypothetical small example Ordinal outcome having 3 possible categories: Two subjects in the dataset: – first subject has a response of 1 whereas the second has a response of 3. – Their contribution to the log likelihood: 46

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Solving for ML Let’s use Fisher’s scoring: – For a three+ category problem: Now 47

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Setting up Fisher’s scoring 2 nd derivatives(see GF83 or HM84 for details) now 48

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Setting up Fisher’s scoring 1 nd derivatives (see GF83 for details) Now with 49

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Fisher’s scoring algorithm So 50

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Data from GF (1983) H A G S Y H A G S Y H A G S Y 1 2 M F M F M M F F F M M F M F F M M F F M F M M F F F M M 4 1 H: Herd (1 or 2) A: Age of Dam (2 = Young heifer, 3 = Older cow) G: Gender or sex (M and F) S: Sire of calf (1, 2, 3, or 4) Y: Ordinal Response (1,2, or 3) 51

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / SAS code: Let’s just consider sex in model proc glimmix data = gf83 ; model y = sex /dist=mult link=cumprobit solutions; estimate 'Category 1 Female ' intercept 1 0 sex 1 /ilink; estimate 'Category 1 Male ' intercept 1 0 sex 0 /ilink; estimate 'Category <=2 Female ' intercept 0 1 sex 1 /ilink; estimate 'Category <=2 Male ' intercept 0 1 sex 0 /ilink; run; Subtle difference in parameterization: Gianola &Foulley, 1983 PROC GLIMMIX = 1 if females, 0 if males 52

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Parameter Estimates EffectyEstimate Standard Error DFt ValuePr > |t| Intercept  1 -  Intercept  2 -  Sex  Type III Tests of Fixed Effects EffectNum DFDen DFF ValuePr > F sex

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Estimated Cumulative Probabilities LabelEstimateStandard Error DFt ValuePr > |t|MeanStandard Error Mean Category 1 Female Category 1 Male Category <=2 Female Category <=2 Male Asymptotics? 54

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / PROC NLINMIXED (fix  0,  e ) proc nlmixed data=gf83 ; parms beta1=0 thresh1=-1.5 thresh2 = 0.5; eta = beta1*sex ; if (y=1) then p = probnorm(thresh1-eta) - 0; else if (y=2) then p = probnorm(thresh2-eta) - probnorm(thresh1-eta); else if (y=3) then p = 1 - probnorm(thresh2-eta); if (p > 1e-8) then ll = log(p); else ll = -1e100; model y ~ general(ll); estimate 'Category 1 Female ' probnorm(thresh1-beta1); estimate 'Category 1 Male ' probnorm(thresh1-0); estimate 'Category <=2 Female ' probnorm(thresh2-beta1); estimate 'Category <=2 Male ' probnorm(thresh2-0); run; 55 Estimate  1,  1,  2

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Key output from PROC NLINMIXED ParameterEstimateStandard Error DFt ValuePr > |t| beta thresh thresh Additional Estimates LabelEstimateStandard Error Category 1 Female Category 1 Male Category <=2 Female Category <=2 Male

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Yet another alternative (fix  1,  2 ) proc nlmixed data=gf83 ; parms beta1=0 sigmae= 1 mu = 0; thresh1 = 0; thresh2 = 0.5; eta = mu + beta1*sex ; if (y=1) then p = probnorm((thresh1-eta)/sigmae); else if (y=2) then p = probnorm((thresh2-eta)/sigmae) - probnorm((thresh1-eta)/sigmae); else if (y=3) then p = 1 - probnorm((thresh2-eta)/sigmae); if (p > 1e-8) then ll = log(p); else ll = -1e100; model y ~ general(ll); estimate 'Category 1 Female ' probnorm((thresh1-(mu+beta1))/sigmae); estimate 'Category 1 Male ' probnorm((thresh1-mu)/sigmae); estimate 'Category <=2 Female ' probnorm((thresh2-(mu+beta1))/sigmae); estimate 'Category <=2 Male ' probnorm((thresh2-mu)/sigmae); run; 57 Estimate  1,  e,  0 (  )

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Parameter Estimates Paramet er EstimateStandard Error beta sigmae mu Additional Estimates LabelEstimateStandard Error Category 1 Female Category 1 Male Category <=2 Female Category <=2 Male This is not inference on overdispersion!!… it’s merely a reparameterization 58

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / What is overdispersion from an experimental design perspective? No overdispersion identifiable for binary data…then why possible overdispersion for binomial data? – It’s merely a cluster (block) effect. Binomial responses. – Consists of y/n response. – Actually each “response” is a combined total for cluster with n contributing binary responses; y of them being successes, n-y being failures. Similar arguments hold for overdispersion in Poisson and n=1 vs. n>1 multinomials. 59

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Hessian Fly Data Example (Gotway and Stroup, 1997) ObsYnblockentrylatlngrep Available from SAS PROC GLIMMIX documentation

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / PROC GLIMMIX code title "G side independence"; proc glimmix data=HessianFly; class block entry rep; model y/n = entry ; random rep /subject =intercept ; run; 61 Much richer (e.g. spatial) analysis provided by Gotway and Stroup (1997); Stroup’s workshop (2011)

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Key portions of output Number of Observations Read64 Number of Observations Used64 Number of Events396 Number of Trials736 Covariance Parameter Estimates Cov ParmSubjectEstimateStandard Error repIntercept

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Hessian Fly Data in “individual” binary form: Obsentryrepz /8 1/9 63

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / PROC GLIMMIX code for “individual” data title "G side independence"; proc glimmix data=HessianFlyindividual ; class rep entry ; model z = entry / dist=bin; random intercept /subject =rep ; run; 64 random rep ;

Applied Bayesian Inference, KSU, April 29, 2012 §. ❶ / Key portions of output Number of Observations Read736 Number of Observations Used736 Covariance Parameter Estimates Cov ParmSubjectEstimateStandard Error Interceptrep