# Comparing Margins of Multivariate Binary Data Bernhard Klingenberg Assoc. Prof. of Statistics Williams College, MA www.williams.edu/~bklingen.

## Presentation on theme: "Comparing Margins of Multivariate Binary Data Bernhard Klingenberg Assoc. Prof. of Statistics Williams College, MA www.williams.edu/~bklingen."— Presentation transcript:

Comparing Margins of Multivariate Binary Data Bernhard Klingenberg Assoc. Prof. of Statistics Williams College, MA www.williams.edu/~bklingen

Outline  Challenges: Associations of various degrees among binary variables Simultaneous Inference Sparse and/or unbalanced data, Test statistics with discrete support Asymptotic theory questionable  Setup: Two indep. groups Response: Vector of k correlated binary variables (multivariate binary)  Goal: Inference about k margins: Marginal Risk Differences Marginal Risk Ratios

Outline  Motivating Examples  From drug safety or animal toxicity/carcinogenicity studies Source: http://us.gsk.com/products/assets/us_advair.pdf

Source: http://www.pfizer.com/files/products/uspi_lipitor.pdf

Outline  Example: AEs from a vaccine trial (flu shot): > head(Y1) # ACTIVE Treatment n1=1971 ID HEADACHE PAIN MYALGIA ARTHRALGIA MALAISE FATIGUE CHILLS 2 1 1 1 1 1 1 1 4 0 1 1 0 0 1 0 5 1 0 0 0 0 0 0 6 1 1 1 1 1 1 1 7 0 0 0 0 0 1 0 9 1 0 1 1 1 1 1 > head(Y2) # PLACEBO Treatment n2=1554 ID HEADACHE PAIN MYALGIA ARTHRALGIA MALAISE FATIGUE CHILLS 1 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 8 0 0 0 0 1 0 0 10 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 15 0 0 1 0 0 1 0

Notation and Setup  k-dimensional response vectors: Group 1Group 2  Random sample in each group: Group 1Group 2  Joint distrib. in each group depends on 2 k -1 parameters Group 1Group 2

Comparing Margins  Usually only interested in k margins. Group 1 Group 2  With just two (k=2) adverse events: Group 1 Group 2 NoYes No Yes Headache Pain NoYes No Yes Headache Pain

Comparing Margins Group1Group2Diff HEADACHE 0.2603 0.2407 0.0196 INJECTION SITE PAIN 0.6088 0.1384 0.4705 MYALGIA 0.2588 0.1088 0.1500 ARTHRALGIA 0.0893 0.0579 0.0314 MALAISE 0.2085 0.1332 0.0753 FATIGUE 0.2476 0.2098 0.0378 CHILLS 0.0928 0.0463 0.0465  Differences in marginal incidence rates between Group 1 (Treatment) and Group 2 (Control)

Family of Tests  j-th Null Hypothesis:  Unrestricted and restricted MLEs:

Comparing Margins  Estimates of marginal incidence rates and test statistics comparing Group 1 (Treatment) and Group 2 (Control) p-hat1p-hat2p-checkp-tildeWaldLocalGlobal HEADACHE0.2600.2410.2520.2601.341.331.32 PAIN0.6090.1380.4010.40533.4728.2928.26 MYALGIA0.2590.1090.1930.21011.8711.2110.85 ARTHRALGIA0.0890.0580.0760.0823.593.503.37 MALAISE0.2090.1330.1750.1965.995.845.60 FATIGUE0.2480.2100.2310.2442.662.642.59 CHILLS0.0930.0460.0720.0855.515.294.93

Asymptotic Test  Note:  Asymptotically, multivariate normal with covariance matrix determined by

Asymptotic Test  Correlation Matrix: > round(cov2cor(Sigma),2) d1 d2 d3 d4 d5 d6 d7 d1 1.00 0.04 0.29 0.26 0.38 0.41 0.27 d2 1.00 0.18 0.09 0.08 0.10 0.01 d3 1.00 0.46 0.35 0.36 0.30 d4 1.00 0.33 0.33 0.32 d5 1.00 0.51 0.44 d6 1.00 0.37 d7 1.00 > qmvnorm(0.95, tail="both.tails", corr=cov2cor(Sigma)) \$quantile [1] 2.656222

Asymptotic Test  Correlation Matrix: > round(cov2cor(Sigma),2) d1 d2 d3 d4 d5 d6 d7 d1 1.00 0.06 0.33 0.28 0.41 0.41 0.29 d2 1.00 0.28 0.11 0.15 0.12 0.09 d3 1.00 0.46 0.41 0.36 0.35 d4 1.00 0.32 0.34 0.28 d5 1.00 0.50 0.47 d6 1.00 0.37 d7 1.00 > qmvnorm(0.95, tail="both.tails", corr=cov2cor(Sigma)) \$quantile [1] 2.653783

Permutation Approach  When testing can use Permutation Approach  This assumes distributions are exchangeable (i.e. identical), much stronger assumption than under null  Need two extra conditions: i.Sequences of all 0's as or more likely to occur under group 2 (Control) ii.Sequence of all 1's as or more likely to occur under group 1 (Treatment)

Permutation vs. Asymptotic  Permutation vs. asymptotic distribution of Critical Value: (  = 0.05) c perm = 2.655 c asympt = 2.654 c Bonf = 2.690 Permut. Distr. Asympt. Distr.

Simultaneous Confidence Intervals  Invert family of tests: Confidence Region:  Simplifies to simultaneous confidence intervals if 

Simultaneous Confidence Intervals  Results: Inverting Score test diff LB UB HEADACHE 0.0196-0.0196 0.0583 PAIN 0.4705 0.4323 0.5069 MYALGIA 0.1500 0.1162 0.1835 ARTHRALGIA 0.0314 0.0078 0.0547 MALAISE 0.0753 0.0416 0.1086 FATIGUE 0.0378-0.0002 0.0752 CHILLS 0.0465 0.0239 0.0692

Simultaneous Confidence Intervals  We used (and recommend) score statistic  Could use Wald statistic instead  This is equivalent to fitting marginal model via GEE:  asympt. multiv. normal, with (sandwich) covariance matrix (same as before)  Use distribution of for multiplicity adjustment

Simultaneous Confidence Intervals  Results: GEE approach (= inverting Wald test) diff LB UB HEADACHE 0.0196-0.0194 0.0586 PAIN 0.4705 0.4331 0.5078 MYALGIA 0.1500 0.1164 0.1836 ARTHRALGIA 0.0314 0.0082 0.0546 MALAISE 0.0753 0.0419 0.1087 FATIGUE 0.0378 0.0001 0.0755 CHILLS 0.0465 0.0241 0.0689

Download ppt "Comparing Margins of Multivariate Binary Data Bernhard Klingenberg Assoc. Prof. of Statistics Williams College, MA www.williams.edu/~bklingen."

Similar presentations