Matched designs Need Matched analysis. Incorrect unmatched analysis. cc cc exp,exact Proportion | Exposed Unexposed | Total Exposed -----------------+------------------------+----------------------

Slides:



Advertisements
Similar presentations
Comparing Two Proportions (p1 vs. p2)
Advertisements

M2 Medical Epidemiology
Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004
Basic epidemiologic analysis with Stata Biostatistics 212 Lecture 5.
What is Interaction for A Binary Outcome? Chun Li Department of Biostatistics Center for Human Genetics Research September 19, 2007.
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function F(Z) giving the probability is the cumulative standardized.
CBER Selecting the Appropriate Statistical Distribution for a Primary Analysis P. Lachenbruch.
Lecture 17: Regression for Case-control Studies BMTRY 701 Biostatistical Methods II.
1 Logistic Regression EPP 245 Statistical Analysis of Laboratory Data.
EPI 809 / Spring 2008 Final Review EPI 809 / Spring 2008 Ch11 Regression and correlation  Linear regression Model, interpretation. Model, interpretation.
In previous lecture, we highlighted 3 shortcomings of the LPM. The most serious one is the unboundedness problem, i.e., the LPM may make the nonsense predictions.
Modelling risk ratios and risk differences …this is *new* methodology…
Chapter 17 Comparing Two Proportions
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
BIOST 536 Lecture 3 1 Lecture 3 – Overview of study designs Prospective/retrospective  Prospective cohort study: Subjects followed; data collection in.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
Missing Data.. What do we mean by missing data? Missing observations which were intended to be collected but: –Never collected –Lost accidently –Wrongly.
BIOST 536 Lecture 4 1 Lecture 4 – Logistic regression: estimation and confounding Linear model.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Biostat 200 Lecture 8 1. Hypothesis testing recap Hypothesis testing – Choose a null hypothesis, one-sided or two sided test – Set , significance level,
Analysis of Categorical Data
Evidence-Based Medicine 4 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Methods Workshop (3/10/07) Topic: Event Count Models.
Statistics for clinical research An introductory course.
1 The Receiver Operating Characteristic (ROC) Curve EPP 245 Statistical Analysis of Laboratory Data.
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function is the cumulative standardized normal distribution.
Basic epidemiologic analysis with Stata Biostatistics 212 Lecture 5.
Dr.Shaikh Shaffi Ahamed Ph.D., Dept. of Family & Community Medicine
Analysis of time-stratified case-crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),
Basic Biostatistics Prof Paul Rheeder Division of Clinical Epidemiology.
Bandit Thinkhamrop, PhD. (Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen University, THAILAND.
Count Models 1 Sociology 8811 Lecture 12
EXERCISES POP QUIZ POOLING LOGISTIC REGRESSION. POP QUIZ.
1October In Chapter 17: 17.1 Data 17.2 Risk Difference 17.3 Hypothesis Test 17.4 Risk Ratio 17.5 Systematic Sources of Error 17.6 Power and Sample.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
The binomial applied: absolute and relative risks, chi-square.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Biostat 200 Lecture 8 1. The test statistics follow a theoretical distribution (t stat follows the t distribution, F statistic follows the F distribution,
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Lecture 18 Ordinal and Polytomous Logistic Regression BMTRY 701 Biostatistical Methods II.
BIOST 536 Lecture 1 1 Lecture 1 - Introduction Overview of course  Focus is on binary outcomes  Some ordinal outcomes considered Simple examples Definitions.
Biostat 200 Lecture 8 1. Where are we Types of variables Descriptive statistics and graphs Probability Confidence intervals for means and proportions.
Case Control Study : Analysis. Odds and Probability.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Design and Analysis of Clinical Study 7. Analysis of Case-control Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
Types of Categorical Data Qualitative/Categorical Data Nominal CategoriesOrdinal Categories.
The dangers of an immediate use of model based methods The chronic bronchitis study: bronc: 0= no 1=yes poll: pollution level cig: cigarettes smokes per.
Matched Case-Control Study Duanping Liao, MD, Ph.D Phone:
Conditional Logistic Regression Epidemiology/Biostats VHM812/802 Winter 2016, Atlantic Veterinary College, PEI Raju Gautam.
Exact Logistic Regression
1 Ordinal Models. 2 Estimating gender-specific LLCA with repeated ordinal data Examining the effect of time invariant covariates on class membership The.
THE CHI-SQUARE TEST BACKGROUND AND NEED OF THE TEST Data collected in the field of medicine is often qualitative. --- For example, the presence or absence.
Dr.Shaikh Shaffi Ahamed Ph.D., Dept. of Family & Community Medicine
Birthweight (gms) BPDNProp Total BPD (Bronchopulmonary Dysplasia) by birth weight Proportion.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
Analysis of matched data Analysis of matched data.
Bandit Thinkhamrop, PhD. (Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen University, THAILAND.
Discussion: Week 4 Phillip Keung.
Lecture 18 Matched Case Control Studies
The binomial applied: absolute and relative risks, chi-square
Introduction to Logistic Regression
Problems with infinite solutions in logistic regression
Count Models 2 Sociology 8811 Lecture 13
Discussion Week 1 (4/1/13 – 4/5/13)
Common Statistical Analyses Theory behind them
Presentation transcript:

Matched designs Need Matched analysis

Incorrect unmatched analysis. cc cc exp,exact Proportion | Exposed Unexposed | Total Exposed Cases | | Controls | | Total | | | | | Point estimate | [95% Conf. Interval] | Odds ratio | | (exact) Attr. frac. ex. | | (exact) Attr. frac. pop | | sided Fisher's exact P = sided Fisher's exact P = This analysis ignores that a matching control was found for each case. Notice that the ‘sample size’ looks to be 1242 and yet nevertheless there is no evidence of a disease- exposure relationship.

Correct classical analysis. reshape wide cc exp,i(pair) j(ct) (note: j = 1 2) Data long -> wide Number of obs > 621 Number of variables 4 -> 5 j variable (2 values) ct -> (dropped) xij variables: cc -> cc1 cc2 exp -> exp1 exp mcc exp2 exp1 | Controls | Cases | Exposed Unexposed | Total Exposed | | 106 Unexposed | | Total | | 621 McNemar's chi2(1) = 5.76 Prob > chi2 = Exact McNemar significance probability = odds ratio (exact). The ‘sample size’ is 21! But the p-value is less than 5% and the estimated odds ratio is very different from the incorrect analysis

Exact p-value is just the binomial. bitesti N Observed k Expected k Assumed p Observed p Pr(k >= 16) = (one-sided test) Pr(k <= 16) = (one-sided test) Pr(k = 16) = (two-sided test)

Conditional logistic regression version of the correct classical analysis. clogit exp cc,group(pair) note: multiple positive outcomes within groups encountered. note: 600 groups (1200 obs) dropped due to all positive or all negative outcomes. Conditional (fixed-effects) logistic regression Number of obs = 42 LR chi2(1) = exp | Coef. Std. Err. z P>|z| [95% Conf. Interval] cc | clogit exp cc,group(pair) or note: multiple positive outcomes within groups encountered. note: 600 groups (1200 obs) dropped due to all positive or all negative outcomes. Conditional (fixed-effects) logistic regression Number of obs = 42 LR chi2(1) = exp | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] cc | P-values / CIs are based on the normal approximation to the binomial. 600 concordant pairs are correctly ‘dropped’

4 matching controls per case LA Study of endometrial cancer use "C:\Mdsc643.02\la.dta", clear (LA Study of Endometrial Cancer). desc Contains data from C:\Mdsc643.02\la.dta obs: 315 LA Study of Endometrial Cancer vars: Nov :43 size: 15,120 (98.6% of memory free) (_dta has notes) storage display value variable name type format label variable label row float %9.0g age float %9.0g Age (yr) gbd float %9.0g yn Gall Bladder Disease hyp float %9.0g yn Hyertension obe float %9.0g yn Obesity est float %9.0g yn Estrogen (Any) Use conj float %9.0g cl Conjugated Dose dur float %9.0g Estrogen Duration (mo) ned float %9.0g yn Non Estrogen Drug cc float %9.0g ccl Case/Control quint float %9.0g 4 Controls: 1 Case Sorted by: quint

Incorrect analysis. cc cc est,exact Proportion | Exposed Unexposed | Total Exposed Cases | 56 7 | Controls | | Total | | | | | Point estimate | [95% Conf. Interval] | Odds ratio | | (exact) Attr. frac. ex. |.873 | (exact) Attr. frac. pop |.776 | sided Fisher's exact P = sided Fisher's exact P =

Classical analysis. drop row. sort quint cc. by quint: gen otf=_n. reshape wide cc age gbd hyp obe est conj dur ned, i(quint) j(otf) (note: j = ) Data long -> wide Number of obs > 63 Number of variables 11 -> 46 j variable (5 values) otf -> (dropped) xij variables: cc -> cc1 cc2... cc5 age -> age1 age2... age5 gbd -> gbd1 gbd2... gbd5 hyp -> hyp1 hyp2... hyp5 obe -> obe1 obe2... obe5 est -> est1 est2... est5 conj -> conj1 conj2... conj5 dur -> dur1 dur2... dur5 ned -> ned1 ned2... ned

A new table. gen sumcon=est1+est2+est3+est4. gen sumcas=est5. table sumcas sumcon | sumcon sumcas | | | There are 5 concordant pairs. Exact p-values based on Binomial p= 1/5, 2/5, 3/5 and 4/5

Components to the p-value. bitesti N Observed k Expected k Assumed p Observed p Pr(k >= 3) = (one-sided test) Pr(k <= 3) = (one-sided test) Pr(k >= 3) = (two-sided test) note: lower tail of two-sided p-value is empty. bitesti N Observed k Expected k Assumed p Observed p Pr(k >= 17) = (one-sided test) Pr(k <= 17) = (one-sided test) Pr(k >= 17) = (two-sided test) note: lower tail of two-sided p-value is empty return list scalars: r(p) = e-06

Next 2 p-values. bitesti N Observed k Expected k Assumed p Observed p Pr(k >= 16) = (one-sided test) Pr(k <= 16) = (one-sided test) Pr(k = 16) = (two-sided test). bitesti N Observed k Expected k Assumed p Observed p Pr(k >= 15) = (one-sided test) Pr(k <= 15) = (one-sided test) Pr(k = 15) = (two-sided test)

Correct p-value TITLE STB-49 sbe28. Meta-analysis of p values. DESCRIPTION/AUTHOR(S) STB insert by Aurelio Tobias, Statistical Consultant, Madrid, Spain. Support: After installation, see help metap. INSTALLATION FILES (click here to install) sbe28/metap.ado sbe28/metap.hlp ANCILLARY FILES (click here to get) sbe28/fleiss.dta

Using the STB ado file. input pvar pvar e end. metap pvar Meta-analysis of p_values Method | chi2 p_value studies Fisher | e

Conditional logistic version. clogit est cc,group(quint) note: multiple positive outcomes within groups encountered. note: 5 groups (25 obs) dropped due to all positive or all negative outcomes. Conditional (fixed-effects) logistic regression Number of obs = 290 LR chi2(1) = est | Coef. Std. Err. z P>|z| [95% Conf. Interval] cc | clogit est cc,group(quint) or note: multiple positive outcomes within groups encountered. note: 5 groups (25 obs) dropped due to all positive or all negative outcomes. Conditional (fixed-effects) logistic regression Number of obs = 290 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = est | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] cc |

Reversing Case/Control and Exposure. clogit cc est,group(quint) Conditional (fixed-effects) logistic regression Number of obs = 315 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = b cc | Coef. Std. Err. z P>|z| [95% Conf. Interval] est | clogit cc est,group(quint) or Conditional (fixed-effects) logistic regression Number of obs = 315 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = cc | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] est |

Assessment of potential confounder. clogit est hyp cc,group(quint) or note: multiple positive outcomes within groups encountered. note: 5 groups (25 obs) dropped due to all positive or all negative outcomes. Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Conditional (fixed-effects) logistic regression Number of obs = 290 LR chi2(2) = Prob > chi2 = Log likelihood = Pseudo R2 = est | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] hyp | cc |

Assessment of age as a potential modifier (even though age was a part of the matching criteria) gen ac=age*cc. clogit est hyp cc ac,group(quint) or note: multiple positive outcomes within groups encountered. note: 5 groups (25 obs) dropped due to all positive or all negative outcomes. Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Conditional (fixed-effects) logistic regression Number of obs = 290 LR chi2(3) = Prob > chi2 = Log likelihood = Pseudo R2 = est | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] hyp | cc | ac |

Notice… …that age*cc is included in the model even though age is not included. This is a special case where we CAN interpret a model with an interaction term even though one of the constituents of this interaction is not included in the model