Download presentation
Presentation is loading. Please wait.
1
Matching, Stratification, Regression
Heejung Bang, PhD UC-Davis
2
Rimm & Bortin (1978). Clinical trials as religion
3
Motivating episode A surgeon came to me – his aim/hypothesis is clear. Excel dataset has small N & few variables. I judged multiple regression is a way to go. Surgeon honestly said “Previous authors already used multiple regression. To publish, I need to use PS. If not, I may give up.” After that, I say: We need PS-match.com and we live in PS nation.
4
Let’s be honest Perfect world: If we have ~perfect (e.g. well-designed & conducted, long term) RCT, we don’t need obs study to compare A vs. B. vs. Even with RCT, we still get average causal effect, not individual or effect on me. - Why should I care about others’ or average health? - The Median isn't the Message… Gould In real life, everyone takes A (or B) in different manners. Alternatives in more perfect world: time machine or avatar/clone + patience
5
Pros & Cons (as always!) RCT Observational study Causality (glorious)
Association/Correlation Expensive or very expensive Expensive or cheap Tons of rules & checklist (+/- audit) Less strict, more freedom Experimentation (no matter what they say) Coin determines my treatment More naturalistic I and/or Doctor determine my treatment Protocol/Registration (less flexible) Not yet required (good & bad) Analysis: Simpler Less simple Blinding, noncompliance, dropout, Hawthorne More freedom for patients/docs/analysts/publication High stakes, esp., pharma & nutrition supplement Small or high financial gain (e.g., Costco vitamins, Starbucks) More loved by top journals/WHO. Practice can change immediately Slow but possibly wider impact in daily life
6
Causal vs. Association Association Causal 2 populations 1 population
The only person you should ever compete with is yourself. You can't hope for a fairer match…Ruthman causal: biological mechanism association/correlation: data relationships devoid of mechanism causal: me
7
Nearly causal (n=1.5)?
8
Classic vs. Modern Matchmakers
Match/Stratify/Regress Less adjustment If 2 groups are close Tall-dark-handsome PS/IV/MSM/SNM More adjustment If 2 groups are different Total score
9
Why Match/Stratify/Regress
All about: Comparability* & Fairness (feasibility & price) Power of Coin: Coin solves so many problems (at baseline) and make statistician’s life easier. Causal method must be fancy and complicated? in 30 games in 2016 vs. Network-meta Confounding vs. Confounder* PS may be a modern broker for these methods under 1 roof. * I hate definitions….. Disraeli personal: My admiration for RCT has been dampened…
10
(Unbiased) effect of Veneers & Diet?
11
Matching Epi 101, after Restriction
Easy to say, not easy to do; See match.com. 1:1, 1:4? Case-control (as well as cohort) Key questions: Should I match…. at all? How/what to match. How many. How to analyze Price of matching age/sex/race: Default? Match b/c others do? For Efficiency more than Validity Design & Analysis should go hand in hand (if possible). e.g., M-H, conditional logistic, NcNemar, paired-t
12
Matching: Beyond textbook
Matching can be ignored in analysis? Only for cohort? Over/Under/Useless matching. How about Accidental matching? Perfect is the enemy of Good. Partial matching better? When ORconfnder-Y is weak to moderate, matching -> no benefit or even slight harm. PS+matching ‘marginal’ effect Easier in the Registry Era; more resources but more room for malpractice? Big difference in matching for cohort vs. case-control (common epi mistake made by biostatisticians…)
13
Example: Carpal Tunnel Syndrome
The objective of “Sex Differences in Musculoskeletal Conditions Across the Lifespan” ….in diagnosis & treatment for both men and women and lead to improvements in women’s health. P50 AR063043 a hypothesis: The cross-sectional area of the median nerve in individuals of both sexes with and without CTS will differ, when adjusted for age, sex & BMI.
14
Sex Case vs. Control* N F Control 50 M 38 Case 60 22
What is known: sex, age & BMI are important. Key objective: Sex-difference --- that’s why funded. Issues in practice: Research question important & robust? PI and statistician change What to do after 3 yrs of screening/enrollment? Women’s Right vs. Balance in Ns (where are “Male cases”?!) Sex=match, Age & BMI=regressor? *When I use a word, it means just what I choose it to mean—neither more nor less.’ Humpty-Dumpty Sex Case vs. Control* N F Control 50 M 38 Case 60 22
15
Stratification Matching is a form of stratification
a matched set = a stratum, e.g., twins We also learn from Stat 202: Sampling Idea is so intuitive: If M vs. W are so different, analyze separately (…. then combine or not). Cochran-Mantel-Haenszel Useful for heterogeneity & effect modification [Confounding is bad; Modification is good? more paper] Closely related to “Standardization”, another method toward valid comparison
16
Beware Yule-Simpson Paradox
Famous “Berkeley sex-bias case” Stratified is correct but Unstratified is misleading/useless? ORcrude ≠ ORstratified: Simpson or Jensen? I want to see all, including 2x2.
17
Almost Simpson or Paradox nothing Canto JAMA 2011
Yang (JAMA 2012) showed exactly opposite direction.
18
Headline: Blacks are 40% less? Schwartz (NEJM 1999)
19
Regression Calculus:Math ≈ Regression:Stat
[Cheater may love regression but hate calculus…] OLS = King of econometrics OR/Logistic = King of biostatistics Multiple regression: β=slope=ΔY/ΔX, while fixing S1 & S2. Can we adjust 20 Xs? Income vs. education, Variance inflation? OR & p attenuated? Evolution & expansion: GLM/GLM, LMM/GLMM, GMM/GEE, quantile, nonlinear, splines, lowess, Lasso, Bayesian, Cox, …… stay tuned.
20
Correlation vs. Regression
What are the differences? In my QE in 1996, I got 0. I still think the solution was not correct…. Mathematically close Correlation/Association ≠ Causation: tired? With randomization, fingers-toes, t-test or 2x2 may be sufficient. [“If you need statistics, your finding is not significant!”] When Genius Errs: R. A. Fisher & the lung cancer controversy (1991) Regression (=Gauss/Legendre/Galton) is older than Correlation (=Galton/Edgeworth/Pearson).
21
What to adjust, how to adjust
DAG may help. age/sex/race – default again? We adjust b/c they are in the dataset? Or for reviewers? Obsessed to adjust? e.g., direct or indirect Variable selection: Stat (e.g., by p, computer) vs. Epi (e.g., 10%, p<.20, theory) teach differently? “Independent” risk factors, multi-collinearity Rule #1: Do not adjust factors that are affected by X. Dangerously easy? Weaker than matching? Important issue in PS as well
22
Regression: Beyond textbook
Adjust or not, e.g., Lord’s paradox, horse-racing bias ANOVA is a scientific method, ANCOVA is not…Kempthorne vs. Is ANOVA obsolete? … Gelman Always be wary, when you Co-vary or Time-dependent. Economists may view a b-o-r-i-n-g topic. I want to see both! Shall we shrink? 1000 proc logistic in your SAS program? When to stop adjustment (until meeting theory or p<0.05?) Pre-specify as much as possible and/or honest reporting (wishing minimal penalty…)
23
All you need to know: logistic?
When can odds ratios mislead? Odds ratios should be used only in case control studies & logistic regression analyses… Deeks, BMJ 1998 <10%, OR~=RR. Can convert to other measures (risks, RR, RD). Guidance on their interpretation is of more use than outright rejection… Davies et al.
24
Dark side of your fav stat? Boos & Stefanski (2013)
25
(Clustering) By doc, hospital, community, or CRT --- We are the world
Longitudinal/Repeated measures/Spatial Mixed model, GEE, conditional logistic, etc. Within vs. Between effect: Within is more right. Population vs. Subject: Jensen’s inequality Clusters ≈ Strata? Wgt/Strata/Cluster important in complex survey design*. *Note: a lot of what we know/use are true for simplest design.
26
Propensity Score (PS) What is PS? Why do we want or need?
Logistic regression for PS: most popular PS use via match/strata/reg/wgt Tons of guides available Match.com: If 2 conditions, use traditional matching. If 10 conditions, use PS matching. If 100 conditions, membership refund? Do you even know what "propensity" means?... C. Rock (Everybody Hates Chris)
27
Seeing is Believing! Weintraub (NEJM 2012)
28
Regression & PS: Beyond textbook
High-dimensional, e.g., claims data, adjusting 500 Xs - Among 500 potential confounders, mediator, IV, collider? - Over-adjustment? Associated with Y vs. imbalance in A vs. B? > 2 groups, Continuous treatment How to match 30 hospitals Beyond logistic: Machine learning Parsimony vs. model as large as elephant PS for case-control/case-cohort: double beware! Double robustness (DR)
29
DR: more harm or good? Demystifying, e.g., 1 ∕ 0 = +/-∞
Easily wrong twice or more Statistical programming/implementation: - Cross-sectional: easier (or timeless?) - Longitudinal: still difficult Do we have all ‘complete’ data in ‘right’ structure? e.g., monotone missingness [Q: Monotonicity is English?] Methods & applications: still evolving/expanding (e.g., DR of ACE on censored costs, 2016).
30
The War of Biases – Rule of thumb?
Pros vs. Cons Bias1 > Bias2 << Bias3? Direct is Direct? Direct=Total-Indirect? Landmark vs. time-dependent vs. : for immortal time biase Do not adjust: factors affected by X; mediator; collider; highly collinear vs. Adjust: confounders; when you want "net" effect Oops, I only have one observation per person. We pray all biases cancel each other? Equivalence of the mediation, confounding & suppression effect… MacKinnon 2000
31
Common Complaints/Feedback
Causal methods less transparent. Traditional vs. newer: provide similar results. Why we need newer/advanced methods? Even experts don’t agree (including Rubin’s advice on PS). Some methods: too difficult to understand or implement IV & DAG: our students don’t know but speaker assumes they know… Causal methods are not in standard curricula; hardcore stat dismiss. Casual inference, Causal interference, Inverse probability (Bayes, 1837) What Effect Is Really Being Measured? Mediate & Moderate are synonyms in English dictionary!
32
Lies, damn lies or Cookbook nonsense?
Different methods Different results p-value/R2/CC/AUC/AIC, the many options of adjustment of potential confounders, multiple testing/modeling: e.g., p↓ & R2↓; R2↓& AUC↑; AIC vs BIC; pseudo R2↓↓ Bonferroni vs. FDR AUC=next king p-value? higher & higher P-value of p-values? 46656 Varieties of Bayesians ... Good(1971) How to guarantee significance ... Mantel(1976) Are statistical contributions to medicine undervalued? … Breslow(2003) Is statistical method of any value in medical research?... Greenwood(1924) Publication as prostitution… Frey(2003)
33
https://datascience.nih.gov/bd2k
Not even secondary; or already collected somewhere N/power, Sampling, Causal: no more? A rose by any other name: Data science, etc…. Cook Data analysis vs. Statistics.… Tukey Data cleaner (for Dirta) = new profession Beautiful Visual, Google over Newton? Examples: EMR, Google flu, Coupons & best restaurants Do we know more about Pluto than Indian ocean? p-value is still (when no tiger, rabbit is king.) e.g., birth month & insect bite: adjusted p=0.001. The Unbearable Lightness of information/data?
34
No shortcut to good performance
mg/dL (CDC) vs. mg/L (AHA) Sorry! kg vs. pound Oops! there is no age. I will have to enter. hemoglobin vs. hematocrit Many paid $999 on Dec 31! Vload=20. Excel Depression & Holy coding error, Batman… Krugman FOOLED by typo or IT? Missing, mismeasured, runaway, fake, repeats, frozen An idiot with a computer is often more powerful than a statistician with a pencil…. Confuseus New method = click-click or
35
Yes, I am biased… …more fundamental progress is more likely to be made by very focused, relatively small scale, intensive investigations than collecting millions of bits of information on millions of people,…. collect such large data now, but it depends on the quality, which may be very high or not, and if it is not, what do you do about it?... D. R. Cox My Little EXCEL Cult of Bigness, Small is Beautiful … Kohr/Taleb Big will be Bigger… as we ain’t getting younger. Stat/epi are slow in BigData game/war… Causal Inference from big data …. Bareinboim-Pearl (2015) (btw, do you have BigData?)
36
Cereal & baby boy: Fooled by Randomness?
37
Same basic principles; small or large
Measure what is measurable; make measurable what is not so … Galilei No free lunch…………….or Garbage-I-G-O/BIBO Multiple testing, multiple modeling, IN-N-OUT We are Aimless Fishermen? Do you remember your data? Data sharing/transparency/look for other evidence Can we escape from theory & math? Can you publish “~50% of newborns are boys”? α=50% (not 5%) Era: already? We reject important confirmations. No-one is incentivised to be right. Scientists are incentivised to be productive & innovative ... Horton
38
BigData vs. n-of-1 Antonym? Dear Both, You Clean. Both are
Personalized, Precision medicine? How about n-of-3 or magic number 30? n=1 is not acceptable, and n≥3 is preferable….. Editor of Science Signaling What I tell you 3 times is true… Lewis Carroll Rule of Three… Darwin, Pearson/Biometrika… n=30 was from correlation & t-test (Student 1908a,b)
39
Secret, Hostage or Monopoly Science. Data are cheap-N-fair
Secret, Hostage or Monopoly Science? Data are cheap-N-fair? Rich get richer?
40
Kilimanjaro or Pluto? Are we p<0.05? Publish or perish? log(100)=2
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.