# Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 5: Reporting Subgroup Results.

## Presentation on theme: "Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 5: Reporting Subgroup Results."— Presentation transcript:

Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician http://gcrc.LAbiomed.org/Biostat Session 5: Reporting Subgroup Results

Subgroup Issues Measuring subgroup effect Subgroups separately Interaction Selection of subgroups A priori Post-hoc Based on data Significance/strength of Conclusions Transparency of analysis Formal statistical comparisons; p-values, CIs.

Case Study pp. 1667-69 Editorial:

Case Study: Abstract

Main Subgroup Result

Separate Subgroup Comparisons CombinationAspirin Only 6.6 5.5 6.9 7.9 Symptomatic Asymptomatic % with Events N = 3284 Δ=-1.0 p=0.20 RR=1.2 0.91 to 1.59 N = 12153 Δ= 1.0 p=0.05 RR=0.88 0.77 to 1.0

Separate Subgroup Conclusions Symptomatic group: Combination better Large N. Is magnitude of effect relevant? See CIs. Asymptomatic group: Inconclusive (0.91 ≤RR≤ 1.59) Same magnitude, apparent inverse from symptomatics. Much smaller N; less power. Have not demonstrated subgroup difference. Use interaction to do so. Need to, based on CIs?

Subgroup Interaction N = 12153 Δ= 1.0 p=0.05 RR=0.88 0.77 to 1.0 N = 3284 Δ=-1.0 p=0.20 RR=1.2 0.91 to 1.59 Interaction = Δ Δ = 2.0% with 95% CI ~ 0.65% to 3.35% Why Is Interaction Relevant? Next slide vs.

Subgroup Conclusions with Interaction Symptomatic group: Combination better Large N. Is magnitude of effect relevant? See CIs. Asymptomatic group: Inconclusive (0.91 ≤RR≤ 1.59) Same magnitude, apparent inverse from symptomatics. Much smaller N; less power. Difference between subgroups: Significant according to interaction. Inverse “non-effect” nevertheless incorporated.

Change Data to Give Non-Significant Interaction CombinationAspirin Only 6.6 6.4 6.9 7.9 Symptomatic Asymptomatic % with Events N = 3284 Δ=-0.2 p=0.80 RR=1.03 0.40 to 1.4 N = 12153 Δ= 1.0 p=0.05 RR=0.88 0.77 to 1.0 Suppose: → P for interaction ~ 0.50. Change conclusions?

Changed Data Subgroup Conclusions Symptomatic group: Combination better Large N. Is magnitude of effect relevant? See CIs. Asymptomatic group: Inconclusive (0.40 ≤RR≤ 1.40) Apparently negligible, but not proven. Much smaller N; less power. Difference between subgroups: Not demonstrated. Use CI for ΔΔ to quantify magnitude of difference.

Change Data to Give Non-Significant Interaction CombinationAspirin Only 6.6 6.4 6.9 7.9 Symptomatic Asymptomatic % with Events N = 3284 Δ=-0.2 p=0.80 RR=1.03 0.40 to 1.3 N = 12153 Δ= 1.0 p=0.05 RR=0.88 0.77 to 1.0 Suppose: → P for interaction will be small. 10000 0.96 to 1.1 New Changes

Twice-Changed Data Subgroup Conclusions Symptomatic group: Combination better Large N. Is magnitude of effect relevant? See CIs. Asymptomatic group: Negligible (0.96 ≤RR≤ 1.1) Negligible, proven. Larger N → smaller CI; power not relevant. Difference between subgroups: Significantly demonstrated with interaction. Use CI for ΔΔ to quantify magnitude of difference.

Many Subgroup Analyses 12 Subgroups + Overall

Formal Multiple Comparison Adjustment Number of comparisons: k. Individual comparison false positive error rate = α. Experiment-wise error rate = α*. Bonferroni adjustment: Assume k comparisons are independent. True negative rate = specificity = 1 – α. Set α* = 1 - (1 – α) k → solve for α = 1 - (1 – α*) 1/k =~ α*/k. So, typically p< 0.05/(# tests) = 0.05/13= 0.004 here. Conservative if comparisons are correlated; can improve if correlation is known. No adjustment: Prob[≥1 false pos]=1-0.95 k =0.49 if k=13. See next slide.

Likelihood of False Positive Conclusions

Subgroup Multiple Comparison Comments Many other specialized methods. Pre-specified comparisons count just as post-hoc, if post-hoc not based on results. Why limit “experiment-wise” count to subgroup comparisons? No formal comparisons in this paper (but what if a large diff was observed?): Table 1-3: 22+20+26 potential covariates. P-values: Table 4 – 12 efficacy and safety comparisons. Figure 2: 12 Subgroups. At least one explicit test.

Subgroup Multiple Comparison Conclusions Obviously usually need to examine subgroups. If want to claim more than observations, need to adjust in a well-defined way. Typically, report as observational and: Explain decisions and choices of subgroups. Formal adjustment typically not necessary. Avoid p-values. Emphasize CI range. Separate planned from data mining results. Number of comparisons should be explicit.

Recommendations for Reporting on Subgroups: See Editorial. Use to justify the following approach to journal. Do not make multiple comparison adjustment. Be transparent about all analyses. State where conclusions are based on interactions. Report number of comparisons that were planned prior to looking at data (1) included and (2) not included in paper. Report which results were a consequence of looking at data; no p-values. Report if alternate definitions for a subgroup were examined. Give confidence intervals for effects that are compatible with the data, not p-values, for subgroups.

Recommendations: Example of a Start at Them Cohan(2005) Crit Care 23;10:2359-66.

Download ppt "Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 5: Reporting Subgroup Results."

Similar presentations