1 Clinical Investigation and Outcomes Research Statistical Issues in Designing Clinical Research Marcia A. Testa, MPH, PhD Department of Biostatistics.

Slides:



Advertisements
Similar presentations
Sample size estimation
Advertisements

1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Chapter 9 Hypothesis Testing Understandable Statistics Ninth Edition
Departments of Medicine and Biostatistics
Statistical Issues in Research Planning and Evaluation
Significance Testing Chapter 13 Victor Katch Kinesiology.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Chapter Seventeen HYPOTHESIS TESTING
Statistics for the Social Sciences
Research Curriculum Session III – Estimating Sample Size and Power Jim Quinn MD MS Research Director, Division of Emergency Medicine Stanford University.
Today Concepts underlying inferential statistics
Sample Size Determination
Sample size and study design
Hypothesis Testing Using The One-Sample t-Test
Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Hypothesis Testing.
Chapter 14 Inferential Data Analysis
Richard M. Jacobs, OSA, Ph.D.
Sample Size Determination Ziad Taib March 7, 2014.
Descriptive Statistics
Inferential Statistics
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
AM Recitation 2/10/11.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Statistical Analysis Statistical Analysis
Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.
Sample Size Determination Donna McClish. Issues in sample size determination Sample size formulas depend on –Study design –Outcome measure Dichotomous.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
1 Statistical Inference Greg C Elvers. 2 Why Use Statistical Inference Whenever we collect data, we want our results to be true for the entire population.
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
Chapter 8 Introduction to Hypothesis Testing
January 31 and February 3,  Some formulae are presented in this lecture to provide the general mathematical background to the topic or to demonstrate.
Chapter 9 Power. Decisions A null hypothesis significance test tells us the probability of obtaining our results when the null hypothesis is true p(Results|H.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Sample Size Considerations for Answering Quantitative Research Questions Lunch & Learn May 15, 2013 M Boyle.
1 Chapter 8 Hypothesis Testing 8.2 Basics of Hypothesis Testing 8.3 Testing about a Proportion p 8.4 Testing about a Mean µ (σ known) 8.5 Testing about.
Education 793 Class Notes Decisions, Error and Power Presentation 8.
Fall 2002Biostat Statistical Inference - Proportions One sample Confidence intervals Hypothesis tests Two Sample Confidence intervals Hypothesis.
© Copyright McGraw-Hill 2004
More Contingency Tables & Paired Categorical Data Lecture 8.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
1 Probability and Statistics Confidence Intervals.
Chapter Eleven Performing the One-Sample t-Test and Testing Correlation.
Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Chapter ?? 7 Statistical Issues in Research Planning and Evaluation C H A P T E R.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
Chapter Nine Hypothesis Testing.
CHAPTER 9 Testing a Claim
How many study subjects are required ? (Estimation of Sample size) By Dr.Shaik Shaffi Ahamed Associate Professor Dept. of Family & Community Medicine.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Hypothesis Testing: Hypotheses
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Presentation transcript:

1 Clinical Investigation and Outcomes Research Statistical Issues in Designing Clinical Research Marcia A. Testa, MPH, PhD Department of Biostatistics Harvard School of Public Health

2 Objective of Presentation Introduce statistical issues that are critical for designing a clinical research study and developing a research protocol, with a special focus on Power and sample size –Readings: Textbook, Designing Clinical Research, Chapter 6, Estimating Sample Size and Power: Applications and Examples and Chapter 19, Writing and Funding a Research Proposal.

3 Research Proposal Carefully planning the analytical and statistical methods is critical to any clinical research study. An outline of the main elements of a research proposal are listed in Table 19.1 of your textbook. Two very important components of the “Research Methods” section are “Measurements” and “Statistical Issues”.

4 Measurement and Statistical Components of the Research Proposal Measurements – you first must define: –Main predictor/independent variables (intervention, if an experiment) –Potential confounding variables –Outcome/dependent variables Statistical Issues – you should outline: –Approach to statistical analyses –Hypothesis, sample size and power

5 Power and Sample Size Depends upon: –measurements and study hypotheses –statistical test used on primary outcome –study design –variability and precision of the dependent measure –alpha (type 1 error) –effect size –number of hypotheses that you want to test

6 Types of Errors Confidence

7 Statistical power: –the probability of correctly identifying a trend or effect (Being correct that there is a trend or effect) Statistical confidence: –the probability of not identifying a false trend or effect (false alarm) (Being correct that there is no trend) What is power analysis?

8 Clinical research is primarily concerned with detecting improvements or worsening due to interventions or risk factors. Power analysis answers the question: Why is power analysis useful in research planning? “How likely is my statistical test to detect important clinical effects given my research design?”

9 Variability (stochastic noise in the data) Sample Size (accumulated information) time horizon (e.g.,survival analysis) –sampling frequency –replication –Confidence level/statistical test Elements of power analysis Beyond our control Within our control

10 Dealing with Variability Variability is often a barrier to detection Minimizing variability is often the goal Choose variables with a high signal to noise ratio Caution: these variables may be less sensitive to change Sample within a more homogeneous population Caution: greater homogeneity often means we are limiting the inferences we can make. At the extreme we would have highly reliable results that are for the most part clinically irrelevant

11 optimal use of resources effective but inefficient use of resources low return on investment Power Curve The Balancing of Cost and Power Low Cost High Cost

12 Power analysis is only as good as the information you provide: –How appropriate is the statistical test? –How accurate are estimates of variability? Power analysis can’t tell you: –How much power is enough? –What’s a meaningful change? Limitations of power analysis

13 There is no universal standard What is more important? Not missing a trend?  Power > Confidence Reporting a false trend?  Confidence > Power Usual range for confidence and power: 80-95% How much power is enough?

14 What’s a meaningful change? effect size Power = 95% for declines = -17% Example: You want to be able to detect the withdrawal (decline in participation) from a diet and exercise program under “usual care”.

15 What’s a meaningful change? effect size Power = 80% for decline = -13%

16 What’s a meaningful change? effect size Power = 60% for decline = -10%

17 Is a 17% annual withdrawal rate clinically meaningful? Example – Start with 100 patients Year No. of individuals After 5 years, more than 50% of your original population has withdrawn for the program 17% withdrawal after one year

18 Most people would concur that a withdrawal of 17% per year from a diet and exercise is large enough to be considered clinically meaningful. However, how meaningful are smaller withdrawal rates (13%, 10%, 5% 1%) ? This can not be answered using a formula. The answer will depend on the research objectives and clinical objectives, and the research budget. What is a meaningful change?

19 1. Chose Statistical Hypothesis Set up Null Hypotheses: Examples 1. Compare sample group mean to a known value  0 –Mean of group = Known population mean (H 0 :    0 ) vs (H A :    0 ) 2. Compare two sample group means –Mean Group (1) = Mean Group (2) (H 0 :  1   2 ) (H A :  1   2 ) Note – because you are testing “not equal” in the alternative hypothesis (  ) you have selected a “two-tailed test”.

20 2. Chose Statistical Test There are many statistical tests that are used in clinical research, however, for this presentation we will restrict ourselves to the following:

21 3. Chose Alpha Level and Effect Size Alpha = 0.05 – probability of rejecting the null when the null is true = 5% –You will conclude that there was a difference 5% of the time when there really was no difference You would like to detect a difference of X units or higher (effect size) in one group as compared to the other

22 4. Need SD of the Dependent Variable Use historical data if available Use the sample data from a feasibility study (e.g. 15 subjects) If you have no data to serve as a reference, you have to make an educated guess. Here’s a trick if your data is mound shaped and approximately normal. –Choose a representative low and high from your clinical experience, take the difference and divide by 4. –= ((high) – (low))/4 = SD estimate

23 5. Calculate a Standard Effect Size Effect size/standard deviation = standardized effect size Choose the  error –Remember Power = 1 - , so a type 2 error of 0.20 yields a power of 0.80 –Power is the probability of failure to reject the null hypothesis when the null hypothesis is false  concluding no difference when there really is a difference.

24 Power and Sample Size Example Continuous Glucose Monitoring Diabetes Study

25 CGM Study Two-group Comparison How many subjects do we need to be able to detect a difference in CGM mean daily glucose between patients on Lantus and Apidra insulin versus Premix analogue insulin? –Before you can answer this question, you must gather some more information.

26 Break down the problem CGM glucose at Week 12 = dependent variable of interest Want to compare two groups – each group has different patients Simple independent t-test Need SD of daily glucose Need to specify how large an effect you want to detect

27 Data from feasibility study Week 12 Data

28 CGM Study Two-group Comparison Compare Lantus & Apidra to Premix at 12 weeks Feasibility data available on 15 patients Independent t test will be used Alpha = 0.05, beta = 0.20, 2-tailed test Power = 0.80 –Null: Mean L & A = Mean Premix (H 0 :  1   2 ) (H A :  1   2 )

29 CGM Study Two-group Comparison SD from 15 patient feasibility study = 33

30 Estimating Sample Size of CGM Study Alpha = 0.05 for 1-sided, for 2-sided test Beta = 0.20, hence, power = 0.80 Clinically meaningful effect = 10 mg/dL difference (based upon clinical judgement) SD CGM glucose = 33 (from feasibility study) Standardized effect = 10/33 = 0.30 Check Appendix 6A in textbook for power Table 6A says you need 176 subjects per treatment group for a total of 352 subjects.

31 This is a directory of where you can find sample size and power programs

32 Useful Power Calculator Website

33 Online Power/Sample Size Power = 0.8, detect ES = 0.3 (10 mg/dL) N = 175 per group Power = 0.9, detect ES = 0.35 (11.6 mg/dL) N = 175 per group

34 Online Power/Sample Size Power = 0.8, detect ES = 0.5 (16.5 mg/dL) Sample size = 64/group Power = 0.8, detect ES = 1.57 (52 mg/dL) Sample size = N1 = 7, N2 = 8

35 CGM Study Paired Comparison Useful for longitudinal assessments CGM Study – You want to detect a decrease between Week 12 and Week 24 of 10 mg/dL You only have one group of patients, but they are measured on two separate occasions (Week 12 and Week 24).

36 15 patient feasibility study What is the mean glucose, parameter for the subjects at Week 12 versus Week 24? For simplicity, we are going to use the single value summary mean glucose levels at Wk 12 and Wk 24. Wk 0 Wk 12 Wk 24

37 Power and Sample Size for Paired t-test Power = 0.8, detect ES = 0.30 Need 92 subjects or “pairs” (Wk 12 and Wk 24) data. Remember with two independent groups we needed 175 subjects per group for a total of 350 subjects. When patients serve as their own control, you need “fewer” subjects to detect an equivalent effect size (ES) with the same power.

38 HRV Study Correlation and Multiple Regression Single-Group Study –Session 1 – Signal 1  HRV –Session 1 – Signal 2  BP –Demographic variables = Age, Gender –Clinical characteristics = Disease Status Suppose you want to look at associations between HRV, BP, demographic and clinical characteristics --  use bivariate correlation coefficient for 2 variables of multiple regression R 2 multiple predictors.

39 Power and Sample Size for Correlations (H 0 : r = 0) Power = , r = 0.3, ES = R 2 = 0.09, Sample size = 85 Power = 0.97, r = 0.4, ES = R 2 = 0.16, Sample size = 85 Only 1 “regressor” or predictor

40 Power and Sample Size for Correlations (H 0 : r = 0) Power = 0.80, r = 0.3, ES = R 2 = 0.09, Sample size = 139, if number of ipredictor variables = 5 Power = 0.80, r = 0.3, ES = R 2 = 0.09, Sample size = 177, if number of predictor variables = 10

41 Power and Sample Size for Test of Two Proportions You want to detect a difference between two proportions. Example: How many patients do you need in each group to detect a difference in the numbers of patients who adhere to diet and exercise at the end of 5 years. Old Program = 0.5 Adhere New Program= 0.7 Adhere Alpha = 0.05, Power = 0.8. You will need 103 individuals in each group.

42 Final Points Design your study such that you will have a sufficient number of subjects to be able to detect the effects that are clinically meaningful (high power). If you have a limited budget, and you can not afford to increase your sample size to the necessary levels, and lowering the variability is not feasible, you should consider alternative designs and hypotheses rather than proceeding with a study design with low power.