Sample Size Robert F. Woolson, Ph.D. Department of Biostatistics, Bioinformatics & Epidemiology Joint Curriculum.

Sample Size Robert F. Woolson, Ph.D. Department of Biostatistics, Bioinformatics & Epidemiology Joint Curriculum

NIH Study Section l Significance l Approach l Innovation l Environment l Investigators

Approach l Feasibility l Study Design: Controls, Interventions l Study Size: Sample Size, Power l Data Analysis

Sample Size l # of Animals l # of Measurement Sites/Animal l # of Replications

Sample Size l What #s are Proposed? l Adequacy of #s? l Compelling Rationale for Adequacy? l Do We need More? l Can We Answer Questions With Fewer?

Sample Size l Simple Question to Ask l Answer May Involve: Assumptions Pilot Data Simplification of Overall Aims to a Single Question

Simplification l What Is The Question? l What Is The Primary Outcome Variable? l What Is The Principal Hypothesis?

Pilot Data l Relationship To Question. l Relationship To Primary Variable. l Relationship To Hypothesis.

Sample Size/Power Freeware on Web: l http://www.stat.uiowa.edu/~rlenth/Po wer/ http://www.stat.uiowa.edu/~rlenth/Po wer/ l l http://hedwig.mgh.harvard.edu/sampl e_size/size.html http://hedwig.mgh.harvard.edu/sampl e_size/size.html l l http://www.bio.ri.ccf.org/power.html http://www.bio.ri.ccf.org/power.html l l http://www.dartmouth.edu/~chance/ http://www.dartmouth.edu/~chance/

Sample Size l Purchase Software http://www.powerandprecision.co m/http://www.powerandprecision.co m/ Nquery: www.statsolusa.com

Animal Studies l Differences usually large l Variability usually small l Small sample sizes l Many groups l Repeated measures

Sample Size (# Animals Required) l Excerpts from the MUSC Vertebrate Animal Review Application Form: “ A power analysis or other statistical justification is required where appropriate. Where the number of animals required is dictated by other than statistical considerations… justify the number… on this basis.”

Sample Size: Ethical Issues in Animal Studies l Ethical Issues Study too large implies some animals needlessly sacrificed Study too small implies potential for misleading conclusions, unnecessary experimentation l Mann MD, Crouse DA, Prentice ED. Appropriate animal numbers in biomedical research in Light of Animal Welfare Considerations. Laboratory Animal Science, 1991, 41:

Ethical Issues Cont. l Human studies - same rationale hold for studies that are too large or too small.

Sample Size: Specifying the Hypothesis l Specifying the hypothesis difference from control? differences among groups over time? differences among groups at a particular point in time? l A “non-hypothesis” Animals in Group A will do better than animals in Group B

Sample Size: Specifying the Hypothesis l H o : Mean blood pressure on drug A = mean blood pressure on drug B measured six hours after start of treatment. l H a : Mean blood pressure on drug A < mean blood pressure on drug B measured six hours after start of treatment.

Example (SHR ) l Animal blood pressures measured at baseline l Animals randomly assigned to placebo or minoxidil l Animals measured 6 hours post treatment l Changes from baseline calculated for each animal

Example (Continued) l Placebo changes thought to be centered at 0 l Expect minoxidil to lower blood pressure, we think by 10 mm Hg l Blood pressure changes have a standard deviation of 5 mm Hg

Example (Continued) l How many animals/group needed to have 90 % power to detect the 10 mm Hg mean difference? l How would this sample size change if the standard deviation is 10 mm Hg rather than 5 mm Hg?

Example (Continued) l Suppose we change the endpoint to, did the animal achieve a reduction in blood pressure of 10 or more mm Hg. l Therefore 50 % of those on minoxidil would be expected to have reduction of 10 or more. l About 2.5 % of those on placebo would have reduction of 10 or more.

Example (Continued) l How many animals/group required to have 90 % power to detect the 50 % vs.. 2.5 %? l Why the difference in sample sizes for the same experiment?Comment on: Assumptions Endpoint Specific hypothesis.

Sample Size: Distribution of Response l Nominal/binary (Binomial) dead, alive l Ordinal (Non-parametric) inflammation (mild, moderate, severe) l Continuous (Normal*) blood pressure * may require transformation

Sample Size: Distribution of Response Binomial N is a function of probability of response in control and probability of response in treated animals Normal N is a function of difference in means and standard deviation

Sample Size: One Sample or 2-sample Test l One sample Change from baseline in one group Comparison to standard (historical controls) l Two sample Two independent study groups

Sample Size: One or Two sided test l One sided test : Ha: a > 0 Ha: a < 0 l Two sided test Ha: a not = 0

Sample Size: Choosing  l  = probability of Type 1 error l probability of rejecting H o when H o is true l significance level usually 0.01 or 0.05 l “calling an innocent person guilty” l “concluding two groups are different when they are not”

Sample Size: Choosing  l Multiple testing can lead to  errors. l Pre-specified hypotheses, may not need to adjust; l If all pairwise comparisons are of interest, adjust  (  /#tests)

Sample Size: Choosing  l  = probability of type II error; l probability of failing to reject H o when a true difference exists. l “Calling guilty person innocent” l “Missing a true difference” l Power = 1 -  l Large clinical trials use 0.9 or 0.95; animal studies usually use 0.8 (80% power).

Sample Size: Power l Concluding groups do not differ when power is low is risky. True difference may have been missed. l 80% power implies a 20% chance of missing a true difference. l 40% power implies a 60% chance of missing a true difference.

Sample Size: Calculation l Calculate N specify difference to be detected specify variability (continuous data) OR l Calculate detectable difference: specify N specify variability (continuous) or control %

Sample Size: Putting it all together Continuous (Normal) Distribution Need all but one: , ,  2, , N Z  = 1.96 (2 sided, 0.05); Z  = 1.645 (always one-sided, 0.05, 95% power)  = difference between means  2 = pooled variance 2 22     ) Z4(Z 2n

Difference (P 1 -P 2 ) (  =0.05, one-sided test, N per group=100, P 1 =0.5) 0.2 0.4 0.6 0.8 1.0 0.0 Power

Sample Size (  =0.05, one-sided test, P 1 =.5, P 2 =.3)

Nquery Advisor l About $700 l Many more options than many other programs l Available in student room in our department

Nquery Advisor l Under “file” choose “New” l Choices means proportions agreement survival (time to event) regression l # groups (1,2,>2) l testing, confidence intervals, equivalence

Examples l Continuous response l Binary response

Sample Size: Specifying the Hypothesis l H o : Mean blood pressure on drug A = mean blood pressure on drug B measured six hours after start of treatment. l H a : Mean blood pressure on drug A < mean blood pressure on drug B measured six hours after start of treatment.

Example (SHR ) l Animal blood pressures measured at baseline l Animals randomly assigned to placebo or minoxidil l Animals measured 6 hours post treatment l Changes from baseline calculated for each animal

Example (Continued) l Placebo changes thought to be centered at 0 l Expect minoxidil to lower blood pressure, we think by 10 mm Hg l Blood pressure changes have a standard deviation of 5 mm Hg

Example (Continued) l How many animals/group needed to have 90 % power to detect the 10 mm Hg mean difference? l How would this sample size change if the standard deviation is 10 mm Hg rather than 5 mm Hg?

Example (Continued) l Suppose we change the endpoint to, did the animal achieve a reduction in blood pressure of 10 or more mm Hg. l Therefore 50 % of those on minoxidil would be expected to have reduction of 10 or more. l About 2.5 % of those on placebo would have reduction of 10 or more.

Example (Continued) l How many animals/group required to have 90 % power to detect the 50 % vs. 2.5 %? l Why the difference in sample sizes for the same experiment?Comment on: Assumptions Endpoint Specific hypothesis.

Sample Size: More Than One Primary Response l Use largest sample size.

Sample Size: Food for Thought l Is detectable difference biologically meaningful? l Is sample size too small to be believable? l N = 5 “rule of thumb” but is this valid for the experiment being planned.

Sample Size: Misunderstandings l “Larger the difference, smaller the sample size” ignores contribution of variability l failing to report power for negative study calculate based on hypothesized difference and observed variability

Sample Size: Keeping It Small l Study continuous rather than binary outcome (if variability does not increase) change in tumor size instead of recurrence l Study surrogate outcome where effect is large cholesterol reduction rather than mortality

Examples Of Surrogate Outcome Measures? l Bone density l Quality of life l Patency l Pain relief l Functional Status l Cholesterol

Sample Size: Keeping It Small l Decrease variability Change from baseline or analysis of covariance training equipment choice of animal model

Sample Size: Keeping It Small l  = 0.05, 2-sided test l  = 0.2 ; power = 0.8 (80%) l Difference between two means = 1 l Standard deviation = 2; N = 64/group l Standard deviation = 1; N = 17/group

Sample Size Estimation l Parameters are estimates l Estimate of relative effectiveness based on other populations l Effectiveness overstated l Patients in trials do better l Assuming mathematical models l Compromise between available resources/objectives

Sample Size: Pilot Studies l No information on variability l No information on efficacy l Use effect size from similar studies or gather pilot data for estimation

Simplification l What Is The Question? l What Is The Primary Outcome Variable? l What Is The Principal Hypothesis?

Sample Size/Power Freeware on Web: l http://www.stat.uiowa.edu/~rlenth/Po wer/ http://www.stat.uiowa.edu/~rlenth/Po wer/ l l http://hedwig.mgh.harvard.edu/sampl e_size/size.html http://hedwig.mgh.harvard.edu/sampl e_size/size.html l l http://www.bio.ri.ccf.org/power.html http://www.bio.ri.ccf.org/power.html l l http://www.dartmouth.edu/~chance/ http://www.dartmouth.edu/~chance/

Sample Size l Purchase Software http://www.powerandprecision.co m/http://www.powerandprecision.co m/ Nquery: www.statsolusa.com

Additional Comments

Pilot Studies l Complication rate P = 1 – (1 – r) N where r = complication rate N = sample size If know desired P and N can solve for r If know desired r and P, can solve for N

Example to Work l Want to have 90% probability of detecting at least one complication, given a 25% complication rate. What N do you need? l You are studying 25 people and want 80% probability of detecting at least one complication. What is the complication rate that would yield this probability.

Pilot Studies l Use larger alpha (>0.05, e.g. 0.15 or 0.2) to compute sample size If reject null hypothesis will test in future study l Underlying concept – futility; ensure new treatment not worse than standard.

Pilot Studies l Can reformulate hypothesis Ho: new treatment = placebo Ha: new treatment < placebo Continue to larger study if fail to reject Ho.

Avoid Data Driven Comparisons l Test here 

Randomization: Bias Due to Order of Observations l Learning effect l Change in laboratory techniques l Different litters l Carry-over effects under estimate carry-over two treatments, same animal give A & B; can only test effect of B after A

Randomization: Order Effects Continued l System fatigue rabbit heart’s ability to function after two different treatments

Randomization: Order Effects Continued l Seasonal variability All rats male, same weight, same age, media temperature and other incubation conditions identical, housed in identical conditions Outcome - unstimulated renin release from kidneys (in vitro) samples at 30 minutes Outcome - Metastasis - winter 16% (n=767; summer 8% (n=142) ; logistic regression p<0.03 for season

Sample Size Robert F. Woolson, Ph.D. Department of Biostatistics, Bioinformatics & Epidemiology Joint Curriculum.

Similar presentations

Presentation on theme: "Sample Size Robert F. Woolson, Ph.D. Department of Biostatistics, Bioinformatics & Epidemiology Joint Curriculum."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sample Size Robert F. Woolson, Ph.D. Department of Biostatistics, Bioinformatics & Epidemiology Joint Curriculum.

Similar presentations

Presentation on theme: "Sample Size Robert F. Woolson, Ph.D. Department of Biostatistics, Bioinformatics & Epidemiology Joint Curriculum."— Presentation transcript:

Similar presentations

About project

Feedback