Presentation on theme: "Biostatistics in Practice Peter D. Christenson Biostatistician Session 1: Quantitative Needs in Biological Research."— Presentation transcript:
Biostatistics in Practice Peter D. Christenson Biostatistician http://gcrc.LABioMed.org/Biostat Session 1: Quantitative Needs in Biological Research
Session 1 Objectives General quantitative needs in biological research. Local statistics resources. Broad overview of concepts today that will be detailed in later sessions: Statistical thinking Study design issues Sources of variation Randomization
General Quantitative Needs Descriptive: Appropriate summarization to meet scientific questions: e.g., changes or % changes or reaching threshold? mean or minimum or natural range of response? avg. time to death or chances of dying by a fixed time? Inferential: Could results be spurious, a fluke, due to “natural” variations or chance? Sensitivity/Power: How many subjects are needed? Validity: Issues such as bias and correct inference are generally scientific, but addressed statistically.
Beginning, Middle, and End Structure and examples of: Protocols Software Publications
Typical Statistical Section of Protocol 11.1 Overview 11.2 Randomization/Treatment assignment 11.3 Study size 11.4 Missing data / subject withdrawal or incompletion 11.5 Definitions / outcomes 11.6 Analysis populations 11.7 Data analysis methods 11.8 Interim analyses
Professional Statistics Software Package Output Enter code; syntax. Stored data; access- ible.
Typical Statistics Software Package Select Methods from Menus Output after menu selection Data in spreadsheet www.ncss.com www.minitab.com
Microsoft Excel for Statistics Primarily for descriptive statistics. Limited output. No analyses for %s.
Almost Free On-Line Statistics Software Package Run from browser, not local. Can store data, results on statcrunch server. $5/ 6 months usage.
Local Biostatistical Resources Biostatistician: Peter Christenson Assist with study design, protocol development Minor, limited analysis of data Major analysis as investigator with %FTE on funded studies GCRC and non-GCRC studies Biostatistics short courses: 6 weeks 2x/yr GCRC computer laboratory in RB-3 For GCRC studies Statistical, graphics, database software Webpage: http://gcrc.LABioMed.org/Biostat Has links to software, online texts, etc.
Readings for Session 1 from StatisticalPractice.com Is statistics hard? An all-important foundation Cause and effect Study design Random samples / randomization
Is Statistics Hard? Statistics is used for: 1.Description. 2.Inference. Inference requires a peculiar way of thinking: 1.It is backwards. 2.The statistical methods are convoluted. 3.No evidence is not evidence of no effect.
Backwards and Convoluted Example: We measure cholesterol in 100 subjects before and after drug therapy. They all decrease. Common sense conclusion: Therapy probably works. Statistical approach is backwards: Step 1: First, assume a conclusion of no therapy effect. Step 2: Does the data contradict this assumption? If so, then conclude that therapy works. Analogy: U.S. Courts and Burden of Proof
Statistics Can Work AGAINST Good Science Good science: We again assume no therapy effect, where this effect is pre- specified and specific, i.e. %Δ in LDL is 0. We observe data that has only a 5% chance of occurring just due to uncontrollable random error under this assumption, so we conclude that therapy works. OK (5% chance we are wrong) Bad science: We again assume no therapy effect, but this effect is defined by looking around all of the measured characteristics after performing the study. Assumption is contradicted for one characteristic, and we conclude that therapy works. NOT OK (This is almost guaranteed to happen if many characteristics are examined.) Can be subtle in practice.
Bad Science That Seems So Good 1.Re-examining data, seeming to be due diligence. 2.Adding subjects to a study that is showing marginal effects; stopping early due to strong results. 3.Emphasizing effects in subgroups. See NEJM 2006 354(16):1667-1669. Actually bad practice? Could be negligent NOT to do these, but need to account for them.
No Evidence ≠ Evidence of No Effect Example, continued: Cholesterol changes range from -20 to +15 in 100 subjects from before to after drug therapy. Conclusion: The data probably does not contradict the assumption of no therapy effect. Lack of proof of therapy effectiveness does not imply the drug is not effective. Cholesterol changes range from -2 to +1 in 100 subjects from before to after drug therapy. Here, lack of proof of effectiveness does imply that the drug is practically ineffective.
No Evidence ≠ Evidence of No Effect Continued. It actually is more subtle, need statistics: Almost certain mean change is between -20 and +15 Almost certain mean change is between -20 and +15 from before to after drug therapy. Conclusion: The data probably does not contradict the assumption of no therapy effect. Lack of proof of therapy effectiveness does not imply the drug is not effective. Almost certain mean change is between Almost certain mean change is between -2 to +1 from before to after drug therapy. Here, lack of proof of effectiveness does imply that the drug is practically ineffective.
Cause and Effect See p. 1 of readings. Association is not causation. Examples. See p. 1 of readings: 1.Crimes ↑ as church membership ↑. 2.Welfare causes paupers? Another example: Pre-marital cohabitation causes divorce? See next slide. The only hard data is that the divorce rate is greater for those who cohabitated than for those who did not. The rest is speculation.
Cause and Effect: Limitations of Data Itself See pp 2-3 of readings. Data analysis can often only lead, not prove. Need mechanisms, randomization, independent sources of evidence (See p. 3 of readings, smoking and lung cancer). Scientific fields with decreasing ability to assign causality: Physical sciences. Clinical trials and some basic biology. Epidemiology and sociology; observational. History.
Study Design: General Requirements See pp 2-7 of readings. 1.Clearly stated research question and primary outcome measure. 2.Project must be feasible. 3.Carefully considered all facets of protocol and data items - will be held to them. 4.Keep it simple. 5.Research has consequences. Papers are sometimes misread or misused.
Study Design: Types See pp. 7-13 of readings. 1.Observational studies 2.Surveys 3.Cross sectional / longitudinal 4.Cohort / case-control 5.Interventional / controlled clinical trials
Study Design Issues See pp 13-18 of readings. 1.Comparison group: differs how? Use placebo? 2.Study size considerations 3.Many subjects vs. many measurements 4.Paired vs. unpaired data 5.Parallel groups vs. cross-over studies 6.Repeated measures 7.Intention-to-treat and meta-analysis
Random Samples / Randomization See pp 1-2 of readings. Random samples of subjects Randomization of treatment - masking (blinding). Validity Generalizability
Review Questions and Reference Questions on the next five slides are from an excellent biostatistics textbook: Martin Bland, Introduction to Medical Statistics, 3 rd ed., Oxford University Press, 2000. The author states that “… some questions are quite hard. If you score +1 for a correct answer, -1 for an incorrect answer, and 0 for a part you omitted, I would regard 40% at the pass level, 50% as good, 60% as very good, and 70% as excellent. … Some questions may be ambiguous, so you will not score 100%.”
Question A: Answer True or False for each part (5 answers) When testing a new medical treatment, suitable control groups include patients who: 1.Are treated by a different MD at the same time. 2.Are treated by a different hospital. 3.Are not willing to receive the new treatment. 4.Were treated by the same doctor in the past. 5.Are not suitable for the new treatment.
Question B: Answer True or False for each part (5 answers) In an experiment to compare 2 treatments, subjects are allocated using random numbers so that: 1.The sample may be referred to a known population. 2.When enrolling a potential subject, we do not know which treatment the subject will receive. 3.The subjects will get the treatment best suited to them. 4.The two groups will be similar, apart from treatment. 5.Treatments are assigned to meet subject characteristics.
Question C: Answer True or False for each part (5 answers) In a double-blind clinical trial: 1.The patients do not know which treatment they receive. 2.Each patient receives a placebo. 3.The patients are blind to the fact that they are in a trial. 4.Each patient receives both treatments. 5.The clinician making assessment does not know which treatment the patient receives.
Question D: Answer True or False for each part (5 answers) Cross-over designs for clinical trials: 1.May be used to compare several treatments. 2.Involve no randomization. 3.Require fewer patients than do designs comparing independent groups. 4.Are useful for comparing treatments intended to alleviate chronic symptoms. 5.Use the patient as his or her own control.
Question E: Answer True or False for each part (5 answers) Placebos are useful in clinical trials: 1.When 2 apparently similar active treatments are to be compared. 2.To guarantee comparability in non-randomized trials. 3.Because the fact of being treated may itself produce a response. 4.Because they may help to conceal the subject’s treatment from clinicians who assess them. 5.When an active treatment is compared to no treatment.
Answers to Questions A.F,F,F,F,F. B.F,T,F,T,F. C.T,F,F,F,T. D.T,F,T,T,T. E.F,F,T,T,T.