Power and Sample Size + Principles of Simulation Benjamin Neale March 4 th, 2010 International Twin Workshop, Boulder, CO.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Inferential Statistics and t - tests
PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Environmental Data Analysis with MatLab Lecture 23: Hypothesis Testing continued; F-Tests.
Statistical Significance What is Statistical Significance? What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant?
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
Business 205. Review Sampling Continuous Random Variables Central Limit Theorem Z-test.
Chapter Seventeen HYPOTHESIS TESTING
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
Hypothesis Testing: Type II Error and Power.
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Today Concepts underlying inferential statistics
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
Chapter 14 Inferential Data Analysis
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Chapter Nine: Evaluating Results from Samples Review of concepts of testing a null hypothesis. Test statistic and its null distribution Type I and Type.
Hypothesis Testing:.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Hypothesis testing. Want to know something about a population Take a sample from that population Measure the sample What would you expect the sample to.
Sampling Distributions and Hypothesis Testing. 2 Major Points An example An example Sampling distribution Sampling distribution Hypothesis testing Hypothesis.
Chapter 8 Introduction to Hypothesis Testing
Power & Sample Size and Principles of Simulation Michael C Neale HGEN 691 Sept Based on Benjamin Neale March 2014 International Twin Workshop,
Chapter 7 Statistical Issues in Research Planning and Evaluation.
Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Elementary Statistical Methods André L. Souza, Ph.D. The University of Alabama Lecture 22 Statistical Power.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Instructor Resource Chapter 5 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
1 Lecture 19: Hypothesis Tests Devore, Ch Topics I.Statistical Hypotheses (pl!) –Null and Alternative Hypotheses –Testing statistics and rejection.
1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5.
1 Lecture note 4 Hypothesis Testing Significant Difference ©
No criminal on the run The concept of test of significance FETP India.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Inferential Statistics Body of statistical computations relevant to making inferences from findings based on sample observations to some larger population.
Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.
RDPStatistical Methods in Scientific Research - Lecture 41 Lecture 4 Sample size determination 4.1 Criteria for sample size determination 4.2 Finding the.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Education 793 Class Notes Decisions, Error and Power Presentation 8.
Issues concerning the interpretation of statistical significance tests.
Module 15: Hypothesis Testing This modules discusses the concepts of hypothesis testing, including α-level, p-values, and statistical power. Reviewed.
Power and Sample Size Anquan Zhang presents For Measurement and Statistics Club.
Power Simulation Ascertainment Benjamin Neale March 6 th, 2014 International Twin Workshop, Boulder, CO Denotes practical Denotes He-man.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Sampling Distributions Statistics Introduction Let’s assume that the IQ in the population has a mean (  ) of 100 and a standard deviation (  )
Chapter 13 Understanding research results: statistical inference.
If we fail to reject the null when the null is false what type of error was made? Type II.
Lec. 19 – Hypothesis Testing: The Null and Types of Error.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Methods of Presenting and Interpreting Information Class 9.
Chapter 8: Inferences Based on a Single Sample: Tests of Hypotheses
Power and p-values Benjamin Neale March 10th, 2016
Part Four ANALYSIS AND PRESENTATION OF DATA
Hypothesis Testing.
Tests of Significance The reasoning of significance tests
Hypothesis Testing: Hypotheses
Statistical Process Control
Chapter 9: Hypothesis Tests Based on a Single Sample
Power and Sample Size I HAVE THE POWER!!! Boulder 2006 Benjamin Neale.
Power and Error What is it?.
Type I and Type II Errors
Statistical Power.
Presentation transcript:

Power and Sample Size + Principles of Simulation Benjamin Neale March 4 th, 2010 International Twin Workshop, Boulder, CO

Slide of Questions What is power? What affects power? How do we calculate power? What is simulation? Why do we simulate? How do we simulate?

What is power? Definitions of power The probability that the test will reject the null hypothesis if the alternative hypothesis is true The chance the your statistical test will yield a significant result when the effect you are testing exists

Key Concepts and Terms Null Hypothesis Alternative Hypothesis Distribution of test statistics

Key Concepts and Terms Null Hypothesis ◦The baseline hypothesis, generally assumed to be the absence of the tested effect Alternative Hypothesis ◦The hypothesis for the presence of an effect Distribution of test statistics ◦The frequencies of the values of the tests statistics under the null and the alternative

Practical 1 We are going to simulate a normal distribution using R We can do this with a single line of code, but let’s break it up

Simulation functions R has functions for many distributions Normal, χ 2, gamma, beta (others) Let’s start by looking at the random normal function: rnorm()

rnorm Documentation In R: ?rnorm

rnorm syntax rnorm(n, mean = 0, sd = 1) Function name Number of Observations to simulate Mean of distribution with default value Standard deviation of distribution with default value

R script: Norm_dist_sim.R This script will plot 4 samples from the normal distribution Look for changes in shape Thoughts?

What did we learn? You have to comment The presentation will not continue without audience participation ◦No this isn’t a game of chicken

One I made earlier

Concepts Sampling variance ◦We saw that the ‘normal’ distribution from 100 observations looks stranger than for 1,000,000 observations Where else may this sampling variance happen? How certain are we that we have created a good distribution?

Mean estimation Rather than just simulating the normal distribution, let’s simulate what our estimate of a mean looks like as a function of sample size We will run the R script mean_estimate_sim.R

R script: mean_estimate_sim.R This script will plot 4 samples from the normal distribution Look for changes in shape Thoughts?

What did we learn? You have to comment The presentation will not continue without audience participation ◦No this isn’t a game of chicken

One I made earlier

Standard Error We see an inverse relationship between sample size and the variance of the estimate This variability in the estimate can be calculated from theory SE x = s/√n SE x is the standard error, s is the sample standard deviation, and n is the sample size

Existential Crisis! What does this variability mean? Again—this is where you comment

Key Concept 1 The sampling variability in my estimate affects my ability to declare a parameter as significant (or significantly different)

Power definition again The probability that the test will reject the null hypothesis if the alternative hypothesis is true

Hypothesis Testing Mean different from 0 hypotheses: ◦h o (null hypothesis) is μ=0 ◦h a (alternative hypothesis) is μ ≠ 0  Two-sided test, where μ > 0 or μ < 0 are one- sided Null hypothesis usually assumes no effect Alternative hypothesis is the idea being tested

Possible scenarios Reject H 0 Fail to reject H 0 H 0 is true 1- H a is true 1- =type 1 error rate =type 2 error rate 1-=statistical power

Rejection of H 0 Non-rejection of H 0 H 0 true H A true Nonsignificant result (1- ) Type II error at rate  Significant result (1-) Type I error at rate  Statistical Analysis Truth

Expanded Power Definition The probability of rejection of the null hypothesis depends on: ◦The significance criterion () ◦The sample size (N) ◦The effect size (Δ) The probability of detecting a given effect size in a population from a sample size, N, using a significance criterion, .

T alpha 0.05 Sampling distribution if H A were true Sampling distribution if H 0 were true   POWER: 1 -  Standard Case Non-centrality parameter

T   POWER: 1 -  ↑ Increased effect size Non-centrality parameter alpha 0.05 Sampling distribution if H A were true Sampling distribution if H 0 were true

T alpha 0.05 Sampling distribution if H A were true Sampling distribution if H 0 were true   POWER: 1 -  Standard Case Non-centrality parameter

T   More conservative α Sampling distribution if H A were true Sampling distribution if H 0 were true POWER: 1 -  ↓ alpha 0.01

T   Non-centrality parameter Less conservative α Sampling distribution if H A were true Sampling distribution if H 0 were true POWER: 1 -  ↑ alpha 0.10

T alpha 0.05 Sampling distribution if H A were true Sampling distribution if H 0 were true   POWER: 1 -  Standard Case Non-centrality parameter

T   POWER: 1 -  ↑ Increased sample size Non-centrality parameter alpha 0.05 Sampling distribution if H A were true Sampling distribution if H 0 were true

T   POWER: 1 -  ↑ Increased sample size Non-centrality parameter alpha 0.05 Sampling distribution if H A were true Sampling distribution if H 0 were true Sample size scales linearly with NCP

Additional Factors Type of Data ◦Continuous > Ordinal > Binary ◦Do not turn “true” binary variables into continuous Multivariate analysis Remove confounders and biases MZ:DZ ratio

Effects Recap Larger effect sizes ◦Reduce heterogeneity Larger sample sizes Change significance threshold ◦False positives may become problematic

Power of Classical Twin Model

You can do this by hand But machines can be much more fun Martin, Eaves, Kearsey, and Davies Power of the Twin Study, Heredity, 1978

How can we determine our power to detect variance components?

Simulation is king Just like we simulated estimates of means we can simulate chi squares from dropping C We get to play God [well more than usual] ◦We fix the means and variances as parameters to simulate ◦We fit the model ACE model ◦We drop C ◦We generate our alternative sampling distribution of statistics

Script: test_C_sim.R This script does not produce anything, but rather creates a function I will walk through the script explaining the content We will make this function and then run the function once to see what happens Then we can generate a few more simulations (though I’m not super keen on the computation power of the machines)

Script: run_C_sim.R Now we understand what the function is doing, we can run it many times We are going to use sapply again This time we will sapply the new function we made In addition to generating our chis, we create an object of the chis to assess our power and see what our results look like

Principles of Simulation “All models are wrong but some are useful” --George Box Simulation is useful for refining intuition Helpful for grant writing Calibrates need for sample size Many factors affect power Ascertainment Measurement error Many others

Why R is great Simulation is super easy in R ClassicMx did not have such routines We can evaluate our power easily using R as well We can generate pictures of our power easily

Additional Online Resources Genetic Power Calculator ◦Good for genetic studies with genotype data ◦Also includes a probability function calculator ◦ Wiki has pretty good info on statistical power ◦

Citations for power in twin modeling Visscher PM, Gordon S, Neale MC., Power of the classical twin design revisited: II detection of common environmental variance. Twin Res Hum Genet Feb;11(1): Neale BM, Rijsdijk FV, Further considerations for power in sibling interactions models. Behav Genet Sep 35(5):671-4 Visscher PM., Power of the classical twin design revisited., Twin Res Oct;7(5): Rietveld MJ, Posthuma D, Dolan CV, Boomsma DI, ADHD: sibling interaction or dominance: an evaluation of statistical power. Behav Genet May 33(3): Posthuma D, Boomsma DI., A note on the statistical power in extended twin designs. Behav Genet Mar;30(2): Neale MC, Eaves LJ, Kendler KS. The power of the classical twin study to resolve variation in threshold traits. Behav Genet May;24(3): Nance WE, Neale MC., Partitioned twin analysis: a power study. Behav Genet Jan;19(1): Martin NG, Eaves LJ, Kearsey MJ, Davies P., The power of the classical twin study. Heredity Feb;40(1):