Sampling, sample size estimation, and randomisation

Name: Sampling, sample size estimation, and randomisation
Uploaded: 2017-07-15T14:29:35+00:00
Duration: PTM12S58
Channel: Clement Peters
Description: Sampling, sample size estimation, and randomisation

Sampling, sample size estimation, and randomisation
PS302

Overview Sampling representative sampling (e.g. for surveys)
homogenous sampling (e.g. for experiments) Sample size estimation Based on power Gathering the information you need Power calculations (G*Power software) - ANOVA - regression Rules of thumb for multivariate tests Presentation of power analysis in your report Practical randomising Random selection (e.g. for surveys) Random allocation (e.g. for experiments)

Getting a representative sample
Survey of UK Households want a sample from each SES group each age group each sex Proportions should match the population

Matching the population
Percent of population  percent of sample Assume, sample size = 1200 Population = Women 60%, Men 40%  Sample: Women 720, Men 480

Problem for you to try Population figures: Men 65 years+ = 1 million
Women 65 years+ = 1.5 million Men years = 8 million Women years = 8.5 million Men < 25 years = 5 million Women < 25 years = 5.2 million Total population size = 29.2 million Percent W25-65 = (5.2 / 29.2) * 100 = 17.8% Given a sample size of 200, how many women <25 years should be included?

quota sampling Recruiters are given a quota of each stratum
Problem – biased selection by recruiter/interviewer Advantage – random selection very difficult to achieve, quota sampling a good compromise

Homogenous sampling Restrict sampling to a narrow group
Sample only Warwick students Sample only one Sex Sample only one Age group Advantages reduces error variance by reducing individual differences

Homogenous sampling ctd
Disadvantage – may reduce generalisability generalisability will need to be considered and assessed separately Suitability experimental work studies where individual differences are not directly relevant and power is more important concern

power Probability that any particular (random) sample will produce a statistically significant effect Eg. power = 0.9  90% chance of detecting an effect if there really is an effect Researchers usually aim to have power at 80-90%

Power and sample size All else being equal, to get more power you need more participants Where “all else” means: reliability of measures other sources of error variance p-value the true size of the effect

These concepts are inter-related
Desired power ↑ N ↑ Acceptable p-value ↓ N ↑ Effect size to detect ↓ N ↑ Reliability of measures ↓ N ↑ Other error variance ↑ N ↑

if you know these… effect size variance of measures you can often work out what the sample size should be So where can you find them? Previous research studies

Calculating using G-power
First step, assemble the figures needed For between subjects ANOVA: Effect size (Cohen’s f, or partial eta squared) Significance level [.05, usually] Power [.8, usually] Numerator degrees of freedom (df) Number of cells in design (groups)

1. Effect size … from previous studies
Easy – they reported effect sizes “There was not a significant main effect of Sex on response time, F(1, 42) = 2.03, p = .16, η2 = 0.046” Harder – they reported only the F and df, so you have to make a calculation partial η2 = (dfeffect * F) / [(dfeffect * F) + dferror] = (1 * 2.03) / [(1 * 2.03) + 42] = 0.046

measures of effect size for ANOVA
Roughly, the correlation between an effect and the outcome (DV) eta squared The proportion of variance in the outcome variable (DV) that is explained by the IV SSeffect / SS[corrected] total partial eta squared (SPSS prints this out) The proportion of the effect + error variance explained by the effect SSeffect / (SSeffect + SSerror)

4. Numerator df “There was a non-significant main effect of Gender on response time, F(1, 42) = 2.03, p < .05, η2 = 0.09”

5. Number of cells (groups)
Two way ANOVA 2 x 3 ANOVA  6 cells 4 x 2 ANOVA  8 cells Etc.

Calculating using G-power
First step, assemble the figures needed For this 2 X 3 between subjects ANOVA: Effect size (η2 = 0.046) Significance level [.05, as usual] Power [.8, normal] Numerator degrees of freedom (df = 1, 2 for the respective main effects, or 2 for the interaction) Number of cells in design (groups = 6)

tip: power & ANOVA Each effect in the ANOVA has its own power
Eg. 2 x 3 ANOVA Main effect A Main effect B Interaction effect A * B Tip: power is lower for interactions than for main effects

Sample size – ethical issues
Too small a sample -- can’t detect significant effects  waste all participants’ time Too large a sample -- waste resources -- waste the extra participants’ time

Sample size – practical issues
Resources Time Cost of running each participant Availability Clinical populations are often small Access can take time & require permission

Choosing an appropriate sample size for established laboratory paradigms
Shortcut Base sample size on sample size used in previous research This is often perfectly appropriate (but make sure the previous research is of high quality!)

Rules of thumb for multivariate tests
multiple regression cases (N) / predictors (p) N at least p for R2 N at least p for testing a predictor Need more cases if outcome is skewed, anticipated effect size is small, measures less reliable…

Rules of thumb for multivariate tests
PCA (exploratory FA) 50 no good 100 poor 300 good, but ideally need more

Random allocation For example
3 between subjects conditions (e.g. control, happy, sad) Who does which condition? first come? Interviewer choice? Must avoid confounds. But can’t check all possible. Solution is random allocation.

Random allocation needs truly random numbers
Different ways to do that SPSS random.org Research randomiser scripting language like python

to randomly assign 9 participants to 3 conditions:
Python to randomly assign 9 participants to 3 conditions: from random import shuffle numbers = [1,1,1,2,2,2,3,3,3] shuffle(numbers) numbers [3, 2, 1, 2, 3, 1, 3, 1, 2]

Research randomiser http://www.randomizer.org/form.htm
3 conditions, 48 planned participants randomly: allocate each participant (identified by order of recruitment) to one of 3 conditions How many sets of numbers to generate? [1] numbers per set? [48] Number range? [From 1 To 3] Do you wish each number in a set to remain unique? [No] [Don’t “sort”!]

Result Set #1:3, 3, 1, 3, 3, 1, 2, 2, 2, 1, 2, 2, 1, 1, 2, 1, 3, 3, 1, 3, 2, 1, 3, 3, 1, 2, 1, 3, 1, 2, 2, 2, 3, 3, 1, 3, 3, 1, 1, 3, 1, 3, 3, 2, 3, 3, 1, 2

3 sentence types, 48 sentences 16 in each group, create a random sequence, but limit runs of the same type How many sets of numbers to generate? [16] numbers per set? [3] Number range? [From 1 To 3] Do you wish each number in a set to remain unique? [Yes]

3 types, 48 sentences, 16 of each type limit run of a given type, while still randomising order of presentation 16 Sets of 3 Unique Numbers Per Set Range: From 1 to 3 -- Unsorted Job Status: Set #1:2, 3, 1 Set #2:3, 1, 2 Set #3: ….

Web links

measures of effect size for ANOVA
Roughly, the correlation between an effect and the outcome (DV) eta squared The proportion of variance in the outcome variable (DV) that is explained by the IV SSeffect / SS[corrected] total partial eta squared (SPSS prints this out) The proportion of the effect + error variance explained by the effect SSeffect / (SSeffect + SSerror)

Sampling, sample size estimation, and randomisation

Similar presentations

Presentation on theme: "Sampling, sample size estimation, and randomisation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sampling, sample size estimation, and randomisation

Similar presentations

Presentation on theme: "Sampling, sample size estimation, and randomisation"— Presentation transcript:

Similar presentations

About project

Feedback