The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes.

The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes that may be thought provoking and challenging It is not intended for the content or delivery to cause offence Any issues raised in the lecture may require the viewer to engage in further thought, insight, reflection or critical evaluation

Validity of Research Threats to Validity Dr. Craig Jackson Senior Lecturer in Health Psychology School of Health and Policy Studies Faculty of Health & Community Care University of Central England craig.jackson@uce.ac.uk

Validity Important consideration Example project: access to 300 workers workers’ ability is assessed workers attend a 1 week training course workers’ ability is assessed again classic within-subjects design (pre-post test design)

300 subjects randomised 150 control group 150 intervention group assess ability control results intervention results compare mean scores Design Concept - Between subjects method

Design Concept - Within-subjects method - better 300 subjects randomised 300 control group assess ability #1 assess ability #2 300 treatment group training course

0 100 25 50 75 Threats to within-subjects designs observe increase after training course gain from test #1 to test #2 scores student concludes the outcome (improvement) is due to training could this be wrong? some threats to internal validity that critics (examiners) might raise and some plausible alternative explanations for the observed effects

History threats Some “historical” event caused increase – not the training TV & other media Sesame Street, Countdown, Tomorrow’s World, Open University Elementary intellectual content Can be mundane or extraordinary “Specific event / chain of events” British Journal of Psychiatry (2000) 177: pp469-72

Maturation threats “Age is the key to wisdom” Improvement would occur without any training course Measuring natural maturation / growth of understanding Effects up to a certain limit Differential maturation Similar to “history threat”?

Testing threats Specific to pre-post test designs Taking a test can increase knowledge Taking test #1 may teach participants Priming – make ready for training in a way they would not be Heisenberg’s Uncertainty Principle (1927)

Instrumentation threats Specific to pre-post test designs “Making the goals bigger” Taking a test twice can increase knowledge Studies do not use same test twice Avoiding testing threats Perhaps 2 versions of the test are not really similar The instrument causes changes not the training course

Instrumentation threats (further) Specific to pre-post test designs Especially likely with human “instruments” Observations or Clinical assessment 3 Factors Observers fatigue over time Observers improve over time Different observers

Mortality threats Metaphorical Dropping out of study Obvious problem? Especially when drop out is non-trivial N = 300 take test #1 N =50 drop-out after taking test #1 N = 250 remain and take test #2 What if the drop-outs were low-scorers on test #1? (self-esteem)

Mortality threats (further) Mean gain from test #1 to test #2 Using all of the scores available on each occasion Includes 50 low test #1 scorers (soon-to-be-dropouts) in the test #1 score Test #1 (n=300)Test #2 (n=250) Mean score 60.5 (± 9.7) 81.6 (± 8.9) Problem - - drops out the potential low scorers from test #2 Inflates mean test #2 score over what it would be if the poor scorers took it Solution - - compare mean test #1 and test #2 scores for only those workers who stayed in the whole study (n = 250)? No - - a sub-sample certainly not be representative of the original sample

Mortality threats (further) Degree of this threat gauged by comparison Compare the drop-out group (n = 50) with the non drop-out group (n = 250) e.g. using test #1 scores demographic data – especially age & sex If no major differences between groups: Reasonable to assume mortality occurred across entire sample Reasonable to assume mortality was not biasing results Depends greatly on size of mortality N

Regression threats Things can only get better – things can only get worse “Regression artefact” “Regression to the mean” Purely statistical phenomenon Whenever there is:a non-random sample from a population two measures imperfectly correlated (test #1 and test #2 scores) these will not be perfectly correlated with each other

Regression threats Few measurements stay exactly the same – confusing? e.g.: If a training program only includes people who are the lowest 10% of the class on test #1, what are the chances that they would constitute exactly the lowest 10% on test #2? Not very likely ! Most of them would score low on the post-test but unlikely to be the lowest 10% twice! The lowest 10% on test #1, they can't get any lower than being the lowest -- they can only go up from there, relative to the larger population from which they were selected

Summary of single-group threats History threats Maturation threats Testing threats Instrumentation threats Mortality threats Regression threats

Multiple Group threats Comparison of 2 different methods Training course to aid factory workers’ in living health lifestyle Example of an MSc project: Student has access to 300 workers 1.Workers’ lifestyle is assessed (test #1) 2.50% workers attend 1 week healthy lifestyle program 2.50% workers shown a healthy lifestyle software 3.Workers’ lifestyle is assessed again (test #2)

software group n=150 training course group n=150 complete lifestyle assessment test #1 trained on software Randomisation of 300 subjects complete lifestyle assessment test #1 attend training course complete lifestyle assessment test #2 Design Concept – Between and Within subjects method

0 100 25 50 75 software course software course Test #1 Test #2 Healthy Lifestyle Score (HLS) Factory workers and Healthy Lifestyle Training What does the graph show?

Selection comparability threats What if there is: an overall change from test #1 to test #2 level of change different between the two groups? Student concludes:the outcome is due to the different styles of risk assessment program. How could this be wrong? Key validity issue: the degree to which the groups are comparable before the study. If groups comparable and the only difference between them is the program, post-test differences can be attributed to the program a big “IF”

Selection comparability threats If groups not comparable to begin with, how much of the change can be attributed to training programs or to the initial differences between groups? The only multiple group threat to internal validity This threat is a selection bias or selection threat Selection threat - “ any factor other than the program that leads to post-test differences between groups”

Selection History threats Any other event that occurs between test #1 and test #2 that the 2 groups experience differently “selection threat”the groups differ in some way “history threat”the way the groups differ is with respect to their reactions to / experiences of “historical” events e.g. If the groups may differ in their reading habits Perhaps the training course group read “health” more frequently than those in the software group A higher test #2 score for the training course group doesn't indicate the effect of lifestyle training…..…..it's really an effect of the two groups differentially experiencing a relevant event (TV)

Selection Instrumentation threats Any differential change in the test used for each group from pre-course and post-course. e.g. the test may change differently for the two groups Especially observers: - differential changes between groups

Selection Mortality threats Arises when there is differential (non-random) dropout between the two groups, from test #1 to test #2 Different types of workers might drop out of each group, More may drop out of one than the other Possibly based on how they were selected Observed differences in results might be due to the different types of dropouts -- the selection-mortality -- and not to the different training programs If the selection into groups was not random a bias will often exist

Selection Regression threats Occurs when there are different rates of regression to the mean in the two groups. This might happen if one group scores more extremely on test #1 than the other group – bias again Perhaps that the software group is getting a disproportionate number of low ability workers (factory managers think they need the “new” tutoring) Managers don't understand the need for 'comparable' program and comparison groups! Since the software group has more extreme lower scorers at test #1, their mean will regress (increase) a greater distance toward the overall population mean at test #2, and they will appear to “gain” more than the training course group

The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes.

Similar presentations

Presentation on theme: "The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes.

Similar presentations

Presentation on theme: "The following lecture has been approved for University Undergraduate Students This lecture may contain information, ideas, concepts and discursive anecdotes."— Presentation transcript:

Similar presentations

About project

Feedback