Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reliability and Validity

Similar presentations


Presentation on theme: "Reliability and Validity"— Presentation transcript:

1 Reliability and Validity
Introduction to Study Skills & Research Methods (HL10040) Dr James Betts

2 Lecture Outline: Definition of Terms Types of Validity
Threats to Validity Types of Reliability Threats to Reliability Introduction to Measurement Error.

3 “My car is unreliable” …in science…
Commonly used terms… “She has a valid point” “My car is unreliable” …in science… “The conclusion of the study was not valid” “The findings of the study were not reliable”.

4 Some definitions… Validity
“The soundness or appropriateness of a test or instrument in measuring what it is designed to measure” (Vincent 1999)

5 Some definitions… Validity
“Degree to which a test or instrument measures what it purports to measure” (Thomas & Nelson 1996)

6 Some definitions… Reliability
“…the degree to which a test or measure produces the same scores when applied in the same circumstances…” (Nelson 1997)

7 Some definitions… Objectivity
“…the degree to which different observers agree on measurements…” (Atkinson & Nevill 1998)

8 Types of Experimental Validity
Internal Is the experimenter measuring the effect of the independent variable on the dependent variable? External Can the results be generalised to the wider population?

9 Validity Logical Statistical Construct Face Content Concurrent
AKA Criterion Logical Statistical Construct Face Content Concurrent Predictive Reliability Consistency Objectivity

10 Logical Validity Face Validity
Infers that a test is valid by definition It is clear that the test measures what it is supposed to e.g. If you want to assess reaction time, measuring how long it takes an individual to react to a given stimulus would have face validity Externally Valid?

11 Logical Validity Face Validity
Infers that a test is valid by definition It is clear that the test measures what it is supposed to Assessing face validity is therefore a subjective process. i.e. Would assessing 15 m sprint time be a valid means of assessing reaction time?

12 Logical Validity Content Validity
Infers that the test measures all aspects contributing to the variable of interest …also a subjective process. e.g. Who is the most physically fit? VO2 max test? Wingate test? 1 RM?

13 Overall: A logically valid test simply appears to measure the right variable in its entirety?

14 Statistical Validity Concurrent Validity
Infers that the test produces similar results to a previously validated test e.g. VO2 max Incremental Treadmill Protocol with expired gas analysis Multi-Stage Fitness (Beep) Test

15 Statistical Validity A B Predictive Validity
Infers that the test provides a valid reflection of future performance using a similar test e.g. Can performance during test A be used to predict future performance in test B? A B

16 Overall: A statistically valid test produces results that agree with other similar tests?

17 Logical/Statistical Validity
Construct Validity Infers not only that the test is measuring what it is supposed to, but also that it is capable of detecting what should exist, theoretically Therefore relates to hypothetical or intangible constructs e.g. Team Rivalry Sportsmanship.

18 Logical/Statistical Validity
Construct Validity Infers not only that the test is measuring what it is supposed to, but also that it is capable of detecting what should exist, theoretically Therefore relates to hypothetical or intangible constructs This makes assessment difficult, i.e. if what should exist cannot be detected, this could mean: a) Test Invalid? b) Theory Incorrect? c) Sensitivity/Specificity Issues?

19 Interesting Example: Breast Cancer
Incidence: ~1 % (0.8 %) (i.e. a positive result should be detected for approximately in every 100 women tested) Sensitivity: ~90 % (87 %) (the mammogram is sensitive enough that approximately in every 100 breast cancer patients will receive a positive result) Specificity: ~90 % (93 %) (the mammogram is specific enough that approximately in every 100 healthy patients will receive a negative result). Data from Kerlikowske et al. (1996)

20 Quick Test What is the probability that a patient receiving a positive result actually has breast cancer?

21

22 Threats to Validity (and possible solutions?)

23 Threats to Internal Validity
Maturation Changes in the DV over time irrespective of the IV

24 Threats to Internal Validity
Maturation e.g. One Group Pre-test Post-test T O1 O2

25 Threats to Internal Validity
Maturation (possible solution) Time series T O1 O2 O3 O4 O5 O6

26 Threats to Internal Validity
Maturation (possible solution) Pre-test Post-test Randomised Group Comparison O1 T O2 PLACEBO P O4 O3 R n.b. RCT

27 Threats to Internal Validity
Maturation (possible solution) Repeated measures designs can occasionally be an inappropriate solution, even when randomised and counterbalanced e.g. Muscle Damage (repeated bout effect) Vitamin Supplementation (wash-out period) In which case independent measures designs could be used.

28 Threats to Internal Validity
History Unplanned events between measurements

29 Threats to Internal Validity
History T O1 O2 e.g. exercise? Therefore, solution = control extraneous variables!

30 Threats to Internal/External Validity
Pre-testing Interactive effects due to the pre-test (e.g. learning, sensitisation, etc.) Also influences External Validity

31 Threats to Internal/External Validity
Pre-testing …so it is actually T+O1 that is better than P, not T alone. e.g. O1 T O2 PLACEBO P O3 R O4 Assessing muscle mass here could make them train harder in both trials… …but then respond better to the T than the P…

32 Threats to Internal/External Validity
Pre-testing (possible solution) O1 T O2 R O4 P O3 PLACEBO O6 O5 Solomon Four-Group Design

33 Threats to Internal Validity
Sophomore Slump & SI ‘Cover Jinx’ Statistical Regression AKA regression to the mean An initial extreme score is likely to be followed by less extreme subsequent scores e.g. Training has the greatest effect on untrained individuals. Therefore, solution = effective sampling.

34 Threats to Internal Validity
Instrumentation A difference in the way 2 comparable variables were measured e.g. Uncalibrated equipment Therefore, solution = calibrate!

35 Threats to Internal Validity
Selection Bias The groups for comparison are not equivalent

36 Threats to Internal Validity
Selection Bias e.g. Groups not randomly assigned T O1 Oa PLACEBO P i.e. Group T were resistance trained to start with Static Group Comparison

37 Threats to Internal Validity
Selection Bias (possible solution) T O1 Either: -Randomise group assignment, -Pre-test and post-test difference, -Repeated Measures Design. PLACEBO P Oa

38 Threats to Internal/External Validity
Experimental Mortality Missing Data due to subject drop-out Reduced n = reduced statistical Power Not only challenges quality of data gathered (Internal Validity) but also our ability to generalise (External Validity). Therefore, solution = recruit sufficient participants (young?)

39 Threats to External Validity
Inadequate description 5th characteristic of research… …should be replicable If nobody can replicate the methods of a given study, then it is irrefutable and therefore lacks external validity. Therefore, solution = comprehensive methodology

40 Threats to External Validity
Biased sampling Linked to statistical regression Sample does not reflect target population n ≠ N Results generalised across gender Therefore, solution = random sample (of target population).

41 Threats to External Validity
Hawthorne Effect DV is influenced by the fact that it is being recorded e.g. Fastest sprint when professor enters lab Therefore, solution = control the lab environment.

42 Threats to External Validity
Demand Characteristics Participants detect the purpose of the study and behave accordingly e.g. Sports Science students already know that the carbohydrate drink is supposedly superior Therefore, solution = double or single blinding. CHO H2O

43 Threats to External Validity
Operationalisation AKA Ecological Validity The DV must have some relevance in the ‘real world’ e.g. TTE has no Olympic equivalent Therefore, solution = choose your DV carefully.

44 Reliability Reliability is a pre-requisite of validity
e.g. Direct versus Indirect measures of VO2 max -Gold Standard -Expensive Complex (i.e. valid and reliable) -Predictive -Cheap Easy

45 Reliability Valid and Reliable
Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Valid and Reliable

46 Reliability Not Valid but Reliable
Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 5 ml.kg-1.min-1 correction? Not Valid but Reliable

47 Not Valid and not Reliable
Reliability Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 i.e. a test can never be valid without being reliable? Not Valid and not Reliable

48 Types of Reliability Relative Absolute Rater reliability (Objectivity)
Intrarater reliability Interrater reliability.

49 Relative Reliability Relatively Reliable
Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 i.e. Individuals maintain position in the group Relatively Reliable

50 Not Absolutely Reliable
Absolute Reliability Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 i.e. Test-Retest within individuals Not Absolutely Reliable

51 Rater Reliability Intrarater reliability
The consistency of a given observer or measurement tool on more than one occasion

52 Rater Reliability Interrater reliability
The consistency of a given measurement from more than one observer or measurement tool e.g. Score for the American Gymnast British Judge = 9.9 French Judge = 4.4 Japanese Judge = 7.0

53 Threats to Reliability
Fatigue 8 am am am Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Therefore, solution = increase time between tests.

54 Threats to Reliability
Habituation Subject ml.kg-1.min ml.kg-1.min ml.kg-1.min-1 Therefore, solution = familiarise prior to test.

55 Threats to Reliability
Standardisation of Procedures Control of extraneous variables Precision of Measurements i.e. if we are happy to measure VO2 max to the nearest 10 ml.kg-1.min-1, then it could probably be reliably predicted from your training volume and age.

56 Measurement Errors Ultimately, reliability is dependent on the degree of measurement error in a given study The overall error in any measurement is comprised of both systematic and random error We will address measurement error further next week…

57 Literature Search Assignment
The handout lists 8 questions which can be answered through retrieving the corresponding source articles Answer as many as possible and bring them to next week’s lecture DO NOT contact author or order articles.

58 Selected Reading Atkinson, G. and A. M. Nevill. Statistical methods for assessing measurement error (Reliability) in variables relevant to sports medicine. Sports Medicine. 26: , 1998. Holmes, T. H. Ten categories of statistical errors: a guide for research in endocrinology and metabolism. American Journal of Physiology. 286: E Thomas J. R. & Nelson J. K. (2001) Research Methods in Physical Activity, 4th edition. Champaign, Illinois: Human Kinetics

59

60

61


Download ppt "Reliability and Validity"

Similar presentations


Ads by Google