Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reliability and Validity Designs

Similar presentations


Presentation on theme: "Reliability and Validity Designs"— Presentation transcript:

1 Reliability and Validity Designs

2 Accurate and consistent measures are needed
It is very important in research and clinical practice to be able to measure patient characteristics accurately and consistently Needed in clinical trials to effectively assess differences between groups Needed in practice to help make clinical decisions and to track patients’ progress Evidence-based Chiropractic

3 Evidence-based Chiropractic
Reliability The ability of a test to provide consistent results when repeated By the same examiner Or by more than one examiner testing the same attribute on the same group of subjects Specific research designs are utilized to determine the degree tests are reliable Evidence-based Chiropractic

4 Evidence-based Chiropractic
Validity The degree to which a test truly measures what it was intended it to measure In valid tests, when the characteristic being measured changes, corresponding changes occur in the test measurement In contrast, tests with reduced validity do not reflect patient changes very well Evidence-based Chiropractic

5 Evidence-based Chiropractic
Measurement error All measurements have some degree of error Thus, any given test score will consist of a true score plus an error component Observed score = True score + Error True score is a theoretical concept involving a measurement derived from a perfect instrument in an ideal environment Evidence-based Chiropractic

6 Evidence-based Chiropractic
True score theory In a group of subjects, variation of true scores occurs because of Individual differences of the subjects Plus an error component Consequently, group scores will always be variable and the variability will result in a distribution of true scores plus error that conforms to a normal curve when the sample size is large enough Evidence-based Chiropractic

7 Evidence-based Chiropractic
Random errors Errors that are attributable to the examiner, the subject, or the measuring instrument Have little effect on the group’s mean score because the errors are just as likely to be high as they are low For example, blood pressure which is variable depending on a number of factors Evidence-based Chiropractic

8 Evidence-based Chiropractic
Systematic errors Errors that cause scores to move in only one direction in response to a factor that has a constant effect on the measurement system Considered to be a form of bias For example, a sphygmomanometer that is out of calibration and always generates high BP readings Evidence-based Chiropractic

9 Evidence-based Chiropractic
Error components Evidence-based Chiropractic

10 Estimating reliability
The proportion of true score variance divided by the observed score variance True score variance Real differences between subjects’ scores due to biologically different people Observed score variance The portion of variability that is due to faults in measurement Evidence-based Chiropractic

11 Observed score variance
Evidence-based Chiropractic

12 The reliability coefficient
Becomes larger (increased reliability) as error variance gets smaller Equals 1.0 when error variance is 0.0 Becomes smaller (decreased reliability) as error variance gets larger Reliability coefficient = True score variance True score variance + Error variance Evidence-based Chiropractic

13 Interpretation of the reliability coefficient
A reliability coefficient of 0.75 means that 75% of the variance in the scores is due to the true variance of the trait being measured and 25% is due to the error variance Evidence-based Chiropractic

14 Interpretation of the reliability coefficient (cont.)
Ranges from 0.0 to 1.0 0.0 represents no reliability and 1.0 perfect reliability Implications 0.75 or greater good reliability 0.5 to 0.75 moderate reliability <0.5 indicates poor reliability. Evidence-based Chiropractic

15 Inter-examiner reliability
When 2 or more examiners test the same subjects for the same characteristic using the same measure, scores should match Inter-examiner reliability is the degree that their findings agree Evidence-based Chiropractic

16 Intra-examiner reliability
Scores should also match when the same examiner tests the same subjects on two or more occasions Intra-examiner reliability is the degree that the examiner agrees with himself or herself Evidence-based Chiropractic

17 Quantifying inter-examiner and intra-examiner reliability
Correlation There should be a high degree of correlation between scores of 2 examiners testing the same group of subjects or 1 examiner testing the same group on 2 occasions However, it is possible to have good correlation and concurrent poor agreement Occurs when 1 examiner consistently scores subjects higher or lower than the other examiner Evidence-based Chiropractic

18 Evidence-based Chiropractic
Graphing reliability 50 40 30 20 10 Examiner 2 scores Very good correlation Examiner 1 scores Evidence-based Chiropractic

19 Good correlation and concurrent poor agreement
50 40 30 20 10 Good correlation, but no agreement Examiner 1 = 40 Examiner 2 = 50 Examiner 1 = 30 Examiner 2 = 40 Examiner 2 scores Examiner 1 = 20 Examiner 2 = 30 Examiner 1 = 10 Examiner 2 = 20 Examiner 1 scores Evidence-based Chiropractic

20 Test-retest reliability
A test is administered to the same group of subjects on more than one occasion Test scores should be consistent when repeated Test scores should correlate well Test-retest reliability is used to assess self-administered questionnaires which are not directly controlled by the examiner Evidence-based Chiropractic

21 Test-retest reliability (cont.)
It is assumed that the condition being considered has not changed between tests Conditions that noticeably change over time are not good candidates for test-retest reliability studies e.g., pain and disability status Evidence-based Chiropractic

22 Test-retest reliability (cont.)
Questionnaire (Time 1) 1 hh hh 2 hh hh 3 hh hh 4 hh hh 5 hh hh 6 hh hh 7 hh hh 8 hh hh 9 hh hh 10 hh hh Questionnaire (Time 2) 1 hh hh 2 hh hh 3 hh hh 4 hh hh 5 hh hh 6 hh hh 7 hh hh 8 hh hh 9 hh hh 10 hh hh ? = Evidence-based Chiropractic

23 Parallel forms reliability a.k.a. Alternate forms reliability
Two versions of a questionnaire or test that measures the same construct are compared Both versions are administered to the same subjects Scores are compared to determine the level of correlation Evidence-based Chiropractic

24 Parallel forms reliability (cont.)
Questionnaire (Version 1) 1 hh hh 2 hh hh 3 hh hh 4 hh hh 5 hh hh 6 hh hh 7 hh hh 8 hh hh 9 hh hh 10 hh hh Questionnaire (Version 2) 1 hh hh 2 hh hh 3 hh hh 4 hh hh 5 hh hh 6 hh hh 7 hh hh 8 hh hh 9 hh hh 10 hh hh ? = Evidence-based Chiropractic

25 Internal consistency reliability
The degree each of the items in a questionnaire measures the targeted construct All questions should measure various characteristics of the construct and nothing else Evidence-based Chiropractic

26 Internal consistency reliability (cont.)
A questionnaire is administered to 1 group of subjects on 1 occasion The results are examined to see how well questions correlate If reliable, each question contributes in a similar way to the questionnaire’s overall score Evidence-based Chiropractic

27 Internal consistency reliability (cont.)
Does - Q1 correlate well with Q8 Q1 with Q9 Q2 with Q7 Questionnaire 1 hh hh 2 hh hh 3 hh hh 4 hh hh 5 hh hh 6 hh hh 7 hh hh 8 hh hh 9 hh hh 10 hh hh Total score____ Also Do - Q1, Q7, Q9, etc. correlate well with the total score ? Evidence-based Chiropractic

28 Cronbach’s coefficient alpha
A measure of internal consistency that evaluates items in a questionnaire to determine the degree that they measure the same construct Is essentially the mean correlation between each of a set of items Evidence-based Chiropractic

29 Cronbach’s alpha (cont.)
Values range from 1, representing perfect internal consistency, to less than zero when a questionnaire includes many negatively correlating items Alpha values ≥0.70 are generally considered to be acceptable Evidence-based Chiropractic

30 2 X 2 contingency table to compare results of examiners
Useful to visualize the results of two examiners who are evaluating the same group of patients Inter-examiner reliability articles often present their findings in the form of a 2 X 2 contingency table If not, they are fairly easy to create from the data presented in the article Evidence-based Chiropractic

31 2 X 2 contingency table (cont.)
Rater 2 Test + Test - Row Total a b a+b c d c+d Column Total a+c b+d a+b+c+d Grand Total Agreements - a & d Rater 1 Disagreements - b & c Evidence-based Chiropractic

32 The kappa statistic (κ)
Agreement between examiners evaluating the same patients can be represented by the percentage of agreement of paired ratings However, percentage of agreement does not account for agreement that would be expected to occur by chance Evidence-based Chiropractic

33 The kappa statistic (cont.)
Even using unreliable measures, a few agreements are expected to occur just by chance Only agreement that occurs beyond chance levels represents true agreement This is what is represented by the kappa statistic It is appropriate for use with dichotomous or nominal data Evidence-based Chiropractic

34 The kappa statistic (cont.)
Where observed agreement (PO) is the total proportion of observations where there is agreement Kappa = observed agreement - chance agreement 1 - chance agreement PO = number of exact agreements or a + d number of possible agreements a + b + c + d Evidence-based Chiropractic

35 The kappa statistic (cont.)
Chance agreement (PC) is the proportion of agreements that would be expected by chance aexpected and dexpected can be found using the same procedure used to calculate expected cell values in the chi square test (Multiply the row total by the column total for cells a and d and then dividing by the grand total) PC = number of expected agreements or aexpected + dexpected number of possible agreements a + b + c + d Evidence-based Chiropractic

36 The kappa statistic (cont.)
The values of PO and PC are then utilized in the following formula to calculate the kappa statistic When the amount of observed agreement exceeds chance agreement, kappa will be positive The strength of agreement is determined by the magnitude of kappa If negative, agreements are less than chance Kappa = PO - PC 1 - PC Evidence-based Chiropractic

37 Interpretation of kappa values
0–0.2 0.2–0.4 0.4–0.6 0.6–0.8 0.8–1.0 Agreement beyond chance None Slight Moderate Fair Substantial Almost perfect Maclure, M. and W.C. Willett, Misinterpretation and misuse of the kappa statistic. Am J Epidemiol, (2): p Evidence-based Chiropractic

38 Evidence-based Chiropractic
Kappa example Reliability of McKenzie classification of patients with cervical or lumbar pain 50 spinal pain patients (25 lumbar and 25 cervical) were simultaneously assessed by 2 physical therapists (14 in total) to classify patients into syndromes and subsyndromes κ = 0.84 for syndrome classification κ = 0.87 for subsyndrome classification Clare, Adams, and Maher, Reliability of McKenzie Classification of Patients With Cervical or Lumbar Pain. Journal of Manipulative and Physiological Therapeutics, (2): p Evidence-based Chiropractic

39 Intraclass Correlation Coefficient (ICC)
Another measure of inter-examiner reliability that is for use with continuous variables Can be used to evaluate 2 or more raters Pearson’s r can be used But ICC is preferred when sample size is small (<15) or more than two tests are involved Evidence-based Chiropractic

40 Evidence-based Chiropractic
ICC (Cont.) There are three models of ICC that may utilize one of two different forms Thus, 6 possible types of ICC depending on how raters are chosen and how subjects are assigned The type of ICC used should always be presented in research papers The first number represents the ICC model The second represents the form used Evidence-based Chiropractic

41 Evidence-based Chiropractic
ICC (Cont.) For example Clare et al reported on the reliability of detection of lumbar lateral shift and found it to be moderate ICC [2,1] values ranging from 0.48 to 0.64 Clare, H.A., R. Adams, and C.G. Maher, Reliability of detection of lumbar lateral shift. J Manipulative Physiol Ther, (8): p Model Form Evidence-based Chiropractic

42 ICC is an index of reliability
Can range from below 0.0 to +1.0 With ≈0.0 indicating weak reliability ≈1.0 strong reliability Suggested interpretation Some clinical measures require ≥0.90 ICC value >0.75 0.40 to 0.75 <0.4 Degree of reliability Excellent Fair to good Poor Evidence-based Chiropractic

43 ICC is based on variance
ICC is the ratio of between-groups variance to total variance, where Between-groups variance is due to different subjects having test scores that truly differ Total variance is due to score differences resulting from inter-rater unreliability of two or more examiners rating the same person Two-way ANOVA is used to calculate ICC Evidence-based Chiropractic

44 Evidence-based Chiropractic
Validity The ability of tests and measurements to in fact evaluate the traits that they were intended to evaluate Vital in research, as well as in clinical practice The extent of a test’s validity depends on the degree to which systematic error has been controlled for Evidence-based Chiropractic

45 Evidence-based Chiropractic
Validity (cont.) The greater the validity, the more likely test results will reflect true differences between scores and not systematic error It’s a matter of degrees, not black-and-white Technically incorrect to say a test is “valid” or “invalid” Better to use categories like highly valid, moderately valid, etc. Evidence-based Chiropractic

46 Evidence-based Chiropractic
Validity (cont.) Test validity depends on its intended purpose For example, a hand-grip dynamometer is valid to measure grip strength, but it is not valid to measure the qualities of hand tremor Evidence-based Chiropractic

47 Evidence-based Chiropractic
Validity (cont.) An invalid test can still be reliable For example, a test that used skull circumference to predict intelligence Reliability would probably be excellent, but it would not be a valid predictor of intelligence But an unreliable test can never be considered valid Evidence-based Chiropractic

48 Methods to estimate the extent of test validity
Can be divided into 3 major categories Self-evident Does the test appear to measure what it is supposed to measure Pragmatic Does the test actually work as hypothesized Construct validity Does the test adequately measure the theoretical construct involved Evidence-based Chiropractic

49 Evidence-based Chiropractic
Self-evident methods Face validity Simply deciding whether a test appears to have merit based on “face value” e.g., if a headache questionnaire asked about the location of head pain it would have face validity If it asked about hair color, it probably would not The lowest level of test validation Often assessed when researchers are first exploring a topic Evidence-based Chiropractic

50 Self-evident methods (cont.)
Content validity The ability of a test to include or represent all of the content of a construct Another definition for content validity The content of a test is compared to the literature that is already available on the topic The test is said to have good content validity if it accurately reflects what is in the literature Evidence-based Chiropractic

51 Evidence-based Chiropractic
Pragmatic methods Criterion-related validity The degree a test corresponds with an external criterion that is an independent measure of the characteristic being tested A criterion is the standard by which a measure is judged A valid test should correlate well with or predict some relevant criterion Concurrent and predictive validity are subgroups of criterion-related validity Evidence-based Chiropractic

52 Pragmatic methods (cont.)
Concurrent validity The results of a new test are compared with an established test (gold standard) to see if they are well correlated Both tests are given at the same time For example, a study that compares a clinical test to detect spondylolisthesis with x-ray findings Evidence-based Chiropractic

53 Pragmatic methods (cont.)
Gold standard test a.k.a, reference standard A test that is generally acknowledged to be the best available The value of a concurrent validity trial depends greatly on the quality of the gold standard that is used Evidence-based Chiropractic

54 Pragmatic methods (cont.)
Construct validity The extent to which a test effectively measures a theoretical construct Like pain or disability The characteristic is not observed directly Rather, an abstraction of the characteristic that corresponds to the construct under consideration is observed e.g., a pain scale or disability questionnaire Evidence-based Chiropractic

55 Pragmatic methods (cont.)
Construct validity can be thought of as the accumulation of evidence that points to the ability of a test to actually measure what it claims to measure It involves the accumulation of evidence by establishing some of the other types of validity The validity of a test is supported if the results of these studies agree with one another Evidence-based Chiropractic

56 Pragmatic methods (cont.)
Construct validity is determined by comparing a new test with other tests that measure a similar construct Another way to evaluate construct validity is to compare the new test with other tests that are different, but related, which should not correlate well Evidence-based Chiropractic

57 Pragmatic methods (cont.)
Convergent validity Has to do with the degree of correlation that exists between a new test and another measure of the same or similar constructs A test that has good convergent validity correlates well with another measure of the same construct Evidence-based Chiropractic

58 Pragmatic methods (cont.)
Discriminant validity The opposite of convergent validity, where the new test is weakly related to or unrelated to another measure that it should in fact be different from A test with good discriminant validity should be able to separate patients into different groups e.g., normal vs. abnormal Evidence-based Chiropractic

59 The concept of validity and reliability
Can be compared with scores on a target Scores may be systematically off center Results from bias The test environment is faulty, causing all scores to be inaccurate Scores miss the bull’s eye in one direction Scores may be randomly off center Scores miss the bull’s eye in any direction Evidence-based Chiropractic

60 The concept of validity and reliability (cont.)
When test scores miss the bull’s eye in any direction, it is caused by random error Some subjects are affected while others are not Accurate tests Are free from bias Precise tests Are free from random error Evidence-based Chiropractic

61 Accuracy and precision
An accurate and precise test hits the bull’s eye and is tightly grouped An inaccurate test syste- matically misses the bull’s eye in one direction An imprecise test misses the bull’s eye randomly Evidence-based Chiropractic

62 Evidence-based Chiropractic
Cutoff points Test results involving ordinal or continuous measures are often converted to a dichotomous scale (dichotomized) Achieved by establishing a cutoff point at a specified value Scores above the specified value are considered positive Scores below the value are negative Evidence-based Chiropractic

63 The ideal diagnostic test
Would always correctly discriminate between those with and those without the condition Always positive for those with the condition Always negative for those without it Evidence-based Chiropractic

64 Evidence-based Chiropractic
The ideal test Always negative for those without the condition Always positive for those with the condition Evidence-based Chiropractic

65 Evidence-based Chiropractic
Real-world test False negatives False positives Evidence-based Chiropractic

66 Sensitivity and Specificity
Commonly used to assess the validity of tests Sensitivity The ability of a test to correctly identify people who have the target disorder Specificity The ability of a test to correctly identify people who do not have the target disorder Evidence-based Chiropractic

67 Sensitivity and Specificity (cont.)
Expressed as a percentage 0% represents no sensitivity or specificity 100% is perfect sensitivity or specificity A 2 X 2 contingency table can be used to calculate these indices Evidence-based Chiropractic

68 Evidence-based Chiropractic
2 X 2 contingency table Condition (per “gold standard”) Present Absent Row Total Positive a (True +) b (False +) a+b Negative c (False -) d (True -) c+d Column Total a+c b+d a+b+c+d Grand Total Test Result Evidence-based Chiropractic

69 Sensitivity and Specificity (cont.)
a/(a+c) = Specificity = d/(b+d) = Evidence-based Chiropractic

70 SnOUT (Sensitivity rules OUT)
In tests that have very high sensitivity A negative test will rule out the condition under consideration This is because there are very few false negatives in tests with very high sensitivity If a test with very high sensitivity is negative, it is very likely a true negative Evidence-based Chiropractic

71 SpIN (SPecificity rules IN)
In tests that have very high specificity A positive test will rule in the condition under consideration This is because there are very few false positives in tests with very high specificity If a test with very high specificity is positive, it is very likely a true positive Evidence-based Chiropractic

72 The cutoff point influences a test’s sensitivity & specificity
Higher scores point to a worsening condition False negatives False positives If the cutoff point is raised, specificity increases, but there are more false negatives Evidence-based Chiropractic

73 The cutoff point and sensitivity & specificity (cont.)
False negatives False positives If the cutoff point is lowered, sensitivity increases, but there are more false positives Evidence-based Chiropractic

74 The cutoff point and sensitivity & specificity (cont.)
Because increasing sensitivity will decrease specificity, and increasing specificity will decrease sensitivity, the cutoff point that is set depends on Whether it is best to maximize sensitivity at the expense of specificity, or Whether it is best to maximize specificity at the expense of sensitivity Evidence-based Chiropractic

75 Receiver Operating Characteristic (ROC) curves
Graphically depicts the tradeoff between sensitivity and specificity In accurate tests The curve closely follows the left-hand border and the top border of the ROC space In less accurate the tests The curve is closer to the 45-degree diagonal of the ROC space Evidence-based Chiropractic

76 Evidence-based Chiropractic
ROC curves (cont.) Evidence-based Chiropractic

77 Evidence-based Chiropractic
ROC curves (cont.) Cut-off low = high sensitivity, but more false positives Cut-off high = low sensitivity, but fewer false positives Evidence-based Chiropractic

78 Implications of sensitivity & specificity
In tests with low sensitivity People with the target disorder will be missed (false negatives) In tests with low specificity People who do not actually have the target disorder will be identified as having it (false positives) Evidence-based Chiropractic

79 Implications of sensitivity & specificity (cont.)
Tests with high sensitivity may be suitable when the consequences of reporting false positive findings to a patient are minor e.g., incorrectly reporting to a patient that their triglycerides are elevated which results in them shifting to a healthier lifestyle Evidence-based Chiropractic

80 Implications of sensitivity & specificity (cont.)
Tests with high specificity are better when false positive findings lead to painful or expensive treatment e.g., a test that leads to surgical intervention In this case false positives must be minimized Evidence-based Chiropractic

81 Implications of sensitivity & specificity (cont.)
Screening for rare conditions Many false positives may result since very few cases have the potential to be detected, even when highly specific tests are used. Not a serious problem if positive screening leads to confirmatory testing Screening for common conditions Many cases may be overlooked, even when a highly sensitive test is used Evidence-based Chiropractic

82 What is an acceptable level of sensitivity specificity?
There is no general agreement, also it depends on the clinical situation Is changeable when The intent of the test or the setting changes The prevalence of the condition is different in the group being tested Alternate methods of testing are available Evidence-based Chiropractic

83 Predictive value of a test
Positive predictive value The probability that a positive test will correctly identify people who have the target disorder a/(a+b) Condition Present Absent Positive a b Negative c d Test result Evidence-based Chiropractic

84 Predictive value of a test (cont.)
Negative predictive value The probability that a negative test will correctly identify people who do not have the target disorder d/(c+d) Condition Present Absent Positive a b Negative c d Test result Evidence-based Chiropractic

85 Evidence-based Chiropractic
Condition (per “gold standard”) Present Absent Row Total Positive a b a+b Negative c d c+d Column Total a+c b+d a+b+c+d Grand Total Test Result Sensitivity = a/(a+c) Specificity = d/(b+d) Positive predictive value = a/(a+b) Negative predictive value = d/(c+d) Evidence-based Chiropractic

86 Evidence-based Chiropractic
Likelihood ratio (LR) The probability that the results of a diagnostic test would be expected in a patient with the condition of interest (sensitivity) compared to the expected results of the same test in a patient without the condition (specificity) Applies to positive as well as negative tests Evidence-based Chiropractic

87 Likelihood ratio (cont.)
LR of a positive test (LR+) A ratio of the probability of a positive test in a person with the condition compared to the probability of a positive test in a person without the condition a/(a+c) or sensitivity 1-d/(b+d) 1-specificity Evidence-based Chiropractic

88 Likelihood ratio (cont.)
In a positive test LR >1, the probability that the condition is present is increased LR <1, the probability that the condition is present is decreased LR =1, the probability that the condition is present versus not being present is the same Evidence-based Chiropractic

89 Likelihood ratio (cont.)
LR of a negative test (LR-) A ratio of the probability of a negative test in a person with the condition compared to the probability of a negative test in a person without the condition 1-a/(a+c) or 1-sensitivity d/(b+d) specificity Evidence-based Chiropractic

90 Likelihood ratio (cont.)
LRs have been referred to as the most useful single indicator of a test’s diagnostic strength They can be used to help make decisions about the need of further testing Also, choosing the appropriate time to begin treatment Worster, A., G. Innes, and R. Abu-Laban, Diagnostic testing: an emergency medicine perspective. Can J Emerg Med, (5). Evidence-based Chiropractic

91 Evidence-based Chiropractic
Meaning of LRs LR >10 or <0.1 Generates large and conclusive changes in the probability of a given diagnosis LR in the range of 5 to 10 or 0.1 to 0.2 Generates a moderate and usually important change in the probability of a given diagnosis LR in the range of 2 to 5 or 0.5 to 0.2 Generates a small but sometimes important change in the probability of a given diagnosis LR in the range of 1 to 2 or 0.5 to 1 Changes the probability of a given diagnosis to a small and rarely important degree Evidence-based Chiropractic

92 Evidence-based Chiropractic
Meaning of LRs (cont.) LRs >10 indicate that the test can be used to rule the condition in LRs ~ 1 provide no useful information for ruling the condition in or out LRs <0.1 indicate that the test can be used to rule the condition out Evidence-based Chiropractic

93 Evidence-based Chiropractic
Pre-test probability The probability that a patient has a condition before the test is carried out Is based on the clinician’s experience, the prevalence of the condition, and published literature May be modified up or down if the patient has risk factors Evidence-based Chiropractic

94 Post-test probability
Is generated by combining a patients pre-test probability of having the condition with the test’s LR A high pre-test probability coupled with a high LR produces a very high post-test probability A low pre-test probability coupled with a low LR produces a very low post-test probability Evidence-based Chiropractic

95 Using LRs with Pre-test & Post-test probabilities
A practitioner’s confidence about a correct diagnosis would be higher after positive results of a test with a high LR Especially if the pre-test probability was high Thus, clinicians can use them in making decisions about the need for further testing and when to begin treatment Evidence-based Chiropractic

96 Using LRs with Pre-test & Post-test probabilities (cont.)
When the post-test probability is very high, the condition is very likely present and treatment should be initiated When it is very low, the condition can be ruled out and no further diagnostic or therapeutic action is necessary Evidence-based Chiropractic

97 Using a nomogram Using a nomogram Draw a line between
the pre-test probability and the LR, extending to the post-test probability Next, the test’s LR is obtained from an article First, the pre-test probability is estimated Evidence-based Chiropractic

98 Using LRs with Pre-test & Post-test probabilities (cont.)
LRs and post-test probabilities can be used serially The post-test probability resulting from one test can be used as a pre-test probability for the next one Evidence-based Chiropractic

99 Clinical disagreement
Practitioners can still disagree about clinical findings, even when valid and reliable tests are used 3 sources of clinical disagreement The examiner (practitioner) The examined (patient) The examination Evidence-based Chiropractic

100 Clinical disagreement due to the examiner
Biological variations of senses Many tests rely on the examiners abilities Some people have better hearing, sight, more skill at palpation, etc. Tendency to record inferences rather than evidence Examiners may “pre-diagnose” patients based on visible cues before actual examination Evidence-based Chiropractic

101 Clinical disagreement due to the examiner (cont.)
Ensnarement by diagnostic classification schemes Vague diagnostic criteria and the tendency to pigeon-hole patients Entrapment by prior expectation Tendency for examiners to find what they hope to find (e.g., chiropractors find back problems, urologists find kidney problems) Examiner incompetence Evidence-based Chiropractic

102 Clinical disagreement due to the examined
Biological variation Many conditions vary from day-to-day Effects of illness and medications A patient with severe pain is very difficult to examine Pain medications may mask the true findings Evidence-based Chiropractic

103 Clinical disagreement due to the examined (cont.)
Memory and rumination Chronic patients may include everything under the sun, or only what they think caused the problem (selective memory) Recall bias Toss-ups Deals with conflicting ways to manage a patients condition Evidence-based Chiropractic

104 Clinical disagreement due to the examination
Disruptive environment e.g., an athletic field or a child crying during a parent’s examination Disruptive interactions between examiner and patient Patients won’t confide in a doctor they don’t like or trust Dysfunctional or incorrectly used diagnostic tools Evidence-based Chiropractic

105 Appraising reliability and validity articles
First decide whether purpose of the study is to assess the test’s reliability or validity (or both) Reliability studies assess the consistency of tests within or between examiners or questionnaires Validity studies compare test results with established tests, or how accurately the test predicts a future outcome Evidence-based Chiropractic

106 Appraising reliability and validity articles (cont.)
Was the test adequately described? Should mention how patients prepared for the test (e.g., fasting prior to a blood test) What patients had to endure (e.g., drugs given for routine colonoscopy) Patient inconvenience, cost, and harm must be weighed against the need for information How the results were analyzed and interpreted Evidence-based Chiropractic

107 Appraising reliability and validity articles (cont.)
Did the study sample include a full range of subjects with and without the condition? All types of patients should be included, like one would see in everyday clinical practice If too many sick are included, there is a greater chance that those with the disease will test positive Such tests may be able to identify obviously ill patients, but not those who are only mildly ill Evidence-based Chiropractic

108 Appraising reliability and validity articles (cont.)
If the study utilized a gold standard for comparison, was it an acceptable one? The credibility of a validity study depends on the soundness of the gold standard It is often difficult to find an ideal gold standard since most tests do not have both high sensitivity and high specificity Especially complex for spinal function tests Evidence-based Chiropractic

109 Appraising reliability and validity articles (cont.)
Were the test results and the gold standard assessed independently in a blinded fashion? Raters should be unaware of the results of previous testing, because this knowledge can greatly affect the interpretation of tests Expectation bias When raters are influenced by knowledge of certain features of the case Evidence-based Chiropractic

110 Appraising reliability and validity articles (cont.)
Verification bias When the decision to carry out the gold standard test is influenced by the results of the test that is being evaluated Be wary of studies that use more than one type of gold standard test e.g., some patients are biopsied, while others wait to see if the condition develops Evidence-based Chiropractic

111 Appraising reliability and validity articles (cont.)
Do the results of this study apply to the patient before me? The study’s population should be comparable to the patient on factors such as age, gender, and condition severity Prevalence or severity of the condition may be higher in an academic environment As a result, the test’s sensitivity may be higher than if it were studied in the general population Evidence-based Chiropractic

112 Appraising reliability and validity articles (cont.)
Will patients benefit as a result of being tested? Is the new test really preferable to the old one It may be less convenient, more expensive, and provide little or no added information Beware of studies on diagnostic tests that have commercial ties Test results should benefit the patient and actually result in a change in the way their condition is managed Evidence-based Chiropractic

113 Appraising reliability and validity articles (cont.)
One must also consider the consequences of not performing the test For instance, a test that is designed to detect a condition that is potentially very harmful if left undiagnosed e.g., arterial dissection or abdominal aneurysm The risk associated with the test should be proportional to the importance of the information to be gained Evidence-based Chiropractic

114 Appraising reliability and validity articles (cont.)
Is the test reliable? Coefficients of agreement should be within acceptable ranges P values or confidence intervals should point to statistically significant findings Evidence-based Chiropractic

115 Appraising reliability and validity articles (cont.)
Is the test valid? P values or confidence intervals should be reported and should be significant The gold standard should be a valid marker for what is being tested Sensitivity and specificity should be sufficiently high Depends on the planned use of the test Evidence-based Chiropractic


Download ppt "Reliability and Validity Designs"

Similar presentations


Ads by Google