Presentation is loading. Please wait.

Presentation is loading. Please wait.

Clinical practice involves measuring quantities for a variety of purposes, such as: aiding diagnosis, predicting future patient outcomes, serving as endpoints.

Similar presentations


Presentation on theme: "Clinical practice involves measuring quantities for a variety of purposes, such as: aiding diagnosis, predicting future patient outcomes, serving as endpoints."— Presentation transcript:

1 Validity, reliability, reproducibility of an index test Definitions and Assessment

2 Clinical practice involves measuring quantities for a variety of purposes, such as: aiding diagnosis, predicting future patient outcomes, serving as endpoints in clinical studies. Measurements are always prone to various sorts of errors, which cause the measured value to differ from the true value. Pre-analytical factors are a major source of variability in laboratory results: failure to identify these factors can lead to falsely increased or decreased results and to erroneous clinical decisions.

3 Trueness and Precision
The trueness (accuracy) refers to the closeness between the mean of a large number of results and the true value or an accepted reference value. The precision (agreement) refers to the closeness between repeated measurements on identical subjects. Different factors may contribute to the variability found in repeated measurements: Observer, Instrument, Environment, Time interval between measurements, … Precision consists of both: - repeatability (factors constant) - reproducibility (factors variable).

4 Accuracy + P + r e c i s o n True values Error prone measurements

5 Bias = Sx/n – x Systematic error
SD = [S(x – m)2/n]1/2 Random error ICC = SDB2 / (SDB2 + SDW2) ANOVA (random effect)

6 Method comparison Before we use a new measurement method in clinical practice, we must ensure that the measurements it gives are sufficiently similar to those generated by the measurement reference method (currently used). It is often of interest to use measurements to differentiate between subjects or groups of subjects: if we have a choice of two measurement methods using the method with higher reliability will give greater statistical power to detect differences between subjects or groups of subjects.

7 Lancet 1986; 307 – 10

8 Plotting the data The first step to analyzing is to plot the data. The simplest plot is of subjects’ measurements from the new method against those from the established method. If both measurements were completely free from error, we would expect the points to lie on the diagonal line of equality. Visual assessment of the disagreements between the measurements from two methods is often more easily done by plotting the difference in a subject’s measurements from the two methods against the mean of their measurements.

9 Association between difference and mean
It is possible to be an association between the paired differences and means. We can perform a statistical test to assess the evidence for a linear association, either testing whether the correlation coefficient between the paired differences and means differs significantly from zero or by linear regression of the differences against the means.

10 Causes of an observed association
There is real association between the difference in measurements from the two methods and true value being measured: the bias between methods changes over the range of true values. The within-subject SDs of the two methods differ. This will happen in the absence of changing bias if a new method has smaller or larger measurement errors than the standard method.

11 Limits of agreement The limits of agreement give a range within which we expect 95% of future differences in measurements between the two methods to lie. To estimate them, we first calculate the mean and SD of the paired differences and if the paired differences are Normally distributed, we can calculate limits within which we expect 95% of paired differences to fall as: mean difference ± 1.96 × SD(differences) If the paired differences are Normally distributed, the standard error of the limits of agreement is approximately equal to: SD(3/n)1/2.

12 Bias between methods In contrast to the repeatability coefficient, which assumes no bias exists between measurements, the limits of agreement method relaxes this assumption. The mean of the paired differences tells us whether on average one method tended to underestimate or overestimate measurements relative to the measurements of the second method, which we refer to as a bias between the methods.

13 Differences (W – w) = d: Mean = - 2,1 L/min SD = 38,76 L/min
95% of differences: SE(d)=38,76/(17)1/2= %CI(d)= 95%CI(Agreement Limits): L ± tn-1[s(3/n)1/2] LL: ± 2.12 x = UL: ± 2.12 x =

14 Study types 1) In a Repeatability study we investigate and quantify the repeatability of measurements made by a single instrument. The conditions of measurement remain constant. 2) In a Reproducibility study measurements are made by different observers (fixed or random). Systematic bias may exist between observers, and their measurement SD’s may differ.

15 Repeatability studies
For an appropriately selected sample make at least two measurements per subject under identical conditions: by the same measurement method and the same observer. It must be excluded the possibility of bias between measurements. The agreement between measurements made on the same subject depends only on the within-subject SD (estimate of measurement error).

16 Repeatability Coefficient = 43.23 SD = 28.16 (New)
L/min (1° - 2°) DIFF2 1 494 490 4 16 2 395 397 -2 3 516 512 434 401 33 1089 5 476 470 6 36 557 611 -54 2916 7 413 415 8 442 431 11 121 9 650 638 12 144 10 433 429 417 420 -3 656 633 23 529 13 267 275 -8 64 14 478 492 -14 196 15 178 165 169 423 372 51 2601 17 427 421 S2D = 468, (Reference) S2D = 792, (New) SD = (Reference) Repeatability Coefficient = 43.23 SD = (New) Repeatability Coefficient = 56.32

17 To estimate the within-subject SD (measurement error), we can fit a one- way analysis of variance (ANOVA) model to the data containing the measurements made on subjects: differences between subjects under measurement differences within subjects under measurement Fitting the ANOVA model results in estimates of the s2B and s2W subjects. The within-subject SD estimate can be used to give an estimate of the repeatability coefficient.

18 Reporting repeatability
The within-subject SD differences between two measurements made on the same subject:

19 The ANOVA model assumes that the measurement errors are statistically independent of the true ‘error free’ value, and that the SD of the errors is constant throughout the range of ‘error-free’ values. Sometimes the SD of errors increases with the true value being measured (check by plotting paired differences between measurements against their mean). The “repeatability coefficient” relies on the differences between measurements being approximately Normally distributed (check by a histogram or Normal plot of the differences in paired measurements on each subject).

20 Reliability in method comparison studies
As discussed previously, reliability may be a useful parameter with which to compare two different measurement methods. To estimate each method’s reliability, we must make at least two measurements of each subject with each of the two methods. The repeat measurements from each method can then be analyzed as two separate repeatability studies, giving estimates of each method’s reliability, which can be compared. Because reliability depends on the heterogeneity of the true error-free values in the sampled population it is essential that reliability ICCs are compared only if they have been estimated from the same population.

21 Reliability Relates the magnitude of the measurement error in observed measurements to the inherent variability in the ‘error-free’ level of the quantity between subjects: __________(SD of subjects’ true values)2 . (SD subjects’ true values)2 + (SD measurement error)2

22 From healthy volunteers
Factors influencing ammonia measurements: - sample temperature - centrifugation temperature (0° 25°) - storage time, temperature, conditions (30’ 60’; 4° 25°; open closed tubes) - patient covariates (biochemical and hematological)

23 20 healthy outpatient volunteers 19 – 47 Y of age
4 subsamples: K3 EDTA HEPA: NH3-1 NH3-2 NH3-3 Conservation 30’: icy water room temperature Centrifugation: 0° 25° C (measurement 1) Conservation 30’: 4° 20° C – closed/opened (measurement 2) Y: (NH3-n – NH31)/NH3x100% Median IQR Multiple Linear Regression Analysis

24 Conclusions As measurement techniques potentially may be used in a variety of settings and different populations, it is advisable to report estimates of between- and within-subject SD’s. If the reliabilities of two methods are to be compared, each method’s reliability should be estimated separately, by making at least two measurements on each subject with each measurement method. An association between paired differences and means may not necessarily be caused by changing bias between two methods. Such an association may also be caused by a difference in the methods’ measurement error SDs. Where measurements involve an observer or rater, measurement error studies must use an adequate number of observers (reproducibility studies).

25 References 1) Bartlett JW, Frost C (2008): Reliability, repeatability and reproducibility: analysis of measurement errors in continuous variables. Ultrasound Obstet Gynecol; 31: 466–75 2) Bland JM, Altman DG (1999): Measuring agreement in method comparison studies. Stat Methods Med Res 1999; 8: 135–60. 3) Bland JM, Altman DG (1986): Statistical methods for assessing agreement between two methods of clinical measurement Lancet; i: 307–10


Download ppt "Clinical practice involves measuring quantities for a variety of purposes, such as: aiding diagnosis, predicting future patient outcomes, serving as endpoints."

Similar presentations


Ads by Google