# Copyright 2004 David J. Lilja1 Comparing Two Alternatives Use confidence intervals for Before-and-after comparisons Noncorresponding measurements.

## Presentation on theme: "Copyright 2004 David J. Lilja1 Comparing Two Alternatives Use confidence intervals for Before-and-after comparisons Noncorresponding measurements."— Presentation transcript:

Copyright 2004 David J. Lilja1 Comparing Two Alternatives Use confidence intervals for Before-and-after comparisons Noncorresponding measurements

Copyright 2004 David J. Lilja2 Comparing more than two alternatives ANOVA Analysis of Variance Partitions total variation in a set of measurements into Variation due to real differences in alternatives Variation due to errors

Copyright 2004 David J. Lilja3 Comparing Two Alternatives 1. Before-and-after Did a change to the system have a statistically significant impact on performance? 2. Non-corresponding measurements Is there a statistically significant difference between two different systems?

Copyright 2004 David J. Lilja4 Before-and-After Comparison Assumptions Before-and-after measurements are not independent Variances in two sets of measurements may not be equal → Measurements are related Form obvious corresponding pairs Find mean of differences

Copyright 2004 David J. Lilja5 Before-and-After Comparison

Copyright 2004 David J. Lilja6 Before-and-After Comparison Measurement (i) Before (b i ) After (a i ) Difference (d i = b i – a i ) 18586 28388-5 394904 4 95-5 58891-3 687834

Copyright 2004 David J. Lilja7 Before-and-After Comparison From mean of differences, appears that change reduced performance. However, standard deviation is large.

Copyright 2004 David J. Lilja8 95% Confidence Interval for Mean of Differences a n0.900.950.975 ………… 51.4762.0152.571 61.4401.9432.447 ………… ∞1.2821.6451.960

Copyright 2004 David J. Lilja9 95% Confidence Interval for Mean of Differences

Copyright 2004 David J. Lilja10 95% Confidence Interval for Mean of Differences c 1,2 = [-5.36, 3.36] Interval includes 0 → With 95% confidence, there is no statistically significant difference between the two systems.

Copyright 2004 David J. Lilja11 Noncorresponding Measurements No direct correspondence between pairs of measurements Unpaired observations n 1 measurements of system 1 n 2 measurements of system 2

Copyright 2004 David J. Lilja12 Confidence Interval for Difference of Means 1. Compute means 2. Compute difference of means 3. Compute standard deviation of difference of means 4. Find confidence interval for this difference 5. No statistically significant difference between systems if interval includes 0

Copyright 2004 David J. Lilja13 Confidence Interval for Difference of Means

Copyright 2004 David J. Lilja14 Why Add Standard Deviations? x1x1 x2x2 s2s2 s1s1 x 1 – x 2

Copyright 2004 David J. Lilja15 Number of Degrees of Freedom

Copyright 2004 David J. Lilja16 Example – Noncorresponding Measurements

Copyright 2004 David J. Lilja17 Example (cont.)

Copyright 2004 David J. Lilja18 Example (cont.): 90% CI

Copyright 2004 David J. Lilja19 A Special Case If n 1 < ≈ 30 or n 2 < ≈ 30 And errors are normally distributed And s 1 = s 2 (standard deviations are equal) OR If n 1 = n 2 And errors are normally distributed Even if s 1 is not equal s 2 Then special case applies…

Copyright 2004 David J. Lilja20 A Special Case Typically produces tighter confidence interval Sometimes useful after obtaining additional measurements to tease out small differences

Copyright 2004 David J. Lilja21 Comparing Proportions

Copyright 2004 David J. Lilja22 Comparing Proportions Since it is a binomial distribution

Copyright 2004 David J. Lilja23 Comparing Proportions

Copyright 2004 David J. Lilja24 OS Example (cont) Initial operating system n1 = 1,300,203 interrupts (3.5 hours) m1 = 142,892 interrupts occurred in OS code p1 = 0.1099, or 11% of time executing in OS Upgrade OS n2 = 999,382 m2 = 84,876 p2 = 0.0849, or 8.5% of time executing in OS Statistically significant improvement?

Copyright 2004 David J. Lilja25 OS Example (cont) p = p1 – p2 = 0.0250 s p = 0.0003911 90% confidence interval (0.0242, 0.0257) Statistically significant difference?

Copyright 2004 David J. Lilja26 Important Points Use confidence intervals to determine if there are statistically significant differences Before-and-after comparisons Find interval for mean of differences Noncorresponding measurements Find interval for difference of means Proportions If interval includes zero → No statistically significant difference

Copyright 2004 David J. Lilja27 Comparing More Than Two Alternatives Naïve approach Compare confidence intervals

Copyright 2004 David J. Lilja28 One-Factor Analysis of Variance (ANOVA) Very general technique Look at total variation in a set of measurements Divide into meaningful components Also called One-way classification One-factor experimental design Introduce basic concept with one-factor ANOVA Generalize later with design of experiments

Copyright 2004 David J. Lilja29 One-Factor Analysis of Variance (ANOVA) Separates total variation observed in a set of measurements into: 1. Variation within one system Due to random measurement errors 2. Variation between systems Due to real differences + random error Is variation(2) statistically > variation(1)?

Copyright 2004 David J. Lilja30 ANOVA Make n measurements of k alternatives y ij = ith measurment on jth alternative Assumes errors are: Independent Gaussian (normal)

Copyright 2004 David J. Lilja31 Measurements for All Alternatives Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja32 Column Means Column means are average values of all measurements within a single alternative Average performance of one alternative

Copyright 2004 David J. Lilja33 Column Means Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja34 Deviation From Column Mean

Copyright 2004 David J. Lilja35 Error = Deviation From Column Mean Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja36 Overall Mean Average of all measurements made of all alternatives

Copyright 2004 David J. Lilja37 Overall Mean Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja38 Deviation From Overall Mean

Copyright 2004 David J. Lilja39 Effect = Deviation From Overall Mean Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja40 Effects and Errors Effect is distance from overall mean Horizontally across alternatives Error is distance from column mean Vertically within one alternative Error across alternatives, too Individual measurements are then:

Copyright 2004 David J. Lilja41 Sum of Squares of Differences: SSE

Copyright 2004 David J. Lilja42 Sum of Squares of Differences: SSA

Copyright 2004 David J. Lilja43 Sum of Squares of Differences: SST

Copyright 2004 David J. Lilja44 Sum of Squares of Differences

Copyright 2004 David J. Lilja45 Sum of Squares of Differences SST = differences between each measurement and overall mean SSA = variation due to effects of alternatives SSE = variation due to errors in measurments

Copyright 2004 David J. Lilja46 ANOVA – Fundamental Idea Separates variation in measured values into: 1. Variation due to effects of alternatives SSA – variation across columns 2. Variation due to errors SSE – variation within a single column If differences among alternatives are due to real differences, SSA should be statistically > SSE

Copyright 2004 David J. Lilja47 Comparing SSE and SSA Simple approach SSA / SST = fraction of total variation explained by differences among alternatives SSE / SST = fraction of total variation due to experimental error But is it statistically significant?

Copyright 2004 David J. Lilja48 Statistically Comparing SSE and SSA

Copyright 2004 David J. Lilja49 Degrees of Freedom df(SSA) = k – 1, since k alternatives df(SSE) = k(n – 1), since k alternatives, each with (n – 1) df df(SST) = df(SSA) + df(SSE) = kn - 1

Copyright 2004 David J. Lilja50 Degrees of Freedom for Effects Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja51 Degrees of Freedom for Errors Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja52 Degrees of Freedom for Errors Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja53 Variances from Sum of Squares (Mean Square Value)

Copyright 2004 David J. Lilja54 Comparing Variances Use F-test to compare ratio of variances

Copyright 2004 David J. Lilja55 F-test If F computed > F table → We have (1 – α) * 100% confidence that variation due to actual differences in alternatives, SSA, is statistically greater than variation due to errors, SSE.

Copyright 2004 David J. Lilja56 ANOVA Summary

Copyright 2004 David J. Lilja57 ANOVA Example Alternatives Measurements123Overall mean 10.09720.13820.7966 20.09710.14320.5300 30.09690.13820.5152 40.19540.17300.6675 50.09740.13830.5298 Column mean0.11680.14620.60780.2903 Effects-0.1735-0.14410.3175

Copyright 2004 David J. Lilja58 ANOVA Example

Copyright 2004 David J. Lilja59 Conclusions from example SSA/SST = 0.7585/0.8270 = 0.917 → 91.7% of total variation in measurements is due to differences among alternatives SSE/SST = 0.0685/0.8270 = 0.083 → 8.3% of total variation in measurements is due to noise in measurements Computed F statistic > tabulated F statistic → 95% confidence that differences among alternatives are statistically significant.

Copyright 2004 David J. Lilja60 Contrasts ANOVA tells us that there is a statistically significant difference among alternatives But it does not tell us where difference is Use method of contrasts to compare subsets of alternatives A vs B {A, B} vs {C} Etc.

Copyright 2004 David J. Lilja61 Contrasts Contrast = linear combination of effects of alternatives

Copyright 2004 David J. Lilja62 Contrasts E.g. Compare effect of system 1 to effect of system 2

Copyright 2004 David J. Lilja63 Construct confidence interval for contrasts Need Estimate of variance Appropriate value from t table Compute confidence interval as before If interval includes 0 Then no statistically significant difference exists between the alternatives included in the contrast

Copyright 2004 David J. Lilja64 Variance of random variables Recall that, for independent random variables X 1 and X 2

Copyright 2004 David J. Lilja65 Variance of a contrast c Assumes variation due to errors is equally distributed among kn total measurements

Copyright 2004 David J. Lilja66 Confidence interval for contrasts

Copyright 2004 David J. Lilja67 Example 90% confidence interval for contrast of [Sys1- Sys2]

Copyright 2004 David J. Lilja68 Important Points Use one-factor ANOVA to separate total variation into: – Variation within one system Due to random errors – Variation between systems Due to real differences (+ random error) Is the variation due to real differences statistically greater than the variation due to errors?

Copyright 2004 David J. Lilja69 Important Points Use contrasts to compare effects of subsets of alternatives

Download ppt "Copyright 2004 David J. Lilja1 Comparing Two Alternatives Use confidence intervals for Before-and-after comparisons Noncorresponding measurements."

Similar presentations