Copyright 2004 David J. Lilja1 Comparing Two Alternatives Use confidence intervals for Before-and-after comparisons Noncorresponding measurements.

Slides:



Advertisements
Similar presentations
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 10 The Analysis of Variance.
Advertisements

Analysis of Variance (ANOVA)
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
Analysis of Variance (ANOVA) ANOVA can be used to test for the equality of three or more population means We want to use the sample results to test the.
Design of Experiments and Analysis of Variance
© 2010 Pearson Prentice Hall. All rights reserved Single Factor ANOVA.
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
Copyright 2004 David J. Lilja1 Errors in Experimental Measurements Sources of errors Accuracy, precision, resolution A mathematical model of errors Confidence.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 11 Analysis of Variance
The Statistical Analysis Partitions the total variation in the data into components associated with sources of variation –For a Completely Randomized Design.
Lecture 9: One Way ANOVA Between Subjects
k r Factorial Designs with Replications r replications of 2 k Experiments –2 k r observations. –Allows estimation of experimental errors Model:
Chapter 2 Simple Comparative Experiments
5-3 Inference on the Means of Two Populations, Variances Unknown
Simple Linear Regression and Correlation
Chap 10-1 Analysis of Variance. Chap 10-2 Overview Analysis of Variance (ANOVA) F-test Tukey- Kramer test One-Way ANOVA Two-Way ANOVA Interaction Effects.
Chapter 12: Analysis of Variance
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
Fundamentals of Data Analysis Lecture 7 ANOVA. Program for today F Analysis of variance; F One factor design; F Many factors design; F Latin square scheme.
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
QNT 531 Advanced Problems in Statistics and Research Methods
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
Simple Linear Regression Models
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
© 1998, Geoff Kuenning General 2 k Factorial Designs Used to explain the effects of k factors, each with two alternatives or levels 2 2 factorial designs.
Chapter 10 Analysis of Variance.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
© Copyright McGraw-Hill 2000
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
ANOVA: Analysis of Variance.
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
Lecture 2d: Performance Comparison. Quality of Measurement Characteristics of a measurement tool (timer) Accuracy: Absolute difference of a measured value.
Chapter 10: Analysis of Variance: Comparing More Than Two Means.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
CPE 619 One Factor Experiments Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama in.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Random samples of size n 1, n 2, …,n k are drawn from k populations with means  1,  2,…,  k and with common variance  2. Let x ij be the j-th measurement.
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
(c) 2007 IUPUI SPEA K300 (4392) Outline Comparing Group Means Data arrangement Linear Models and Factor Analysis of Variance (ANOVA) Partitioning Variance.
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
ENGR 610 Applied Statistics Fall Week 8 Marshall University CITE Jack Smith.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
DSCI 346 Yamasaki Lecture 4 ANalysis Of Variance.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Comparing Models.
Chapter 11 Analysis of Variance
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Statistics for Managers Using Microsoft Excel 3rd Edition
ANOVA Econ201 HSTS212.
i) Two way ANOVA without replication
Comparing Three or More Means
Statistics Analysis of Variance.
Chapter 10: Analysis of Variance: Comparing More Than Two Means
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Econ 3790: Business and Economic Statistics
Correlation and Regression
Chapter 11 Analysis of Variance
Model Comparison: some basic concepts
Chapter 13 Group Differences
BUSINESS MARKET RESEARCH
One-Factor Experiments
Chapter 15 Analysis of Variance
Chapter 10 – Part II Analysis of Variance
Quantitative Methods ANOVA.
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

Copyright 2004 David J. Lilja1 Comparing Two Alternatives Use confidence intervals for Before-and-after comparisons Noncorresponding measurements

Copyright 2004 David J. Lilja2 Comparing more than two alternatives ANOVA Analysis of Variance Partitions total variation in a set of measurements into Variation due to real differences in alternatives Variation due to errors

Copyright 2004 David J. Lilja3 Comparing Two Alternatives 1. Before-and-after Did a change to the system have a statistically significant impact on performance? 2. Non-corresponding measurements Is there a statistically significant difference between two different systems?

Copyright 2004 David J. Lilja4 Before-and-After Comparison Assumptions Before-and-after measurements are not independent Variances in two sets of measurements may not be equal → Measurements are related Form obvious corresponding pairs Find mean of differences

Copyright 2004 David J. Lilja5 Before-and-After Comparison

Copyright 2004 David J. Lilja6 Before-and-After Comparison Measurement (i) Before (b i ) After (a i ) Difference (d i = b i – a i )

Copyright 2004 David J. Lilja7 Before-and-After Comparison From mean of differences, appears that change reduced performance. However, standard deviation is large.

Copyright 2004 David J. Lilja8 95% Confidence Interval for Mean of Differences a n ………… ………… ∞

Copyright 2004 David J. Lilja9 95% Confidence Interval for Mean of Differences

Copyright 2004 David J. Lilja10 95% Confidence Interval for Mean of Differences c 1,2 = [-5.36, 3.36] Interval includes 0 → With 95% confidence, there is no statistically significant difference between the two systems.

Copyright 2004 David J. Lilja11 Noncorresponding Measurements No direct correspondence between pairs of measurements Unpaired observations n 1 measurements of system 1 n 2 measurements of system 2

Copyright 2004 David J. Lilja12 Confidence Interval for Difference of Means 1. Compute means 2. Compute difference of means 3. Compute standard deviation of difference of means 4. Find confidence interval for this difference 5. No statistically significant difference between systems if interval includes 0

Copyright 2004 David J. Lilja13 Confidence Interval for Difference of Means

Copyright 2004 David J. Lilja14 Why Add Standard Deviations? x1x1 x2x2 s2s2 s1s1 x 1 – x 2

Copyright 2004 David J. Lilja15 Number of Degrees of Freedom

Copyright 2004 David J. Lilja16 Example – Noncorresponding Measurements

Copyright 2004 David J. Lilja17 Example (cont.)

Copyright 2004 David J. Lilja18 Example (cont.): 90% CI

Copyright 2004 David J. Lilja19 A Special Case If n 1 < ≈ 30 or n 2 < ≈ 30 And errors are normally distributed And s 1 = s 2 (standard deviations are equal) OR If n 1 = n 2 And errors are normally distributed Even if s 1 is not equal s 2 Then special case applies…

Copyright 2004 David J. Lilja20 A Special Case Typically produces tighter confidence interval Sometimes useful after obtaining additional measurements to tease out small differences

Copyright 2004 David J. Lilja21 Comparing Proportions

Copyright 2004 David J. Lilja22 Comparing Proportions Since it is a binomial distribution

Copyright 2004 David J. Lilja23 Comparing Proportions

Copyright 2004 David J. Lilja24 OS Example (cont) Initial operating system n1 = 1,300,203 interrupts (3.5 hours) m1 = 142,892 interrupts occurred in OS code p1 = , or 11% of time executing in OS Upgrade OS n2 = 999,382 m2 = 84,876 p2 = , or 8.5% of time executing in OS Statistically significant improvement?

Copyright 2004 David J. Lilja25 OS Example (cont) p = p1 – p2 = s p = % confidence interval (0.0242, ) Statistically significant difference?

Copyright 2004 David J. Lilja26 Important Points Use confidence intervals to determine if there are statistically significant differences Before-and-after comparisons Find interval for mean of differences Noncorresponding measurements Find interval for difference of means Proportions If interval includes zero → No statistically significant difference

Copyright 2004 David J. Lilja27 Comparing More Than Two Alternatives Naïve approach Compare confidence intervals

Copyright 2004 David J. Lilja28 One-Factor Analysis of Variance (ANOVA) Very general technique Look at total variation in a set of measurements Divide into meaningful components Also called One-way classification One-factor experimental design Introduce basic concept with one-factor ANOVA Generalize later with design of experiments

Copyright 2004 David J. Lilja29 One-Factor Analysis of Variance (ANOVA) Separates total variation observed in a set of measurements into: 1. Variation within one system Due to random measurement errors 2. Variation between systems Due to real differences + random error Is variation(2) statistically > variation(1)?

Copyright 2004 David J. Lilja30 ANOVA Make n measurements of k alternatives y ij = ith measurment on jth alternative Assumes errors are: Independent Gaussian (normal)

Copyright 2004 David J. Lilja31 Measurements for All Alternatives Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja32 Column Means Column means are average values of all measurements within a single alternative Average performance of one alternative

Copyright 2004 David J. Lilja33 Column Means Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja34 Deviation From Column Mean

Copyright 2004 David J. Lilja35 Error = Deviation From Column Mean Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja36 Overall Mean Average of all measurements made of all alternatives

Copyright 2004 David J. Lilja37 Overall Mean Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja38 Deviation From Overall Mean

Copyright 2004 David J. Lilja39 Effect = Deviation From Overall Mean Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja40 Effects and Errors Effect is distance from overall mean Horizontally across alternatives Error is distance from column mean Vertically within one alternative Error across alternatives, too Individual measurements are then:

Copyright 2004 David J. Lilja41 Sum of Squares of Differences: SSE

Copyright 2004 David J. Lilja42 Sum of Squares of Differences: SSA

Copyright 2004 David J. Lilja43 Sum of Squares of Differences: SST

Copyright 2004 David J. Lilja44 Sum of Squares of Differences

Copyright 2004 David J. Lilja45 Sum of Squares of Differences SST = differences between each measurement and overall mean SSA = variation due to effects of alternatives SSE = variation due to errors in measurments

Copyright 2004 David J. Lilja46 ANOVA – Fundamental Idea Separates variation in measured values into: 1. Variation due to effects of alternatives SSA – variation across columns 2. Variation due to errors SSE – variation within a single column If differences among alternatives are due to real differences, SSA should be statistically > SSE

Copyright 2004 David J. Lilja47 Comparing SSE and SSA Simple approach SSA / SST = fraction of total variation explained by differences among alternatives SSE / SST = fraction of total variation due to experimental error But is it statistically significant?

Copyright 2004 David J. Lilja48 Statistically Comparing SSE and SSA

Copyright 2004 David J. Lilja49 Degrees of Freedom df(SSA) = k – 1, since k alternatives df(SSE) = k(n – 1), since k alternatives, each with (n – 1) df df(SST) = df(SSA) + df(SSE) = kn - 1

Copyright 2004 David J. Lilja50 Degrees of Freedom for Effects Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja51 Degrees of Freedom for Errors Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja52 Degrees of Freedom for Errors Alternatives Measure ments 12…j…k 1y 11 y 12 …y 1j …y k1 2y 21 y 22 …y 2j …y 2k ………………… iy i1 y i2 …y ij …y ik ………………… ny n1 y n2 …y nj …y nk Col meany.1 y.2 …y.j …y.k Effectα1α1 α2α2 …αjαj …αkαk

Copyright 2004 David J. Lilja53 Variances from Sum of Squares (Mean Square Value)

Copyright 2004 David J. Lilja54 Comparing Variances Use F-test to compare ratio of variances

Copyright 2004 David J. Lilja55 F-test If F computed > F table → We have (1 – α) * 100% confidence that variation due to actual differences in alternatives, SSA, is statistically greater than variation due to errors, SSE.

Copyright 2004 David J. Lilja56 ANOVA Summary

Copyright 2004 David J. Lilja57 ANOVA Example Alternatives Measurements123Overall mean Column mean Effects

Copyright 2004 David J. Lilja58 ANOVA Example

Copyright 2004 David J. Lilja59 Conclusions from example SSA/SST = / = → 91.7% of total variation in measurements is due to differences among alternatives SSE/SST = / = → 8.3% of total variation in measurements is due to noise in measurements Computed F statistic > tabulated F statistic → 95% confidence that differences among alternatives are statistically significant.

Copyright 2004 David J. Lilja60 Contrasts ANOVA tells us that there is a statistically significant difference among alternatives But it does not tell us where difference is Use method of contrasts to compare subsets of alternatives A vs B {A, B} vs {C} Etc.

Copyright 2004 David J. Lilja61 Contrasts Contrast = linear combination of effects of alternatives

Copyright 2004 David J. Lilja62 Contrasts E.g. Compare effect of system 1 to effect of system 2

Copyright 2004 David J. Lilja63 Construct confidence interval for contrasts Need Estimate of variance Appropriate value from t table Compute confidence interval as before If interval includes 0 Then no statistically significant difference exists between the alternatives included in the contrast

Copyright 2004 David J. Lilja64 Variance of random variables Recall that, for independent random variables X 1 and X 2

Copyright 2004 David J. Lilja65 Variance of a contrast c Assumes variation due to errors is equally distributed among kn total measurements

Copyright 2004 David J. Lilja66 Confidence interval for contrasts

Copyright 2004 David J. Lilja67 Example 90% confidence interval for contrast of [Sys1- Sys2]

Copyright 2004 David J. Lilja68 Important Points Use one-factor ANOVA to separate total variation into: – Variation within one system Due to random errors – Variation between systems Due to real differences (+ random error) Is the variation due to real differences statistically greater than the variation due to errors?

Copyright 2004 David J. Lilja69 Important Points Use contrasts to compare effects of subsets of alternatives