Copyright 2004 David J. Lilja1 Errors in Experimental Measurements Sources of errors Accuracy, precision, resolution A mathematical model of errors Confidence.

Slides:



Advertisements
Similar presentations
DATA & STATISTICS 101 Presented by Stu Nagourney NJDEP, OQA.
Advertisements

Copyright 2004 David J. Lilja1 Comparing Two Alternatives Use confidence intervals for Before-and-after comparisons Noncorresponding measurements.
Welcome to PHYS 225a Lab Introduction, class rules, error analysis Julia Velkovska.
Chapter 6 Sampling and Sampling Distributions
October 1999 Statistical Methods for Computer Science Marie desJardins CMSC 601 April 9, 2012 Material adapted.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Estimation in Sampling
Sampling: Final and Initial Sample Size Determination
Chap 8-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 8 Estimation: Single Population Statistics for Business and Economics.
Inference Sampling distributions Hypothesis testing.
Chapter 11- Confidence Intervals for Univariate Data Math 22 Introductory Statistics.
Objectives Look at Central Limit Theorem Sampling distribution of the mean.
Copyright 2004 David J. Lilja1 What Do All of These Means Mean? Indices of central tendency Sample mean Median Mode Other means Arithmetic Harmonic Geometric.
1 Introduction to Estimation Chapter Introduction Statistical inference is the process by which we acquire information about populations from.
Topic 7 Sampling And Sampling Distributions. The term Population represents everything we want to study, bearing in mind that the population is ever changing.
Chapter 7 Sampling and Sampling Distributions
Introduction to Inference Estimating with Confidence Chapter 6.1.
Chapter 8 Estimation: Single Population
Chapter 7: Variation in repeated samples – Sampling distributions
Types of Errors Difference between measured result and true value. u Illegitimate errors u Blunders resulting from mistakes in procedure. You must be careful.
PROBABILITY AND SAMPLES: THE DISTRIBUTION OF SAMPLE MEANS.
Chapter 7 Estimation: Single Population
Copyright 2004 David J. Lilja1 Measuring Computer Performance: A Practitioner’s Guide David J. Lilja Electrical and Computer Engineering University of.
6.1 Confidence Intervals for the Mean (Large Samples)
Standard error of estimate & Confidence interval.
Statistical Inference Dr. Mona Hassan Ahmed Prof. of Biostatistics HIPH, Alexandria University.
Chapter 7 Estimation: Single Population
Albert Morlan Caitrin Carroll Savannah Andrews Richard Saney.
Populations, Samples, Standard errors, confidence intervals Dr. Omar Al Jadaan.
Estimation of Statistical Parameters
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Statistics for Data Miners: Part I (continued) S.T. Balke.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
PARAMETRIC STATISTICAL INFERENCE
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Basic Statistics for Engineers. Collection, presentation, interpretation and decision making. Prof. Dudley S. Finch.
Copyright 2011 by W. H. Freeman and Company. All rights reserved.1 Introductory Statistics: A Problem-Solving Approach by Stephen Kokoska Chapter 8: Confidence.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
Basic Sampling & Review of Statistics. Basic Sampling What is a sample?  Selection of a subset of elements from a larger group of objects Why use a sample?
Sampling Error.  When we take a sample, our results will not exactly equal the correct results for the whole population. That is, our results will be.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
Lecture 2d: Performance Comparison. Quality of Measurement Characteristics of a measurement tool (timer) Accuracy: Absolute difference of a measured value.
Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.
Chapter 8: Confidence Intervals based on a Single Sample
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved. Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall
CONFIDENCE INTERVALS.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a measure of the population. This value is typically unknown. (µ, σ, and now.
© Copyright McGraw-Hill 2004
Chapter 7: The Distribution of Sample Means. Frequency of Scores Scores Frequency.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
1 Estimation Chapter Introduction Statistical inference is the process by which we acquire information about populations from samples. There are.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Sampling and Sampling Distributions. Sampling Distribution Basics Sample statistics (the mean and standard deviation are examples) vary from sample to.
Confidence Intervals Cont.
Confidence Intervals and Sample Size
Introduction, class rules, error analysis Julia Velkovska
Lecture 2d1: Quality of Measurements
Statistics Branch of mathematics dealing with the collection, analysis, interpretation, presentation, and organization of data. Practice or science of.
Introduction to Instrumentation Engineering
Daniela Stan Raicu School of CTI, DePaul University
Daniela Stan Raicu School of CTI, DePaul University
Chapter 7: The Distribution of Sample Means
IE 355: Quality and Applied Statistics I Confidence Intervals
Sampling Distributions (§ )
Objectives 6.1 Estimating with confidence Statistical confidence
Objectives 6.1 Estimating with confidence Statistical confidence
Presentation transcript:

Copyright 2004 David J. Lilja1 Errors in Experimental Measurements Sources of errors Accuracy, precision, resolution A mathematical model of errors Confidence intervals For means For proportions How many measurements are needed for desired error?

Copyright 2004 David J. Lilja2 Why do we need statistics? 1. Noise, noise, noise, noise, noise! OK – not really this type of noise

Copyright 2004 David J. Lilja3 Why do we need statistics? 2. Aggregate data into meaningful information

Copyright 2004 David J. Lilja4 What is a statistic? “A quantity that is computed from a sample [of data].” Merriam-Webster → A single number used to summarize a larger collection of values.

Copyright 2004 David J. Lilja5 What are statistics? “A branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data.” Merriam-Webster → We are most interested in analysis and interpretation here. “Lies, damn lies, and statistics!”

Copyright 2004 David J. Lilja6 Goals Provide intuitive conceptual background for some standard statistical tools. Draw meaningful conclusions in presence of noisy measurements. Allow you to correctly and intelligently apply techniques in new situations. → Don’t simply plug and crank from a formula.

Copyright 2004 David J. Lilja7 Goals Present techniques for aggregating large quantities of data. Obtain a big-picture view of your results. Obtain new insights from complex measurement and simulation results. → E.g. How does a new feature impact the overall system?

Copyright 2004 David J. Lilja8 Sources of Experimental Errors Accuracy, precision, resolution

Copyright 2004 David J. Lilja9 Experimental errors Errors → noise in measured values Systematic errors Result of an experimental “mistake” Typically produce constant or slowly varying bias Controlled through skill of experimenter Examples Temperature change causes clock drift Forget to clear cache before timing run

Copyright 2004 David J. Lilja10 Experimental errors Random errors Unpredictable, non-deterministic Unbiased → equal probability of increasing or decreasing measured value Result of Limitations of measuring tool Observer reading output of tool Random processes within system Typically cannot be controlled Use statistical tools to characterize and quantify

Copyright 2004 David J. Lilja11 Example: Quantization → Random error

Copyright 2004 David J. Lilja12 Quantization error Timer resolution → quantization error Repeated measurements X ± Δ Completely unpredictable

Copyright 2004 David J. Lilja13 A Model of Errors ErrorMeasured value Probability -Ex – E½ +Ex + E½

Copyright 2004 David J. Lilja14 A Model of Errors Error 1Error 2Measured value Probability -E x – 2E¼ -E +Ex¼ -Ex¼ +E x + 2E¼

Copyright 2004 David J. Lilja15 A Model of Errors

Copyright 2004 David J. Lilja16 Probability of Obtaining a Specific Measured Value

Copyright 2004 David J. Lilja17 A Model of Errors Pr(X=x i ) = Pr(measure x i ) = number of paths from real value to x i Pr(X=x i ) ~ binomial distribution As number of error sources becomes large n → ∞, Binomial → Gaussian (Normal) Thus, the bell curve

Copyright 2004 David J. Lilja18 Frequency of Measuring Specific Values Mean of measured values True value Resolution Precision Accuracy

Copyright 2004 David J. Lilja19 Accuracy, Precision, Resolution Systematic errors → accuracy How close mean of measured values is to true value Random errors → precision Repeatability of measurements Characteristics of tools → resolution Smallest increment between measured values

Copyright 2004 David J. Lilja20 Quantifying Accuracy, Precision, Resolution Accuracy Hard to determine true accuracy Relative to a predefined standard E.g. definition of a “second” Resolution Dependent on tools Precision Quantify amount of imprecision using statistical tools

Copyright 2004 David J. Lilja21 Confidence Interval for the Mean c1c2 1-α α/2

Copyright 2004 David J. Lilja22 Normalize x

Copyright 2004 David J. Lilja23 Confidence Interval for the Mean Normalized z follows a Student’s t distribution (n-1) degrees of freedom Area left of c 2 = 1 – α/2 Tabulated values for t c1c2 1-α α/2

Copyright 2004 David J. Lilja24 Confidence Interval for the Mean As n → ∞, normalized distribution becomes Gaussian (normal) c1c2 1-α α/2

Copyright 2004 David J. Lilja25 Confidence Interval for the Mean

Copyright 2004 David J. Lilja26 An Example ExperimentMeasured value 18.0 s 27.0 s 35.0 s 49.0 s 59.5 s s 75.2 s 88.5 s

Copyright 2004 David J. Lilja27 An Example (cont.)

Copyright 2004 David J. Lilja28 An Example (cont.) 90% CI → 90% chance actual value in interval 90% CI → α = α /2 = 0.95 n = 8 → 7 degrees of freedom c1c2 1-α α/2

Copyright 2004 David J. Lilja29 90% Confidence Interval a n ………… ………… ∞

Copyright 2004 David J. Lilja30 95% Confidence Interval a n ………… ………… ∞

Copyright 2004 David J. Lilja31 What does it mean? 90% CI = [6.5, 9.4] 90% chance real value is between 6.5, % CI = [6.1, 9.7] 95% chance real value is between 6.1, 9.7 Why is interval wider when we are more confident?

Copyright 2004 David J. Lilja32 Higher Confidence → Wider Interval? % %

Copyright 2004 David J. Lilja33 Key Assumption Measurement errors are Normally distributed. Is this true for most measurements on real computer systems? c1c2 1-α α/2

Copyright 2004 David J. Lilja34 Key Assumption Saved by the Central Limit Theorem Sum of a “large number” of values from any distribution will be Normally (Gaussian) distributed. What is a “large number?” Typically assumed to be >≈ 6 or 7.

Copyright 2004 David J. Lilja35 How many measurements? Width of interval inversely proportional to √n Want to minimize number of measurements Find confidence interval for mean, such that: Pr(actual mean in interval) = (1 – α)

Copyright 2004 David J. Lilja36 How many measurements?

Copyright 2004 David J. Lilja37 How many measurements? But n depends on knowing mean and standard deviation! Estimate s with small number of measurements Use this s to find n needed for desired interval width

Copyright 2004 David J. Lilja38 How many measurements? Mean = 7.94 s Standard deviation = 2.14 s Want 90% confidence mean is within 7% of actual mean.

Copyright 2004 David J. Lilja39 How many measurements? Mean = 7.94 s Standard deviation = 2.14 s Want 90% confidence mean is within 7% of actual mean. α = 0.90 (1-α/2) = 0.95 Error = ± 3.5% e = 0.035

Copyright 2004 David J. Lilja40 How many measurements? 213 measurements → 90% chance true mean is within ± 3.5% interval

Copyright 2004 David J. Lilja41 Proportions p = Pr(success) in n trials of binomial experiment Estimate proportion: p = m/n m = number of successes n = total number of trials

Copyright 2004 David J. Lilja42 Proportions

Copyright 2004 David J. Lilja43 Proportions How much time does processor spend in OS? Interrupt every 10 ms Increment counters n = number of interrupts m = number of interrupts when PC within OS

Copyright 2004 David J. Lilja44 Proportions How much time does processor spend in OS? Interrupt every 10 ms Increment counters n = number of interrupts m = number of interrupts when PC within OS Run for 1 minute n = 6000 m = 658

Copyright 2004 David J. Lilja45 Proportions 95% confidence interval for proportion So 95% certain processor spends % of its time in OS

Copyright 2004 David J. Lilja46 Number of measurements for proportions

Copyright 2004 David J. Lilja47 Number of measurements for proportions How long to run OS experiment? Want 95% confidence ± 0.5%

Copyright 2004 David J. Lilja48 Number of measurements for proportions How long to run OS experiment? Want 95% confidence ± 0.5% e = p =

Copyright 2004 David J. Lilja49 Number of measurements for proportions 10 ms interrupts → 3.46 hours

Copyright 2004 David J. Lilja50 Important Points Use statistics to Deal with noisy measurements Aggregate large amounts of data Errors in measurements are due to: Accuracy, precision, resolution of tools Other sources of noise → Systematic, random errors

Copyright 2004 David J. Lilja51 Important Points: Model errors with bell curve True value Precision Mean of measured values Resolution Accuracy

Copyright 2004 David J. Lilja52 Important Points Use confidence intervals to quantify precision Confidence intervals for Mean of n samples Proportions Confidence level Pr(actual mean within computed interval) Compute number of measurements needed for desired interval width