Handling Data and Figures of Merit Data comes in different formats time Histograms Lists But…. Can contain the same information about quality What is meant.

Slides:



Advertisements
Similar presentations
Properties of Least Squares Regression Coefficients
Advertisements

Statistical Techniques I EXST7005 Start here Measures of Dispersion.
Biomedical Statistics Testing for Normality and Symmetry Teacher:Jang-Zern Tsai ( 蔡章仁 ) Student: 邱瑋國.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Selectivity, Sensitivity, Signal to Noise, Detection Limit
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
1 Analysis of Variance This technique is designed to test the null hypothesis that three or more group means are equal.
Correlation and Regression. Spearman's rank correlation An alternative to correlation that does not make so many assumptions Still measures the strength.
Statistics Psych 231: Research Methods in Psychology.
SIMPLE LINEAR REGRESSION
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. by Lale Yurttas, Texas A&M University Chapter 171 CURVE.
SIMPLE LINEAR REGRESSION
Business Statistics - QBM117 Least squares regression.
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
Correlation and Regression Analysis
Lorelei Howard and Nick Wright MfD 2008
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
Chapter 6 Random Error The Nature of Random Errors
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
Correlation and Linear Regression
Quantitative Skills: Data Analysis and Graphing.
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
Ch4 Describing Relationships Between Variables. Pressure.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, Lesson Objectives  Learn when each measure of a “typical value” is appropriate.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Quality Assurance How do you know your results are correct? How confident are you?
Introduction to Biostatistics and Bioinformatics Regression and Correlation.
Statistical Analysis Topic – Math skills requirements.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Radiation Detection and Measurement, JU, 1st Semester, (Saed Dababneh). 1 Radioactive decay is a random process. Fluctuations. Characterization.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Data Analysis, Presentation, and Statistics
Statistics Presentation Ch En 475 Unit Operations.
CHAPTER – 1 UNCERTAINTIES IN MEASUREMENTS. 1.3 PARENT AND SAMPLE DISTRIBUTIONS  If we make a measurement x i in of a quantity x, we expect our observation.
Correlation and Regression Ch 4. Why Regression and Correlation We need to be able to analyze the relationship between two variables (up to now we have.
This represents the most probable value of the measured variable. The more readings you take, the more accurate result you will get.
Chapter 12: Correlation and Linear Regression 1.
Statistical Concepts Basic Principles An Overview of Today’s Class What: Inductive inference on characterizing a population Why : How will doing this allow.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
The simple linear regression model and parameter estimation
Inference for Regression
AP Biology Intro to Statistics
Part 5 - Chapter
Part 5 - Chapter 17.
PCB 3043L - General Ecology Data Analysis.
Part 5 - Chapter 17.
AP Biology Intro to Statistic
AP Biology Intro to Statistic
AP Biology Intro to Statistic
Summary (Week 1) Categorical vs. Quantitative Variables
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Presentation transcript:

Handling Data and Figures of Merit Data comes in different formats time Histograms Lists But…. Can contain the same information about quality What is meant by quality? (figures of merit) Precision, separation (selectivity), limits of detection, Linear range

My weight Plot as a function of time data was acquired:

Do not use curved lines to connect data points – that assumes you know more about the relationship of the data than you really do Comments: background is white (less ink); Font size is larger than Excel default (use 14 or 16)

Bin refers to what groups of weight to cluster. Like A grade curve which lists number of students who got between 95 and 100 pts would be a bin

Assume my weight is a single, random, set of similar data Make a frequency chart (histogram) of the data Create a “model” of my weight and determine average Weight and how consistent my weight is

= measure of the consistency, or similarity, of weights average s = 1.4 lbs Inflection pt s = standard deviation

Characteristics of the Model Population (Random, Normal) Peak height, A Peak location (mean or average),  Peak width, W, at baseline Peak width at half height, W 1/2 Standard deviation, s, estimates the variation in an infinite population,  Related concepts

Width is measured At inflection point = s W 1/2 Triangulated peak: Base width is 2s < W < 4s

+/- 1s Area +/- 2s = 95.4% Area +/- 3s = % Pp = peak to peak – or – largest separation of measurements Peak to peak is sometimes Easier to “see” on the data vs time plot Area = 68.3%

Peak to peak s~ pp/6 = ( )/6~0.9 (Calculated s= 1.4)

Scale up the first derivative and second derivative to see better There are some other important characteristics of a normal (random) population 1 st derivative 2 nd derivative

Population, 0 th derivative 1 st derivative, Peak is at the inflection Determines the std. dev. 2 nd derivative Peak is at the inflection Of first derivative – should Be symmetrical for normal Population; goes to zero at Std. dev.

Asymmetry can be determined from principle component analysis A. F. (≠Alanah Fitch) = asymmetric factor

Is there a difference between my “baseline” weight and school weight? Can you “detect” a difference? Can you “quantitate” a difference? Comparing TWO populations of measurements

Exact same information displayed differently, but now we divide The data into different measurement populations baseline school Model of the data as two normal populations

Average Baseline weight Average school weight Standard deviation Of baseline weight Standard deviation Of the school weight

We have two models to describe the population of measurements Of my weight. In one we assume that all measurements fall into a single population. In the second we assume that the measurements Have sampled two different populations. Which is the better model? How to we quantify “better”?

Compare how close The measured data Fits the model Did I gain weight? The red bars represent the difference Between the two population model and The data The purple lines represent The difference between The single population Model and the data Which model Has less summed differences?

This process (summing of the squares of the differences) Is essentially what occurs in an ANOVA Analysis of variance Normally sum the square of the difference in order to account for Both positive and negative differences. In the bad old days you had to work out all the sums of squares. In the good new days you can ask Excel program to do it for you.

Test: is F<F critical ? If true = hypothesis true, single population if false = hypothesis false, can not be explained by a single population at the 5% certainty level

In an Analysis of Variance you test the hypothesis that the sample is Best described as a single population. 1.Create the expected frequency (Gaussian from normal error curve) 2.Measure the deviation between the histogram point and the expected frequency 3.Square to remove signs 4.SS = sum squares 5.Compare to expected SS which scales with population size 6.If larger than expected then can not explain deviations assuming a single population

The square differences For an assumption of A single population Is larger than for The assumption of Two individual populations

There are other measurements which describe the two populations Resolution of two peaks Mean or average Baseline width

xaxa xbxb In this example Peaks are baseline resolved when R > 1

xaxa xbxb In this example Peaks are just baseline resolved when R = 1

xaxa xbxb In this example Peaks are not baseline resolved when R < 1

2008 Data What is the R for this data?

Visually less resolved Visually better resolved Anonymous 2009 student analysis of Needleman data

Visually less resolved Visually better resolved Anonymous 2009 student analysis of Needleman data

Other measures of the quality of separation of the Peaks 1.Limit of detection 2.Limit of quantification 3.Signal to noise (S/N)

X blank X limit of detection 99.74% Of the observations Of the blank will lie below the mean of the First detectable signal (LOD)

Two peaks are visible when all the data is summed together

Estimate the LOD (signal) of this data

Other measures of the quality of separation of the Peaks 1.Limit of detection 2.Limit of quantification 3.Signal to noise (S/N)

Your book suggests 10 Limit of quantification requires absolute Certainty that no blank is part of the measurement

Estimate the LOQ (signal) of this data

Other measures of the quality of separation of the Peaks 1.Limit of detection 2.Limit of quantification 3.Signal to noise (S/N) Signal = x sample - x blank Noise = N = standard deviation, s

Estimate the S/N of this data (This assumes pp school ~ pp baseline)

Can you “tell” where the switch between Red and white potatoes begins? What is the signal (length of white)? What is the background (length of red)? What is the S/N ?

Effect of sample size on the measurement

Error curve Peak height grows with # of measurements s always has same proportion of total number of measurements However, the actual value of s decreases as population grows

2008 Data

Calibration Curve A calibration curve is based on a selected measurement as linear In response to the concentration of the analyte. Or… a prediction of measurement due to some change Can we predict my weight change if I had spent a longer time on Vacation?

5 days The calibration curve contains information about the sampling Of the population

Can get this by using “trend line”

This is just a trendline From “format” data Using the analysis Data pack Get an error Associated with The intercept

In the best of all worlds you should have a series of blanks That determine you’re the “noise” associated with the background Sometimes you forget, so to fall back and punt, estimate The standard deviation of the “blank” from the linear regression But remember, in doing this you are acknowledging A failure to plan ahead in your analysis

Extrapolation of the associated error Can be obtained from the Linear Regression data Sensitivity (slope) The concentration LOD depends on BOTH Stdev of blank and sensitivity Signal LOD !!Note!! Signal LOD ≠ Conc LOD We want Conc. LOD

Difference in slope is one measure selectivity In a perfect method the sensing device would have zero Slope for the interfering species Selectivity Pb 2+ H+H+

Limit of linearity 5% deviation

Summary: Figures of Merit Thus far R = resolution S/N LOD = both signal and concentration LOQ LOL Sensitivity (calibration curve slope) Selectivity (essentially difference in slopes) Can be expressed in terms of signal, but better Expression is in terms of concentration Tests: Anova Why is the limit of detection important? Why has the limit of detection changed so much in the Last 20 years?

The End

Which of these two data sets would be likely To have better numerical value for the Ability to distinguish between two different Populations? Needleman’s data

2008 Data Height for normalized Bell curve <1 Which population is more variable? How can you tell?

Increasing the sample size decreases the std dev and increases separation Of the populations, notice that the means also change, will do so until We have a reasonable sample of the population