Statistics Introduction 1.)All measurements contain random error  results always have some uncertainty 2.)Uncertainty are used to determine if two or.

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

Chapter 7 Statistical Data Treatment and Evaluation
Parameter Estimation Chapter 8 Homework: 1-7, 9, 10 Focus: when  is known (use z table)
Sampling: Final and Initial Sample Size Determination
Statistics and Quantitative Analysis U4320
Evaluating Hypotheses
CHAPTER 6 Statistical Analysis of Experimental Data
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
How confident are we that our sample means make sense? Confidence intervals.
Chapter 7 Probability and Samples: The Distribution of Sample Means
ANALYTICAL CHEMISTRY CHEM 3811
Business Statistics: Communicating with Numbers
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
AM Recitation 2/10/11.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 9 Section 1 – Slide 1 of 39 Chapter 9 Section 1 The Logic in Constructing Confidence Intervals.
Confidence Intervals (Chapter 8) Confidence Intervals for numerical data: –Standard deviation known –Standard deviation unknown Confidence Intervals for.
Estimation of Statistical Parameters
1 Chapter 6. Section 6-1 and 6-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
1 Statistical Inference Greg C Elvers. 2 Why Use Statistical Inference Whenever we collect data, we want our results to be true for the entire population.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
PARAMETRIC STATISTICAL INFERENCE
Comparing two sample means Dr David Field. Comparing two samples Researchers often begin with a hypothesis that two sample means will be different from.
Lecture 4 Basic Statistics Dr. A.K.M. Shafiqul Islam School of Bioprocess Engineering University Malaysia Perlis
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
Approximate letter grade assignments ~ D C B 85 & up A.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
1 Tests of Significance In this section we deal with two tests used for comparing two analytical methods, one is a new or proposed method and the other.
Measures of Central Tendency: The Mean, Median, and Mode
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
CHAPTER SEVEN ESTIMATION. 7.1 A Point Estimate: A point estimate of some population parameter is a single value of a statistic (parameter space). For.
Chapter 4 Statistics. Is my red blood cell count high today?
CHEMISTRY ANALYTICAL CHEMISTRY Fall
Chapter 8 Parameter Estimates and Hypothesis Testing.
Chapter 4 Statistics Tools to accept or reject conclusion from experimental measurements Deal with random error only.
Confidence Interval Estimation For statistical inference in decision making:
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved. Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and.
Chapter 9: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Type I and II Errors Testing the difference between two means.
CONFIDENCE INTERVALS.
Statistical analysis Why?? (besides making your life difficult …)  Scientists must collect data AND analyze it  Does your data support your hypothesis?
© Copyright McGraw-Hill 2004
Confidence Interval Estimation For statistical inference in decision making: Chapter 9.
1 Estimation of Population Mean Dr. T. T. Kachwala.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
MATB344 Applied Statistics I. Experimental Designs for Small Samples II. Statistical Tests of Significance III. Small Sample Test Statistics Chapter 10.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
ERT 207 Analytical Chemistry ERT 207 ANALYTICAL CHEMISTRY Dr. Saleha Shamsudin.
Chapter 4 Exploring Chemical Analysis, Harris
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
m/sampling_dist/index.html.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
6-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
10.1 Estimating with Confidence Chapter 10 Introduction to Inference.
Chapter 6: Random Errors in Chemical Analysis. 6A The nature of random errors Random, or indeterminate, errors can never be totally eliminated and are.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
Tom.h.wilson Department of Geology and Geography West Virginia University Morgantown, WV.
Confidence Intervals and Sample Size
ESTIMATION.
Two-Sample Hypothesis Testing
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
Inference and Tests of Hypotheses
Statistics in Applied Science and Technology
Chapter 4 Statistics.
Presentation transcript:

Statistics Introduction 1.)All measurements contain random error  results always have some uncertainty 2.)Uncertainty are used to determine if two or more experimental results are equivalent or different  Statistics is used to accomplish this task Masuzaki, H., et. al Science (2001), 294(5549), 2166 Is the mutant (transgenic) mouse significantly fatter than the normal (wild-type) mouse? Statistical Methods Provide Unbiased Means to Answer Such Questions.

Statistics Gaussian Curve 1.)For a series of experimental results with only random error: (i) A large number of experiments done under identical conditions will yield a distribution of results. (ii) Distribution of results is described by a Gaussian or Normal Error Curve Number of Occurrences Value High population about correct value low population far from correct value

Statistics Gaussian Curve 2.)Any set of data (and corresponding Gaussian curve) can be characterized by two parameters: (i) Mean or Average Value ( ) where: n = number of data points x i = value of data point number i = value 1 + value 2 + value 3 … value n (ii) Standard Deviation (s) Smaller the standard deviation is, more precise the measurement is.

Statistics Gaussian Curve 3.)Other Terms Used to Describe a Data Set (i) Variance: Related to the standard deviation  Used to describe how “wide” or precise a distribution of results is variance = (s) 2 where: s = standard deviation (ii) Range: difference in the highest and lowest values in a set of data  Example: measurments of 4 light bulb lifetimes 821, 783, 834, 855 High Value = 855 hours Low Value = 783 hours Range = High Value – Low Value = 855 – 783 = 72 hours

Gaussian Curve 3.)Other Terms Used to Describe a Data Set (iii) Median: The value in a set of data which has an equal number of data values above it and below it  For odd number of data points, the median is actually the middle value  For even number of data points, the median is the value halfway between the two middle values  Example: Data Set: 1.19, 1.23, 1.25, 1.45,1.51 mean( ) = 1.33 Data Set: 1.19, 1.23, 1.25, 1.45 mean( ) = 1.28 median = 1.24 Statistics Median value

Statistics Gaussian Curve (iii) Example: For the following bowling scores 116.0, 97.9, 114.2, and 108.3, find the mean, median, range and standard deviation.

Statistics Gaussian Curve 4.)Relating Terms Back to the Gaussian Curve (i) Formula for a Gaussian curve where e = base of natural logarithm ( …)  ≈ (mean)  ≈ s (standard deviation) mean ± standard deviation Entire area under curve is normalized to one

Statistics Standard Deviation and Probability 1.)By knowing the standard deviation (s) and the mean ( ) of a set of result (and the corresponding Gaussian curve) (i) The probability of the next result falling in any given range can be calculated by: (ii)The probability of a result falling in that portion of the Gaussian curve is equal to the normalized area of the curve in that portion. (iii) Example: 68.3% of the area of a Gaussian curve occurs between the values -1s and +1s ( ± 1s) Thus, any new result has a 68.3% chance of falling within this range. Standard Deviation (s)Probability ±1s68.3% ±2s95.5% ±3s99.7% ±4s99.9% Probability of Measuring a value in a certain range is equal to the area of that range

Statistics Standard Deviation and Probability - Area under curve from mean value and result. - Total ½ area is Remaining area is 0.5 – Area. - Example: z = 1.3  area from mean to 1.3 is  area from infinity to 1.3 is 0.5 – = 0.097

Statistics Standard Deviation and Probability (iii) Example: A bowler has a mean score of and a standard deviation of 7.1. What fraction of the bowler’s scores will be less than 80.2?

Statistics Standard Deviation and Probability 2.)Knowing the standard deviation (s) of a data set indicates the precision of a measurement (i) Common intervals used for expressing analytical results are shown below: (ii)The precision of many analytical measurements is expressed as:  There is only a ~5% chance (1 out of 20) that any given measurement on the sample will be outside of this range RangePercent of Results Expected in Range x ±1s68.3% (31.7 outside) x ±2s95.5% (4.5% outside x ±3s99.7% (0.3% outside)

Statistics Standard Deviation and Probability 4.)The precision of a mean (average) result is expressed using a confidence interval (i)Relationship between the true mean value (  ) and the measured mean ( ) is given by: where:s = standard deviation n = number of measurements t = student’s t value degrees of freedom = (n-1) Confidence interval Note: As n increases, the confidence interval becomes smaller (  becomes more precisely known)

Statistics Standard Deviation and Probability 4.)The precision of a mean (average) result is expressed using a confidence interval (ii)Student’s t  Statistical tool frequently used to express confidence intervals A probability distribution that addresses the problem of estimating the mean of a normally distributed population when the sample size is small. Population standard deviation (  ) is unknown and has to be estimated from the data using s. From number of measurements (n-1)

Statistics Standard Deviation and Probability 4.)The precision of a mean (average) result is expressed using a confidence interval (iii)The meaning of Confidence Interval  To determine the “true” mean need to collect an infinite number of data points. - obviously not possible  Confidence interval tells us the probability that the range of numbers contains the “true” mean. 50% confidence interval  range of numbers only contains true mean 50% of the time 90% confidence interval  range of numbers contains true mean 90% of the time. “true” mean 50% of data sets do not contain true mean

Standard Deviation and Probability (iii) Example: For the following bowling scores 116.0, 97.9, 114.2, and 108.3, a bowler has a mean score of and a standard deviation of 7.1. What is the 90% confidence interval for the mean? Statistics

Standard Deviation and Probability 5.)Comparison of Two Data Sets (i)To determine if two results obtained by the same method are statistically the same, use the following formula to determine a calculated t: where: = mean results of samples 1 & 2 n 1, n 2 = number of measurements of samples 1 & 2 s pooled = “pooled” standard deviation Requires standard deviation from the two data sets be similar.

Statistics Standard Deviation and Probability 5.)Comparison of Two Data Sets (ii)Compare calculated t to the corresponding value in the Student’s t probability table.  Use the desired %confidence level at the appropriate Degrees of freedom  Degrees of Freedom = (n 1 + n 2 -2) (iii)If calculated t is greater than the value in the Student’s t probability table, then the two results are significantly different at the given % confidence level.  Easier to achieve for lower %confidence level Calculated t needs to be less than table value

Statistics Standard Deviation and Probability 5.)Comparison of Two Data Sets (iv)Example: The amount of 14 CO 2 in a plant sample is measured to be: 28, 32, 27, 39 & 40 counts/min (mean = ). The amount of radioactivity in a blank is found to be: 28, 21, 28, & 20 counts/min (mean = ). Are the mean values significantly different at a 95% confidence level?

Statistics Standard Deviation and Probability 5.)Comparison of Two Data Sets (iv)Example: Degrees of Freedom = (5 + 4 – 2) = 7 From Student’s t probability table: Degrees of Freedom (7) 95% Confidence level Calculated t (2. 48 ) > The results are significantly different at a 95% confidence level, but not at 98% or higher confidence levels

Statistics Standard Deviation and Probability 6.)Comparison of Two Methods (i)To determine if the results of two methods for the same sample are the same, use the following formula to determine a calculated t: where: = difference in the mean values of the two methods n = number of samples analyzed by each method s d = (ii)Degree of Freedom = (n - 1) (iii)If calculated t is greater than the value in the Student’s t probability table, then the two methods are significantly different at the given % confidence level.

Statistics Standard Deviation and Probability 6.)Comparison of Two Methods (iv)Example: Two methods for measuring cholesterol in blood provide the following results: Are these methods significantly different at the 95% confidence level? Cholesterol content (g/L) Plasma sample Method AMethod BDifference (d i ) =

Statistics Standard Deviation and Probability 6.)Comparison of Two Methods (iv)Example: Degrees of Freedom (6-1 =5) 95% Confidence level Calculated t (1.20) ≤ The results are not significantly different at a 95% confidence level.

Statistics Dealing with Bad Data 1.)Q Test (i)Method used to decide whether or not to reject a “bad” data point. (ii)Procedure: 1.Arrange Data in order of increasing value. 2.Determine the lowest and highest values and the total range of values. Example: Determine the difference between the “bad” data point and the nearest value. - Calculate the “Q value” Range = 0.20 Questionable point gap = 0.11

Statistics Dealing with Bad Data 1.)Q Test (ii)Procedure: 4.Compare the calculated Q value to those in Tables at the same value of n and the desired %confidence level. - n: total number of values or observations - For example, at n = 5 and 90% confidence, the value of Q is Since: Q (calculated) ≤ Q (table) 0.55 ≤ data point can not be rejected at the 90% confidence level (iii)Although the Q-test is valuable in eliminating bad data, common sense and repeating experiments with questionable results are usually more helpful. Values of Q for rejection of data Q (90% confidence) Number of Observations

Statistics Dealing with Bad Data 1.)Q Test (ii)Example: For the following bowling scores 116.0, 97.9, 114.2, and 108.3, a bowler has a mean score of and a standard deviation of 7.1. Using the Q test, decide whether the number 97.9 should be discarded.