Information Analysis Gaussian or Normal Distribution.

Slides:



Advertisements
Similar presentations
Frequency Analysis Reading: Applied Hydrology Sections 12-2 to 12-6.
Advertisements

The Normal Distribution
And standard deviation
STA305 week 31 Assessing Model Adequacy A number of assumptions were made about the model, and these need to be verified in order to use the model for.
Introduction to Summary Statistics
Analyzing Data After an experiment, you will typically graph the data collected in order to perform your analysis. Suppose that your data points resemble.
Objectives (BPS chapter 24)
2-5 : Normal Distribution
WFM 5201: Data Management and Statistical Analysis
Measures of Variability or Dispersion
CHAPTER 6 Statistical Analysis of Experimental Data
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Graphing. Representing numerical information in a picture. Graph shows a picture of a relationship -how two processes relate -what happens when two events.
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
Lecture II-2: Probability Review
Flood Frequency Analysis
BPT 2423 – STATISTICAL PROCESS CONTROL.  Frequency Distribution  Normal Distribution / Probability  Areas Under The Normal Curve  Application of Normal.
Inference for regression - Simple linear regression
 Multiple choice questions…grab handout!. Data Analysis: Displaying Quantitative Data.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
POPULATION DYNAMICS Required background knowledge:
 Review homework Problems: Chapter 5 - 2, 8, 18, 19 and control chart handout  Process capability  Other variable control charts  Week 11 Assignment.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Standard Normal Distribution
Section 2.4 Representing Data.
A P STATISTICS LESSON 2 – 2 STANDARD NORMAL CALCULATIONS.
Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Quality Control Lecture 5
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
 Review homework Problems: Chapter 5 - 2, 8, 18, 19 and control chart handout  Process capability  Other variable control charts  Week 11 Assignment.
1 The first thing to be considered when testing aggregate quality is the Size Number. This number regulates which set of sieves will be used to sieve the.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Continuous Random Variables.
Inference We want to know how often students in a medium-size college go to the mall in a given year. We interview an SRS of n = 10. If we interviewed.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Section 9.3: Confidence Interval for a Population Mean.
Probability = Relative Frequency. Typical Distribution for a Discrete Variable.
Copyright © Cengage Learning. All rights reserved. 2 Descriptive Analysis and Presentation of Single-Variable Data.
§ 5.3 Normal Distributions: Finding Values. Probability and Normal Distributions If a random variable, x, is normally distributed, you can find the probability.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Chapter Eight: Using Statistics to Answer Questions.
Stracener_EMIS 7305/5305_Spr08_ Reliability Data Analysis and Model Selection Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Histograms, Frequency Polygons, and Ogives. What is a histogram?  A graphic representation of the frequency distribution of a continuous variable. Rectangles.
Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts.
 I can identify the shape of a data distribution using statistics or charts.  I can make inferences about the population from the shape of a sample.
Why do we analyze data?  It is important to analyze data because you need to determine the extent to which the hypothesized relationship does or does.
Hydrological Forecasting. Introduction: How to use knowledge to predict from existing data, what will happen in future?. This is a fundamental problem.
MATH 2311 Help Using R-Studio. To download R-Studio Go to the following link: Follow the instructions for your computer.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Graphing Oceanographic Data. The Ocean is a Variable Ecosystem The ocean is a huge and highly variable environment –Changes with time (daily, seasonally,
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
MM150 ~ Unit 9 Statistics ~ Part II. WHAT YOU WILL LEARN Mode, median, mean, and midrange Percentiles and quartiles Range and standard deviation z-scores.
Cell Diameters and Normal Distribution. Frequency Distributions a frequency distribution is an arrangement of the values that one or more variables take.
Chapter 6 Continuous Random Variables Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
CHAPTER 12 More About Regression
AP Biology Intro to Statistics
CHAPTER 12 More About Regression
CHAPTER 26: Inference for Regression
Organizing and Displaying Data
Analyzing One-Variable Data
The Normal Distribution
STEM Fair Graphs.
Continuous Statistical Distributions: A Practical Guide for Detection, Description and Sense Making Unit 3.
CHAPTER 12 More About Regression
10-5 The normal distribution
CHAPTER 2: Basic Summary Statistics
CHAPTER 12 More About Regression
2.3. Measures of Dispersion (Variation):
Presentation transcript:

Information Analysis Gaussian or Normal Distribution

:= mean, estimated as x x = observed sample mean = 3 x/n F= standard deviation, estimated as s n = sample size S= observed standard deviation : F Area under curve = 1

: F Coefficient of Variation C v = 150/20 = 7.5 C v = 150/60 = 2.5

Example 100 kg of glass is recovered from municipal refuse and processed. The glass is crushed and sieved. Lot the cumulative distribution of particle size from the data below 4 mm holes10 kg glass remained on the sieve (90 kg went through) 3 mm holes25 kg remained on the sieve 2 mm holes35 kg remained on the sieve 1 mm holes20 kg remained on the sieve No holes10 kg went all the way through Sieve SizeFraction Retained 410/100 = /100 = /100 = /100 = 0.20 <110/100 = 0.1

Cumulative Distribution Sieve SizeFraction Smaller Than sieve size 41 – 0.1 = – ( ) = –( ) = – ( ) = 0.1

Graphs Independent variableAbscissa (x-axis) Dependent variableOrdinate (y-axis) A variable is independent if the value is chosen, like sieve size in the previous example. A value is dependent if is determined by experiment

Probability Paper X-axis is linear Y-axis is plotted so that if the probability is normal (Gaussian) then the cumulative probability will plot as a straight line. If this is the case the mean is at 0.5 or 50% and the standard deviation is on either side of the mean. You can also calculate s by: s = 2/5(x 90 – x 10 )

Example Consider the recycled glass data from the previous example. What is the mean, the standard deviation, and the 95% interval? The mean is the value on the x-axis when the y-axis value is 0.5, 2.4 mm. The standard deviation is the spread around the mean so that 68% of the data fall into the range (or about 34% on either side of the mean) = 0.84, which corresponds to 3.5 mm, so s = 3.5 – 2.4 = 1.1, or: S=2/5( ) = 1.16 The 95% interval means 95% of the data is in the range, or between and 0.975, or 0.2 mm and 4.8 mm

Return Period Return period is how often an event is expected to recur. If the annual probability of an event occurring is 5%, then the event can be expected to occur once every 20 years, or have a return period of 20 years: Return period = 1/fractional probability To determine return periods, first rank time-variant data (smallest to largest or largest to smallest) then calculate the probabilities and plot the data.

Return Period Example The data below are from a wastewater treatment plant. BOD is the measure of organic pollution in a water. The BOD is measured daily.. Does this data fit the normal distribution? Can it be used to calculate the mean and standard deviation? What is the worst quality expected in 30 days?

First, rank the data: Now plot the data. We will plot m/n (which is the probability), versus the BOD

It does fit the normal distribution fairly well The mean is about 35 mg/L BOD To find the worst quality in a 30 day period, calculate: 29/30 = This is the fraction of days the quality is better than the worst day out of 30 days Enter the graph at and find the answer: 67 mg/L BOD

Sometimes data is analyzed after it is grouped. Often the mean is used to analyze the data. Example: Using the data from the previous problem estimate the highest expected BOD to occur once every 30 days using grouped data analysis First define groups of BOD values.

Now plot these data Notice how the data points form a curve. This means the data don’t really fit the normal Distribution, but we’ll go ahead anyway Now P 29/30 = and we read 67 mg/L BOD from the graph.