Lecture II-2: Probability Review

Slides:



Advertisements
Similar presentations
Chapter 2 Multivariate Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Advertisements

Biomedical Statistics Testing for Normality and Symmetry Teacher:Jang-Zern Tsai ( 蔡章仁 ) Student: 邱瑋國.
Random Variables ECE460 Spring, 2012.
Hydrologic Statistics
Integration of sensory modalities
Probability Review 1 CS479/679 Pattern Recognition Dr. George Bebis.
G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1 Statistical Data Analysis: Lecture 2 1Probability, Bayes’ theorem 2Random variables and.
QA-2 FRM-GARP Sep-2001 Zvi Wiener Quantitative Analysis 2.
FRM Zvi Wiener Following P. Jorion, Financial Risk Manager Handbook Financial Risk Management.
Chapter 6 Continuous Random Variables and Probability Distributions
Probability theory 2011 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different definitions.
Data Basics. Data Matrix Many datasets can be represented as a data matrix. Rows corresponding to entities Columns represents attributes. N: size of the.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Tch-prob1 Chapter 4. Multiple Random Variables Ex Select a student’s name from an urn. S In some random experiments, a number of different quantities.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
CHAPTER 6 Statistical Analysis of Experimental Data
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Continuous Random Variables and Probability Distributions
Chapter 4: Joint and Conditional Distributions
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Probability theory 2008 Outline of lecture 5 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different.
Modern Navigation Thomas Herring
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
NIPRL Chapter 2. Random Variables 2.1 Discrete Random Variables 2.2 Continuous Random Variables 2.3 The Expectation of a Random Variable 2.4 The Variance.
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Separate multivariate observations
Review of Probability.
Review of normal distribution. Exercise Solution.
Pairs of Random Variables Random Process. Introduction  In this lecture you will study:  Joint pmf, cdf, and pdf  Joint moments  The degree of “correlation”
Short Resume of Statistical Terms Fall 2013 By Yaohang Li, Ph.D.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
1 LES of Turbulent Flows: Lecture 1 Supplement (ME EN ) Prof. Rob Stoll Department of Mechanical Engineering University of Utah Fall 2014.
CHAPTER 4 Multiple Random Variable
Module 1: Statistical Issues in Micro simulation Paul Sousa.
Statistical Methods II: Confidence Intervals ChE 477 (UO Lab) Lecture 4 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
LECTURE 3: ANALYSIS OF EXPERIMENTAL DATA
1 Topic 5 - Joint distributions and the CLT Joint distributions –Calculation of probabilities, mean and variance –Expectations of functions based on joint.
Chapter 01 Probability and Stochastic Processes References: Wolff, Stochastic Modeling and the Theory of Queues, Chapter 1 Altiok, Performance Analysis.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Operations on Multiple Random Variables
Chapter 01 Probability and Stochastic Processes References: Wolff, Stochastic Modeling and the Theory of Queues, Chapter 1 Altiok, Performance Analysis.
CY1B2 Statistics1 (ii) Poisson distribution The Poisson distribution resembles the binomial distribution if the probability of an accident is very small.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
STOCHASTIC HYDROLOGY Stochastic Simulation of Bivariate Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National.
Random Variables. Numerical Outcomes Consider associating a numerical value with each sample point in a sample space. (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
Continuous Random Variables and Probability Distributions
Joint Moments and Joint Characteristic Functions.
Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
Lecture II-3: Interpolation and Variational Methods Lecture Outline: The Interpolation Problem, Estimation Options Regression Methods –Linear –Nonlinear.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Virtual University of Pakistan Lecture No. 26 Statistics and Probability Miss Saleha Naghmi Habibullah.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Pattern Recognition Probability Review
Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.
3.1 Expectation Expectation Example
6-1 Introduction To Empirical Models
Hydrologic Statistics
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
Integration of sensory modalities
Computing and Statistical Data Analysis / Stat 7
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Mathematical Foundations of BME Reza Shadmehr
Professor Ke-sheng Cheng
Continuous Random Variables: Basics
Presentation transcript:

Lecture II-2: Probability Review Lecture Outline: Random variables and probability distributions Functions of a random variable, moments Multivariate probability Marginal and conditional probabilities and moments Multivariate normal distributions Application of probabilistic concepts to data assimilation

Random Variables and Probability Density Functions A random variable is a variable whose possible values are distributed throughout a specified range. The variable’s probability density function (PDF) describes how these values are distributed (i.e. it gives the probability that the variable value falls within a particular interval). Continuous PDFs 1 All values between 0 and 1 are equally likely y fy (y) Uniform distribution (e.g. soil texture) Smallest values are most likely y fy (y) Exponential distribution (e.g. event rainfall) 3 1 2 4 y fy (y) Discrete distribution (e.g. number of severe storms) Only discrete values (integers) are possible Probability that y = 2 0.2 0.3 0.25 0.15 0.1 A Discrete PDF

Interval Probabilities Probability that x falls in interval (x1, x2]: Continuous PDF: y1 y2 y f y(y) -4 -2 2 4 0.2 0.4 Discrete PDF: -4 -2 2 4 0.2 0.4 f y(y) y1 y2 y Probability that y takes on some value in the range (-  , + ) is 1.0: That is, area under PDF must = 1

Example: Calculating Interval Probabilities from a Continuous PDF Historical data indicate that average rainfall intensity y during a particular storm follows an exponential distribution: 20 40 60 80 0.02 0.04 0.06 0.08 0.1 0.12 a=0.1 mm -1 y (mm) What is the probability that a given storm will produce greater than 10mm. of rainfall if a =0.1 mm-1 ?

Cumulative Distribution Functions Cumulative distribution function (CDF) of x (probability that x is less than  ): Continuous PDF: Discrete PDF: -4 -2 2 4 0.2 0.4 Area = F y ( ) y -4 -2 2 4 0.5 1 F y () f y(y)   -4 -2 2 4 0.2 0.4 f y(y)  0.5 1 F y () y y Note that F y ()  1.0 !

Histogram (Sample PDF) Constructing PDFs and CDFs From Data y t 10 20 30 40 50 -4 -2 2 How are these 50 monthly streamflows distributed over range of observed values? 1 0.8 Rank data from smallest to largest value and divide into bins (sample PDF or histogram) or plot normalized rank (rank/50) vs. value (sample CDF) 0.6 0.4 0.2 -3 -2 -1 1 2 5 10 -3 -2 Sample CDF y -1 1 2 Sample CDF may be fit with a standard function (e.g. Gaussian) Histogram (Sample PDF) y 2

Expectation of a Random Variable The expectation of a function z = g(y) of the random variable y is defined as: Continuous: Discrete: Expectation is a linear operator: Note that expectation of y is not a random variable but is a property of the PDF f y(y ).

Moments and Other Properties of Random Variables Non-central Moments of y: Central Moments of y: Variance: Standard deviation: Mean: Second moment: 2 4 6 8 10 12 14 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Prob(y>y95) =0.05 y95 Mode (peak)  1 Standard deviation Median Prob(y > median) = Prob(y  median) =0.5 Mean Integrals are replaced by sums when PDF is discrete

Expectation Example The mean and variance of a random variable distributed uniformly between 0 and 1 are: Mean: Variance: Standard deviation: Mean defines “center” of distribution Standard deviation

y2 = number of storms in July (0, 1, or 2) We frequently work with groups of related random variables. Discrete example: y1 = number of storms in June (0, 1, or 2) y2 = number of storms in July (0, 1, or 2) Multiple (Jointly Distributed) Random Variables Table of joint (multivariate) probabilities: Assemble multiple random variables in vectors: y = [y1, y2 , …, yn ] Shorthand: f y (y ) = f y1 y2... yn (y1 , y2 ,..., yn ) 1 2 0.1 0.2 0.3 0.4 f y1 y2 ( y1 , y2 ) y1 y2 f y1 y2 ( 0, 2 ) Plot table as discrete joint PDF with two independent variables y1 and y2

 Interval Probabilities for Multivariate Random Variables y2 y2 y1 y1 In multivariate problems interval probabilities are replaced by the probability that the n random variables fall in a specified region (R) of the n-dimensional space with coordinates ( y1 , y2 , …, yn ) . Bivariate case -- Probability that the pair of variables ( y1 , y2 ) lies in a region R in the y1 - y2 plane is: y2 10 20 30 40 50 60 Continuous PDF (contour plot): Region R Discrete PDF (discrete contour plot): y1 y2 1 2  y1 0.15 Region R

General Multivariate Moments The mean of a vector of n random variables y = [y1, y2 , …, yn ] is an n vector: Second non-central moment of a vector y is an n by n matrix, called the covariance matrix: The correlation coefficient between any two scalar random variables (e.g. two elements of the vector y) is: If Cyiyk = ik = 0 then yi and yi are uncorrelated.

Marginal and Conditional PDFs The marginal PDF of any one of a set of jointly distributed random variables is obtained by integrating joint density over all possible values of the other variables. In the bivariate case marginal density of y1 is: Continuous PDF : Discrete PDF: The conditional PDF of a random variable yi for a given value of some other random variable yk is defined as: The conditional density of yi given yk is a valid probability density function (e.g. the area under this function must = 1).

Discrete Marginal and Conditional Probability Example For the discrete example described earlier the marginal probabilities are obtained by summing over columns [ to get f y1 ( y 1 ) ] or rows [ to get f y2 ( y 2 ) ] : Marginal densities shown in color (last row and last column) The conditional density of y1 (June storms) given that y2 = 1 (one storm in July) is obtained by dividing the entries in the y2 = 1 column by f y2 ( y2=1) = 0.3:

Conditional Moments Conditional moments are defined in the same way as regular moments, except that the unconditional density [e.g. f y1 ( y1 )] is replaced by the conditional density [e.g. f y1|y2 (y1 | y12=1)] in the appropriate definitions. For discrete example, unconditional mean and variance of y1 may be computed directly from f y1 ( y1) table: The conditional mean and variance of y1 given that y2 = 1 may be computed directly from f y1|y2 (y1 | y12=1)] table: Note that the conditional variance (uncertainty) of y1 is smaller than the unconditional variance. This reflects the decrease in uncertainty we gain by knowing that y12=1.

Independent Random Variables Two random vectors y and z are independent if any of the following equivalent expressions holds: Independent variables are also uncorrelated, although the converse may not be true. In the discrete example described above, the two random variables y and y are not independent because: For example, for the combination (y1 = 0, y2 = 0 ) we have:

Functions of a Random Variable A function z = g(y) of a random variable is also a random variable, with its own PDF f z(z). -2 -1 1 2 4 6 8 z = g(y) = e y Range of possible y values Corresponding range of z values f y(y) f z(z) z = g (y) -4 -2 2 4 0.1 0.2 0.3 0.4 f y(y) (normal) 1 2 3 4 0.2 0.4 0.6 0.8 f z(z) (lognormal) The basic concept also applies to multivariate problems, where y and z are random vectors and z = g (y) is a vector transformation.

Derived Distributions The PDF f z(z) of the random variable z = g(y) may be sometimes be derived in closed form from g(y) and f z(z). When this is not possible Monte Carlo (stochastic simulation) methods may be used. If y and z are scalars and z = g(y) has a unique solution y = g -1(z) for all permissible y, then: where : If z = g(y) has multiple solutions the right-hand side term is replaced by a sum of terms evaluated at the different solutions. This result extends to vectors of random variables and a vector transformation z = g(y) if the derivative g’ is replaced by the Jacobian of g(y). An important example for data assimilation purposes is the simple scalar linear transformation z = g() = a +  , where  is a random variable with PDF f () and a is a constant. Then g -1(z) = z - a and the PDF of the random variable z is:

Bayes Theorem The definition of the conditional PDF may be applied twice to obtain Bayes Theorem, which is very important in data assimilation. To illustrate, suppose that we seek the PDF of a state vector y given that a measurement vector has the value z. This conditional PDF may be computed as follows.: This expression is useful because it may be easier to determine f z|y( z|y) and then compute f y|z( y|z) from Bayes Theorem than to derive f y|z( y|z) directly. For example, suppose that: Then if y is given (not random) f z | y(z| y) = f  (z - y). If the unconditional PDFs f  () and f y(y) are specified they can be substituted into Bayes Theorem to give the desired PDF f y|z( y|z). The specified PDFs can be viewed as prior information about the uncertain measurement error and state.

Multivariate Normal (Gaussian) PDFs The only widely used continuous joint PDF is the multivariate normal (or Gaussian): Multivariate normal PDF of the n vector y = [y1, y2 , …, yn ] is completely determined by mean and covariance C yy of y: Where | C yy | represents determinant of C yy and C yy-1 represents inverse of C yy . f y1 y2 ( y1 , y2 ) Bivariate normal PDF: . Mean of normal PDF is at peak value. Contours of equal PDF form ellipses. y1 y2

Important Properties of Multivariate Normal Random Variables The following properties of multivariate normal random variables are frequently used in data assimilation: A linear combination z = a1 y1+a2 y2+ … an yn = a T y of jointly normal random variables y = [y1 , y2 , … , yn]T is also a normal random variable. The mean and variance of z are: If y and z are multivariate normal random vectors with a joint PDF fyz(y, z) the marginal PDFs fy (y) and fz(z) and the conditional PDFs f y| z (y| z) and f z| y (z| y) are also multivariate normal. Linear combinations of independent random variables become normally distributed as the number of variables approaches infinity (this is the Central Limit Theorem) In practice, many other functions of multiple independent random variables also have nearly normal PDFs, even when the number of variables is relatively small (e.g. 10-100). For this reason environmental variables are often observed to be normally distributed.

Conditional Multivariate Normal PDFs and Moments Consider two vectors of random variables which are all jointly normal: y = [y1, y2 , …, yn ] (e.g. a vector of n states) z = [z1, z2 , …, zm ] (e.g. a vector of m measurements) The conditional PDF of y given z is: Where: (Conditional mean) (Conditional covariance) (y, z cross-covariance) (Normalization constant) The conditional covariance is “smaller” than the unconditional y covariance (since the difference matrix [Cy y - Cyy| z] is positive definite). This decrease in uncertainty about y reflects the additional information provided by z

Application of Probabilistic Concepts to Data Assimilation Data assimilation seeks to characterize the true but unknown state of an environmental system. Physically-based models help to define a reasonable range of possible states but uncertainties remain because the model structure may be incorrect and the model’s inputs may be imperfect. These uncertainties can be accounted for in an approximate way if we assume that the models inputs and states are random vectors. Suppose we use a model and a postulated unconditional PDF f u ( u) for the input u to derive an unconditional PDF f y ( y ) for the state y . f y ( y ) characterizes our knowledge of the state before we include any measurements. Now suppose that we want to include information contained in the measurement vector z . This measurement is also a random vector because it depends on the random state y and the random measurement error  . The measurement PDF is f z ( z ). Our knowledge of the state after we include measurements is characterized by the conditional PDF f y|z (y| z). This density can be derived from Bayes Theorem. When y and z are multivariate normal f y|z (y| z) can be readily obtained from the multivariate normal expressions presented earlier. In other cases approximations must be made. The estimates (or analyses) provided by most data assimilation methods are based in some way on the conditional density f y|z (y| z) .