 # FREQUENCY ANALYSIS Basic Problem: To relate the magnitude of extreme events to their frequency of occurrence through the use of probability distributions.

## Presentation on theme: "FREQUENCY ANALYSIS Basic Problem: To relate the magnitude of extreme events to their frequency of occurrence through the use of probability distributions."— Presentation transcript:

FREQUENCY ANALYSIS Basic Problem: To relate the magnitude of extreme events to their frequency of occurrence through the use of probability distributions.

FREQUENCY ANALYSIS Basic Assumptions: (a) Data analyzed are to be statistically independent & identically distributed - selection of data (Time dependence, time scale, mechanisms). (b) Change over time due to man-made (eg. urbanization) or natural processes do not alter the frequency relation - temporal trend in data (stationarity).

FREQUENCY ANALYSIS Practical Problems: (a) Selection of reasonable & simple distribution. (b) Estimation of parameters in distribution. (c) Assessment of risk with reasonable accuracy.

REVIEW OF BASIC CONCEPTS Probabilistic Outcome of a hydrologic event (e.g., rainfall amount & duration; flood peak discharge; wave height, etc.) is random and cannot be predicted with certainty. Terminologies  Population The collection of all possible outcomes relevant to the process of interest. Example: (1) Max. 2-hr rainfall depth: all non-negative real numbers; (2) No. of storm in June: all non-negative integer numbers.  Sample A measured segment (or subset) of the population.

REVIEW OF BASIC CONCEPTS Terminologies - Random Variable A variable describable by a probability distribution which specifies the chance that the variable will assume a particular value. Convention: Capital letter for random variables (say, X) whereas the lower case letter (say, x) for numerical realization that the random variable X will take. Example: X = rainfall amount in 2 hours (a random variable) ; x = 100.2 mm/2hr (realization). - Random variables can be - discrete (eg., no. of rainy days in June) or - continuous (eg., max. 2-hr rainfall amount, flood discharge)..

REVIEW OF BASIC CONCEPTS Terminologies Frequency & Relative Frequency oFor discrete random variables: oFrequency is the number of occurrences of a specific event. Relative frequency is resulting from dividing frequency by the total number of events. e.g. n = no. of years having exactly 50 rainy days; N = total no. of years. Let n=10 years and N=100 years. Then, the frequency of having exactly 50 rainy days is 10 and the relative frequency of having exactly 50 rainy days in 100 years is n/N = 0.1. oFor continuous random variables: oFrequency needs to be defined for a class interval. oA plot of frequency or relative frequency versus class intervals is called histogram or probability polygon. oAs the number of sample gets infinitely large and class interval length approaches to zero, the histogram will become a smooth curve, called probability density function.

REVIEW OF BASIC CONCEPTS Terminologies Probability Density Function (PDF) – For a continuous random variable, the PDF must satisfy and f(x) ≥ 0 for all values of x. For a discrete random variable, the PDF must satisfy and 1≥ p(x) ≥ 0 for all values of x.

REVIEW OF BASIC CONCEPTS Terminologies - Cumulative Distribution Function - For a continuous random variable, For a discrete random variable, by

Statistical Properties of Random Variables Population - Synonymous to sample space, which describes the complete assemblage of all the values representative of a particular random process. Sample - Any subset of the population. Parameters - Quantities that are descriptive of the population in a statistical model. Normally, Greek letters are used to denote statistical parameters. Sample statistics (or simply statistics): Quantities calculated on the basis of sample observations.

Statistical Moments of Random Variables Descriptors commonly used to show statistical properties of a RV are those indicative (1) Central tendency; (2) Dispersion; (3) Asymmetry. Frequently used descriptors in these three categories are related to statistical moments of a RV. Two types of statistical moments are commonly used in hydrosystem engineering applications: (1) product-moments and (2) L-moments.

Product-Moments rth-order product-moment of X about any reference point X=x o is defined, for continuous case, as whereas for discrete case, where E[  ] is a statistical expectation operator. In practice, the first three moments (r=1, 2, 3) are used to describe the central tendency, variability, and asymmetry. Two types of product-moments are commonly used: –Raw moments: µ r '=E[X r ] rth-order moment about the origin; and –Central moments: µ r =E[(X  µ x ) r ] = rth-order central moment –Relations between two types of product-moments are:. where C n,x = binomial coefficient = n!/(x!(n-x)!) Main disadvantages of the product-moments are: (1) Estimation from sample observations is sensitive to the presence of outliers; and (2) Accuracy of sample product-moments deteriorates rapidly with increase in the order of the moments.

Mean, Mode, Median, and Quantiles Expectation (1 st -order moment) measures central tendency of random variable X –Mean (  ) = Expectation = 1 = location of the centroid of PDF or PMF. –Two operational properties of the expectation are useful: in which  k =E[X k ] for k = 1,2, …, K. For independent random variables, Mode (x mo ) - the value of a RV at which its PDF is peaked. The mode, x mo, can be obtained by solving Median (x md ) - value that splits the distribution into two equal halves, i.e, Quantiles - 100pth quantile of a RV X is a quantity x p that satisfies P(X  x p ) = F x (x p ) = p A PDF could be uni-modal, bimodal, or multi-modal. Generally, the mean, median, and mode of a random variable are different, unless the PDF is symmetric and uni-modal.

Uni-modal and bi-modal distributions

Variance, Standard Deviation, and Coefficient of Variation Variance is the second-order central moment measuring the spreading of a RV over its range, Standard deviation (  x ) is the positive square root of the variance. Coefficient of variation,  x =  x /  x, is a dimensionless measure; useful for comparing the degree of uncertainty of two RVs with different units. Three important properties of the variance are: –(1) Var[c] = 0 when c is a constant. –(2) Var[X] = E[X 2 ]  E 2 [X] –For multiple independent random variables, where a k =a constant and  k = standard deviation of X k, k=1,2,..., K.

Skewness Coefficient Measures asymmetry of the PDF of a random variable Skewness coefficient,  x, defined as The sign of the skewness coefficient indicates the degree of symmetry of the probability distribution function.skewness coefficient Pearson skewness coefficient – In practice, product-moments higher than 3rd-order are less used because they are unreliable and inaccurate when estimated from a small number of samples See Table for equations to compute the sample product- moments.Table

Relative locations of mean, median, and mode for positively-skewed, symmetric, and negatively-skewed distributions.

Product-moments of random variables

Kurtosis (  x ) Measure of the peakedness of a distribution. Related to the 4th central product-moment as For a normal RV, its kurtosis is equal to 3. Sometimes, coefficient of excess,  x =  x  3, is used. All feasible distribution functions, skewness coefficient and kurtosis must satisfy

Some Commonly Used Distributions NORMAL DISTRIBUTION Standardized Variable: Z has mean 0 and standard deviation 1.

Some Commonly Used Distributions STANDARD NORMAL DISTRIBUTION:

Some Commonly Used Distributions LOG-NORMAL DISTRIBUTION 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 0123456 x f LN (x)  x =0. 6  x =1.3  x =0. 3 (a)  x = 1.0 (b)  x = 1.30

Some Commonly Used Distributions Gumbel (Extreme-Value Type I) Distribution

Some Commonly Used Distributions Log-Pearson Type 3 Distribution

Some Commonly Used Distributions Log-Pearson Type 3 Distribution with  >0, x  e  when  >0 and with  >0, x  e  when  <0

Download ppt "FREQUENCY ANALYSIS Basic Problem: To relate the magnitude of extreme events to their frequency of occurrence through the use of probability distributions."

Similar presentations