Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Random Variables.

Similar presentations


Presentation on theme: "Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Random Variables."— Presentation transcript:

1 Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Random Variables and Probability Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University

2 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Definition of random variable (RV) For a given probability space (, A, P[  ]), a random variable, denoted by X or X(  ), is a function with domain  and counterdomain the real line. The function X(  ) must be such that the set A r, denoted by, belongs to A for every real number r.

3 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

4 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Cumulative distribution function (CDF) The cumulative distribution function of a random variable X, denoted by, is defined to be

5 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Consider the experiment of tossing two fair coins. Let random variable X denote the number of heads. CDF of X is

6 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

7 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Indicator function or indicator variable Let  be any space with points  and A any subset of . The indicator function of A, denoted by, is the function with domain  and counterdomain equal to the set consisting of the two real numbers 0 and 1 defined by

8 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Discrete random variables A random variable X will be defined to be discrete if the range of X is countable. If X is a discrete random variable with values then the function denoted by and defined by is defined to be the discrete density function of X.

9 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Continuous random variables A random variable X will be defined to be continuous if there exists a function such that for every real number x. The function is called the probability density function of X.

10 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Properties of a CDF is continuous from the right, i.e.

11 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Properties of a PDF

12 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Example 1 Determine which of the following are valid distribution functions:

13 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

14 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

15 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Example 2 Determine the real constant a, for arbitrary real constants m and 0 < b, such that is a valid density function.

16 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Function is symmetric about m.

17 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Characterizing random variables Cumulative distribution function Probability density function Expectation (expected value) Variance Moments Quantile Median Mode

18 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Expectation of a random variable The expectation (or mean, expected value) of X, denoted by or E(X), is defined by:

19 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Rules for expectation Let X and X i be random variables and c be any real constant.

20 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Variance of a random variable

21 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University is called the standard deviation of X.

22 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Rules for variance

23 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Two random variables are said to be independent if knowledge of the value assumed by one gives no clue to the value assumed by the other. Events A and B are defined to be independent if and only if

24 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Moments and central moments of a random variable

25 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Properties of moments

26 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

27 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Quantile The q th quantile of a random variable X, denoted by, is defined as the smallest number satisfying.

28 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Median and mode The median of a random variable is the 0.5 th quantile, or. The mode of a random variable X is defined as the value u at which is the maximum of.

29 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Note: For a positively skewed distribution, the mean will always be the highest estimate of central tendency and the mode will always be the Lowest estimate of central tendency (assuming that the distribution has only one mode). For negatively skewed distributions, the mean will always be the lowest estimate of central tendency and the mode will be the highest estimate of central tendency. In any skewed distribution (i.e., positive or negative) the median will always fall in-between the mean and the mode.

30 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Moment generating function

31 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

32 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Usage of MGF MGF can be used to express moments in terms of PDF parameters and such expressions can again be used to express mean, variance, coefficient of skewness, etc. in terms of PDF parameters. Random variables of the same MGF are associated with the same type of probability distribution.

33 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University The moment generating function of a sum of independent random variables is the product of the moment generating functions of individual random variables.

34 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Expected value of a function of a random variable

35 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University If Y=g(X)

36 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Y Y=g(X) X x1x1 yy x2x2 x3x3

37 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Theorem

38 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Chebyshev Inequality

39 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University The Chebyshev inequality gives a bound, which does not depend on the distribution of X, for the probability of particular events described in terms of a random variable and its mean and variance.

40 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Probability density functions of discrete random variables Discrete uniform distribution Bernoulli distribution Binomial distribution Negative binomial distribution Geometric distribution Hypergeometric distribution Poisson distribution

41 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Discrete uniform distribution N ranges over the possible integers.

42 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Bernoulli distribution 1-p is often denoted by q.

43 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Binomial distribution Binomial distribution represents the probability of having exactly x success in n independent and identical Bernoulli trials.

44 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Negative binomial distribution Negative binomial distribution represents the probability of having exactly r success in x independent and identical Bernoulli trials. Unlike the binomial distribution for which the number of trials is fixed, the number of successes is fixed and the number of trials varies from experiment to experiment. The negative binomial random variable represents the number of trials needed to obtain exactly r successes.

45 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

46 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Geometric distribution Geometric distribution represents the probability of obtaining the first success in x independent and identical Bernoulli trials.

47 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Hypergeometric distribution where M is a positive integer, K is a nonnegative integer that is at most M, and n is a positive integer that is at most M.

48 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Let X denote the number of defective products in a sample of size n when sampling without replacement from a box containing M products, K of which are defective.

49 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Poisson distribution The Poisson distribution provides a realistic model for many random phenomena for which the number of occurrences within a given scope (time, length, area, volume) is of interest. For example, the number of fatal traffic accidents per day in Taipei, the number of meteorites that collide with a satellite during a single orbit, the number of defects per unit of some material, the number of flaws per unit length of some wire, etc.

50 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

51 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Assume that we are observing the occurrence of certain happening in time, space, region or length. Also assume that there exists a positive quantity which satisfies the following properties: 1.

52 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University 2. 3.

53 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

54 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

55 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

56 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

57 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

58 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

59 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

60 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Comparison of Poisson and Binomial distributions

61 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Example Suppose that the average number of telephone calls arriving at the switchboard of a company is 30 calls per hour. (1) What is the probability that no calls will arrive in a 3-minute period? (2) What is the probability that more than five calls will arrive in a 5-minute interval? Assume that the number of calls arriving during any time period has a Poisson distribution.

62 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Assuming time is measured in minutes.

63 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Assuming time is measured in seconds.

64 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University The first property provides the basis for transferring the mean rate of occurrence between different observation scales. The “small time interval of length h” can be measured in different observation scales. represents the time length measured in scale of. is the mean rate of occurrence when observation scale is used.

65 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University If the first property holds for various observation scales, say, then it implies the probability of exactly one happening in a small time interval h can be approximated by The probability of more than one happenings in time interval h is negligible.

66 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University probability that more than five calls will arrive in a 5-minute interval =1 - =0.042021

67 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Probability density functions of continuous random variables Uniform or rectangular distribution Normal distribution (also known as the Gaussian distribution) Exponential distribution (or negative exponential distribution) Gamma distribution (Pearson Type III) Chi-squared distribution Lognormal distribution

68 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Uniform or rectangular distribution

69 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University PDF of U(a,b)

70 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Normal distribution (Gaussian distribution)

71 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

72 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

73 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

74 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

75 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Commonly used values of normal distribution

76 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Exponential distribution (negative exponential distribution) Mean rate of occurrence in a Poisson process.

77 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

78 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

79 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Gamma distribution represents the mean rate of occurrence in a Poisson process. is equivalent to in the exponential density.

80 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University The exponential distribution is a special case of gamma distribution with The sum of n independent identically distributed exponential random variables with parameter has a gamma distribution with parameters.

81 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Pearson Type III distribution, and are mean, standard deviation and skewness coefficient of X, respectively. It reduces to Gamma distribution if = 0.

82 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University The Pearson Type III distribution is widely applied in stochastic hydrology. Total rainfall depths of storm events can be characterized by the Pearson Type III distribution.

83 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Chi-squared distribution The chi-squared distribution is a special case of gamma distribution with

84 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

85 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Log-Normal Distribution Log-Pearson Type III Distribution A random variable X is said to have a log-normal distribution if Log(X) is distributed with a normal density. A random variable X is said to have a Log-Pearson type III distribution if Log(X) has a Pearson type III distribution.

86 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Lognormal distribution

87 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

88 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Approximations between random variables Approximation of binomial distribution by Poisson distribution Approximation of binomial distribution by normal distribution Approximation of Poisson distribution by normal distribution

89 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Approximation of binomial distribution by Poisson distribution

90 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Approximation of binomial distribution by normal distribution Let X have a binomial distribution with parameters n and p. If, then for fixed a<b, is the cumulative distribution function of the standard normal distribution.

91 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University It is equivalent to say that as n approaches infinity X can be approximated by a normal distribution with mean np and variance npq.

92 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Approximation of Poisson distribution by normal distribution Let X have a Poisson distribution with parameter. If, then for fixed a<b

93 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University It is equivalent to say that as approaches infinity X can be approximated by a normal distribution with mean and variance.

94 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

95 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Example Suppose that two fair dice are tossed 600 times. Let X denote the number of times of a total of 7 occurs. What is the probability that ?

96 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

97 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Transformation of random variables [Theorem] Let X be a continuous RV with density f x. Let Y=g(X), where g is strictly monotonic and differentiable. The density for Y, denoted by f Y, is given by

98 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Proof: Assume that Y=g(X) is a strictly monotonic increasing function of X.

99 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

100 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Using the moment ratio diagram (MRD) for goodness-of-fit (GOF) test A two dimensional plot of coefficient of skewness ( ) vs coefficient of kurtosis( ) is called a moment ratio diagram. An MRD uniquely defines the distribution types of individual random variables. By examining scattering of sample moment ratios we can identify the distribution type for the random samples.

101 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Product (ordinary) moment ratio diagram

102 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Simulation Given a random variable X with CDF F X (x), there are situations that we want to obtain a set of n random numbers (i.e., a random sample of size n) from F X (.). The advances in computer technology have made it possible to generate such random numbers using computers. The work of this nature is termed “simulation”, or more precisely “stochastic simulation”.

103 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University

104 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Pseudo-random number generation Pseudorandom number generation (PRNG) is the technique of generating a sequence of numbers that appears to be a random sample of random variables uniformly distributed over (0,1).

105 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University A commonly applied approach of PRNG starts with an initial seed and the following recursive algorithm (Ross, 2002) modulo m where a and m are given positive integers, and where the above means that is divided by m and the remainder is taken as the value of.

106 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University The quantity is then taken as an approximation to the value of a uniform (0,1) random variable. Such algorithm will deterministically generate a sequence of values and repeat itself again and again. Consequently, the constants a and m should be chosen to satisfy the following criteria:

107 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University For any initial seed, the resultant sequence has the “ appearance ” of being a sequence of independent uniform (0,1) random variables. For any initial seed, the number of random variables that can be generated before repetition begins is large. The values can be computed efficiently on a digital computer.

108 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University A guideline for selection of a and m is that m be chosen to be a large prime number that can be fitted to the computer word size. For a 32- bit word computer, m = and a = result in desired properties (Ross, 2002).

109 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University Simulating a continuous random variable probability integral transformation

110 Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University


Download ppt "Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Random Variables."

Similar presentations


Ads by Google