Virtual University of Pakistan

Slides:



Advertisements
Similar presentations
Note 7 of 5E Statistics with Economics and Business Applications Chapter 5 The Normal and Other Continuous Probability Distributions Normal Probability.
Advertisements

FURTHER APPLICATIONS OF INTEGRATION Probability In this section, we will learn about: The application of calculus to probability.
CHAPTER 6 Statistical Analysis of Experimental Data
Continuous Random Variables and Probability Distributions
QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik.
Copyright © Cengage Learning. All rights reserved. 4 Continuous Random Variables and Probability Distributions.
Chris Morgan, MATH G160 March 2, 2012 Lecture 21
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
Econ 482 Lecture 1 I. Administration: Introduction Syllabus Thursday, Jan 16 th, “Lab” class is from 5-6pm in Savery 117 II. Material: Start of Statistical.
Distributions Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
Chapter 7: The Normal Probability Distribution
Chapter 6 The Normal Probability Distribution
Business Statistics: Communicating with Numbers
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Virtual University of Pakistan Lecture No. 30 Statistics and Probability Miss Saleha Naghmi Habibullah.
DISCRETE PROBABILITY DISTRIBUTIONS
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Chapter 4. Discrete Random Variables A random variable is a way of recording a quantitative variable of a random experiment. A variable which can take.
Biostatistics Unit 5 – Samples. Sampling distributions Sampling distributions are important in the understanding of statistical inference. Probability.
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
Random Variable The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we.
CY1B2 Statistics1 (ii) Poisson distribution The Poisson distribution resembles the binomial distribution if the probability of an accident is very small.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Random Variables. Numerical Outcomes Consider associating a numerical value with each sample point in a sample space. (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
Virtual University of Pakistan Lecture No. 26 Statistics and Probability Miss Saleha Naghmi Habibullah.
Lecture 21 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
Virtual University of Pakistan Lecture No. 35 of the course on Statistics and Probability by Miss Saleha Naghmi Habibullah.
Copyright © Cengage Learning. All rights reserved. 8 PROBABILITY DISTRIBUTIONS AND STATISTICS.
Virtual University of Pakistan Lecture No. 11 Statistics and Probability by Miss Saleha Naghmi Habibullah.
Copyright © Cengage Learning. All rights reserved. 4 Continuous Random Variables and Probability Distributions.
Theoretical distributions: the Normal distribution.
Probability Distributions  A variable (A, B, x, y, etc.) can take any of a specified set of values.  When the value of a variable is the outcome of a.
Virtual University of Pakistan
Virtual University of Pakistan
Virtual University of Pakistan
Virtual University of Pakistan
MATB344 Applied Statistics
Virtual University of Pakistan
Virtual University of Pakistan
Virtual University of Pakistan
CORRELATION.
Normal Distribution and Parameter Estimation
Virtual University of Pakistan
Section 7.3: Probability Distributions for Continuous Random Variables
Virtual University of Pakistan
CONTINUOUS RANDOM VARIABLES
Chapter 6. Continuous Random Variables
Virtual University of Pakistan
What does a population that is normally distributed look like?
Distribution of the Sample Means
Virtual University of Pakistan
ENGR 201: Statistics for Engineers
Combining Random Variables
2.1 Normal Distributions AP Statistics.
Introduction to Probability and Statistics
Econometric Models The most basic econometric model consists of a relationship between two variables which is disturbed by a random error. We need to use.
Process Capability.
Virtual University of Pakistan
Chapter 6 The Definite Integral
CONTINUOUS RANDOM VARIABLES AND THE NORMAL DISTRIBUTION
Virtual University of Pakistan
Warsaw Summer School 2017, OSU Study Abroad Program
10-5 The normal distribution
Lecture 12: Normal Distribution
Statistical analysis and its application
Tutorial 6 The Definite Integral
The Normal Distribution
Presentation transcript:

Virtual University of Pakistan Lecture No. 24 of the course on Statistics and Probability by Miss Saleha Naghmi Habibullah

F(a) = p( x < a) =

IN THE LAST LECTURE, YOU LEARNT Graphical Representation of the Distribution Function of a Discrete Random Variable Mathematical Expectation Mean, Variance and Moments of a Discrete Probability Distribution Properties of Expected Values

TOPICS FOR TODAY Chebychev’s Inequality Concept of Continuous Probability Distribution Mathematical Expectation, Variance & Moments of a Continuous Probability Distribution

We begin with the discussion of the concept of the Chebychev’s Inequality in the case of a discrete probability distribution:

Chebychev’s Inequality If X is a random variable having mean  and variance 2 > 0, and k is any positive constant, then the probability that a value of X falls within k standard deviations of the mean is at least That is:

Alternatively, we may state Chebychev’s theorem as follow:

Given the probability distribution of the random variable X with mean  and standard deviation , the probability of the observing a value of X that differs the  by k or more standard deviations cannot exceed 1/k2.

As indicated earlier, this inequality is due to the Russian mathematician P.L. Chebychev (1821-1894), and it provides a means of understanding how the standard deviation measures variability about the mean of a random variable. It holds for all probability distributions having finite mean and variance.

Let us apply this concept to the example of the number of petals on the flowers of a particular species that we considered earlier: EXAMPLE: If a biologist is interested in the number of petals on a particular flower, this number may take the values 3, 4, 5, 6, 7, 8, 9, and each one of these numbers will have its own probability.

The probability distribution of the random variable X is:

 = The mean of this distribution is:  = E(X) = XP(X) = 5.925  5.9 And the standard deviation of this distribution is:  =

According to the Chebychev’s inequality, the probability is at least 1 - 1/22 = 1 - 1/4 = 3/4 = 0.75 that X will lie between  - 2 and  + 2 i.e. between 5.9 - 2(1.3) and 5.9 + 2(1.3) i.e. between 3.3 and 8.5

Let us have another look at the probability distribution:

According to this distribution, the probability that X lies between 3 According to this distribution, the probability that X lies between 3.3 and 8.5 is 0.10 + 0.20 + 0.30 + 0.25 + 0.075 = 0.925 which is greater than 0.75 (AS indicated by the Chebychev’s inequality).

Finally, and most importantly, we will use the concepts in Chebychev's Rule and the Empirical Rule to build the foundation for statistical inference-making. The method is illustrated in next example.

EXAMPLE Suppose you invest a fixed sum of money in each of five business ventures. Assume you know that 70% of such ventures are successful, the outcomes of the ventures are independent of one another, and the probability distribution for the number, x, of successful ventures out of five is:

a) Find  = E(X). Interpret the result. b) Find Interpret the result.

c) Graph P(x). d) Locate  and the interval  + 2 on the graph. Use either Chebychev’s Rule or the Empirical Rule to approximate the probability that x falls in this interval. Compare this result with the actual probability. e) Would you expect to observe fewer than two successful ventures out of five?

SOLUTION a) Applying the formula,  = E(X) = x P(x) = 0(.002) + 1(.029) + 2(.132) + 3(.309) + 4.(.360) + 5(.168) = 3.50

INTERPRETATION: On average, the number of successful ventures out of five will equal 3.5. (It should be remembered that this expected value has meaning only when the experiment – investing in five business ventures – is repeated a large number of times.)

b) Now we calculate the variance of X:

We know that 2 = E[(X - )2] = (x - )2 P(x)

Hence, we will need to construct a column of x - :

Thus, the variance is 2 = 1.05 and the standard deviation is

This value measures the spread of the probability distribution of X, the number of successful ventures out of five.

c) The graph of P(x) is shown in the following figure with the mean  and the interval  + 2 = 3.50 + 2(1.02) = 3.50 + 2.04 = (1.46, 5.54) shown on the graph.

  + 2 (5.54) (1.46)

Note particularly that  = 3 Note particularly that  = 3.5 locates the centre of the probability distribution.

Since this distribution is a theoretical relative frequency distribution that is moderately mound-shaped, we expect (from Chebychev’s Rule) at least 75% and, more likely (from the Empirical Rule), approximately 95% of observed x values to fall in the interval  + 2 ------ that is, between 1.46 and 5.54.

It can be seen from the above figure that the actual probability that X falls in the interval  + 2 includes the sum of P(x) for the values X = 2, X = 3, X = 4, and X = 5.

  + 2 (5.54) (1.46)

This probability is P(2) + P(3) + P(4) + P(5) = .132 +.309 + .360 + .168 = .969. Therefore, 96.9% of the probability distribution lies within 2 standard deviations of the mean.

This percentage is CONSISTENT with both the Chebychev’s rule and the Empirical Rule.

d) Fewer than two successful ventures out of five implies that x = 0 or x = 1. Since both these values of x lie outside the interval  + 2, we know from the Empirical Rule that such a result is unlikely (with approximate probability of only .05).

The exact probability, P(x < 1), is P(0) + P(1) = .002 + .029 = .031. Consequently, in a single experiment where we invest in five business ventures, we would not expect to observe fewer than two successful ones.

The key question: What is the significance of the Chebychev’s Inequality and the Empirical Rule? The answer to this question is that both these rules assist us in having a certain IDEA regarding amount of data lying between the mean minus a certain number of standard deviations and mean plus that same number of standard deviations.

Given any data-set, the moment we compute the mean and standard deviation, we HAVE an idea regarding the two points (i.e. mean minus two standard deviations, and mean plus two standard deviations) between which the BULK of our data lies.

If our data-set is hump-shaped, we obtain this idea through the Empirical Rule, and if we don’t have any reason to believe that our data-set is hump-shaped, then we obtain this idea through the Chebychev’s Rule.

Next, we begin the discussion of CONTINUOUS RANDOM VARIABLES.

In this regard, the first point to be noted is that uptil now we have discussed discrete random variables – quantities that are countable.

We now begin the discussion of CONTINUOUS RANDOM VARIABLES – quantities that are measurable.

As stated in the very first lecture, continuous variables result from measurement, and can therefore take any value within a certain range. For example, the height of a normal Pakistani adult male may take any value between 5 feet 4 inches and 6 feet.

The temperature at a place, the amount of rainfall, time to failure for an electronic system, etc. are all examples of continuous random variable. Formally speaking, a continuous random variable can be defined as follows:

CONTINUOUS RANDOM VARIABLE A random variable X is defined to be continuous if it can assume every possible value in an interval [a, b], a < b, where a and b may be – and + respectively.

The function f(x) is called the probability density function, abbreviated to p.d.f., or simply density function of the random variable X.

A continuous probability distribution looks something like this:

f(x) X

A p.d.f. has the following properties:

iii) The probability that X takes on a value in the interval [c, d], c < d is given by: P(c < x < d) = which is the area under the curve y = f(x) between X = c and X = d, as shown in the following figure:

c d f(x) P(c < x < d) The TOTAL area under the curve is 1.

In other words: 1) f(x) a non-negative function, 2) the integration takes place over all possible values of the random variable X between the specified limits, and 3) the probabilities are given by appropriate areas under the curve.

Since it should therefore be noted that the probability of a continuous random variable X taking any particular value k is always zero. That is why probability for a continuous random variable is measurable only over a given interval.

Further, since for a continuous random variable X, P(X = x) = 0 for every x, the following four probabilities are regarded as the same: P(c < X < d), P(c < X < d), P(c < X < d) and P(c < X d). They may be different for a discrete random variable.

The values (expressed as intervals) of a continuous random variable and their associated probabilities can be expressed by means of a formula.

We now discuss the distribution function of a continuous random variable:

CONTINUOUS RANDOM VARIABLE A random variable X may also be defined as continuous if its distribution function F(x) is continuous and is differentiable everywhere except at isolated points in the given range.

In contrast with the graph of the distribution function of a discrete variable, the graph of F(x) in the case of a continuous variable has no jumps or steps but is a continuous function for all x-values, as shown in the following figure:

1 X F(x) F(a) F(b)

The relationship between f(x) and F(x) is as follows:

f(x) is obtained by finding the derivative of F(x), i.e.

Let us now explain the above concepts with the help of an example:

EXAMPLE a) Find the value of k so that the function f(x) defined as follows, may be a density function f(x) = kx, 0 < x < 2 = 0, elsewhere b) Compute P(X = 1). c) Compute P(X > 1). d) Compute the distribution function F(x). e)

SOLUTION a) The function f(x) will be a density function, if i) f(x) > 0 for every x, and ii)

The first condition is satisfied when k > 0. The second condition will be satisfied, if

We had f(x) = kx, 0 < x < 2 = 0, elsewhere and since we have obtained k = 1/2, hence:

b) Since f(x) is continuous probability function, therefore P(X = 1) = 0.

c) P(X > 1) is obtained by computing the area under the curve (in this case, a straight line) between X=1 and X=2: 1 2 X f(x) f(x) = x|2

This area is obtained as follows:

d) To compute the distribution function, we need to find:

We do so step by step, as shown below:

For any x such that - ¥ < x < 0, F(x) = ò - = x , dx

We will discuss the computation of the conditional probability in the next lecture.

IN TODAY’S LECTURE, YOU LEARNT Chebychev’s Inequality Concept of Continuous Probability Distribution Distribution Function of a Continuous Probability Distribution

IN THE NEXT LECTURE, YOU WILL LEARN Mathematical Expectation, Variance & Moments of a Continuous Probability Distribution BIVARIATE Probability Distribution