Probability Distributions Random Variables: Finite and Continuous A review MAT174, Spring 2004.
Published byModified over 6 years ago
Presentation on theme: "Probability Distributions Random Variables: Finite and Continuous A review MAT174, Spring 2004."— Presentation transcript:
Probability Distributions Random Variables: Finite and Continuous A review MAT174, Spring 2004
Finite Random Variables We want to associate probabilities with the values that the random variable takes on. There are two types of functions that allow us to do this: Probability Mass Functions (p.m.f) Cumulative Distribution Functions (c.d.f)
Probability Distributions The pattern of probabilities for a random variable is called its probability distribution. In the case of a finite random variable we call this the probability mass function (p.m.f.), f x (x) where f x (x) = P( X = x )
Probability Mass Function This is a p.m.f which is a histogram representing the probabilities The bars are centered above the values of the random variable The heights of the bars are equal to the corresponding probabilities (when the width of your rectangles is 1)
Cumulative Distribution Function The same probability information is often given in a different form, called the cumulative distribution function (c.d.f) or F X F X (x) = P(X ≤ x) 0 ≤ F X (x) ≤ 1, for all x In the finite case, the graph of a c.d.f. should look like a step function, where the maximum is 1 and the minimum is 0.
Binomial Random Variable Let X stand for the number of successes in n Bernoulli Trials where X is called a Binomial Random Variable Binomial Setting: 1.You have n repeated trials of an experiment 2. On a single trial, there are only two possible outcomes 3.The probability of success is the same from trial to trial 4.The outcome of each trial is independent Expected Value of a Binomial R.V is represented by E(X)=n*p
BINOMDIST BINOMDIST is a built-in Excel function that gives values for the p.m.f and c.d.f of any binomial random variable It is located under Statistical in the Function menu – BINOMDIST(x, n, p, false) = P(X=x) – BINOMDIST(x, n, p, true) = P(X ≤ x)
Expected Value This is average value of X (what happens on average in infinitely many repeated trials of the underlying experiment – It is denoted by X For a Binomial Random Variable, E(X)=n*p, where n is the the number of independent trials and p is the probability of success
Continuous Random Variable Continuous random variables take on values in an interval; you cannot list all the possible values Examples: 1.Let X be a randomly selected number between 0 and 1 2.Let R be a future value of a weekly ratio of closing prices for IBM stock 3.Let W be the exact weight of a randomly selected student You can only calculate probabilities associated with interval values of X. You cannot calculate P(X=x); however we can still look at its c.d.f, F X (x).
Probability Density Function (p.d.f) Represented by f x (x) – f x (x) is the height of the function f x (x) at an input of x – This function does not give probabilities For any continuous random variable, X, P(X=a)=0 for every number a. Look at probabilities associated with X taking on an interval of values – P(a ≤ X ≤ b)
Probability Density Function (p.d.f) To find P(a ≤ X ≤ b), we need to look at the portion of the graph that corresponds to this interval. How can we relate this to integration? A ab fXfX
Cumulative Distribution Function CDF -- – F X (x)=P(X ≤ x) – 0 ≤ F X (x) ≤ 1, for all x NOTE: Regardless of whether the random variable is finite or continuous, the cdf, F X, has the same interpretation – I.e., F X (x)=P(X ≤ x)
Cumulative Distribution Function For the finite case, our c.d.f graph was a step function For the continuous case, our c.d.f. graph will be a continuous graph
Fundamental Theorem of Calculus (FTC) Given that – Differentiate both sides and what happens? Well, from the previous slide we can see that – If we differentiate both sides, we get that What does this say? How can we verify this claim?
Example 7 from Course Files Define the following function: – What are the possible values of X? – Set up an integral that would give you the following probabilities: P(X < 0.5) P(X > 0.6) P(0.1 ≤ X ≤ 0.9) P(0.1 ≤ X ≤ 5) – Verify that the function is a density function – What is E(X)?
Expected Value For a finite random variable, we summed over all possible values of x For a continuous random variable, we want to integrate over all possible values of x This implies that
Example 8 from the Course Files Let T be the amount of time between consecutive computer crashes and has the following p.d.f. and c.d.f. – What type of r.v. is T? – Calculate P(1 < T < 5) in two different ways. – What is E(X)?
Exponential Distribution Exponential random variables usually describe the waiting time between consecutive events. In general, the p.d.f and c.d.f for an exponential random variable X is given as follows: Any EXPONENTIAL random variable X, with parameter , has How can we verify this?
Continuous R.V. with exponential distribution How can we verify that the graph on the left is the graph of a p.d.f.?
Uniform Distribution If the probability that X assumes a value is the same for all equal subintervals of an interval [0,u], then we have a continuous uniform random variable X is equally likely to assume any value in [0,u] If X is uniform on the interval [0,u], then we have the following formulas:
Continuous R.V. with uniform distribution In general, if X is a continuous random variable with a UNIFORM distribution on [0,u], then
Focus on the Project Look at the file Auction Focus.xls in the course files – This file contains 22 prior leases – Looking at each prior lease, we see that if each company bid their signal, every company that won the auction would have lost money – We want to devise a new bidding strategy using this data Use data to simulate thousands of similar auctions
Identify Random Variables We need random variables – Let V be the continuous random variable that gives the fair profit value, in millions of dollars, for an oil lease similar to the 22 tracts Look through Auction Focus.xls to see the statistics for the sample – Each signal is an observation of the continuous random variable, S V where v is the actual fair value of the tract It is assumed that E(S V ) = v for every lease – R V gives the error in a company’s signal Given by the signal minus the actual fair profit value of the lease E(R V ) = 0 for every value of v
What should you do? From slide 65 in MBD 2 Proj 2.ppt – 1. Start an Excel file which incorporates the historical data on the lease values and your team’s particular set of signals 2. Use these to compute the complete sample of signal errors, and then analyze this sample. Specifically, you should compute the maximum, minimum, and sample mean of the errors. You should also plot a histogram that approximates the actual p.d.f, f R of R – Go to slide 50 to see information about relative frequencies