Presentation is loading. Please wait.

Presentation is loading. Please wait.

Random Sampling Approximations of E(X), p.m.f, and p.d.f.

Similar presentations


Presentation on theme: "Random Sampling Approximations of E(X), p.m.f, and p.d.f."— Presentation transcript:

1 Random Sampling Approximations of E(X), p.m.f, and p.d.f

2 Important Read through simulation slides for Thursday Homework #8 is due on Thursday Check web-page on Wednesday night -- print off any worksheets for simulation that might be there for Thursday Major mistakes on study guide put on-line; there is a new one there now –A different definition of the p.d.f and c.d.f for the uniform random variable then what was given in class but they mean the same thing. P.M.F versus P.D.F – need clarification because I mispoke

3 P.M.F versus P.D.F Either graph can be a histogram –I was assuming that the bin width will always be 1 for a finite random variable but that is not necessarily the case Take X = 0, ½, 1, etc. Probability Mass Function –The values along the y-axis of a histogram represent probabilities If you sum up the probabilities, they should add up to 1 every time (regardless of the bin width) Thus, to determine is a graph is a p.m.f, you need to add up the heights of the rectangles – if they add up to 1, then it is a p.m.f.

4 P.M.F versus P.D.F A probability density function can also be a histogram The values along the y-axis do not represent probabilities for a continuous random variable However, the area under graph must be equal to 1 –How can you check if a histogram represents an p.d.f? If the heights of the rectangles do not add up to 1, but the areas of the rectangles do sum to 1.

5 In conclusion Both a p.m.f and p.d.f graph can be histograms –To tell if a histogram represents a p.m.f, the sum of the HEIGHTS of the rectangles must equal 1 (because the heights represent probabilities) –To tell if a histogram represents an p.d.f, the sum of the heights of the rectangles do not equal but the area of the rectangles do. Let’s look at number 4b from the homework just turned in ….

6 Why do we use Random Sampling? In business, we identify a random variable We want its probability information Problem: We do not know its distribution OR expected value Solution: Estimate E(X) and estimate F X (x) and f X (x) using random sampling

7 Definitions A number x that results from a trial of the process is called an observation of X A set {x 1, x 2, ……, x n } of n independent observations of the same random variable X is called a random sample of size n.

8 Example #1 Suppose that X is the number of assembly line stoppages that occur during an 8-hour shift in a manufacturing plant. We could obtain a random sample of size 10 by watching the line for 10 different shifts and recording the number of stoppages during each 8-hour shift

9 Example #1 (continued) Looking above, we see the information recorded during the 10 different shift observations We can compute the sample mean of the observations The sample mean is denoted by

10 Statistics and Probability There is a difference between probabilities and statistics even though people use them interchangably A number that describes a sample is called a statistic THEOREM: The statistic can be used as an estimate of E(X). In general, the larger the sample size n, the better the estimate will be

11 Sample Mean We can find the mean of example #1

12 Approximating Probability Mass and Density Functions If we have a large enough sample, we can approximate functions I.e., we can approximate a p.m.f or a p.d.f depending on the random variable If we approximate a p.m.f or p.d.f, we can also look at the corresponding graphs

13 Example #3 Suppose that the assembly line discussed in Example #1 runs 24 hours a day, with workers in three shifts. Observations of the number of stoppages during an 8-hour shift were recorded for a nine month period. I.e., 819 different shifts were observed and recorded in the file Stoppages.xls.Stoppages.xls

14 Relative Frequencies Relative frequencies were plotted to obtain the histogram seen in Stoppages.xls The relative frequency of each value X in the sample gives an estimate for the probability that X will assume that value. WHY? How did we obtain the relative frequencies? A histogram will give a good approximation for the graph of f X

15 Continuous Random Variables A large random sample can also be used to approximate the p.d.f of a continuous random variable One way we can obtain our p.d.f is by looking at smaller and smaller bin widths of our data Use the HISTOGRAM function in Excel to find the approximation of the graph of the p.d.f

16 Example #4 The manager of the plant that was described in the the previous examples wants to get a better of understanding of the delays caused by stoppagesof the assembly line. So, in addition to knowing how many stoppages there are, the manager wants to know how long they last.

17 Example #4 (continued) Let T be the (exact) length of time, in minutes, that a randomly selected stoppage will last QUESTION: Is T a continuous random variable? In Stoppages.xls, the duration of each stoppage was also recorded for all 819 shifts Therefore, we have a random sample of observations for T

18 Example #4 (continued) Used the function HISTOGRAM in Excel to plot an approximation of the p.d.f., f T In Stoppages.xls, bin widths of 2 minutes are used Since our bin width is 2, to make the area under the graph be 1, we had to divide each relative frequency by 2 and then plot those “new” relative frequencies –Note: Here you are dividing the relative frequency by the bin width – not the frequency by the bin width as stated in class Thus, you find the relative frequency as you did before and then divide it by the bin width By connecting the midpoints of the tops of the rectangles gives us an approximate curve

19 Using the approximated p.d.f We can use our plot to calculate probabilities For example, if we wanted to know P(2<T  4), we could look at the corresponding area under the graph Note: P(2<T  4) corresponds to an area under the graph between (2,4] which is a rectangle So, to find our probability, find the area of the rectangle

20 Focus on the Project We have a continuous random variable R norm which gives the normalized ratio of weekly closing prices on Disney stock (class project) Option Focus.xls contains 417 values of R norm from 417 weekly closing ratios They are considered to be independent observations Thus, make up a random sample of size 417 for R norm

21 Focus on the Project We can calculate sample mean which we know should be equal to what? We can create a plot using the relative frequencies Note: If your bin width is greater than 1, you will have to divide the relative frequency by your bin width to make the area under the curve be 1 Graphing the midpoints at the tops of the bars will produce a line graph approximation for f norm

22 What should you do? Plot an approximation of the probability density function for the normalized ratios of weekly closing prices The plot should be a line graph, where you are connecting the midpoints of the tops of the bars Remember, if your bin width is greater than 1 you will have to divide the relative frequencies by that width before you plot Find the sample mean of the normalized ratios – you already know what it should equal


Download ppt "Random Sampling Approximations of E(X), p.m.f, and p.d.f."

Similar presentations


Ads by Google