Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probability plots.

Similar presentations


Presentation on theme: "Probability plots."— Presentation transcript:

1 Probability plots

2 Application to data An investigator sometimes obtains a numerical sample and wants to determine if it is plausible that it came from a certain distribution. This may be necessary because many procedures for statistical inference are based on the assumption that the population distribution is of a specific type. Knowing the distribution can sometimes give insight into the physical mechanism that generates the data.

3 Application to data (continued)
An effective way to check a distributional assumption is to construct what is called a probability plot. The essence of such a plot is that if the distribution on which the plot is based is correct, the points in the plot should fall on a straight line. The details in constructing the plot differ a bit from source to source.

4 Sample percentiles Recall that the (100p)th percentile of a continuous distribution with cdf F is the number such that Sample percentiles are handled in roughly the same way, e.g. the (100p)th sample percentile should have (100p)% of the data to the left of the data point. This can’t happen exactly for all percentiles because of the discrete number of data points. For example, if n = 10, what is the 50th percentile?

5 Definition of sample percentiles
Order the n sample observations from smallest to largest. Then the j-th smallest observation in the list is taken to be sample percentile.

6 What is plotted? We then plot in the x-y plane
If the sample percentiles are close to the population percentiles, the points will fall close to the line, and the distribution is plausible.

7 Does some member of family of distributions fit?
An investigator is typically not interested in knowing whether a specified probability distribution is a plausible model, but whether some member of a family of distributions supplies a plausible model. As an example, one may be interested in whether some normal distribution is a good fit.

8 Example: normal distribution
The standard normal percentiles and percentiles of an arbitrary normal distribution are related by: Percentile for normal corresponding normal percentile

9 Normal probability plot
A plot of pairs is a normal probability plot. If the sample observations are drawn from a normal distribution, the points should fall close to a line with slope and intercept .

10 Some categories for non-normal distributions and relation to plot
Symmetric with “lighter tails” that the normal distribution (points below line on right end and above line on left end; see Figure 4.34) Symmetric with heavier tails than the normal distribution (points above line on right end and below line on left end; see Figure 4.37 (a)) For a distribution that has a short left tail and long right tail (positively skewed), both the smallest and largest observations will be above the line (Figure 4.37b).

11 How close is close? Even when the population distribution is normal, the sample percentiles will not coincide exactly with the theoretical percentiles because of sampling variability. How much can the points in the probability plot deviate from a straight-line pattern before the assumption of population normality is no longer plausible?

12 Small sample versus large sample
There is typically greater variation in the appearance of a probability plot for sample sizes smaller than 30, and only for a much larger sample size does a linear pattern generally predominate. When a plot is based on a small sample, only a very substantial departure from linearity should be taken as conclusive evidence of non-normality.


Download ppt "Probability plots."

Similar presentations


Ads by Google