Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University.

Similar presentations


Presentation on theme: "Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University."— Presentation transcript:

1 Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University Industrial & Systems Engineering Dept. Steve Kennedy 1

2 Statistical Hypotheses
A statistical hypothesis is an assertion or conjecture concerning one or more populations. To prove a hypothesis in statistics, we generally set up the opposite of the hypothesis and see if we can reject it. Acceptance of a hypothesis just means that there is not enough evidence to refute it. Rejection implies that the evidence truly does refute it. The hypothesis that we wish to test is called the null hypothesis and denoted H0. Rejection of H0 leads to acceptance of the alternate hypothesis H1. Acceptance or rejection will be based on the value of a sample statistic. The critical region is the range of the sample statistic for which we reject the null hypothesis. The range for which we accept is called the acceptance region.

3 Testing a Hypothesis There are two types of error that can be made when testing a null hypothesis. Type I error: Rejecting the null hypothesis when it is true. Type II error: Accepting the null hypothesis when it is false. We define  = P(type I error), and  = P(type II error). In hypothesis testing, we generally want to minimize , the probability of making a type I error. The value of  can be reduced by adjusting the critical region. Decreasing  generally causes  to increase and vice versa. Increasing the sample size n will decrease both  and .

4 More Hypothesis Testing
The power of an hypothesis test is the probability of (correctly) rejecting H0 given that H1 is true. For a given , we would like a test with a high power value. Summary of hypothesis test outcomes: Reality  Decision  H0 true H1 true Accept Correct Acceptance 1 -  Type II Error Reject Type I Error Correct Rejection (power = 1 - )

5 One-Tailed and Two-Tailed Tests
For a one-tailed test, the alternate hypothesis is in a single direction, for example, H0:  = 0 and H1:  > 0. For a two-tailed test, the alternate hypothesis can be in either direction, for example, H0:  = 0 and H1:   0.  is equal to the sum of the probabilities on either side (typically /2 in either tail). To calculate , a specific value, H1:  = 1 (either high or low) must be used. For a 2-sided test, the high and low values will generally be symmetric around 0.

6 Pre-selecting  vs. using a P-Value
In classical hypothesis testing, we typically pre-select  to be .05 or .01 and then determine the critical region. We can then reject the hypothesis with that level of significance. (Remember that in a 2-sided test, with  = .05, for example, the critical region would have .025 in either tail.) The alternative is calculating the P-value, or probability of obtaining the calculated result if H0 is true. The P-value provides more information than just that the hypothesis was rejected or not. If rejected, the P-value may be much less than .05 or .01, giving us additional confidence in our decision. If not rejected, the P-value may be very close to .05 or .01, allowing us the option of rejecting at a slightly reduced level. The judgment of the experimenter is used to interpret the calculated P-value result.

7 Single Mean Tests ( known)
Suppose we have an unknown population distribution with mean  and known variance 2, and that H0:  = 0 and H1:   0 . If either then reject H0. In this case, the probability of making a type I error (rejecting H0 when it is true) is . If z falls within the above limits, then accept H0. For a single-sided test with H1:  > 0 , reject H0 if z > z . For a single-sided test with H1:  < 0 , reject H0 if z < - z . The critical region for xbar can also be written in terms of  and  rather than z.

8 Single Mean Tests ( unknown)
Suppose we have a normal population distribution with mean  and unknown variance, and that H0:  = 0 and H1:   0 . If either then reject H0. In this case, the probability of making a type I error (rejecting H0 when it is true) is . If z falls within the above limits, then accept H0. For a single-sided test with H1:  > 0 , reject H0 if t > t, n-1 . For a single-sided test with H1:  < 0 , reject H0 if t < - t, n-1 . The critical region for xbar can also be written in terms of  and s rather than t. If n  30, can still use the normal distribution.

9 Difference of 2 Means ( known)
Suppose we have an unknown population distribution with mean  and known variance 2, and that H0: 1 - 1 = d0 and H1: 1 - 1  d0 . If either then reject H0. In this case, the probability of making a type I error (rejecting H0 when it is true) is . If z falls within the above limits, then accept H0. For a single-sided test with H1:  > 0 , reject H0 if z > z . For a single-sided test with H1:  < 0 , reject H0 if z < - z . The critical region for xbar1 - xbar2 can also be written in terms of  and  rather than z.

10 Test of a Single Proportion
Suppose we have a binomial experiment with a binomial random variable X with probability p, and that H0: p = p0 and H1: p  p0 . If either then reject H0. In this case, the probability of making a type I error (rejecting H0 when it is true) is . If z falls within the above limits, then accept H0. For a single-sided test with H1: p > p0 , reject H0 if z > z . For a single-sided test with H1: p < p0 , reject H0 if z < - z . The critical region for x can also be written in terms of n, p0 and q0 rather than z. Note: For small n, use the binomial distribution directly.

11 Goodness of Fit Test A goodness of fit test helps answer the questions: "Is a die fair?", "Is a population normally distributed", etc. For any situation comparing expected and observed frequencies in k different categories, if ei is the expected, and oi is the observed frequency in category i, is approximately 2 distributed with  = k - 1 degrees of freedom. The expected frequency, ei, in each category must be  5. Categories with ei < 5 may be combined, but without reference to the observed frequencies (don't cheat!). If the frequencies don't match, the 2 statistic will be high, so we generally set up a 1-sided test in that direction.

12 Goodness of Fit Test Summary
Steps for a 2 goodness of fit test of the hypothesis H0 that data follows a given distribution: Break the observed data up into a logical group of categories based on the range of the data. Do not base the categories in any way on the observed frequency values. Determine the total number of observations, n, for the observed data. For the given distribution, calculate the probability of of a randomly selected observation falling in each category. (These probabilities add to 1). Multiply each probability by n to get the expected number of observations, ei, in each category. (These and the oi's add to n). Determine the observed frequencies, oi, in each category. Combine categories if necessary to ensure that each ei  5. Do not use the oi's when deciding which categories to combine. Calculate 2 and reject H0 if the 2 value is too high.

13 Degrees of Freedom/Test for Independence
The 2 goodness of fit test can apply to many situations. However, the number of degrees of freedom must be adjusted for parameters taken from the observed data and used to calculate the expected data. For example, to test data for normality, if we estimate  and  using xbar and s, we must subtract 2 additional degrees of freedom and use  = k - 3. In a test for independence of two discrete variables, we write the data in a table and determine the expected table frequencies if the data variables are independent. For this case with a 2-dimensional table, we need to use the row and column sums to determine the expected frequencies in the table. Here, we use  = (r - 1)(c - 1) degrees of freedom for a table with r rows and c columns.


Download ppt "Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University."

Similar presentations


Ads by Google