Presentation on theme: "11.1 – Significance Tests: The Basics"— Presentation transcript:
111.1 – Significance Tests: The Basics Purpose: to assess the evidence provided by data about some claim covering a population.Basic Idea: an outcome that would rarely happen if a claim were true is good evidence that the claim is not true.A significance test is a formal procedure for comparing observed data with a hypothesis whose truth we want to access. The hypothesis is a statement about a population parameter, like the population mean µ or population proportion p.
2The statement being tested in a significance test is called the null hypothesis, Ho. The significance test is designed to assess the strength of the evidence against the null hypothesis. Usually the null hypothesis is a statement of ‘no effect’, ‘no difference’, or no change from historical values. It is the status quo.A significance test assesses the evidence provided by data against a null hypothesis, Ho, in favor of the alternative hypothesis, Ha. The claim about the population we are trying to find evidence for is the alternative hypothesis.
3Hypotheses always refer to some population, not to a particular outcome. Be sure to state Ho and Ha in terms of a population parameter. We are trying to make a statement about an unknown population value and don’t need to make hypothesis about things we already know.The alternative hypothesis should express the hopes or suspicions we have before we see the data. It is cheating to first look at the data and then frame Ha to fit what the data show.
4Example:Because of variation in the manufacturing process, tennis balls produced by a particular machine did not have identical diameters. Let µ denote the true average diameter for tennis balls currently being produced.Suppose that the machine was initially calibrated to achieve the design specification µ = 3 inches. The manufacturer is now concerned that the diameters no longer conform to the specification (i.e., µ ≠ 3 inches must now be considered a possibility). If sample evidence suggests that µ ≠ 3 inches, the production process will have to be halted while the machine is recalibrated. Because stopping production is costly, the manufacturer wanted to be quite sure that µ ≠ 3 before undertaking recalibration.
5Under these circumstances, a sensible choice of hypothesis is: Ho: µ = 3 (the spec is being met, recalibration is unnecessary)Ha: µ ≠ 3 (the spec isn’t being met, recalibration is necessary)Only compelling sample evidence would then result in Ho being rejected.
6Stating Ha is not always straight-forward Stating Ha is not always straight-forward. It is not always clear – whether Ha should be 1- or 2-sided.Example 11.3 on page 162 in your book concerns a job satisfaction survey. The parameter of interest is the mean µ of the difference in scores. The authors of the study wanted to know if the 2 work conditions have different levels of job satisfaction. They did not specify the direction of the difference. Here the null hypothesis will be Ho = 0. Since the direction of the difference was not specified, it could be µ < 0 or µ > 0. For simplicity it is just written as Ha ≠ 0.
7Do Now: Exercise 11.4, page 693Each of the following situations calls for a significance test. State the appropriate Ho and Ha in each case. Be sure to define your parameter each time.a) Larry’s car averages 26 miles per gallon on the highway. He switches to a new brand of motor oil that is advertised to increase gas mileage. After driving 3000 highway miles with the new oil, he wants to determine if the average gas mileage has increased.
8b) A May 2005 Gallup Poll report on a national survey of 1028 teenagers revealed that 72% of teens said they rarely or never argue with their friends. You wonder whether this national result will be true in your school. You conduct your own survey of a random sample of students at your school.
9Solution:a) µ = the mean gas mileage for Larry’s car on the highwayHo: µ = 26 mpgHa: µ > 26 mpgb) p = proportion of teens in your schools who rarely or never fight with their friends.Ho: p = 0.72Ha: p ≠ 0.72
10Verify 3 conditions are met before you begin your calculations: SRS, Normality, independence. Normality check for means: population distribution is Normal, or large sample size (n≥30).Normality check for proportion: np ≥ 10 and n(1 – p) ≥ 19.
11A test is based on a test statistic A test is based on a test statistic. A test statistic is the function of sample data on which a conclusion to reject or fail to reject Ho is based. Some principles that apply to most tests:The test is based on a statistic that compares the valueof the parameter as stated in the null hypothesis with anestimate of the parameter from the sample data.Values of the estimate far from the parameter value inthe direction specified by the alternative hypothesis giveevidence against Ho.To assess how far the estimate is from the parameter,standardize the estimate. In many common situations,the test statistic has the form:z = estimate – hypothesize valuestandard deviation of the estimate
12The P-value is a quantitative measure of just how unlikely a given finding is, assuming the null hypothesis is true. We may compare this value to a significance level in order to decide whether or not a finding is significantly different from what was expected. P-value is a measure of the rarity of a finding.
13P-value is the probability (computed supposing Ho to be true) that the test statistic will take a value at least as extreme as that actually observed. It is the conditional probability of observing results at least as extreme as ours if Ho were true.We can compare the P-value with a value we regard as decisive. The decisive value of P is called a significance level. It is written as the Greek letter alpha, The most commonly used significance level is = 0.05; but, it may be preferable to choose a different level based on the situation.
14Small P-values indicate strong evidence against Ho Small P-values indicate strong evidence against Ho. Calculating P-values requires knowledge of the sampling distribution of the test statistic when Ho is true.
1511.2 – Carrying out significance tests There is a four step process to test hypotheses. Each step has several requirements and it will help you organize your answers so that you don’t lose points on the AP exam.1. Hypothesesa) Write the null hypothesisb) Write the alternative hypothesis – is it onesided or two? Are we interested in the uppertail or lower or both?
162. Modela) Which is the correct inference procedure. Thereare manyb) List assumptions and check assumptions. If youare checking np show the result – don’t just sayits greater than 10.c) Name the test.
173. Mechanicsa) Write down the statistics and use proper notation.b) Draw a curve depicting the model – mark thehypothesized parameter and the observedstatistic, shade the appropriate tail.c) Calculate the value of the test statistic. It couldbe z, t, χ2. Show the formula; substitute all theproper values; and, give the final result. (You cando the calculations in the calculator and justshow the results).d) Find the P-value. Often you will be able to usethe TI and copy down the P-value.
184. Conclusiona) Link the P-value to the decision. You need to beclear how the calculated P-value led to yourdecision.b) State the decision about the null hypothesis.Either you reject it or fail to reject it – neveraccept it.c) Interpret the decision in the proper context.
19Confidence intervals and two-sided significance tests are closely connected, provided that the significance level for the test and the confidence level for ht interval add to 100%.
2011.3 – Use and abuse of testsPoints to keep in mind when using or interpreting significance tests.1. Choosing a level of SignificanceHow small a P-value is convincing evidenceagainst H0? If Ho represents an assumption thatpeople have believed for years, strong evidencewill be needed (small P). What are theconsequences of rejecting Ho?
212. Statistical Significance and Practical Importance Statistical significance is not the same thing aspractical importance (see ex , pg 722). Payattention to the actual data as the P-value.3. Don’t Ignore Lack of SignificanceThere is a tendency to infer that there is no effectwhenever a P-value fails to attain the usual 5%standard. In some areas of research small effectsthat are detectable only with large sample sizescan be of great practical significance. Whenplanning a study verify that the test you plan to usehas a high probability of detecting an effect of thesize you hope to find.
224. Statistical Inference is not valid for all sets of data. Badly designed experiments or surveys often produceinvalid results.5. Beware of Multiple AnalysesStatistical significance ought to mean that you havefound an effect that you were looking for.
2311.4 Using Inference to Make Decisions Type 1 error: Reject H0 when H0 is actually trueType 2 error: Fail to reject H0 when H0 is false.Which error is the more serious depends upon the circumstances.The significance level of any fixed level test is the probability of a Type 1 error. That is, is the probability that the test will reject the null hypothesis H0 when H0 is in fact true.
24The probability that a fixed level significance test will reject H0 when a particular alternative value of the parameter is true is call the power of the test against the alternative.The power of a test against any alternative is 1 minus the probability of a Type 2 error for that alternative; that is, power = 1 - βIncreasing the size of the sample increase the size of the power (reduces the probability of Type 2 error) when the significance level remains fixed. We can also increase the power of a test by using a higher significance level.