Presentation is loading. Please wait.

Presentation is loading. Please wait.

4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.

Similar presentations


Presentation on theme: "4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample."— Presentation transcript:

1

2 4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample from the population. Its two major areas: 1. Parameter Estimation 2. Hypothesis Testing

3 4-2 Point Estimation A point estimate is an observed value of a point estimator (a statistic). Point Estimator

4 4-2` Interval Estimation - Confidence Interval

5 Note: L(lower confidence limit) and U (upper confidence limit) are statistic and hence random variables. Ex) Confidence level = 95% ( 1-  =0.95)  P(C.I. will contain the true parameter) = 0.95  95% of all the C.I. s will contain the true parameter The general formula for all confidence intervals is: Point Estimate ± (Critical Value) (Standard Error) Point Estimate Lower Confidence Limit Upper Confidence Limit Width of confidence interval U=Point Estimate +Critical Value*S.E L=Point Estimate -Critical Value*S.E (=2*Critical Value*S.E.)

6 Chap 8-6 Confidence Interval for a Mean  when the variance  2 is known Assumptions –Population standard deviation σ is known –Population is normally distributed –If population is not normal, use large sample (CLT)  100(1-  )% (two-sided) Confidence Interval for  : or (, where Z α/2 is the standardized normal distribution critical value for a probability of α/2 in each tail)

7 100(1-  )% Upper-Confidence Bound for  100(1-  )% Lower-Confidence Bound for 

8 Chap 8-8 Critical Value: Z α/2 Consider a 95% confidence interval: 1-  =0.95 X units: Z 1-  /2 = -1.96 Z α/2 = 1.96 0 Commonly used confidence levels are 90%, 95%, and 99% Note: Z 1-  /2 = - Z α/2

9 Chap 8-9 Example A sample of 11 circuits from a normal population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is 0.35 ohms. Determine a 95% confidence interval for the true mean resistance of the population. (1.9932, 2.4068)

10 Chap 8-10 If the population standard deviation σ is unknown, we can substitute the sample standard deviation, S This introduces extra uncertainty, since S is variable from sample to sample => Use the t distribution instead of the normal distribution Assumptions –Population standard deviation  is unknown –Population is normally distributed –If population is not normal, use large sample 100(1-  )% Confidence Interval for  : or (,where t  /2, n-1 is the critical value of the t distribution with n-1 d.f. and an area of α/2 in each tail) Confidence Interval for a Mean  when the variance  2 is unknown

11 100(1-  )% Upper-Confidence Bound for  100(1-  )% Lower-Confidence Bound for 

12 Chap 8-12 Student’s t Distribution 0 t t (df = 5) t (df = 13) Standard Normal (t with df = ∞) T-distriburions are symmetric and bell shaped but have flatter tails than normal The t value depends on degrees of freedom (d.f.) As d.f. goes infinity, t-distribution -> N(0,1 2 )

13 Table of T-distiribution

14 Chap 8-14 Example A random sample of n = 25 has the sample mean 50 and the sample variance 8. Form a 95% confidence interval for μ –d.f. = n – 1 = 24, so –The confidence interval is (48.832, 51.168)

15 (16.457,17,483)

16 Chap 8-16 Confidence Intervals for the variance of a normal population

17

18 100(1-  )% Confidence Interval for  2 100(1-  )% Upper-Confidence Bound for  2 100(1-  )% Lower-Confidence Bound for  2

19 19

20 Chap 8-20 Confidence Intervals for the Population Proportion, p

21 100(1-  )% Confidence Interval for p 100(1-  )% Upper-Confidence Bound for p 100(1-  )% Lower-Confidence Bound for p

22 Chap 8-22 [Example] A random sample of 100 people shows that 25 wear glasses. Form a 95% confidence interval for the true proportion of the population who wear glasses. Note : We are 95% confident that the true percentage of people wearing glasses in the population is between 16.51% and 33.49%. Although the interval from.1651 to.3349 may or may not contain the true proportion, 95% of intervals formed from samples of size 100 in this manner will contain the true proportion.

23

24 A (statistical) hypothesis is a statement or claim about a population parameter(not about a sample statistic): Ex) The mean electric bill per household of this city is μ = $132. The proportion of adults in this city with full-time jobs is p =0.61. Hypothesis testing is a procedure leading to a decision about a hypothesis based on a random sample Null Hypothesis (H 0 ) states the assumption to be tested. A hypothesis testing begins with the assumption that H 0 is true Alternative Hypothesis (H 1 ) is the opposite of the null hypothesis. It is the hypothesis that the researcher is trying to prove. Ex) H 0 : The mean age of smart phone users is 28. (H 0 : μ = 28) H 1 : The mean age of smart phone users is not 28. (H 1 : μ  28) 4-3 Hypothesis Testing 4-3.1 Statistical Hypotheses

25 Suppose that we are interested in the burning rate of a solid propellant used to power aircrew escape systems. Suppose that our interest focuses on the mean burning rate (a parameter of the distribution of the burning rate). If we are interested in deciding whether or not the mean burning rate is 50 centimeters per second: Two-sided Alternative Hypothesis One-sided Alternative Hypotheses If we are trying to prove that the mean burning rate is less than 50 centimeters per second. H 0 :  = 50cm/s H 1 :  < 50cm/s  Note: If H 1 :  < 50cm/s then we can write the null hypothesis as H 0 :  = 50cm/s or H 0 :   50cm/s. Both expression lead to the same testing procedure and the same decision. Example- Insight into the Hypothesis Testing

26 4-3.2 Testing Statistical Hypotheses Hypothesis-testing procedures rely on using the information in a random sample from the population of interest. If this information is consistent with the hypothesis, then we will conclude that the hypothesis is true; if this information is inconsistent with the hypothesis, we will conclude that the hypothesis is false. Sample the population and find sample mean. Suppose the sample mean age was = 20. This is significantly lower than the claimed population mean 50. If the null hypothesis were true, the probability of getting such a different sample mean would be very small, so you reject the null hypothesis. In other words, getting a sample mean of 20 is so unlikely if the population mean was 50, thus you conclude that the population mean must not be 50.

27 Sampling Distribution of X 20 X μ = 50 If H 0 is true

28 Chap 9-28 The Test Statistic and Rejection Region If the sample mean is close to the assumed population mean, the null hypothesis is not rejected. If the sample mean is far from the assumed population mean, the null hypothesis is rejected. How far is “far enough” to reject H 0 ? Critical Values Distribution of the test statistic Rejection Region Test statistic is a statistic computed from the sample data to make a decision about the hypothesis. ex) sample mean, sample variance, sample proportion etc. If the test statistic value falls in the rejection region, we will reject H 0. The boundaries that define the rejection regions are called the critical values.

29 How to decide the rejection region (critical values)? 0 represents critical value Rejection region is shaded  H 0 : μ ≥ 50 H 1 : μ < 50 H 0 : μ ≤ 50 H 1 : μ > 50   Lower-tail test 0 Upper-tail test Two-tail test 0 H 0 : μ = 50 H 1 : μ ≠ 50  The critical values are decided by i) the distribution of the test statistic ii) the significance level  ( see next page)

30 Errors in Decision Making The conclusion from a hypothesis testing may be an error since it is based on a random sample (random experiment). Type I Error  Rejecting the null hypothesis when it is true.  The probability of a Type I Error is called the significance level or size of the test, denoted by .  The significance level is usually set by researchers in advance. Type II Error  Failing to reject the null hypothesis when it is false.  The probability of a Type II Error is denoted by β.  1- β is called the power of the test. Actual Situation DecisionH 0 TrueH 0 False Do Not Reject H 0 No Error Probability 1 - α Type II Error Probability β Reject H 0 Type I Error Probability α No Error Probability 1 - β

31

32 Chap 9-32 1. State the null hypothesis, H 0 and the alternative hypothesis, H 1 2. Choose the significance level, α. 3. Determine the test statistic to use / Convert Sample Statistic (ex. X) to Test Statistic (ex. Z-statistic ) 4. Find the critical values and determine the rejection region(s) 5. Collect data and compute the test statistic value from the sample result 6.Compare the test statistic to the critical value to determine whether the test statistic falls in the region of rejection. Make the statistical decision: Reject H 0 if the test statistic falls in the rejection region. Hypothesis Testing procedure using Rejection Region

33 4-3.3 P-Values in Hypothesis Testing The p-value is the probability of obtaining a test statistic equal to or more extreme than the observed sample value when H 0 is true. Sometimes referred to as “the observed level of significance” or “Smallest value of  for which H 0 can be rejected” The p-value measures the plausibility of the null hypothesis, H 0. “The smaller the p-value, the less plausible is the null hypothesis.“

34 Chap 9-34 1. State the null hypothesis, H 0 and the alternative hypothesis, H 1 2. Choose the significance level, α. 3. Determine the test statistic to use / Convert Sample Statistic (ex. X) to Test Statistic (ex. Z-statistic ) 4. Collect data and compute the test statistic from the sample result 5. Obtain the p-value from a distribution table of test statistic (or by using Excel, minitab etc) 6. Compare the p-value with  If p-value < , reject H 0 If p-value  , do not reject H 0 Hypothesis Testing procedure using P-value

35 Hypothesis Testing on the Mean

36 4-4 Inference on the Mean of a Population, Variance Known Assumptions

37 4-4.1 Hypothesis Testing on the Mean, Variance Known

38 Ex: Hypothesis Testing: σ Known, two-sided Convert sample statistic ( X ) to test statistic Determine the critical Z values for a specified level of significance  Decision Rule: If the test statistic falls in the rejection region, reject H 0, otherwise do not reject H 0 H 0 : μ = μ o H 1 : μ ≠ μ o Do not reject H 0 Reject H 0  /2 -Z 0 +Z  /2 Lower critical value Upper critical value Z X μoμo

39 Chap 9-39 Example To test the claim that the mean weight of chocolate bars manufactured in a factory is 3 ounces, we weighed 100 chocolate bars and the average weight was 2.84. Suppose that, from past records, the standard deviation is known to be 0.8. 1) State the null and alternative hypotheses H 0 : μ = 3 H 1 : μ ≠ 3 (two-sided test) 2) Choose the desired level of significance Suppose that  =0.05 is chosen for this test 3) Determine the test statistic σ is known so this is a Z-test 4) Find the critical values and determine the rejection region(s) For  = 0.05, the critical Z-values are ±1.96 Reject H 0 if z 0 1.96 5) Reach a decision and interpret the result Since z 0 = -2.0 < -1.96, you reject the null hypothesis. (That is, there is sufficient evidence that the mean weight of chocolate bars is not equal to 3.)

40 X = 2.84 is translated to a Z score.0228  /2 =.025 -1.960 -2.0 Z1.96 2.0.0228  /2 =.025 Example -revisit To test the claim that the mean weight of chocolate bars manufactured in a factory is 3 ounces, we weighed 100 chocolate bars and the average weight was 2.84. Suppose that, from past records, the standard deviation is known to be 0.8. Test at  =0.05 using p-value. p-value = 2P(Z > lz 0 l ) =2P(Z>2.0)=2*0.0228=0.0456 p-value = 0.0456 <  (= 0.05) Thus, we reject the null hypothesis.

41 Chap 9-41 Example A phone industry manager thinks that customer monthly cell phone bills have increased, and now average more than $52 per month. Past company records indicate that the standard deviation is about $10. He collect a sample of n=64 and the sample mean was 53.1 Test this claim at  = 0.10 1) H 0 : μ ≤ 52 vs H 1 : μ > 52 2)Test Statistic 3) Rejection Region: Critical Value = 1.28 If Z 0 >1.28 then reject H 0 4) Since Z 0 =0.88 < 1.28, we cannot reject H 0 5) We cannot say that the mean bill is greater than $52  =.10 1.28 0 Reject H 0 1-  =.90 Z 0 =.88

42 Chap 9-42 P-value method: Let’s calculate the p-value and compare to  p-value = 0.1894  =.10 1.28 0 Reject H 0 Z =.88 We do not reject H 0 since p-value = 0.1894 >  (=.10)

43 4-5 Inference on the Mean of a Population, Variance Unknown T-distriburions are symmetric and bell shaped but have flatter tails than normal The t value depends on degrees of freedom (d.f.) As d.f. goes infinity, t-distribution -> N(0,1 2 ) Student’s t Distribution

44 Assumptions Population standard deviation is unknown Population is normally distributed, If population is not normal, use large sample 4-5.1 Hypothesis Testing on the Mean, Variance Unknown

45 Calculating the P-value

46 Chap 9-46 Example The mean cost of a hotel room in LA is said to be $168 per night. A random sample of 25 hotels resulted in X = 172.50 and S = 15.40. Test at the  = 0.05 level Assuming the data are normally distributed. H 0 : μ = 168 H 1 : μ  168  is unknown, so use a t-statistic Critical Values: t 0.025, 24 = ± 2.0639 Reject H 0 if t 0 >2.0639 or t 0 <-2.0639 Since t 0 does not fall in the rejection region, we cannot reject H 0

47 Relationship between Tests of Hypotheses and Confidence Intervals  The test of significance level  of the hypothesis will lead to rejection of H 0  The hypothesized value  0 is not in the 100(1 -  ) percent confidence interval [l, u].  The test of significance level  of the hypothesis will lead to rejection of H 0  The hypothesized value  0 is not in the 100(1 -  ) percent confidence interval [- , u]. <  The test of significance level  of the hypothesis will lead to rejection of H 0  The hypothesized value  0 is not in the 100(1 -  ) percent confidence interval [l,  ]. >

48 4-6 Inference on the Variance of a Normal Population 4-6.1 Hypothesis Testing on the Variance of a Normal Population

49

50 4-7 Inference on Population Proportion 4-7.1 Hypothesis Testing on a Binomial Proportion We will consider testing:

51

52

53


Download ppt "4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample."

Similar presentations


Ads by Google