Class notes for ISE 201 San Jose State University

Slides:



Advertisements
Similar presentations
Chapter 9 Introduction to the t-statistic
Advertisements

Chapter 6 Sampling and Sampling Distributions
EGR Ch. 8 Part 1 and 2 Spring 2009 Slide 1 Fundamental Sampling Distributions  Introduction to random sampling and statistical inference  Populations.
Sampling Distributions (§ )
Chapter 10 Inference on Two Samples 10.4 Inference on Two Population Standard Deviations.
Sampling Distributions
Class notes for ISE 201 San Jose State University
Class notes for ISE 201 San Jose State University
Chapter 7 Sampling and Sampling Distributions
Class notes for ISE 201 San Jose State University
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Sampling Distributions
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University.
1 Chapter 12 Inference About a Population 2 Introduction In this chapter we utilize the approach developed before to describe a population.In this chapter.
Part III: Inference Topic 6 Sampling and Sampling Distributions
Experimental Evaluation
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Chapter 11: Inference for Distributions
Inferences About Process Quality
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
Chapter 9 Hypothesis Testing.
Class notes for ISE 201 San Jose State University
5-3 Inference on the Means of Two Populations, Variances Unknown
Statistical Inference for Two Samples
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
1 Ch6. Sampling distribution Dr. Deshi Ye
1 More about the Sampling Distribution of the Sample Mean and introduction to the t-distribution Presentation 3.
More About Significance Tests
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
AP Statistics Chapter 9 Notes.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Sampling Distribution of the Sample Mean. Example a Let X denote the lifetime of a battery Suppose the distribution of battery battery lifetimes has 
EML EML 4550: Engineering Design Methods Probability and Statistics in Engineering Design Class Notes Probability & Statistics for Engineers.
Chapter 7: Sample Variability Empirical Distribution of Sample Means.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
EGR Ch. 8 9th edition 2013 Slide 1 Fundamental Sampling Distributions  Introduction to random sampling and statistical inference  Populations and.
Stat 1510: Sampling Distributions
Chapter 10: Introduction to Statistical Inference.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Introduction to Inference Sampling Distributions.
Section 6.4 Inferences for Variances. Chi-square probability densities.
Chapter 9 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis.
President UniversityErwin SitompulPBST 10/1 Lecture 10 Probability and Statistics Dr.-Ing. Erwin Sitompul President University
Chapter 5: The Basic Concepts of Statistics. 5.1 Population and Sample Definition 5.1 A population consists of the totality of the observations with which.
Objectives (BPS chapter 12) General rules of probability 1. Independence : Two events A and B are independent if the probability that one event occurs.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Chapter 6 Sampling and Sampling Distributions
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Introduction to Inference
Sampling Distribution Estimation Hypothesis Testing
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
Sample Mean Distributions
Chapter 7 Sampling Distributions.
Econ 3790: Business and Economics Statistics
Chapter 7 Sampling Distributions.
Chapter 7 Sampling Distributions.
CHAPTER 11: Sampling Distributions
Sampling Distributions (§ )
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
Chapter 7 Sampling Distributions.
Fundamental Sampling Distributions and Data Descriptions
Presentation transcript:

Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 8 Notes Class notes for ISE 201 San Jose State University Industrial & Systems Engineering Dept. Steve Kennedy 1

Populations & Samples A population is the set (possibly infinite) of all possible observations. Each observation is a random variable X having some (often unknown) probability distribution, f (x). If the population distribution is known, it might be referred to, for example, as a normal population, etc. A sample is a subset of a population. Our goal is to make inferences about the population based on an analysis of the sample. A biased sample, usually obtained by taking convenient, rather than representative observations, will consistently over- or under-estimate some characteristic of the population. Observations in a random sample are made independently and at random. Here, random variables X1, X2, …, Xn in the sample all have same distribution as the population, X.

Sample Statistics Any function of the random variables X1, X2, …, Xn making up a random sample is called a statistic. The most important statistics, as we have seen are the sample mean, sample variance and sample standard deviation:

Sampling Distributions Starting with an unknown population distribution, we can study the sampling distribution, or distribution of a sample statistic (like Xbar or S) calculated from a sample of size n from that population. The sample consists of independent and identically distributed observations X1, X2, …, Xn from the population. Based on the sampling distributions of Xbar and S for samples of size n, we will make inferences about the population mean and variance  and . We could approximate the sampling distribution of Xbar by taking a large number of random samples of size n and plotting the distribution of the Xbar values.

Sampling Distribution Summary Normal distribution: Sampling distribution of Xbar when  is known for any population distribution. Also the sampling distribution for the difference of the means of two different samples. t-distribution: Sampling distribution of Xbar when  is unknown and S is used. Population must be normal. Also the sampling distribution for the difference of the means of two different samples when  is unknown. Chi-square (2) distribution: Sampling distribution of S2. Population must be normal. F-distribution: The distribution of the ratio of two 2 random variables. Sampling distribution of the ratio of the variances of two different samples. Population must be normal.

Central Limit Theorem The central limit theorem is the most important theorem in statistics. It states that If Xbar is the mean of a random sample of size n from a population with an arbitrary distribution with mean  and variance 2, then as n, the sampling distribution of Xbar approaches a normal distribution with mean  and standard deviation /n . The central limit theorem holds under the following conditions: For any population distribution if n  30. For n < 30, if the population distribution is generally shaped like a normal distribution. For any value of n if the population distribution is normal.

Inferences About the Population Mean We often want to test hypotheses about the population mean (hypothesis testing will be formalized later). Example: Suppose a manufacturing process is designed to produce parts with  = 6 cm in diameter, and suppose  is known to be .15 cm. If a random sample of 80 parts has xbar = 6.046 cm, what is the probability (P-value) that a value this far from the mean could occur by chance if  is truly 6 cm?

Difference Between Two Means In addition, we can make inferences about the difference between two population means based on the difference between two sample means. The central limit theorem also holds in this case. If independent samples of size n1 and n2 are drawn at random from two populations, discrete or continuous, with means 1 and 2, and variances 12 and 22, then as n gets large, the sampling distribution of Xbar1 - Xbar2 approaches a normal distribution with

Difference of Two Means Example Suppose we record the drying time in hours of 20 samples each of two types of paint, type A and type B. Suppose we know that the population standard deviations are both equal to 1/2 hour. Assuming that the population means are equal, what is the probability that the difference in the sample means is greater than 1/2 hour? P(z > 3.16) = .0008. Very unlikely to happen by chance.

t-Distribution (when  is Unknown) The problem with the central limit theorem is that it assumes that  is known. Generally, if  is being estimated from the sample,  must be estimated from the sample as well. The t-distribution can be used if  is unknown, but it requires that the original population must be normally distributed. Let X1, X2, ..., Xn be independent, normally distributed random variables with mean  and standard deviation . Then the random variable T below has a t-distribution with  = n - 1 degrees of freedom: The t-distribution is like the normal, but with greater spread since both  and  have fluctuations due to sampling.

Using the t-Distribution Observations on the t-Distribution: The t-statistic is like the normal, but using S rather than . Table value depends on the sample size (degrees of freedom). The t-distribution is symmetric with  = 0, but 2 > 1. As would be expected, 2 is largest for small n. Approaches the normal distribution (2 = 1) as n gets large. For a given probability , table shows the value of t that has P() to the right of it. The t-distribution can also be used for hypotheses concerning the difference of two means where 1 and 2 are unknown, as long as the two populations are normally distributed. Usually if n  30, S is a good enough estimator of , and the normal distribution is typically used instead.

Sampling Distribution of S2 If S2 is the variance of a random sample of size n taken from a normal population with known variance 2, then the statistic 2 below has a chi-squared distribution with  = n-1 degrees of freedom: The 2 table is the reverse of the normal table. It shows the 2 value that has probability  to the right of it. Note that the 2 distribution is not symmetric. Degrees of freedom: As before, you can think of degrees of freedom as the number of independent pieces of information. We use  = n-1, since one degree of freedom is subtracted due to the fact that the sample mean is used to estimate  when calculating S2. If  is known, use n degrees of freedom.

F-Distribution Comparing Sample Variances The F statistic is the ratio of two 2 random variables, each divided by its number of degrees of freedom. If S12 and S22 are the variances of samples of size n1 and n2 taken from normal populations with variances 12 and 22, then has an F-distribution with 1 = n1 - 1 and 2 = n2 - 1 degrees of freedom. For the F-distribution table Note that since the ratio is inverted, 1 and 2 are reversed.

F-Distribution Usage The F-distribution is used in two-sample situations to draw inferences about the population variances. In an area of statistics called analysis of variance, sources of variability are considered, for example: Variability within each of the two samples. Variability between the two samples (variability between the means). The overall question is whether the variability within the samples is large enough that the variability between the means is not significant.