Chapter 5 Sampling and Statistics Math 6203 Fall 2009 Instructor: Ayona Chatterjee.

Slides:



Advertisements
Similar presentations
Inference in the Simple Regression Model
Advertisements

“Students” t-test.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
Sampling Distributions (§ )
Hypothesis Testing: One Sample Mean or Proportion
Business 205. Review Sampling Continuous Random Variables Central Limit Theorem Z-test.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 8-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Chapter 8 Estimation: Single Population
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
OMS 201 Review. Range The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of dispersion.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Chapter 7 Estimation: Single Population
IENG 486 Statistical Quality & Process Control
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
Chapter 8 Introduction to Hypothesis Testing
Statistics for Managers Using Microsoft® Excel 5th Edition
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Chapter 7 Confidence Intervals and Sample Sizes
Chapter 10 Hypothesis Testing
Confidence Intervals and Hypothesis Testing - II
Statistics for Managers Using Microsoft® Excel 7th Edition
Fundamentals of Hypothesis Testing: One-Sample Tests
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Introduction to Hypothesis Testing.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Chapter 7 Estimation: Single Population
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal
Chapter 9.3 (323) A Test of the Mean of a Normal Distribution: Population Variance Unknown Given a random sample of n observations from a normal population.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
1 Introduction to Hypothesis Testing. 2 What is a Hypothesis? A hypothesis is a claim A hypothesis is a claim (assumption) about a population parameter:
Chapter 7 Hypothesis testing. §7.1 The basic concepts of hypothesis testing  1 An example Example 7.1 We selected 20 newborns randomly from a region.
Mid-Term Review Final Review Statistical for Business (1)(2)
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
IE241: Introduction to Hypothesis Testing. We said before that estimation of parameters was one of the two major areas of statistics. Now let’s turn to.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Inen 460 Lecture 2. Estimation (ch. 6,7) and Hypothesis Testing (ch.8) Two Important Aspects of Statistical Inference Point Estimation – Estimate an unknown.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
© Copyright McGraw-Hill 2004
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
§2.The hypothesis testing of one normal population.
n Point Estimation n Confidence Intervals for Means n Confidence Intervals for Differences of Means n Tests of Statistical Hypotheses n Additional Comments.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Statistical Process Control
Chapter 9 Hypothesis Testing.
STATISTICAL INFERENCE PART IV
Sampling Distributions (§ )
How Confident Are You?.
Presentation transcript:

Chapter 5 Sampling and Statistics Math 6203 Fall 2009 Instructor: Ayona Chatterjee

5.1 Sampling and Statistics Typical statistical problem: – We have a random variable X with pdf f(x) or pmf p(x) unknown. Either f(x) and p(x) are completely unknown. Or the form of f(x) or p(x) is known down to the parameter θ, where θ may be a vector. Here we will consider the second option. Example: X has an exponential distribution with θ unknown.

Since θ is unknown, we want to estimate it. Estimation is based on a sample. We will formalize the sampling plan: – Sampling with replacement. Each draw is independent and X’s have the same distribution. – Sampling without replacement. Each draw is not independent but X’s still have the same distribution.

Random Sample The random variables X 1, X 2, …., X n constitute a random sample on the random variable X if they are independent and each has the same distribution as X. We will abbreviate this by saying that X 1, X 2, …., X n are iid; i.e. independent and identically distributed. – The joint pdf can be given as

Statistic Suppose the n random variables X 1, X 2, …., X n constitute a sample from the distribution of a random variable X. Then any function T=T(X 1, X 2, …., X n ) of the sample is called a statistic. A statistic, T=T(X 1, X 2, …., X n ), may convey information about the unknown parameter θ. We call the statistics a point estimator of θ.

5.2 Order Statistics

Notation Let X 1, X 2, ….X n denote a random sample from a distribution of the continuous type having a pdf f(x) that has a support S = (a, b), where -∞≤ a< x< b ≤ ∞. Let Y 1 be the smallest of these X i, Y 2 the next X i in order of magnitude,…., and Y n the largest of the X i. That is Y 1 < Y 2 < …<Y n represent X 1, X 2, ….X n, when the latter is arranged in ascending order of magnitude. We call Y i the ith order statistic of the random sample X 1, X 2, ….X n.

Theorem Let Y 1 < Y 2 < …<Y n denote the n order statistics based on the random sample X 1, X 2, ….X n from a continuous distribution with pdf f(x) and support (a,b). Then the joint pdf of Y 1, Y 2, ….Y n is given by,

Note The joint pdf of any two order statistics, say Y i < Y j can be written as

Note Y n - Y 1 is called the range of the random sample. (Y 1 + Y n )/2 is called the mid-range If n is odd then Y (n+1)/2 is called the median of the random sample

5.4 MORE ON CONFIDENCE INTERVALS

The Statistical Problem We have a random variable X with density f(x,θ), where θ is unknown and belongs to the family of parameters Ω. We estimate θ with some statistics T, where T is a function of the random sample X 1, X 2, ….X n. It is unlikely that value of T gives the true value of θ. – If T has a continuous distribution then P(T= θ)=0. What is needed is an estimate of the error of estimation. – By how much did we miss θ?

Central Limit Theorem Let θ 0 denote the true, unknown value of the parameter θ. Suppose T is an estimator of θ such that Assume that σ T 2 is known.

Note When σ is unknown we use s(sample standard deviation) to estimate it. We have a similar interval as obtained before with the σ replaced with s t. Note t is the value of the statistic T.

Confidence Interval for Mean μ Let X 1, X 2, ….X n be a random sample from the distribution with unknown mean μ and unknown standard deviation σ.

Note We can find confidence intervals for any confidence level. Let Z α/2 as the upper α/2 quantile of a standard normal variable. Then the approximate (1- α)100% confidence interval for θ 0 is

Confidence Interval for Proportions Let X be a Bernoulli random variable with probability of success p. Let X 1, X 2, ….X n be a random sample from the distribution of X. Then the approximate (1- α)100% confidence interval for p is

5.5 Introduction to Hypothesis Testing

Introduction Our interest centers on a random variable X which has density function f(x,θ), where θ belongs to Ω. Due to theory or preliminary experiment, suppose we believe that

The hypothesis H 0 is referred to as the null hypothesis while H 1 is referred to as the alternative hypothesis. The null hypothesis represents ‘no change’. The alternative hypothesis is referred to the as research worker’s hypothesis.

Error in Hypothesis Testing The decision rule to take H 0 or H 1 is based on a sample X 1, X 2, ….X n from the distribution of X and hence the decision could be wrong. True State of Nature DecisionH o is trueH 1 is true Reject H o Type I ErrorCorrect Decision Accept H o Correct DecisionType II Error

The goal is to select a critical region from all possible critical regions which minimizes the probabilities of these errors. In general this is not possible, the probabilities of these errors have a see-saw effect. – Example if the critical region is Φ, then we would never reject the null so the probability of type I error would be zero but then probability of type II error would be 1. Type I error is considered the worse of the two.

Critical Region We fix the probability of type I error and we try and select a critical region that minimizes type II error. We saw critical region C is of size α if Over all critical regions of size α, we want to consider critical regions which have lower probabilities of Type II error.

We want to maximize The probability on the right hand side is called the power of the test at θ. It is the probability that the test detects the alternative θ when θ belongs to w 1 is the true parameter. So maximizing power is the same as minimizing Type II error.

Power of a test We define the power function of a critical region to be Hence given two critical regions C 1 and C 2 which are both of size α, C 1 is better than C 2 if

Note Hypothesis of the form H 0 : p = p 0 is called simple hypothesis. Hypothesis of the form H 1 : p < p 0 is called a composite hypothesis. Also remember α is called the significance level of the test associated with that critical region.

Test Statistics for Mean

5.7 Chi-Square Tests

Introduction Originally proposed by Karl Pearson in 1900 Used to check for goodness of fit and independence.

Goodness of fit test Consider the simple hypothesis – H 0 : p 1 =p 10, p 2 =p 20, …, p k-1 =p k-1,0 If the hypothesis H 0 is true, the random variable Has an approximate chi-square distribution with k-1 degrees of freedom.

Test for Independence Let the result of a random experiment be classified by two attributes. Let A i denote the outcomes of the first kind and B j denote the outcomes for the second kind. Let p ij = P(A i B j ) The random experiment is said to be repeated n independent times and X ij will denote the frequencies of an event in A i B j

The random variable Has an approximate chi-square distribution with (a-1)(b-1) degrees of freedom provided n is large.