Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

“Students” t-test.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Chapter 6 Sampling and Sampling Distributions
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Sampling: Final and Initial Sample Size Determination
1 Virtual COMSATS Inferential Statistics Lecture-7 Ossam Chohan Assistant Professor CIIT Abbottabad.
Sampling Distributions (§ )
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Sampling Distributions
1 Introduction to Estimation Chapter Introduction Statistical inference is the process by which we acquire information about populations from.
Chapter 6 Introduction to Sampling Distributions
Chapter 7 Sampling and Sampling Distributions
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Evaluating Hypotheses
Chapter 8 Estimation: Single Population
Part III: Inference Topic 6 Sampling and Sampling Distributions
Chapter 7 Estimation: Single Population
AP Statistics Section 10.2 A CI for Population Mean When is Unknown.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Inferences About Process Quality
Continuous Probability Distribution  A continuous random variables (RV) has infinitely many possible outcomes  Probability is conveyed for a range of.
Business Statistics: Communicating with Numbers
Chapter 6: Sampling Distributions
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Chapter 7 Using sample statistics to Test Hypotheses about population parameters Pages
Chapter 5 Lecture Slides
Dan Piett STAT West Virginia University
Estimation of Statistical Parameters
Topic 5 Statistical inference: point and interval estimate
Lecture 14 Sections 7.1 – 7.2 Objectives:
QBM117 Business Statistics Estimating the population mean , when the population variance  2, is known.
1 Introduction to Estimation Chapter Concepts of Estimation The objective of estimation is to determine the value of a population parameter on the.
Chapter 8: Confidence Intervals
Estimates and Sample Sizes Lecture – 7.4
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Introduction to Biostatistics and Bioinformatics Estimation II This Lecture By Judy Zhong Assistant Professor Division of Biostatistics Department of Population.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
1 Estimation From Sample Data Chapter 08. Chapter 8 - Learning Objectives Explain the difference between a point and an interval estimate. Construct and.
Statistical estimation, confidence intervals
6 - 1 © 1998 Prentice-Hall, Inc. Chapter 6 Sampling Distributions.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved. Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and.
Statistical Analysis II Lan Kong Associate Professor Division of Biostatistics and Bioinformatics Department of Public Health Sciences December 15, 2015.
Inferences Concerning the Difference in Population Proportions (9.4) Previous sections (9.1,2,3): We compared the difference in the means (  1 -  2 )
One Sample Mean Inference (Chapter 5)
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Chapter 9 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
6 - 1 © 2000 Prentice-Hall, Inc. Statistics for Business and Economics Sampling Distributions Chapter 6.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Dr.Theingi Community Medicine
Chapter 6: Sampling Distributions
Sampling and Sampling Distributions
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
Chapter 4. Inference about Process Quality
Chapter 6: Sampling Distributions
Virtual COMSATS Inferential Statistics Lecture-11
Towson University - J. Jung
Statistics in Applied Science and Technology
CONCEPTS OF ESTIMATION
Confidence intervals for the difference between two means: Independent samples Section 10.1.
Inference on the Mean of a Population -Variance Known
Sampling Distributions (§ )
Presentation transcript:

Previous Lecture: Distributions

Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division of Biostatistics Department of Population Health

Statistical inference  Statistical inference can be further subdivided into the two main areas of estimation and hypothesis  Estimation is concerned with estimating the values of specific population parameters  Hypothesis testing is concerned with testing whether the value of a population parameter is equal to some specific value 3

Two examples of estimation  Suppose we measure the systolic blood pressure (SBP) of a group of patients and we believe the underlying distribution is normal. How can the parameters of this distribution (µ,  ^2) be estimated? How precise are our estimates?  Suppose we look at people living within a low-income census tract in an urban area and we wish to estimate the prevalence of HIV in the community. We assume that the number of cases among n people sampled is binomially distributed, with some parameter p. How is the parameter p estimated? How precise is this estimate? 4

Point estimation and interval estimation  Sometimes we are interested in obtaining specific values as estimates of our parameters (along with estimation precise). There values are referred to as point estimates  Sometimes we want to specify a range within which the parameter values are likely to fall. If the range is narrow, then we may feel our point estimate is good. These are called interval estimates 5

 Purpose of inference: Make decisions about population characteristics when it is impractical to observe the whole population and we only have a sample of data drawn from the population Population? 6 From Sample to Population!

Towards statistical inference o Parameter: a number describing the population o Statistic: a number describing a sample o Statistical inference: Statistic  Parameter 7

Population Sample Estimates & tests 8 Inference Process Sample statistic

Section 6.5: Estimation of population mean  We have a sample (x1, x2, …, xn) randomly sampled from a population  The population mean µ and variance  ^2 are unknown  Question: how to use the observed sample (x1, …, xn) to estimate µ and  ^2? 9

Point estimator of population mean and variance  A natural estimator for estimating population mean µ is the sample mean  A natural estimator for estimating population standard deviation  is the sample standard deviation 10

Sampling distribution of sample mean 11  To understand what properties of make it a desirable estimator for µ, we need to forget about our particular sample for the moment and consider all possible samples of size n that could have been selected from the population  The values of in different samples will be different. These values will be denoted by  The sampling distribution of is the distribution of values over all possible samples of size n that could have been selected from the study population

An example of sampling distribution 12

Sample mean is an unbiased estimator of population mean  We can show that the average of these samples mean ( over all possible samples) is equal to the population mean µ  Unbiasedness: Let X1, X2, …, Xn be a random sample drawn from some population with mean µ. Then 13

is minimum variance unbiased estimator of µ  The unbiasedness of sample mean is not sufficient reason to use it as an estimator of µ  There are many other unbiasedness, like sample median and the average of min and max  We can show that (but not here): among all kinds of unbiased estimators, the sample mean has the smallest variance  Now what is the variance of sample mean ? 14

Standard error of mean  The variance of sample mean measures the estimation precise  Theorem: Let X1, …, Xn be a random sample from a population with mean µ and variance. The set of sample means in repeated random samples of size n from this population has variance. The standard deviation of this set of sample means is thus and is referred to as the standard error of the mean or the standard error. 15

Use to estimate 16  In practice, the population variance is rarely unknown. We will see in Section 6.7 that the sample variance is a reasonable estimator for  Therefore, the standard error of mean can be estimated by (recall that ) NOTE: The larger sample size is  the smaller standard error is  the more accurate estimation is

An example of standard error  A sample of size 10 birthweights: 97, 125, 62, 120, 132, 135, 118, 137, 126, 118 (sample mean x-bar= and sample standard deviation s=22.44)  In order to estimate the population mean µ, a point estimate is the sample mean, with standard error given by 17

Summary of sampling distribution of 18  Let X1, …, Xn be a random sample from a population with µ and σ 2. Then the mean and variance of is µ and σ 2 /n, respectively  Furthermore, if X1,..., Xn be a random sample from a normal population with µ and σ 2. Then by the properties of linear combination, is also normally distributed, that is  Now the question is, if the population is NOT normal, what is the distribution of ?

The Central Limit Theorem 19  Let X 1, X 2, …, X n denote n independent random variables sampled from some population with mean  and variance  2  When n is large, the sampling distribution of the sample mean is approximately normally distributed even if the underlying population is not normal  By standardization:

Illustration of Central limit Theorem (CLT) 20

An example of using CLT  Example 6.27 (Obstetrics example continued) Compute the 21

Interval estimation 22  Let X 1, X 2, …, X n denote n independent random variables sampled from some population with mean  and variance  2  Our goal is to estimate µ. We know that is a good point estimate  Now we want to have a confidence interval such that

Motivation for t-distribution  From Central Limit Theorem, we have  But we still cannot use this to construct interval estimation for µ, because  is unknown  Now we replace  by sample standard deviation s, what is the distribution of the following? 23

T-distribution  If X1, …, Xn ~ N(µ,  2 ) and are independent, then where is called t-distribution with n-1 degrees of freedom 24

T-table  See Table 5 in Appendix  The (100×u)th percentile of a t distribution with d degrees of freedom is denoted by That is 25

Normal density and t densities 26

Comparison of normal and t distributions  The bigger degrees of freedom, the closer to the standard normal distribution 27

100%×(1- α) area 1-α t α/2 = - t 1- α/2 t 1- α/2 28 α/2  Define the critical values t 1- α /2 and -t 1- α /2 as follows

Our goal is get a 95% interval estimation  We start from 29

Develop a confidence interval formula 30

Confidence interval  Confidence Interval for the mean of a normal distribution  A 100%×(1- α) CI for the mean µ of a normal distribution with unknown variance is given by A shorthand notation for the CI is 31

Confidence interval (when n is large)  Confidence Interval for the mean of a normal distribution (large sample case)  A 100%×(1- α) CI for the mean µ of a normal distribution with unknown variance is given by A shorthand notation for the CI is 32

Factors affecting the length of a CI 33