Statistical Sampling & Analysis of Sample Data

Slides:



Advertisements
Similar presentations
Estimating Population Values
Advertisements

Chapter 6 Sampling and Sampling Distributions
© 2011 Pearson Education, Inc
Estimation in Sampling
Chapter 11- Confidence Intervals for Univariate Data Math 22 Introductory Statistics.
Chapter 7 Introduction to Sampling Distributions
Chapter 6 Introduction to Sampling Distributions
Chapter 7 Sampling and Sampling Distributions
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Confidence Interval Estimation Statistics for Managers.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Irwin/McGraw-Hill © The McGraw-Hill Companies, Inc., 2000 LIND MASON MARCHAL 1-1 Chapter Seven Sampling Methods and Sampling Distributions GOALS When you.
Sampling Methods and Sampling Distributions Chapter.
Chapter 8 Estimation: Single Population
Part III: Inference Topic 6 Sampling and Sampling Distributions
Chapter 7 Estimation: Single Population
7-2 Estimating a Population Proportion
The Excel NORMDIST Function Computes the cumulative probability to the value X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc
7/2/2015 (c) 2001, Ron S. Kenett, Ph.D.1 Sampling for Estimation Instructor: Ron S. Kenett Course Website:
Statistics for Managers Using Microsoft® Excel 5th Edition
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Business Statistics, A First Course.
Business Statistics: Communicating with Numbers
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Confidence Interval Estimation
Chapter Nine Copyright © 2006 McGraw-Hill/Irwin Sampling: Theory, Designs and Issues in Marketing Research.
1 Math 10 Part 5 Slides Confidence Intervals © Maurice Geraghty, 2009.
Albert Morlan Caitrin Carroll Savannah Andrews Richard Saney.
Estimation of Statistical Parameters
Topic 5 Statistical inference: point and interval estimate
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Estimating a Population Proportion
STA291 Statistical Methods Lecture 18. Last time… Confidence intervals for proportions. Suppose we survey likely voters and ask if they plan to vote for.
Sampling Methods and Sampling Distributions
Determining the Appropriate Sample Size
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall
CONFIDENCE INTERVALS.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Sampling and Sampling Distributions Basic Business Statistics 11 th Edition.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University.
Point Estimates point estimate A point estimate is a single number determined from a sample that is used to estimate the corresponding population parameter.
Confidence Interval Estimation For statistical inference in decision making: Chapter 9.
1 ES Chapter 18 & 20: Inferences Involving One Population Student’s t, df = 5 Student’s t, df = 15 Student’s t, df = 25.
Basic Business Statistics
Chapter 7 Introduction to Sampling Distributions Business Statistics: QMIS 220, by Dr. M. Zainal.
Topics Semester I Descriptive statistics Time series Semester II Sampling Statistical Inference: Estimation, Hypothesis testing Relationships, casual models.
Review Confidence Intervals Sample Size. Estimator and Point Estimate An estimator is a “sample statistic” (such as the sample mean, or sample standard.
6-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Confidence Intervals. Point Estimate u A specific numerical value estimate of a parameter. u The best point estimate for the population mean is the sample.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Sampling: Distribution of the Sample Mean (Sigma Known) o If a population follows the normal distribution o Population is represented by X 1,X 2,…,X N.
Probability & Statistics Review I 1. Normal Distribution 2. Sampling Distribution 3. Inference - Confidence Interval.
Dr.Theingi Community Medicine
Chapter 8 Confidence Interval Estimation Statistics For Managers 5 th Edition.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Estimation and Confidence Intervals. Point Estimate A single-valued estimate. A single element chosen from a sampling distribution. Conveys little information.
Chapter 6 Sampling and Estimation
Inference: Conclusion with Confidence
Making inferences from collected data involve two possible tasks:
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Chapter 7 Sampling Distributions.
Econ 3790: Business and Economics Statistics
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Chapter 7 Sampling Distributions.
LESSON 18: CONFIDENCE INTERVAL ESTIMATION
Chapter 7 Sampling Distributions.
Chapter 7 Sampling Distributions.
Presentation transcript:

Statistical Sampling & Analysis of Sample Data (Lesson - 04/A) Understanding the Whole from Pieces Dr. C. Ertuna

Sampling Sampling is : Collecting sample data from a population and Estimating population parameters Sampling is an important tool in business decisions since it is an effective and efficient way obtaining information about the population. Dr. C. Ertuna

Sampling (Cont.) How good is the estimate obtained from the sample? The means of multiple samples of a fixed size (n) from some population will form a distribution called the sampling distribution of the mean The standard deviation of the sampling distribution of the mean is called the standard error of the mean Dr. C. Ertuna

Sampling (Cont.) Standard Error of the mean = Estimates from larger sample sizes provide more accurate results If the sample size is large enough the sampling distribution of the mean is approximately normal, regardless of the shape of the population distribution - Central Limit Theorem Dr. C. Ertuna

Sampling Distribution of the Mean THE CENTRAL LIMIT THEREOM For samples of n observations taken from a population with mean  and standard deviation , regardless of the population’s distribution, provided the sample size is sufficiently large, the distribution of the sample mean , will be normal with a mean equal to the population mean . Further, the standard deviation will equal the population standard deviation divided by the square-root of the sample size . The larger the sample size, the better the approximation to the normal distribution. Dr. C. Ertuna

Sampling Statistics Sampling statistics are statistics that are based on values that are created by repeated sampling from a population, such as: Mean of the sampling means Standard Error of the sampling mean Sampling distribution of the means Dr. C. Ertuna

Sampling: Key Issues Key Sampling issues are: Sample Design (Planning) Sampling Methods (Schemes) Sampling Error Sample Size Determination. Dr. C. Ertuna

Sampling: Design Sample Design (Sample Planning) describes: Objective of Sampling Target Population Population Frame Method of Sampling Statistical tools for Data Analysis Dr. C. Ertuna

Sampling Methods (Sampling Schemes) Subjective Methods Judgment Sampling Convenience Sampling Probabilistic Methods Simple Random Sampling Systematic Sampling Stratified Sampling Cluster Sampling Dr. C. Ertuna

Sampling: Methods (Cont.) Simple Random Sampling Method refers to a method of selecting items from a population such that every possible sample of a specified size has an equal chance of being selected with or without replacement Dr. C. Ertuna

Sampling: Methods (Cont.) Stratified Sampling Method: Population is divided into natural subsets (Strata) Items are randomly selected from stratum Proportional to the size of stratum. Dr. C. Ertuna

Stratified Sampling Example Population Cash holdings of All Financial Institutions in the Country Large Institutions Medium Size Institutions Small Institutions Stratified Population Stratum 1 Stratum 2 Stratum 3 Select n1 Select n2 Select n3 Stratified Sample of Cash Holdings of Financial Institutions Dr. C. Ertuna

Cluster Sampling Cluster sampling refers to a method by which the population is divided into groups, or clusters, that are each intended to be mini-populations. A random sample of m clusters is selected. Dr. C. Ertuna

Cluster Sampling Example Mid-Level Managers by Location for a Company 42 22 105 20 36 52 76 Algeria Scotland California Alaska New York Florida Mexico Dr. C. Ertuna

SAMPLING ERROR-SINGLE MEAN The difference between a value (a statistic) computed from a sample and the corresponding value (a parameter) computed from a population. Where: Dr. C. Ertuna

Sampling: Error (Cont.) Sampling Error is inherent in any sampling process due to the fact that samples are only a subset of the total population. Sampling Errors depends on the relative size of sample Sampling Errors can be minimized but not eliminated. Dr. C. Ertuna

Sampling: Error (Cont.) If Sampling size is more than 5% of the population “With Replacement” assumption of Central Limit Theorem and hence, Standard Error calculations are violated Correction by the following factor is needed. Dr. C. Ertuna

Sampling: Size Sample Size Determination. n = sample size where, n = sample size z = z-score = a factor representing probability in terms of standard deviation α = 100% - confidence level E = interval on either side of the mean Dr. C. Ertuna

Estimation Estimation (Inference) is assessing the the value of a population parameter using sample data Two types of estimation: Point Estimates Interval Estimates Dr. C. Ertuna

FOR ESTIMATION USE ALLWAYS STANDARD NORMAL DISTRIBUTION Dr. C. Ertuna

Estimation (Cont.) Most common point estimates are the descriptive statistical measures. If the expected value of an estimator equals to the population parameter then it is called unbiased. Dr. C. Ertuna

Estimation (Cont.) That means that we can use sample estimates as if they were population parameters without committing an error. Dr. C. Ertuna

Estimation (Cont.) Interval Estimate provides a range within which population parameter falls with certain likelihood. Confidence Level is the probability (likelihood) that the interval contains the population parameter. Most commonly used confidence levels are 90%, 95%, and 99%. Dr. C. Ertuna

Confidence Interval Confidence Interval (CI) is an interval estimate specified from the perspective of the point estimate. In other words CI is an interval on either side (+/-) of the point estimate based on a fraction (t or z-score) of the Std. Dev. of the point estimate Dr. C. Ertuna

Lower Confidence Limit Upper Confidence Limit Confidence Intervals Lower Confidence Limit Upper Confidence Limit Point Estimate Dr. C. Ertuna

95% Confidence Intervals 0.95 z.025= -1.96 z.025= 1.96 Dr. C. Ertuna

CI for Proportions p = x/n For categorical variables having only two possible outcomes proportions are important. An unbiased estimation of population proportion (π) is the sample statistics p = x/n where, x = number of observations in the sample with desired characteristics Dr. C. Ertuna

Confidence Interval - From General to Specific Format - Point Estimate  (Critical Value)(Standard Error) (Based on CL) { CI unite value = CI proportion = Dr. C. Ertuna

Confidence Interval - From Statistical Expression to Excel Formula - Where z α/2 = Normsinv(1 – α/tails) and when n < 30 z  t , then t α/2 n-1 = Tinv(2α/tails, n-1) Dr. C. Ertuna

Confidence Interval of the Mean Logic of the CImean computation CLT Unbiased Estimator Dr. C. Ertuna

CI of the Mean (Cont.) where, z = z-score = a critical factor representing probability in terms of Standard Deviation (for sampling Standard Error) (valid for normal distribution) (critical value) t = t-score = a factor representing probability in terms of standard deviation (or Std. Error) (valid for t distribution) (critical value) α = 100% - confidence level Dr. C. Ertuna

CI of the Mean (Cont.) E = Margin of Error E unite value = where, E = Margin of Error E unite value = E proportion = Dr. C. Ertuna

Z-score A z-score is a critical factor, indicating how many standard deviation (standard error for sampling) away from the mean a value should be to observe a particular (cumulative) probability. There is a relationship between z-score and probability over p(x) = (1-Normsdist(z))*tails and There is a relationship between z-score and the value of the random variable over Dr. C. Ertuna

Z-score (Cont.) Since the z-score is a measure of distance from the mean in terms of Standard Deviation (Standard Error for sampling), it provides us with information that a cumulative probability could not. For example, the larger z-score the unusual is the observation. Dr. C. Ertuna

Student’s t-Distribution The t-distribution is a family of distributions that is bell-shaped and symmetric like the Standard Normal Distribution but with greater area in the tails. Each distribution in the t-family is defined by its degrees of freedom. As the degrees of freedom increase, the t-distribution approaches the normal distribution. Dr. C. Ertuna

Degrees of freedom Degrees of freedom (df) refers to the number of independent data values available to estimate the population’s standard deviation. If k parameters must be estimated before the population’s standard deviation can be calculated from a sample of size n, the degrees of freedom are equal to n - k. Dr. C. Ertuna

Example of a CI Interval Estimate for  A sample of 100 cans, from a population with  = 0.20, produced a sample mean equal to 12.09. A 95% confidence interval would be: 12.051 ounces 12.129 ounces Dr. C. Ertuna

Example of Impact of Sample Size on Confidence Intervals If instead of sample of 100 cans, suppose a sample of 400 cans, from a population with  = 0.20, produced a sample mean equal to 12.09. A 95% confidence interval would be: 12.0704 ounces 12.1096 ounces n=400 Dr. C. Ertuna n=100 12.051 ounces 12.129 ounces

Example of CI for Proportion 62 out of a sample of 100 individuals who were surveyed by Quick-Lube returned within one month to have their oil changed. To find a 90% confidence interval for the true proportion of customers who actually returned: 0.54 0.70 Dr. C. Ertuna

From Margin of Error to Sampling Size E unite value = E proportion = Dr. C. Ertuna

Sampling: Size Sample Size Determination. where, n = sample size z = z-score = a factor representing probability in terms of standard deviation α = 100% - confidence level E = interval on either side of the mean Dr. C. Ertuna

Pilot Samples A pilot sample is a sample taken from the population of interest of a size smaller than the anticipated sample size that is used to provide and estimate for the population standard deviation. Dr. C. Ertuna

Example of Determining Required Sample Size The manager of the Georgia Timber Mill wishes to construct a 90% confidence interval with a margin of error of 0.50 inches in estimating the mean diameter of logs. A pilot sample of 100 logs yield a sample standard deviation of 4.8 inches. Dr. C. Ertuna

RANGE versus CI Example:   The customer’s demand is normally distributed with a mean of 750 units/month and a standard deviation of 100 units/month. What is the probability that the demand will be within 700 units/month and 800 units/month? Dr. C. Ertuna

RANGE versus CI (Cont.) 1) A RANGE is GIVEN, probability asked (population  and  given) The customer’s demand is normally distributed with a mean of 750 units/month and a standard deviation of 100 units/month. What is the probability that the demand will be within 700 units/month and 800 units/month? Answer: p(x≤800) - p(x≤700) ; p(700≤x≤800) = NORMDIST(800,750,100,true) - NORMDIST(700,750,100,true) Dr. C. Ertuna

NORMDIST versus CI (Cont.) PROBABILITY IS GIVEN, Upper and Lower limits are asked (sample mean, s, n) What would be the Confidence Interval for an expected sales level of 750 units/month if you whish to have a 90% confidence level based on 30 observations? U/LL(x) = x  NORMSINV(1-(/tails))*(s/SQRT(n)) U/LL(x) = 750  NORMSINV(0.95)*100/SQRT(30) Dr. C. Ertuna

Next Lesson (Lesson - 04/B) Hypothesis Testing Dr. C. Ertuna