Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses.

Similar presentations


Presentation on theme: "1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses."— Presentation transcript:

1 1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses “auxiliary” information (X ) Sample data: observe y i and x i Population information Have y i and x i on all individual units, or Have summary statistics from the population distribution of X, such as population mean, total of X Ratio estimation is also used to estimate population parameter called a ratio (B )

2 2 Uses Estimate a ratio Tree volume or bushels per acre Per capita income Liability to asset ratio More precise estimator of population parameters If X and Y are correlated, can improve upon Estimating totals when pop size N is unknown Avoids need to know N in formula for Domain estimation Obtaining estimates of subsamples Incorporate known information into estimates Postratification Adjust for nonresponse

3 3 Estimating a ratio, B Population parameter for the ratio: B Examples Number of bushels harvested (y) per acre (x) Number of children (y) per single-parent household (x) Total usable weight (y) relative to total shipment weight (x) for chickens

4 4 Estimating a ratio SRS of n observation units Collect data on y and x for each OU Natural estimator for B ?

5 5 Estimating a ratio -2 Estimator for B is a biased estimator for B is a ratio of random variables

6 6 Bias of

7 7 Bias is small if Sample size n is large Sample fraction n/N is large is large is small (pop std deviation for x) High positive correlation between X and Y (see Lohr p. 67) Bias of – 2

8 8 Estimated variance of estimator for B Estimator for If is unknown?

9 9 Variance of Variance is small if sample size n is large sample fraction n/N is large deviations about line e = y  Bx are small correlation between X and Y close to  1 is large

10 10 Ag example – 1 Frame: 1987 Agricultural Census Take SRS of 300 counties from 3078 counties to estimate conditions in 1992 Collect data on y, have data on x for sample Existing knowledge about the population

11 11 Ag example – 2 Estimate 0.9866 farm acres in 1992 relative to 1987 farm acres

12 12 Ag example – 3 Need to calculate variance of e i ’s

13 13 Ag example – 4 For each county i, calculate Coffee Co, AL example Sum of squares for e i

14 14 Ag example – 5

15 15 Estimating proportions If denominator variable is random, use ratio estimator to estimate the proportion p Example (p. 72) 10 plots under protected oak trees used to assess effect of feral pigs on native vegetation on Santa Cruz Island, CA Count live seedlings y and total number of seedlings x per plot Y and X correlated due to common environmental factors Estimate proportion of live seedlings to total number of seedlings

16 16 Estimating population mean Estimator for “Adjustment factor” for sample mean A measure of discrepancy between sample and population information, and Improves precision if X and Y are + correlated

17 17 Underlying model with B > 0 B is a slope B > 0 indicates X and Y are positively correlated Absence of intercept implies line must go through origin (0, 0 ) y x 0 0

18 18 Using population mean of X to adjust sample mean Discrepancy between sample & pop info for X is viewed as evidence that same relative discrepancy exists between

19 19 Bias of Ratio estimator for the population mean is biased Rules of thumb for bias of apply

20 20 Estimator for variance of

21 21 Ag example – 6

22 22 Ag example - 8

23 23 Ag example – 9 Expect a linear relationship between X and Y (Figure 3.1) Note that sample mean is not equal to population mean for X

24 24 MSE under ratio estimation Recall … MSE = Variance + Bias 2 SRS estimators are unbiased so MSE = Variance Ratio estimators are biased so MSE > Variance Use MSE to compare design/estimation strategies EX: compare sample mean under SRS with ratio estimator for pop mean under SRS

25 25 Sample mean vs. ratio estimator of mean is smaller than if and only if For example, if and ratio estimation will be better than SRS

26 26 Estimating the MSE Estimate MSE with sample estimates of bias and variance of estimator This tends to underestimate MSE and are approximations Estimated MSE is less biased if is small (see earlier slide) Large sample size or sampling fraction High + correlation for X and Y is a precise estimate (small CV for ) We have a reasonably large sample size (n > 30)

27 27 Ag example – 10

28 28 Estimating population total t Estimator for t Is biased? Estimator for

29 29 Ag example – 11

30 30 Summary of ratio estimation

31 31 Summary of ratio estn – 2

32 32 Regression estimation What if relationship between y and x is linear, but does NOT pass through the origin Better model in this case is y x B0B0 B 1 slope

33 33 Regression estimation – 2 New estimator is a regression estimator To estimate, is predicted value from regression of y on x at Adjustment factor for sample mean is linear, rather than multiplicative

34 34 Estimating population mean Regression estimator Estimating regression parameters

35 35 Estimating pop mean – 2 Sample variances, correlation, covariance

36 36 Bias in regression estimator

37 37 Estimating variance Note: This is a different residual than ratio estimation (predicted values differ)

38 38 Estimating the MSE Plugging sample estimates into Lohr, equation 3.13:

39 39 Estimating population total t Is regression estimator for t unbiased?

40 40 Tree example Goal: obtain a precise estimate of number of dead trees in an area Sample Select n = 25 out of N = 100 plots Make field determination of number of dead trees per plot, y i Population For all N = 100 plots, have photo determination on number of dead trees per plot, x i Calculate = 11.3 dead trees per plot

41 41 Tree example – 2 Lohr, p. 77-78 Data Plot of y vs. x Output from PROC REG Components for calculating estimators and estimating the variance of the estimators We will use PROC SURVEYREG, which will give you the correct output for regression estimators

42 42 Tree example – 3 Estimated mean number of dead trees/plot Estimated total number of dead trees

43 43 Tree example – 4 Due to small sample size, Lohr uses t - distribution w/ n  2 degrees of freedom Half-width for 95% CI Approx 95% CI for t y is (1115, 1283) dead trees

44 44 Related estimators Ratio estimator B 0 = 0  ratio model Ratio estimator  regression estimator with no intercept Difference estimation B 1 = 1  slope is assumed to be 1 y x B0B0 B 1 slope

45 45 Domain estimation under SRS Usually interested in estimates and inferences for subpopulations, called domains If we have not used stratification to set the sample size for each domain, then we should use domain estimation We will assume SRS for this discussion If we use stratified sampling with strata = domains, then use stratum estimators (Ch 4) To use stratification, need to know domain assignment for each unit in the sampling frame prior to sampling

46 46 Stratification vs. domain estimation In stratified random sampling Define sample size in each stratum before collecting data Sample size in stratum h is fixed, or known In other words, the sample size n h is the same for each sample selected under the specified design In domain estimation n d = sample size in domain d is random Don’t know n d until after the data have been collected The value of n d changes from sample to sample

47 47 Population partitioned into domains Recall U = index set for population = {1, 2, …, N } Domain index set for domain d = 1, 2, …, D U d = {1, 2, …, N d } where N d = number of OUs in domain d in the population In sample of size n n d = number of sample units from domain d are in the sample S d = index set for sample belonging to domain d Domain D d=1d=2... d=Dd=D Domain #1

48 48 Boat owner example Population N = 400,000 boat owners (currently licensed) Sample n = 1,500 owners selected using SRS Divide universe (population) into 2 domains d = 1own open motor boat > 16 ft. (large boat) d = 2do not own this type of boat Of the n = 1500 sample owners: n 1 = 472 owners of open motor boat > 16 ft. n 2 = 1028 owners do not own this kind of boat

49 49 New population parameters Domain mean Domain total

50 50 Boat owner example - 2 Estimate population domain mean Estimate the average number of children for boat owners from domain 1 Estimate proportion of boat owners from domain 1 who have children Estimate population domain total Estimate the total number of children for large boat owners (domain 1)

51 51 New population parameter – 2 Ratio form of population mean Numerator variable Denominator variable

52 52 Boat owner example - 3 Estimate mean number of children for owners from domain 1 Zero values for OUs that are not in domain 1 Applies to whole pop

53 53 Boat example – 4

54 54 Estimator for population domain mean

55 55 Boat example – 5 Domain 1 data

56 56 Boat example – 6 Domain 1 and domain 2 data combined 1104 zeros = 76 zeros from domain 1 + 1028 zeros from domain 2

57 57 Two ways of estimating mean Boat example – 7 Whole data set Domain 1 data only

58 58 Estimator for variance of

59 59 Boat example – 8

60 60 Boat example – 9

61 61 Approximation for estimator of variance of Domain 1 data only

62 62 Estimated variance of Estimator for Domain variance estimator is directly related

63 63 Relationship to estimating a ratiowith Population mean of X Residual

64 64 Relationship to estimating a ratiowith - 2 Residual variance

65 65 Estimator for variance of

66 66 Estimating a population domain total If we know the domain sizes, N d

67 67 Estimating a population domain total- 2 If we do NOT know the domain sizes Standard SRS estimator using u as the variable

68 68 Boat example – 10 Do not know the domain size, N 1

69 69 Comparing 2 domain means Suppose we want to test the hypothesis that two domain means are equal Construct a z-test with Type 1 error rate  (for falsely rejecting null hypothesis) Test statistic: Critical value: z  /2 Reject H 0 if |z| > z  /2

70 70 Boat example - 10 Large boat owners (d = 1) Other boat owners (d = 2)

71 71 Boat example - 11 Test whether domain means are equal at  = 0.05 Calculate z-statistic Critical value z  /2 = z 0.25 = 1.96 Apply rejection rule |z| = |-1.04|=1.04 < 1.96 = z 0.25 Fail to reject H 0

72 72 Overview Population parameters Mean Total Proportion (w/ fixed denom) Ratio Includes proportion w/ random denominator Domain mean Domain total

73 73 Overview – 2 Estimation strategies No auxiliary information Auxiliary information X, no intercept Y and X positively correlated Linear relationship passes through origin Auxiliary information X, intercept Y and X positively correlated Linear relationship does not pass through origin

74 74 Overview – 3 Make a table of population parameters (rows) by estimation strategy (columns) In each cell, write down Estimator for population parameter Estimator for variance of estimated parameter Residual e i Notes Some cells will be blank Look for relationship between mean and total, and mean and proportion Look at how the variance formulas for many of the estimators are essentially the same form


Download ppt "1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses."

Similar presentations


Ads by Google