Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using unequal probability sampling to limit antici- pated variances of regression estimators Anders HolmbergICES III 07 Anders Holmberg Department of Research.

Similar presentations


Presentation on theme: "Using unequal probability sampling to limit antici- pated variances of regression estimators Anders HolmbergICES III 07 Anders Holmberg Department of Research."— Presentation transcript:

1 Using unequal probability sampling to limit antici- pated variances of regression estimators Anders HolmbergICES III 07 Anders Holmberg Department of Research & Development Statistics Sweden SE-701 89 Örebro Sweden Tel: +46 19 176905 Fax: +46 19 177084 E-mail: Anders.Holmberg@scb.se

2 Outline Background The problem Some theory Auxiliary Information An application in a business survey Comparisons and Results Comments Anders HolmbergICES III 07

3 Background (1) Anders HolmbergICES III 07 Prepare the sampling frame Derive and analyse diagnostic data Decide on a sampling design, sampling scheme and estimator Launch the survey

4 Background (2) Anders HolmbergICES III 07 Prerequisites –A well defined business population –Several parameters of interest –Design-based inference –An up-to-date frame from the business register –Admin. data available as auxiliary information –Attempt to find the most efficient/(robust) design

5 Background (6) Anders HolmbergICES III 07 (1)Number of employees (u 1 ) (2)Turnover (u 2 ) (3)Personnel expenses (u 3 ) (4) Investments (u 4 ) (t-2) (1)Number of employees (u 1 ) (2)Turnover (u 2 ) (3)Personnel expenses (u 3 ) (4) Investments (u 4 ) (t-1) (1)Number of employees (y 1 ) (2)Turnover (y 2 ) (3)Personnel expenses (y 3 ) (4) Investments (y 4 ) (t) (1)Number of employees (u 1 ) (2)Turnover (u 2 ) (3)Personnel expenses (u 3 ) (4) Investments (u 4 ) (t-1)

6 A design that minimizes is such that Minimum of is Brewer, Hajek, Cassel et al., Rosén Optimal design in the single variable case Anders HolmbergICES III 07

7 Population plot Anders HolmbergICES III 07 E.g. if : ’Guesstimate’ to find size measures

8 The multivariate case? Anders HolmbergICES III 07 (1)Number of employees (u 1 ) (2)Turnover (u 2 ) (3)Personnel expenses (u 3 ) (4) Investments (u 4 ) (t-2) (1)Number of employees (u 1 ) (2)Turnover (u 2 ) (3)Personnel expenses (u 3 ) (4) Investments (u 4 ) (t-1) (1)Number of employees (y 1 ) (2)Turnover (y 2 ) (3)Personnel expenses (y 3 ) (4) Investments (y 4 ) (t) (1)Number of employees (u 1 ) (2)Turnover (u 2 ) (3)Personnel expenses (u 3 ) (4) Investments (u 4 ) (t-1)

9 The multivariate case Anders HolmbergICES III 07 The least we should do is to analyse the various designs’ possible effects on different estimators, before we make the design choice. Derive inclusion probabilities as a function of standardized (univariate) size measures Maximal Brewer selection

10 The multivariate case Anders HolmbergICES III 07 There is no evident criterion of optimality, but some are better than others. Minimize under the restrictions Try to find a design that in some sence is optimal for all important parameters?

11 Scale effects are neutralized, the relations between the ANV q :s and the corresponding single parameter minimum values (The Brewer selection) are used. Anders HolmbergICES III 07 The multivariate case some optimisation approaches Minimizing a weighted sum of relative efficiency losses: is minimized when

12 If we want to put restrictions on certain parameters, e.g. Optimization model: Then a design that minimizes ANOREL can be obtained through non-linear programming Anders HolmbergICES III 07 The multivariate case some optimisation approaches

13 An Application Anders HolmbergICES III 07 The 4 variables studied for three branches (strata) SNI25: Manufacturers of food products & beverages N=749, SNI28: Manufacturers of metal goods (except machines and devices) N=2292, SNI33: Manufacturers of optical instruments N=323, Analysis and comparisons made on admin data from previous reference times. Plots, Estimated correlations and gammacoefficients

14 An Application Anders HolmbergICES III 07 A common ratio model pictures the relationships reasonably well if the corresponding older variable is used as regressor variable. (Strongest pairwise correlation over branches and time, although doubts exist for the investment variable) Estimates of the gammacoefficient are sensitive. Estimates ranged between 0.2 and 0.9 and sometimes deteriorated!? For investments very weak or no heteroscedasticity For the other three variables, “cannot be ruled out” and is simple as a guesstimate

15 An Application Anders HolmbergICES III 07 studyvariable FoodMetalOptic employees 0.5 turnover 0.5 P-costs 0.5 investment 0.20 Strata Auxiliary /size variable

16 An Application Anders HolmbergICES III 07 Computations of inclusion probabilities and the anticipated variances using the Brewer selection (Maximal brewer selection) Computation of the optimisation based approaches, with the extra condition that

17 Study variables Considered Design EmployeesTurnoverP-costInvest Mean Opt. on Empl 024.33.524.413.0 Opt. on Turn 24.5019.174.429.5 Opt. on P-cost 3.316.4043.013.0 Opt. on Invest 34.491.745.9043.0 Minimizing Anorel 2.813.92.919.59.8 Minimizing Anorel with restrictions 5.715.06.515.010.6 Food & Beverages

18 Study variables Considered Design EmployeesTurnoverP-costInvest Mean Opt. on Empl 013.03.430.611.8 Opt. on Turn 11.008.251.617.7 Opt. on P-cost 3.08.3037.112.1 Opt. on Invest 44.773.353.4042.8 Minimizing Anorel 2.97.73.120.18.5 Minimizing Anorel with restrictions 4.510.95.315.08.9 Optical Instruments

19 Study variables Considered Design EmployeesTurnoverP-costInvest Mean Opt. on Empl 07.32.021.07.6 Opt. on Turn 6.104.333.010.9 Opt. on P-cost 1.85.0024.47.8 Opt. on Invest 31.651.236.1029.7 Minimizing Anorel 1.74.91.914.05.6 Minimizing Anorel with restrictions 3.47.04.010.06.1 Metal goods Maximal Brewer selection satisfies the criteria but with 25% larger sample

20 Does it work on the estimator variances? Anders HolmbergICES III 07 In most cases we will never know However, for these variables we can check against admin. data (coming in 1.5 year later) Using Whereis the Taylor expanded variance of the ratio estimator under poisson sampling

21 Study variables Considered Design EmployeesTurnoverP-costInvest Mean loss Opt. on Empl 01997225 Opt. on Turn 90366728 Opt. on P-cost 8606821 Opt. on Invest 146 22224134 Minimizing Anorel 223109 Minimizing Anorel with restrictions 10845818 Metal goods Ratios of the Taylor expanded variances to the smallest variance of each estimator (%)

22 Summary Carefully choosing appropriate size measures to get limits anticipated variances of regression estimators. And Brewer’s results can be extended to a multivariate situation. If there is a multivariate issue and you intend to use auxiliary information in the design, diagnostic computations are important. With an optimization approach we know what we are aiming to minimize and with the non-linear programming approach some practical trouble in designing a pps-sample are avoided. Anders HolmbergICES III 07


Download ppt "Using unequal probability sampling to limit antici- pated variances of regression estimators Anders HolmbergICES III 07 Anders Holmberg Department of Research."

Similar presentations


Ads by Google