# Length-Biased Sampling: A Review of Applications Termeh Shafie Department of Statistics Department of Statistics Umeå University

## Presentation on theme: "Length-Biased Sampling: A Review of Applications Termeh Shafie Department of Statistics Department of Statistics Umeå University"— Presentation transcript:

Length-Biased Sampling: A Review of Applications Termeh Shafie Department of Statistics Department of Statistics Umeå University termeh.shafie@stat.umu.se

Outline 1. Length-Biased Sampling & the Estimation-Problem 2. Applications & Suggested Solutions 3. Simulation under Misspecified Sampling Inclusion Probabilities

Length-Biased Sampling The probability of sample inclusion of a population unit is related to the value of the variable measured. The probability of sample inclusion of a population unit is related to the value of the variable measured. Cox (1969): Textile fibre sampling Cox (1969): Textile fibre sampling A simple illustration of the problem when estimating the population mean A simple illustration of the problem when estimating the population mean

The Estimation Problem Assume there is a population with elements The mean of the population is

The Estimation Problem Suppose observations form a sample with sample mean where if individual i is sampled otherwise

The Estimation Problem The expected value of the sample mean is where are the inclusion probabilities of the population units.

The Estimation Problem Using simple random sampling and thus

The Estimation Problem However in general is unknown and thus The sample mean becomes a biased estimator of the population mean.

Cox (1969) Derived the length-biased or weighted pdf and looked at the estimation of the population mean from a length-biased sample. Assume is a random sample with pdf

Cox (1969) It can be shown that An unbiased estimator of is

Cox (1969) with variance with varianceNote: ~ N

Cox (1969) Relation between the moments of g(x) and f(x) : The relative bias is thus

2. APPLICATIONS Technical/Industrial Sampling Cox (1969): Sampling textile fibres and the estimation of fibre length distribution.

Marketing Shopping Center Sampling & Mall Intercept Surveys: Shopping Center Sampling & Mall Intercept Surveys: - Keillor et al (2001): Global consumer tendencies. - Sudman (1980): Quota sampling techniques and weighting procedures to correct for frequency bias. - Nowell et al (1991): correction techniques for length- biased sampling in two situations; when total length of stay is known or estimated and when only the recurrence time is known.

Epidemiology Sampling procedure for the collection of positive-valued or lifetime data are length- biased (Simon 1980, Zelen et al. 1969) Sampling procedure for the collection of positive-valued or lifetime data are length- biased (Simon 1980, Zelen et al. 1969) Wang (1996): statistical analysis of length- biased data under proportional hazards model. A pseudo-likelihood approach for estimation of the parameters from length-biased data is presented. Wang (1996): statistical analysis of length- biased data under proportional hazards model. A pseudo-likelihood approach for estimation of the parameters from length-biased data is presented.

Resource Economics On-site sampling: On-site sampling: - Deriving demand functions for a recreational site (Bockstael 1990, Ovaskainen et al. 2001) - Charting trip taking behavior (Bowker 1998) - Travel cost models of recreational demand (Moons et al. 2001) - Contingent valuation surveys for the elicitation of non-market goods (Cameron et al. 1987, Nowell et al. 1988)

Resource Economics Shaw (1988): Three problems with on-site samples’ regression; Shaw (1988): Three problems with on-site samples’ regression; 1. Non-negative integers 2. Truncation 3. Endogeneous Stratification

Resource Economics Shaw (1988): recreational demand modeling under two assumptions about the dependent variable’s distribution: 1. Normal distribution 2. Poisson distribution: y=1,2,…

Resource Economics Englin & Shonkwiler (1995): Englin & Shonkwiler (1995): - The Negative Binomial Model The truncated, stratified model is y=1,2,…

Resource Economics Nunes (2003): Binary Choice Models The count variable is described by a Poisson distribution with an unobservable heterogeneity term correlated with the error term in a probit binary choice model

3. Misspecification of Sampling Probabilities: A Simulation Aim: Aim: To see whether or not the effect of missepecified sampling probabilities is large or not… What happens if time per visit is correlated with frequency of visits when estimating the expected number of visits? What happens if time per visit is correlated with frequency of visits when estimating the expected number of visits?

Misspecification of Sampling Probabilities: A Simulation Time is modeled as a function of frequency of visits when estimating the population mean. Time is modeled as a function of frequency of visits when estimating the population mean. ~ Poisson ~ Poisson ~ Exponential ~ Exponential ~ Gamma ~ Gamma The inclusion probabilities are proportional to the time spent at the site:

Misspecification of Sampling Probabilities: A Simulation The three estimators used for the simulation are: The sample mean: The sample mean: Shaw’s estimator: Shaw’s estimator: Cox’s Estimator: Cox’s Estimator:

Simulation Results Sample mean 0.689 0.964 1.058 0.689 0.964 1.058 (0.481) (0.939) (1.131) 0.780 0.983 1.118 (0.656) (1.016) (1.301) Shaw’s estimator -0.311 -0.036 0.058 (0.103) (0.011) (0.015) -0.220 - 0.017 0.118 (0.096) (0.050) (0.065) Cox’s estimator 0.398 0.567 0.642 0.398 0.567 0.642 (0.162) (0.327) (0.419) 0.155 0.036 0.176 0.155 0.036 0.176 (0.100) (0.081) (0.112)

Summary If the probabilities of sample inclusion of population units are related to the values of the variable measured, the parameter estimates will be biased and inconsistent. Thus correctly specified sampling inclusion mechanisms should not be neglected!

References Bockstael, N.E., Strand, I.E., McConnell, K.E., Arsanjani, F., 1990. Sample Selection Bias in the Estimation of Recreational Demand Functions:An Application to Sportfishing. Land Economics, vol.66. No 1,40-49 Bockstael, N.E., Strand, I.E., McConnell, K.E., Arsanjani, F., 1990. Sample Selection Bias in the Estimation of Recreational Demand Functions:An Application to Sportfishing. Land Economics, vol.66. No 1,40-49 Bowker, J.M., Leeworthy, V.R., 1998. Accounting for Ethnicity in Recreation Demand: A Flexible Count Data Approach.Journal of Leisure research 30(1),64-78. Bowker, J.M., Leeworthy, V.R., 1998. Accounting for Ethnicity in Recreation Demand: A Flexible Count Data Approach.Journal of Leisure research 30(1),64-78. Bush, A.J, Hair, J.F., 1985. An Assessment of the Mall Intercept as a Data Collection Method. Journal of Marketing Research 22, 158-67. Bush, A.J, Hair, J.F., 1985. An Assessment of the Mall Intercept as a Data Collection Method. Journal of Marketing Research 22, 158-67. Cameron, T. A., James, M.D., 1987. Efficient Estimation Methods for "Close- Ended" Contingent Valuation Surveys. The Review of Economics and Statistics 69, 269-276. Cameron, T. A., James, M.D., 1987. Efficient Estimation Methods for "Close- Ended" Contingent Valuation Surveys. The Review of Economics and Statistics 69, 269-276. Cox, D.R., 1969. "Some Sampling Problems in Technology" in New Developments in Survey Sampling, U. L. Johnson and H. Smith, eds. New York: Wiley Interscience. Cox, D.R., 1969. "Some Sampling Problems in Technology" in New Developments in Survey Sampling, U. L. Johnson and H. Smith, eds. New York: Wiley Interscience. Englin, J., Shonkwiler, J.S., 1995. Estimating Social Welfare Using Count Data Models: An Application to Long-Run Recreation Demand under Conditions of Endogenous Stratifications and Truncation. Review of Economics and Statistic 77, 104-112. Englin, J., Shonkwiler, J.S., 1995. Estimating Social Welfare Using Count Data Models: An Application to Long-Run Recreation Demand under Conditions of Endogenous Stratifications and Truncation. Review of Economics and Statistic 77, 104-112. Keillor, B.D., D'Amico, M., Horton, V., 2001. Global Consumer Tendencies, Psychology and Marketing 18, 1-19. Keillor, B.D., D'Amico, M., Horton, V., 2001. Global Consumer Tendencies, Psychology and Marketing 18, 1-19. Laitila, T., 1998. Estimation of Combined Site-Choice and Trip-Frequency Models of Recreational Demand using Choice-based and On-Site Samples. Economics Letters 64, 17-23. Laitila, T., 1998. Estimation of Combined Site-Choice and Trip-Frequency Models of Recreational Demand using Choice-based and On-Site Samples. Economics Letters 64, 17-23. Moons, E., Loomis, J., Proost, S., Eggermont, K., Hermy, M., 2001. Travel Cost and Time Measurement in Travel Cost Models. Faculty of Economics and Applied Economic Sciences, Working Paper series, no 2001-22. Moons, E., Loomis, J., Proost, S., Eggermont, K., Hermy, M., 2001. Travel Cost and Time Measurement in Travel Cost Models. Faculty of Economics and Applied Economic Sciences, Working Paper series, no 2001-22. Nakanishi, M., 1978. Frequency Bias in Shopper Surveys, in Preceedings of the American Marketing Association Educators‘ Conferenc. Chicago: American Marketing Association, 67-70. Nakanishi, M., 1978. Frequency Bias in Shopper Surveys, in Preceedings of the American Marketing Association Educators‘ Conferenc. Chicago: American Marketing Association, 67-70. Nowell, C., Evans, M.A., McDonald, L., 1988. Length-Biased Sampling in Contingent Valuation Studies. Land Economics 64 (November), 367-71. Nowell, C., Evans, M.A., McDonald, L., 1988. Length-Biased Sampling in Contingent Valuation Studies. Land Economics 64 (November), 367-71. Nowell, C., Stanley, L.R., 1991. Length-Biased Sampling in Mall Intercept Surveys. Journal of Marketing Research 28, 1991, 475-479. Nowell, C., Stanley, L.R., 1991. Length-Biased Sampling in Mall Intercept Surveys. Journal of Marketing Research 28, 1991, 475-479. Nunes, L.C., 2003. Estimating Binary Choice Models With On-Site Samples. Faculdade de Economia, Universidade Nova de Lisboa. Nunes, L.C., 2003. Estimating Binary Choice Models With On-Site Samples. Faculdade de Economia, Universidade Nova de Lisboa. Ovaskainen, V., Mikkola, J., Pouta, E., 2001. Estimating Recreation Demand with On-Site Data: An Application of Truncated and Endogenously Stratified Count Data Models. Journal of Forest Economics 7:2, 125-144. Ovaskainen, V., Mikkola, J., Pouta, E., 2001. Estimating Recreation Demand with On-Site Data: An Application of Truncated and Endogenously Stratified Count Data Models. Journal of Forest Economics 7:2, 125-144. Santos Silva, J.M.C., 1997. Unobservables in Count Data Models for On-Site Samples. Economics Letters 54, 217-220. Santos Silva, J.M.C., 1997. Unobservables in Count Data Models for On-Site Samples. Economics Letters 54, 217-220. Satten, G.A., Kong, F., Wright, D.J., Glynn, S.A., Schreiber, G.B., 2004. How Special is a 'Special' Interval: Modeling Departure from Length-Biased Sampling in Renewal Processes. Biostatistics 5, 1, 145-151. Satten, G.A., Kong, F., Wright, D.J., Glynn, S.A., Schreiber, G.B., 2004. How Special is a 'Special' Interval: Modeling Departure from Length-Biased Sampling in Renewal Processes. Biostatistics 5, 1, 145-151. Shaw, D., 1988. On-Site Samples' Regression, Problems of Non-negative Integers, Truncation, and Endogenous Stratification. Journal of Econometrics 37, 211-223. Shaw, D., 1988. On-Site Samples' Regression, Problems of Non-negative Integers, Truncation, and Endogenous Stratification. Journal of Econometrics 37, 211-223. Simon, R. 1980. Length-Biased Sampling in Etiological Studies. Am. J. Epidem. 111, 444-452. Simon, R. 1980. Length-Biased Sampling in Etiological Studies. Am. J. Epidem. 111, 444-452. Sudman, S., 1980. Improving the Quality of Shopping Center Sampling. Journal of Marketing Research 17, 1980, 423-431. Sudman, S., 1980. Improving the Quality of Shopping Center Sampling. Journal of Marketing Research 17, 1980, 423-431. Wang, M-C., 1996. Hazards Regression Analysis for Length- Biased Data, Biometrika 2, 343-354. Wang, M-C., 1996. Hazards Regression Analysis for Length- Biased Data, Biometrika 2, 343-354. Zelen, M., Feinleib, M. 1969. On The Theory of Screening for Chronic Diseases. Boimetrika 56, 601-614 Zelen, M., Feinleib, M. 1969. On The Theory of Screening for Chronic Diseases. Boimetrika 56, 601-614

And finally she stops…

Download ppt "Length-Biased Sampling: A Review of Applications Termeh Shafie Department of Statistics Department of Statistics Umeå University"

Similar presentations