Length-Biased Sampling: A Review of Applications Termeh Shafie Department of Statistics Department of Statistics Umeå University

Slides:



Advertisements
Similar presentations
The Simple Linear Regression Model Specification and Estimation Hill et al Chs 3 and 4.
Advertisements

Mean, Proportion, CLT Bootstrap
Chapter 6 Sampling and Sampling Distributions
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Sampling: Final and Initial Sample Size Determination
Simple Regression. Major Questions Given an economic model involving a relationship between two economic variables, how do we go about specifying the.
Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.
The Generalized IV Estimator IV estimation with a single endogenous regressor and a single instrument can be naturally generalized. Suppose that there.

Today Today: Chapter 9 Assignment: Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
Statistical Inference Chapter 12/13. COMP 5340/6340 Statistical Inference2 Statistical Inference Given a sample of observations from a population, the.
Chapter 7 Sampling and Sampling Distributions
Chapter 4 Multiple Regression.
Estimation of parameters. Maximum likelihood What has happened was most likely.
Statistical Inference and Regression Analysis: GB Professor William Greene Stern School of Business IOMS Department Department of Economics.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Nonparametric, Model-Assisted Estimation for a Two-Stage Sampling Design Mark Delorey Joint work with F. Jay Breidt and Jean Opsomer September 8, 2005.
Chapter Topics Confidence Interval Estimation for the Mean (s Known)
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Some standard univariate probability distributions
Chapter 7 Estimation: Single Population
STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%
Chapter 6 Sampling and Sampling Distributions
The Neymann-Pearson Lemma Suppose that the data x 1, …, x n has joint density function f(x 1, …, x n ;  ) where  is either  1 or  2. Let g(x 1, …,
1 Regression Models with Binary Response Regression: “Regression is a process in which we estimate one variable on the basis of one or more other variables.”
Xitao Fan, Ph.D. Chair Professor & Dean Faculty of Education University of Macau Designing Monte Carlo Simulation Studies.
Overall agenda Part 1 and 2  Part 1: Basic statistical concepts and descriptive statistics summarizing and visualising data describing data -measures.
Chapter 7 Estimation: Single Population
Simple Linear Regression
Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Crop area estimates with area frames in the presence of measurement errors Elisabetta Carfagna University of Bologna Department.
© 2003 Prentice-Hall, Inc.Chap 6-1 Business Statistics: A First Course (3 rd Edition) Chapter 6 Sampling Distributions and Confidence Interval Estimation.
Confidence Intervals 1 Chapter 6. Chapter Outline Confidence Intervals for the Mean (Large Samples) 6.2 Confidence Intervals for the Mean (Small.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Random Sampling, Point Estimation and Maximum Likelihood.
Introduction to Biostatistics and Bioinformatics Estimation II This Lecture By Judy Zhong Assistant Professor Division of Biostatistics Department of Population.
Specification Error I.
Probability Distributions and Dataset Properties Lecture 2 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006.
Lecture 6 Forestry 3218 Forest Mensuration II Lecture 6 Double Sampling Cluster Sampling Sampling for Discrete Variables Avery and Burkhart, Chapter 3.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
12.1 Heteroskedasticity: Remedies Normality Assumption.
Biostatistics Class 3 Discrete Probability Distributions 2/8/2000.
Managerial Economics Demand Estimation & Forecasting.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Determination of Sample Size: A Review of Statistical Theory
Issues in Estimation Data Generating Process:
Eurostat Statistical matching when samples are drawn according to complex survey designs Training Course «Statistical Matching» Rome, 6-8 November 2013.
Chapter 7. Control Charts for Attributes
Chapter 8: Confidence Intervals based on a Single Sample
Data Collection & Sampling Dr. Guerette. Gathering Data Three ways a researcher collects data: Three ways a researcher collects data: By asking questions.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Lesoon Statistics for Management Confidence Interval Estimation.
Sampling technique  It is a procedure where we select a group of subjects (a sample) for study from a larger group (a population)
CHAPTER 7, THE LOGIC OF SAMPLING. Chapter Outline  A Brief History of Sampling  Nonprobability Sampling  The Theory and Logic of Probability Sampling.
Chapter Outline 6.1 Confidence Intervals for the Mean (Large Samples) 6.2 Confidence Intervals for the Mean (Small Samples) 6.3 Confidence Intervals for.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Topics Semester I Descriptive statistics Time series Semester II Sampling Statistical Inference: Estimation, Hypothesis testing Relationships, casual models.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Chapter 6 Confidence Intervals 1 Larson/Farber 4th ed.
Chapter 6 Sampling and Sampling Distributions
Marketing Research Aaker, Kumar, Leone and Day Eleventh Edition
STATISTICS POINT ESTIMATION
STATISTICAL INFERENCE
Chapter 4. Inference about Process Quality
Graduate School of Business Leadership
Workshop on Residential Property Price Indices
Presentation transcript:

Length-Biased Sampling: A Review of Applications Termeh Shafie Department of Statistics Department of Statistics Umeå University

Outline 1. Length-Biased Sampling & the Estimation-Problem 2. Applications & Suggested Solutions 3. Simulation under Misspecified Sampling Inclusion Probabilities

Length-Biased Sampling The probability of sample inclusion of a population unit is related to the value of the variable measured. The probability of sample inclusion of a population unit is related to the value of the variable measured. Cox (1969): Textile fibre sampling Cox (1969): Textile fibre sampling A simple illustration of the problem when estimating the population mean A simple illustration of the problem when estimating the population mean

The Estimation Problem Assume there is a population with elements The mean of the population is

The Estimation Problem Suppose observations form a sample with sample mean where if individual i is sampled otherwise

The Estimation Problem The expected value of the sample mean is where are the inclusion probabilities of the population units.

The Estimation Problem Using simple random sampling and thus

The Estimation Problem However in general is unknown and thus The sample mean becomes a biased estimator of the population mean.

Cox (1969) Derived the length-biased or weighted pdf and looked at the estimation of the population mean from a length-biased sample. Assume is a random sample with pdf

Cox (1969) It can be shown that An unbiased estimator of is

Cox (1969) with variance with varianceNote: ~ N

Cox (1969) Relation between the moments of g(x) and f(x) : The relative bias is thus

2. APPLICATIONS Technical/Industrial Sampling Cox (1969): Sampling textile fibres and the estimation of fibre length distribution.

Marketing Shopping Center Sampling & Mall Intercept Surveys: Shopping Center Sampling & Mall Intercept Surveys: - Keillor et al (2001): Global consumer tendencies. - Sudman (1980): Quota sampling techniques and weighting procedures to correct for frequency bias. - Nowell et al (1991): correction techniques for length- biased sampling in two situations; when total length of stay is known or estimated and when only the recurrence time is known.

Epidemiology Sampling procedure for the collection of positive-valued or lifetime data are length- biased (Simon 1980, Zelen et al. 1969) Sampling procedure for the collection of positive-valued or lifetime data are length- biased (Simon 1980, Zelen et al. 1969) Wang (1996): statistical analysis of length- biased data under proportional hazards model. A pseudo-likelihood approach for estimation of the parameters from length-biased data is presented. Wang (1996): statistical analysis of length- biased data under proportional hazards model. A pseudo-likelihood approach for estimation of the parameters from length-biased data is presented.

Resource Economics On-site sampling: On-site sampling: - Deriving demand functions for a recreational site (Bockstael 1990, Ovaskainen et al. 2001) - Charting trip taking behavior (Bowker 1998) - Travel cost models of recreational demand (Moons et al. 2001) - Contingent valuation surveys for the elicitation of non-market goods (Cameron et al. 1987, Nowell et al. 1988)

Resource Economics Shaw (1988): Three problems with on-site samples’ regression; Shaw (1988): Three problems with on-site samples’ regression; 1. Non-negative integers 2. Truncation 3. Endogeneous Stratification

Resource Economics Shaw (1988): recreational demand modeling under two assumptions about the dependent variable’s distribution: 1. Normal distribution 2. Poisson distribution: y=1,2,…

Resource Economics Englin & Shonkwiler (1995): Englin & Shonkwiler (1995): - The Negative Binomial Model The truncated, stratified model is y=1,2,…

Resource Economics Nunes (2003): Binary Choice Models The count variable is described by a Poisson distribution with an unobservable heterogeneity term correlated with the error term in a probit binary choice model

3. Misspecification of Sampling Probabilities: A Simulation Aim: Aim: To see whether or not the effect of missepecified sampling probabilities is large or not… What happens if time per visit is correlated with frequency of visits when estimating the expected number of visits? What happens if time per visit is correlated with frequency of visits when estimating the expected number of visits?

Misspecification of Sampling Probabilities: A Simulation Time is modeled as a function of frequency of visits when estimating the population mean. Time is modeled as a function of frequency of visits when estimating the population mean. ~ Poisson ~ Poisson ~ Exponential ~ Exponential ~ Gamma ~ Gamma The inclusion probabilities are proportional to the time spent at the site:

Misspecification of Sampling Probabilities: A Simulation The three estimators used for the simulation are: The sample mean: The sample mean: Shaw’s estimator: Shaw’s estimator: Cox’s Estimator: Cox’s Estimator:

Simulation Results Sample mean (0.481) (0.939) (1.131) (0.656) (1.016) (1.301) Shaw’s estimator (0.103) (0.011) (0.015) (0.096) (0.050) (0.065) Cox’s estimator (0.162) (0.327) (0.419) (0.100) (0.081) (0.112)

Summary If the probabilities of sample inclusion of population units are related to the values of the variable measured, the parameter estimates will be biased and inconsistent. Thus correctly specified sampling inclusion mechanisms should not be neglected!

References Bockstael, N.E., Strand, I.E., McConnell, K.E., Arsanjani, F., Sample Selection Bias in the Estimation of Recreational Demand Functions:An Application to Sportfishing. Land Economics, vol.66. No 1,40-49 Bockstael, N.E., Strand, I.E., McConnell, K.E., Arsanjani, F., Sample Selection Bias in the Estimation of Recreational Demand Functions:An Application to Sportfishing. Land Economics, vol.66. No 1,40-49 Bowker, J.M., Leeworthy, V.R., Accounting for Ethnicity in Recreation Demand: A Flexible Count Data Approach.Journal of Leisure research 30(1), Bowker, J.M., Leeworthy, V.R., Accounting for Ethnicity in Recreation Demand: A Flexible Count Data Approach.Journal of Leisure research 30(1), Bush, A.J, Hair, J.F., An Assessment of the Mall Intercept as a Data Collection Method. Journal of Marketing Research 22, Bush, A.J, Hair, J.F., An Assessment of the Mall Intercept as a Data Collection Method. Journal of Marketing Research 22, Cameron, T. A., James, M.D., Efficient Estimation Methods for "Close- Ended" Contingent Valuation Surveys. The Review of Economics and Statistics 69, Cameron, T. A., James, M.D., Efficient Estimation Methods for "Close- Ended" Contingent Valuation Surveys. The Review of Economics and Statistics 69, Cox, D.R., "Some Sampling Problems in Technology" in New Developments in Survey Sampling, U. L. Johnson and H. Smith, eds. New York: Wiley Interscience. Cox, D.R., "Some Sampling Problems in Technology" in New Developments in Survey Sampling, U. L. Johnson and H. Smith, eds. New York: Wiley Interscience. Englin, J., Shonkwiler, J.S., Estimating Social Welfare Using Count Data Models: An Application to Long-Run Recreation Demand under Conditions of Endogenous Stratifications and Truncation. Review of Economics and Statistic 77, Englin, J., Shonkwiler, J.S., Estimating Social Welfare Using Count Data Models: An Application to Long-Run Recreation Demand under Conditions of Endogenous Stratifications and Truncation. Review of Economics and Statistic 77, Keillor, B.D., D'Amico, M., Horton, V., Global Consumer Tendencies, Psychology and Marketing 18, Keillor, B.D., D'Amico, M., Horton, V., Global Consumer Tendencies, Psychology and Marketing 18, Laitila, T., Estimation of Combined Site-Choice and Trip-Frequency Models of Recreational Demand using Choice-based and On-Site Samples. Economics Letters 64, Laitila, T., Estimation of Combined Site-Choice and Trip-Frequency Models of Recreational Demand using Choice-based and On-Site Samples. Economics Letters 64, Moons, E., Loomis, J., Proost, S., Eggermont, K., Hermy, M., Travel Cost and Time Measurement in Travel Cost Models. Faculty of Economics and Applied Economic Sciences, Working Paper series, no Moons, E., Loomis, J., Proost, S., Eggermont, K., Hermy, M., Travel Cost and Time Measurement in Travel Cost Models. Faculty of Economics and Applied Economic Sciences, Working Paper series, no Nakanishi, M., Frequency Bias in Shopper Surveys, in Preceedings of the American Marketing Association Educators‘ Conferenc. Chicago: American Marketing Association, Nakanishi, M., Frequency Bias in Shopper Surveys, in Preceedings of the American Marketing Association Educators‘ Conferenc. Chicago: American Marketing Association, Nowell, C., Evans, M.A., McDonald, L., Length-Biased Sampling in Contingent Valuation Studies. Land Economics 64 (November), Nowell, C., Evans, M.A., McDonald, L., Length-Biased Sampling in Contingent Valuation Studies. Land Economics 64 (November), Nowell, C., Stanley, L.R., Length-Biased Sampling in Mall Intercept Surveys. Journal of Marketing Research 28, 1991, Nowell, C., Stanley, L.R., Length-Biased Sampling in Mall Intercept Surveys. Journal of Marketing Research 28, 1991, Nunes, L.C., Estimating Binary Choice Models With On-Site Samples. Faculdade de Economia, Universidade Nova de Lisboa. Nunes, L.C., Estimating Binary Choice Models With On-Site Samples. Faculdade de Economia, Universidade Nova de Lisboa. Ovaskainen, V., Mikkola, J., Pouta, E., Estimating Recreation Demand with On-Site Data: An Application of Truncated and Endogenously Stratified Count Data Models. Journal of Forest Economics 7:2, Ovaskainen, V., Mikkola, J., Pouta, E., Estimating Recreation Demand with On-Site Data: An Application of Truncated and Endogenously Stratified Count Data Models. Journal of Forest Economics 7:2, Santos Silva, J.M.C., Unobservables in Count Data Models for On-Site Samples. Economics Letters 54, Santos Silva, J.M.C., Unobservables in Count Data Models for On-Site Samples. Economics Letters 54, Satten, G.A., Kong, F., Wright, D.J., Glynn, S.A., Schreiber, G.B., How Special is a 'Special' Interval: Modeling Departure from Length-Biased Sampling in Renewal Processes. Biostatistics 5, 1, Satten, G.A., Kong, F., Wright, D.J., Glynn, S.A., Schreiber, G.B., How Special is a 'Special' Interval: Modeling Departure from Length-Biased Sampling in Renewal Processes. Biostatistics 5, 1, Shaw, D., On-Site Samples' Regression, Problems of Non-negative Integers, Truncation, and Endogenous Stratification. Journal of Econometrics 37, Shaw, D., On-Site Samples' Regression, Problems of Non-negative Integers, Truncation, and Endogenous Stratification. Journal of Econometrics 37, Simon, R Length-Biased Sampling in Etiological Studies. Am. J. Epidem. 111, Simon, R Length-Biased Sampling in Etiological Studies. Am. J. Epidem. 111, Sudman, S., Improving the Quality of Shopping Center Sampling. Journal of Marketing Research 17, 1980, Sudman, S., Improving the Quality of Shopping Center Sampling. Journal of Marketing Research 17, 1980, Wang, M-C., Hazards Regression Analysis for Length- Biased Data, Biometrika 2, Wang, M-C., Hazards Regression Analysis for Length- Biased Data, Biometrika 2, Zelen, M., Feinleib, M On The Theory of Screening for Chronic Diseases. Boimetrika 56, Zelen, M., Feinleib, M On The Theory of Screening for Chronic Diseases. Boimetrika 56,

And finally she stops…