N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa.

Slides:



Advertisements
Similar presentations
The Wealth Index MICS3 Data Analysis and Report Writing Workshop.
Advertisements

Module B-4: Processing ICT survey data TRAINING COURSE ON THE PRODUCTION OF STATISTICS ON THE INFORMATION ECONOMY Module B-4 Processing ICT Survey data.
Methods of Economic Investigation Lecture 2
N ational T ransfer A ccounts National Transfer Accounts: A Quick Overview Andrew Mason University of Hawaii – Manoa East-West Center.
N ational T ransfer A ccounts Data Review (Hands On) Amonthep Chawla East-West Center & Nihon University Population Research Institute.
Track A: NTA Orientation and Getting Started Gretchen Donehower The Tenth Meeting of Working Group on Macroeconomic Aspects of Intergenerational Transfer.
CHAPTER 1: THE NATURE OF REGRESSION ANALYSIS
United States Country Team Report Third NTA Workshop Honolulu, Hawaii January 20, 2006.
NLSCY – Non-response. Non-response There are various reasons why there is non-response to a survey  Some related to the survey process Timing Poor frame.
Demystifying Data Reference Helping non-specialists make sense of data.
Bridging the Gaps: Dealing with Major Survey Changes in Data Set Harmonization Joint Statistical Meetings Minneapolis, MN August 9, 2005 Presented by:
SAMPLING DISTRIBUTIONS. SAMPLING VARIABILITY
Palestinian Central Bureau of Statistics (PCBS) Palestine Poverty Maps 2009 March
N ational T ransfer A ccounts Comments on Intergenerational Transfers, Age Structure, and Macroeconomics Sang-Hyop Lee University of Hawaii at Manoa WEAI.
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
Market-based NTA Labor Income and Consumption by Gender Gretchen Donehower Day 4, Session 1, NTA Time Use and Gender Workshop Thursday, May 24, 2012 Institute.
Labor Income Profiles Sang-Hyop Lee November 5, 2007 Prepared for NTA 5 th Workshop SKKU, Seoul, Korea.
Consumption calculations with real data – CORRECTED VERSION (CORRECTIONS IN RED) Gretchen Donehower Day 3, Session 2, NTA Time Use and Gender Workshop.
Sampling : Error and bias. Sampling definitions  Sampling universe  Sampling frame  Sampling unit  Basic sampling unit or elementary unit  Sampling.
N ational T ransfer A ccounts 1 National Transfer Accounts: Private Transfers Marjorie Pajaron University of Hawaii at Manoa & East-West Center.
Fundamentals of Data Analysis. Four Types of Data Alphabetical / Categorical / Nominal data: –Information falls only in certain categories, not in-between.
Dr. Engr. Sami ur Rahman Assistant Professor Department of Computer Science University of Malakand Research Methods in Computer Science Lecture: Research.
N ational T ransfer A ccounts 1 The Lifecycle Deficit: A Review Sang-Hyop Lee University of Hawaii at Manoa.
Track A: Transfers Gretchen Donehower The Tenth Meeting of Working Group on Macroeconomic Aspects of Intergenerational Transfer Beijing, China Tuesday,
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
GETTING STARTED Workshop Track A Wednesday, June 5, 9am-10am Gretchen Donehower University of California at Berkeley, Demography United States.
Manual on National Transfer Accounts: Lifecycle Account Training Workshop 10 th NTA Meeting Beijing, November 2014 Andrew Mason University of Hawaii at.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Gender and the Demographic Dividend Karen Oppenheim Mason East-West Center.
3 rd Meeting Macroeconomic Aspects of Intergenerational Transfers Country Report: Indonesia Honolulu January 2006 Maliki Turro Wongkaren Suahasil Nazara.
Data Analysis to NTA Sang-Hyop Lee 41 Summer Seminar June 8, 2010.
N ational T ransfer A ccounts 1 Private Asset-based Reallocations: An Introduction Andrew Mason University of Hawaii and the East-West Center.
Poverty measurement: experience of the Republic of Moldova UNECE, Measuring poverty, 4 May 2015.
Statistical analysis Prepared and gathered by Alireza Yousefy(Ph.D)
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
NTA by SES (NTASES, N project) Sang-Hyop Lee University of Hawii at Manoa East-West Center November 12, 2014 NTA 10, Beijing, PRC.
Welcome and time use data orientation Gretchen Donehower Day 1, Session 1, NTA Time Use and Gender Workshop Monday, May 21, 2012 Institute for Labor, Science.
1 Dealing with Item Non-response in a Catering Survey Pauli Ollila Statistics Finland Kaija Saarni Finnish Game and Fisheries Research Institute Asmo Honkanen.
Handling Attrition and Non- response in the 1970 British Cohort Study Tarek Mostafa Institute of Education – University of London.
N ational T ransfer A ccounts National Transfer Accounts: An Overview Andrew Mason East-West Center University of Hawaii at Manoa.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
Consumption calculations with real data Gretchen Donehower Day 3, Session 2, NTA Time Use and Gender Workshop Wednesday, May 23, 2012 Institute for Labor,
Application 3: Estimating the Effect of Education on Earnings Methods of Economic Investigation Lecture 9 1.
DATA PREPARATION: PROCESSING & MANAGEMENT Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
SAMPLE SELECTION in Earnings Equation Cheti Nicoletti ISER, University of Essex.
N ational T ransfer A ccounts 1 Public Asset-based Reallocations Amonthep Chawla East-West Center Nihon University Population Research Institute.
Design and Assessment of the Toronto Area Computerized Household Activity Scheduling Survey Sean T. Doherty, Erika Nemeth, Matthew Roorda, Eric J. Miller.
PEP-PMMA Training Session Introduction to Distributive Analysis Lima, Peru Abdelkrim Araar / Jean-Yves Duclos 9-10 June 2007.
N ational T ransfer A ccounts Data and Estimation Issues May 15, 2009 Sang-Hyop Lee University of Hawaii at Manoa.
Transfers and Asset-based Age Profiles by Gender Gretchen Donehower Day 4, Session 2, NTA Time Use and Gender Workshop Thursday, May 24, 2012 Institute.
Expert Group Meeting on MDG, Astana, 5-8 Oct.2009 MDG 3.2: Share of women in wage employment in the non-agricultural sector Sources of discrepancies between.
The Most Important Graph in the World: US Life Cycle Deficits, Gretchen Donehower UC Berkeley Department of Demography September 27, 2006.
Item-Non-Response and Imputation of Labor Income in Panel Surveys: A Cross-National Comparison ITEM-NON-RESPONSE AND IMPUTATION OF LABOR INCOME IN PANEL.
Finalizing Results, Review and Sensitivity Testing Gretchen Donehower Day 2, Session 2, NTA Time Use and Gender Workshop Tuesday, May 22, 2012 Institute.
Demand Forecasting Prof. Ravikesh Srivastava Lecture-11.
N ational T ransfer A ccounts 1 Consumption and Labor Income Profiles: Results Sang-Hyop Lee University of Hawaii at Manoa.
Regression method (basic level) Regression method (basic level) Jo z e Sambt NTA Hands-On Workshop Berkeley, CA January 14, 2009.
: LSS1 Longitudinal Studies Seminars: Longitudinal Analyses Using STATA Stirling University, Data and Variable Management Paul Lambert.
This research is funded in part through a U.S. Health Resources and Services Administration, State Planning Grant to the Hawaii State Department of Health.
Workshop on MDG, Bangkok, Jan.2009 MDG 3.2: Share of women in wage employment in the non-agricultural sector National and global data.
1 Consumption and Production Profiles Andrew Mason Sang-Hyop Lee Maliki January 14, 2005 NTA meeting at Berkeley.
Handling Attrition and Non-response in the 1970 British Cohort Study
The Lifecycle Deficit: A Review
Introduction to Survey Data Analysis
Uninsured Population Hawai`i: Adults Age 19-64
Generational Wealth Accounts Workshop
Lifecycle Deficit (Consumption & Labor Income)
Asset-based Reallocations
Lecture 1: Descriptive Statistics and Exploratory
Calculating NTTA Production Profiles
Presentation transcript:

N ational T ransfer A ccounts 1 Data and Estimation Issues Sang-Hyop Lee University of Hawaii at Manoa

N ational T ransfer A ccounts 2 Data Sets for Statistical Analysis ► Cross section ► Time series ► Cross section time series; useful for aggregate cohort analysis ► Panel (longitudinal)  Repeated cross-section design: most common  Rotating panel design (Cote d’Ivore 1985 data)  Supplemental cross-section design (Kenya & Tanzania 1982/83 data, MFLS) ► Cross section with retrospective information ► Micro vs. Macro

N ational T ransfer A ccounts 3 Quality of Survey Data ► Constructing NTA requires individual or household micro survey data sets. ► A good survey data set has the properties of  Extent (richness): it has the variables of interest at a certain level of details.  Reliability: the variables are measured without error.  Validity: the data set is representative.

N ational T ransfer A ccounts 4 Data Problem (An example) ► FIES (64,433 household with 233,225 individuals)  Measured for only urban area (Valid?)  No single person household (Valid?)  No individual level income, only household level (Rich?)  No information of income for family owned business (Rich?)  Measured for up to 8 household members: discrepancy between the sum of individual and household income (Valid? Rich?)

N ational T ransfer A ccounts 5 Extent (Richness): Missing/Change of Variables ► Not measured in the data  Only measured for a certain group  Labor portion of self-employed income ► Change of variables over time  Institutional/policy change  New consumption items, new jobs, etc ► Change of survey instrument/collapsing

N ational T ransfer A ccounts 6 Reliability: Measurement Error ► Response error  Respondents do not know what is required  Incentive to understate/overstate  Recall bias: related with period of survey  Using wrong/different reporting units ► Reporting error: heaping or outliers ► Coding error ► Overestimate/Underestimate  Parents do not report their children until the children have name  Detect by checking survival rate of single age ► Discrepancy between aggregate value and individual value

N ational T ransfer A ccounts 7 Validity: Censoring ► Selection based on characteristics ► Top/Bottom coding ► Censoring due to the time of survey  Duration of unemployment (left and right censoring)  Completed years of schooling ► Attrition (Panel data)

N ational T ransfer A ccounts 8 Categorical/Qualitative Variables ► Converting categorical to single continuous variables  Grouped by age (population, public education consumption)  Income category (FPL) ► Inconsistency over time ► Categorical  continuous, and vice versa

N ational T ransfer A ccounts 9 Units, Real vs. Nominal ► Be careful about the reporting unit  Measurement units  Reporting period units (reference period, seasonal fluctuation, recall bias) ► Nominal vs. Real  Aggregation across items  Quality change (e.g. computer)  Where inflation is a substantial problem

N ational T ransfer A ccounts 10 Solution for Missing Variables ► Ignore it; random non-response ► Give up: find other source of data (FIES vs. LFS) ► Impute  Based on their characteristics or mean value  Based on the value of other peer group  Modified zero order regressions (y on x) - Create dummy variable for missing variables of x (z) - Replace missing variable with 0 (x’) - Regress y on x’ and z, rather than y on x

N ational T ransfer A ccounts 11 Households vs. Individuals ► Consumption and income measurement are individual level ► But a lot of data are gathered from household  Allocating household consumption (income) to individual household members is a critical part of estimation  Adjusting using aggregate (macro) control

N ational T ransfer A ccounts 12 Headship (Thailand, 1996)

N ational T ransfer A ccounts 13 Measuring Consumption ► Underestimation: e.g. British FES  Using aggregate control mitigate the problem. ► Home produced items: both income and consumption. ► Allocation across individuals is difficult ► Estimating some profiles, such as health expenditure are also difficult in part due to various source of financing.

N ational T ransfer A ccounts 14 Measuring Income ► “All of the difficulties of measuring consumption apply with greater force to the measurement of income” (Deaton, p. 29).  Need detailed information on “transactions” (inflow and outflow): an enormous task  Incentive to understate: using aggregate control mitigate the problem.  Some surveys did not attempt to collect information on asset income (e.g. NSS of India) ► Allocating self-employment income across individuals is difficult.

N ational T ransfer A ccounts 15 Data Cleaning ► Case by case ► Find out what data sets are available and choose the best one (template for workshop) ► Detect outliers and examine them carefully ► A serious examination is required when inflation matters to check whether actual estimation process generate a variable ► Make variables consistent ► Convert categorical variable to continuous variable, etc.

N ational T ransfer A ccounts 16 Weighting and Clustering ► Weight should be used in the summary of variables/direct tabulation/regression/smoothing. ► Frequency Weights; fw indicate replicated data. The weight tells the command how many observations each observation really represents.. tab edu [w=wgt]  tab edu [fw=wgt] ► Analytic Weights; aw are inversely proportional to the variance of an observation. It is appropriate when you are dealing with data containing averages.. su edu [w=wgt]  su edu [aw=wgt]. reg wage edu [w=wgt]  reg wage edu [aw=wgt]

N ational T ransfer A ccounts 17 Weighting and Clustering (cont’d) ► Probability Weights; pw are the sample weight which is the inverse of the probability that this observation was sampled.. reg wage edu [pw=wgt]  reg wage edu [(a)w=wgt], robust. reg wage edu [pw=wgt], cluster(hhid)  reg wage edu [(a)w=wgt], cluster(hhid)  reg wage edu [(a)w=wgt], cluster(hhid)

N ational T ransfer A ccounts 18 Smoothing ► Shows the pattern more clearly by reducing sampling variance ► Should not eliminate real features of the data  Avoid too much smoothing (e.g. old-age health expenditure.)  We don’t want to smooth some profiles (e.g. education)  Basic components should be smoothed, but not aggregations ► Type of smoothing  “lowess” smoothing (Stata) does not incorporate sample weight  Friedman’s super smoothing (R) does.

N ational T ransfer A ccounts 19 Discussion ► Data type/quality varies across countries. ► Estimation method could vary across countries depending on data. ► However, some standard measure could be applied.  Definition  Specification  Estimation using weight  Smoothing  Macro control  Present your work!  If some component vary substantially by age, then it is estimated separately (education, health, etc) (education, health, etc)

N ational T ransfer A ccounts 20 Acknowledgement Support for this project has been provided by the following institutions: ► the John D. and Catherine T. MacArthur Foundation; ► the National Institute on Aging: NIA, R37-AG and NIA, R01-AG025247; ► the International Development Research Centre (IDRC); ► the United Nations Population Fund (UNFPA); ► the Academic Frontier Project for Private Universities: matching fund subsidy from MEXT (Ministry of Education, Culture, Sports, Science and Technology), , granted to the Nihon University Population Research Institute.

N ational T ransfer A ccounts 21 The End