Presentation is loading. Please wait.

Presentation is loading. Please wait.

Creating a collection of standardized datasets on household consumption Olivier Dupriez World Bank, Development Data Group 6 June.

Similar presentations


Presentation on theme: "Creating a collection of standardized datasets on household consumption Olivier Dupriez World Bank, Development Data Group 6 June."— Presentation transcript:

1 Creating a collection of standardized datasets on household consumption Olivier Dupriez World Bank, Development Data Group odupriez@worldbank.org 6 June 2013

2 Initial objective Calculate poverty PPPs Had price data at basic heading level from the ICP ; needed consumption shares “at the poverty line” for the same breakdown to be used as weights. See: A. Deaton and O. Dupriez, Purchasing power parity exchange rates for the global poor, American Economic Journal: Applied, vol. 3, pp. 137-166 (2011), and also Global Poverty and Global Price IndexesPurchasing power parity exchange rates for the global poorGlobal Poverty and Global Price Indexes

3 Intermediary output – data files A collection of “standard” files – Individual level: age, sex – Household level: region, total expenditure (before and after fixing outliers), adult equivalents, hhld size, etc – Household + product level: Product code (original as in questionnaire, with labels) and COICOP code Value purchased, home produced, received, total Deflated (when available) / non deflated NO information on quantities – Format/structure of the data files is standard; content not so much

4 Multiple uses and users Many potential applications – IFC “Business Opportunities at the Base of the Pyramid” – Micro-macro modeling – Poverty/inequality analysis – Assessment of reliability and relevance of surveys E.g., list all items related to health with percentage of respondents, for each survey E,g, list all categories not covered by questionnaires – And many more

5 Method Use household consumption/expenditure surveys – A VERY divers set of surveys (HBS, LSMS, HIES, etc) – Ex-post harmonization has limits Map all products and services to COICOP – From 6000+ items in Brazil survey to less than 50 in other countries… Annualize values by product/service and hhld Fix outliers No attempt to fill gaps (no imputation of values for missing products/services) Generate the 3 standard files

6 Principle – Full replicability One single Stata program per survey – Calls one “generic” program to detect and fix outliers Controlled vocabulary for file names, folder names Survey ID to link to on-line metadata catalog

7 Mapping to COICOP ICP/COICOP: 110 basic headings for household consumption 105 are relevant for household surveys Situations: Many to one (e.g., long list of vegetables) One to one  One to many (lack of detail in questionnaire) No data to one (questionnaire missed items)

8 Grouped categories One to many: items in questionnaires are not always detailed enough to be mapped to one single COICOP basic heading

9 Missing categories No questionnaire found to cover all 105 categories of products and services On average, N basic headings missing – Sometimes for know reasons (e.g., pork in muslim countries) – But questionnaire design needs improvement in all countries

10 Splitting grouped categories Used breakdown from national accounts to split grouped categories (data obtained from ICP)

11 Correlation between SNA and surveys From almost perfect (very few cases) to very low (many countries)

12 Annualization challenges Some problematic items: – Durables (use value/expenditure) – Imputed rents – Out of pocket health expenditure – Ceremonies, etc. – Food away from home Validation: compare with official estimates when available, and with PovCal aggregates – Never replicate exactly

13 Detecting and fixing outliers Top outliers only Tried multiple options Based on per capita or per household depending on item 75 th percentile + 5 times interquartile range Replace with maximum valid value (zero values not included in calculations) If outlier for multiple items, consider “rich” household and do not fix Would deserve a specific research project

14 Outliers fixing – Significant impact Example: change in Ginis http://datavizint.worldbank.org/t/DECDG/views/GiniAnalyses/Ginis?:embed=y&:display_count=no http://datavizint.worldbank.org/t/DECDG/views/GiniAnalyses/Ginis?:embed=y&:display_count=no

15 Past and future 160 datasets “standardized” – 90+ low and middle-income countries Many more survey datasets available at WB; could expand and update the collection if resources are available Conduct in-depth research work on outliers and formulate recommendations to countries Feedback to countries on issues in questionnaire design Dissemination of microdata?


Download ppt "Creating a collection of standardized datasets on household consumption Olivier Dupriez World Bank, Development Data Group 6 June."

Similar presentations


Ads by Google