Presentation on theme: "Further Updating Poverty Mapping in Albania Gianni Betti*, Andrew Dabalen**, Celine Ferrè** and Laura Neri* * University of Siena, Italy, ** The World."— Presentation transcript:
Further Updating Poverty Mapping in Albania Gianni Betti*, Andrew Dabalen**, Celine Ferrè** and Laura Neri* * University of Siena, Italy, ** The World Bank, Washington, USA Poverty and Social Inclusion in the Western Balkans WBalkans 2010, Brussels, December 14-15, 2010
2 Scopes of the presentation - Introduction on basic concepts of poverty mapping - Concepts of updating poverty mapping without new census data - Application to Albania: 2002-2005-2008 - Results on 2008 and comparisons with 2005 and 2002 are only reported in the paper for sake of time restriction
3 THE METHODOLOGY Combines Census and Survey Data to produce disaggregated maps of poverty and inequality (Elbers, Lanjow and Lanjow, 2003, Econometrica). THE APPLICATION HERE PROPOSED Census (2001) and LSMS (2002) in Albania, firstly updated to LSMS (2005), and then updated to LSMS (2008).
4 Citing the paper of Elbers, Lanjouw and Lanjouw (ELL, 2002 and 2003) Poverty and inequality maps are spatial descriptions of the distribution of poverty and inequality and are most useful to policy- makers and researchers when they are finely disaggregated, i.e. when they represent small geographic units, such as cities, municipalities, regions or other administrative partitions of a country. What is “Poverty Mapping” ?
5 Figure 3. Head Count Ratio by Municipality, 2002.
6 BASIC IDEA OF “POVERTY MAPPING” To estimate a linear regression model with local variance components on the LSMS data (the dependent variable is a monetary variable) – ESTIMATION (Stage 1) The distribution of the dependent variable is used to generate the distribution for any subpopulation in the Census conditional to the observed data – IMPUTATION or SIMULATION (Stage 2) The variables used in the Census data and in the LSMS should comparable (i.e. same categories, etc…). A so-called Stage 0 is needed before estimation of linear regression model in Stage 1.
7 Stage 0: are the LSMS and the Census comparable? Fully analysis of the two data source to construct common variables in Albania we have identified 38 common variables Housing and Dwelling conditions and presence of durable goods (23) Household head characteristics (8) Household socio-demographic characteristics (7) Imputation for missing values in LSMS has been done (IVE-ware, Raghunathan et al. 2001) Census and LSMS distribution should be compared
8 Stage 1: Estimation The model: it is a linear approximation to the conditional distribution of the logarithm consumption expenditure (or income) of household h in cluster c, (1) The error component is specified to allow for a within cluster correlation in disturbances. IMPORTANT: several models are estimated in terms of number of strata in the LSMS survey.
9 Stage 2: Simulation The estimates obtained are applied to the Census data to simulate the expenditure for each household in the Census. A certain number (i.e.100) of simulations are conducted The simulated values are: (4) The beta coefficients, are drawn from a multivariate normal distribution with mean and variance covariance matrix equal to the one associated to.
10 For the residual, any specific distributional form assumption is avoided so the residual are drawn directly from the estimated residuals. For each of the simulated consumption expenditure distributions a set of poverty and inequality measures is calculated. Mean over all the simulations point estimates Standard deviation over all the simulations bootstrapping standard error.
11 Updating the 2002 Poverty Mapping In 2005, Dabalèn and Ferrè have proposed to update the poverty mapping in two Phases: First Phase: construct the so-called “counterfactual population distribution”: this is the distribution that would have prevailed in 2002 if the parameters of consumption and the distribution of observable and unobserved covariates were as they were in 2005 (now 2008); Second Phase: apply the ELL methodology described in the previous slides using the “counterfactual population distribution” [CM].
12 CM – logic behind the model - 1 In this case, Dabalèn and Ferrè (2005) proposed to construct a counterfactual consumption distribution of the old household survey, using information from both the old and new household survey and match the corresponding estimates with the old census data, following the methodology proposed by Lemieux (2002).
13 CM – logic behind the model - 2 To construct the counterfactual wealth distribution, firstly let’s consider a consumption model using the new survey. (5) Where denotes consumption in year 2005, i indexes the household, is a parameter (that captures the “returns” to or “price” of covariates in 2005), is a vector of covariates and, is unobserved component of consumption.
14 CM – logic behind the model - 3 Note that using this new survey, without additional adjustment, and applying the ELL estimator would be problematic because the returns to covariates, the parameter may have changed between 2002 and 2005. In addition, the profile of the population – that is covariates such as education levels, age composition, and so on – may also have changed. Finally, the returns to unobserved covariates may also have changed. To recreate a consumption distribution that resembles consumption of 2002, CM would have to account for these changes. Therefore, the counterfactual consumption distribution has three steps.
15 CM – First step - 1 The first step is to create a consumption distribution that would have prevailed in 2002 if the parameters were as in 2005. That is, (6) Equation (6) accounts for changes in the parameters of covariates, by using the estimated parameters from the new survey to estimate consumption distribution in the old survey.
16 CM – First step - 2 However, in addition to these parameters, levels of covariates may have changed because, for instance, the population is now more educated, etc…
17 CM – Second step - 1 Instead, the CM methodology creates a score that reduces the dimension of the data, by stacking the new and old surveys, and then by running a probit model: (7) In principle, a large set of observable household level characteristics can be included,, and also the migration status of the household,, or any suitable variables that capture the scale of migration, which is of crucial concern when trying to update poverty maps.
18 CM – Second step - 2 Equation (7) allows us to obtain a propensity score – the predicted probability of being in period - conditional on the observable characteristics. (8) Where is the unconditional probability that an observation belongs to period t or the share of year 2005 observations in total observations (that is, both years).
19 CM – Second step - 3 In this framework, accounting for changes in the distribution of observable characteristics is equivalent to reweighing the consumption distribution estimated in equation (6), so that the CM model becomes to be as: (9)
20 CM – Third step - 1 The only step remaining is to add a measure of the unobserved component of consumption. If the dispersion in unobserved consumption is due to random events that are unrelated to systematic differences across households, then there would be nothing more to say about the error term. However, one reason to add a measure of the unobserved consumption is that the residual is unlikely to be just a random component of consumption.
21 CM – Third step - 2 CM first estimates a consumption model for the 2002 data, and ranks all the households on the basis of the residual distribution for that year. Then CM assigns to each household in year 2002, the value of ranked residual from the empirical distribution of residuals in year 2005 (equation (5)) that corresponds to the year 2002 rank. We now have the counterfactual consumption, the consumption that would have been observed in 2002, if the parameters, the distribution of covariates and the unmeasured determinants of consumption are as in 2005.
22 CM – Third step - 3 From equations (5) and (9), this counterfactual wealth distribution can be rewritten as: (10) Where, denote the value of the ranked residual in 2005 assigned to a household with the same residual rank in year 2002.
23 Here we have further updated the poverty mapping using the new LSMS conducted in 2008. Clearly the counterfactual distribution corresponding to the 2008 is less accurate comparing to the one of 2005. However, the results are still good, and the errors still under control.
24 IMPLEMENTATION OF THE POVERTY MAPPING IN ALBANIA THE DATA: Population and Housing Census (2001) Reference Time: 31 March 2001 Number of Households: 726,895 Number of Persons: 3,069,275 Collected Information: Building, Dwellings, Household, Individuals.
25 Living Standard Measurement Study (LSMS, 2002, 2005 & 2008) Reference Time: Spring 2002, 2005 and 2008 Sampling Frame: 4 Strata, 450 PSUs (corresponding to the EA in the Census), 8 Household per PSU Number of Households: 3599 Collected Information: Household, Food Consumption, Diary, Community, Price
26 POVERTY MEASURES: The procedure for estimating the poverty measures has been applied for the whole of Albania and disaggregated at seven levels: a) Rural – urban level; b) The four strata used in sampling the LSMS; c) The six strata for which the linear regression models have been estimated; d) The 12 Prefectures (or Counties); e) The 36 Districts; f) The 374 Communes/Municipalities; g) The 11 Mini-municipalities in which the city of Tirana is divided.
27 Future work: New Poverty Mapping using the fresh 2011 Census and the fresh 2011 LSMS THANK YOU FOR YOUR ATTENTION!!