Download presentation

Presentation is loading. Please wait.

Published byConrad Barefield Modified about 1 year ago

1
Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets Tilottama Ghosh, CIRES, University of Colorado, Boulder, USA, and Indicus Analytics Private Limited, New Delhi, India Mayuri Chaturvedi, Indicus Analytics Private Limited, New Delhi, India Laveesh Bhandari, Indicus Analytics Private Limited, New Delhi, India Chris Elvidge, NOAA National Geophysical Data Center (NGDC), Boulder, Colorado, USA Kim Baugh, CIRES, University of Colorado, Boulder, USA India Geospatial Forum, Gurgaon, Haryana 8 th February, 2012

2
Overview Introduction Research objective Methods – data used Analysis – Step 1: State-level graphical analysis Step 2: Model 1 Step 3: Model 2 Results Discussion Conclusion and Future considerations

3
Why use nightlights to study income distribution? Inclusive growth one of the major policy thrust areas in the current as well as next Five-Year Plan Income distribution data not easy to come by Limitations include: – Under-reporting, Over-reporting, Misreporting – Inappropriate sampling and/or weighting – Lack of standardization across sampling organizations – Enormous expense involved in data collection – Political and economic situations in areas inhibiting data collection – Huge time lags between collection and publication, and low frequency of data collection – Coarse spatial resolution, Modifiable Areal Unit Problem Nightlights (NL) can help circumvent these problems Introduction

4
Research objective In this paper, we take a look at the relationship between night lights and Income distribution, as captured by the number of households in different income brackets. We then include other datasets to improve the estimation. Use multinomial regression techniques to study the statistical relationship Map the prediction errors to identify regions of maximum estimation errors Use socio-economic insights to understand probable reasons behind the errors Research objective

5
Data used Radiance-calibrated nighttime image of India, 2004 Source: NOAA, NGDC LandScan population data, 2004 Source: Oak Ridge National Laboratory with United States Department of Energy. State and districts shapefile of India Source: Indicus Analytics Pvt. Ltd. Methods

6
Data used Upper Income Households Middle Income Households Lower Income Households Three categories of households defined on the basis of annual household income – Upper income households (earning more than Rs 10 lakh per annum) – Middle income households (earning Rs 3- 10 lakh per annum) – Lower income households (earning less than Rs 3 lakh per annum) Sum of lights extracted for the States and the Districts Area calculated for the districts Total population extracted for the districts Percentage of rural population in each district calculated from Indicus’ data repository comprising of urban, rural, and total population Sum of lights and number of households in each income category graphed at the State level Methods

7
State-level graphical analysis Lower income households R 2 =0.61 Analysis – Step 1

8
State-level graphical analysis Middle income households R 2 =0.81 Analysis – Step 1

9
State-level graphical analysis Upper income households R 2 =0.77 Analysis – Step 1

10
Lights definitely have a relationship with households in different income categories, but is not able to capture the entire picture at the state-level Examples highlight the need of analysis at a finer spatial resolution – Maharashtra and Andhra Pradesh (similar lights, dissimilar incomes) – Madhya Pradesh and Rajasthan (similar incomes, dissimilar lights) – Uttar Pradesh in the graph and in the NL Image (variegated lighting pattern) Complex role of population is highlighted State-level graphical analysis - inferences Analysis – Step 1

11
Model 1: Using nighttime lights and dummy variables The relationship between nighttime lights and household income suggested a logarithmic relationship Scatter plot for 585 districts X- Sum of lights Y- No. of households in Upper income group Scatter plot for 585 districts X- Natural log of sum of lights Y- Natural log of no. of households in Upper income group Analysis – Step 2, Developing Model 1

12
Model 1 Dummy variables were created for commercially and administratively important districts which are also high population zones Other Districts All the remaining … 4.State Capitals Patna Khordha Mumbai Jaipur… 3.Large Industrialized Towns Agra Jamnagar Kanpur Rangareddi Cuttack Krishna… 2.Suburbs of Metros Noida Thane Gurgaon Faridabad Nagpur Hugli Surat… 1.Metropolitan Districts Delhi Mumbai Kolkata Chennai Pune Bangalore Ahmadabad Hyderabad Analysis – Step 2, Developing Model 1

13
Model 1 Hypotheses of the model While we can have data on households in different income brackets, we can obtain information only on total sum of lights in a region Hypothesis One: NL should be more closely associated with the richer in any given region than with the poorer Hypothesis Two: NL will most likely tend to under-estimate the number of poor households and over-estimate the rich households Logarithmic multivariate regression model used for all three income categories using the same predictor variables Upper Income Middle Income Lower Income Upper Income Middle Income Lower Income Number of households Contribution to Nightlights Analysis – Model 1

14
Model 1 Model coefficients Ln Y = α + β 1 (Ln X 1 ) + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5 * Significant at the 99% Confidence Interval, $ Significant at the 95% Confidence Interval, # Significant at the 90% Confidence Interval Analysis – Model 1

15
Model 1 Inferences Tightening of relationship between NL and households’ categories as the income goes up as seen in higher adjusted R 2 values for middle and upper income category models Magnitude of the coefficient for NL (β 1 ) increases as we move from the lower to the higher income segments Most of the predictor variables significant at the 99% level of significance Coefficients of all dummy variables go up monotonically for higher income group Lights are better able to estimate households in more affluent categories (Hypothesis One) β’s consistently highest for the Metropolitan dummy followed by dummy for Suburbs of Metros for all three models Analysis – Model 1

16
Model 1 Discussion Error maps were created to study the pattern of relationship between nighttime lights and number of households in each income category Under-estimation of number of households was observed in lower income category for highly populated states with over 80% rural population Under-estimation of upper income households by NL observed in high population density states of UP, Bihar and Kerala Under-estimation was lesser for upper- and middle-income households Over-estimation of lower income households in border districts of Rajasthan Over-estimation of lower income households in agriculturally rich states of Punjab, Haryana Thus, both Hypothesis one and Hypothesis two proved to be true Analysis – Model 1

17
Model 2: Using nighttime lights, population density data & including another dummy variable Analysis – Step 3, Developing Model 2 Population density calculated at the district level A dummy variable created for districts with percentage of rural population greater than 80%

18
Model 2 Model coefficients Ln Y = α + β 1 (Ln X 1 ) + β 2 (Ln X 2 ) + β 3 X 3 + β 4 X 4 + β 5 X 5 + β 6 X 6 + β 7 X 7 * Significant at the 99% Confidence Interval, $ Significant at the 95% Confidence Interval, # Significant at the 90% Confidence Interval Analysis – Model 2

19
Model 2 Inferences Inclusion of population density and the dummy variable of districts with rural population greater than 80%, increases the R 2 for all the three income categories Highest percentage increase (about 13%) in R 2 value is seen for households in the lowest income category Magnitude of the coefficient for NL (β 1 ) is highest for the higher income group Magnitude of the coefficient for population density (β 2 ) is lowest for the higher income group The rural population’s indicator is most significant for the lowest income group In fact, the rural indicator is negatively correlated with the middle and upper income households Coefficients of all other dummy variables go up monotonically for higher income group Analysis – Model 2

20
Comparing error maps of Model 1 and Model 2 Error maps – Lower income households Results Model 1Model 2

21
Results Comparing error maps of Model 1 and Model 2 Error maps – Middle income households Model 1Model 2

22
Results Comparing error maps of Model 1 and Model 2 Error maps – Upper income households Model 1Model 2

23
Good relationship exists between nighttime lights and income distribution at the district level, with the relationship being stronger for households in the highest income category Inclusion of population density and dummy variable for districts with rural population greater than 80% causes the greatest improvement in the estimates of the lower income households A study of the error maps show that, in general, Model 2 expands the yellow areas in the maps (-5 to +5 % error), which we are considering as ‘acceptable’ percentage errors, across all the income groups High population density in urban areas, big share of rural population and presence of large expanse of cultivated areas which are not lit, lack of government provision of public amenities, presence of affluent farmers, presence of military base along border areas, are some of the characteristics noticed of districts with anomalous estimates of economic activity by nightlights Discussion

24
Conclusion and Future considerations Conclusion Finer spatial resolution analysis of nightlights is more effective in understanding and using this remotely sensed spatial data as a proxy of economic activity The same holds true for spatial population data The developed models (with further improvements) can be used to estimate households in different income categories for years when such data are not available These models can be useful in studying income inequality. Inclusion of data such as land use, land cover, vegetation cover, are some of the variables that can be considered for improving the model

25
Thank You!! Questions?

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google