Presentation is loading. Please wait.

Presentation is loading. Please wait.

TerraPop Goals Lower barriers to conducting interdisciplinary human-environment interactions research by making data with different formats from different.

Similar presentations


Presentation on theme: "TerraPop Goals Lower barriers to conducting interdisciplinary human-environment interactions research by making data with different formats from different."— Presentation transcript:

1

2 TerraPop Goals Lower barriers to conducting interdisciplinary human-environment interactions research by making data with different formats from different scientific domains easily interoperable Provide an organizational and technical framework to preserve, integrate, disseminate, and analyze global-scale spatiotemporal data describing population and the environment.

3 Source Data Domains & Formats Population Microdata Area-Level Data

4 Terra Populus Data Domains
Microdata Land cover Individuals and households Population Environment Land use Disparate scientific domains – interrelated processes Multiple data formats Areal Data Climate

5 Trent Alexander, CIC Conference Presentation on IPUMS, 10/11/07
Age Birthplace Sex Mother’s birthplace Relationship Race Occupation PopulationMicrodata Structure Geographic and housing characteristics Rows Household records Person records within households Columns Variables Key points – Preserve richness of microdata Location based on administrative units

6 Microdata Availability
Thanks to attending partners: Czech Republic* Mexico* Brazil University of Barcelona Slovenia* Netherlands* Spain* Italy* Poland* *anticipating 2010 or 2011 data Croatia – legislation approved? Norway & Sweden NAPP only Bulgaria no data provided Finland, Slovakia, Kosovo Denmark?

7 Area-level Data Sources
Census tables, especially where microdata is unavailable Other types of surveys, data Agricultural surveys Economic surveys, data Election data Legal information

8 Environmental Data (Rasters) TerraPop Prototype
Land cover data from satellite images (Global Land Cover 2000) Agricultural land use data from satellites and government records (Global Landscapes Initiative) Climate data from weather stations (WorldClim)

9 Location-Based Integration
Microdata  Area-level  Raster

10 Location-Based Integration
Microdata Integration across domains, formats hinges on geography Users get any type of data in format useful to them Requires boundary files, boundaries harmonized over time Rasters Area-level data

11 Location-Based Integration
Microdata Individuals and households with their environmental and social context Integration across domains, formats hinges on geography Users get any type of data in format useful to them Requires boundary files, boundaries harmonized over time Rasters Area-level data

12 Location-Based Integration
Microdata Summarized environmental and population County ID G G G G G G G County ID Mean Ann. Temp. Max. Ann. Precip. Rent, Rural Rent, Urban Own, Rural Own, Urban G 21.2 768 3129 1063 637 365 G 23.4 589 2949 1075 1469 717 G 24.3 867 3418 1589 1108 617 G 21.5 943 1882 425 202 142 G 24.1 2416 572 426 197 G 24.4 697 2560 934 950 563 G 25.6 701 2126 653 321 215 County ID Mean Ann. Temp. Max. Ann. Precip. G 21.2 768 G 23.4 589 G 24.3 867 G 21.5 943 G 24.1 G 24.4 697 G 25.6 701 characteristics for administrative districts Integration across domains, formats hinges on geography Users get any type of data in format useful to them Requires boundary files, boundaries harmonized over time Rasters Area-level data

13 Location-Based Integration
Microdata Rasters of population and environment data Integration across domains, formats hinges on geography Users get any type of data in format useful to them Requires boundary files, boundaries harmonized over time Rasters Area-level data

14 Boundaries are Key Linkages across data formats rely on administrative unit boundaries Particular needs Lower level boundaries Historical boundaries

15 Administrative Unit Boundary Processing
Obtaining Linking to microdata Temporal harmonization regionalization

16 Obtaining Boundary Data
Potential sources of digital data National Statistical Offices Global Administrative Areas data (e.g. SALB, GAUL) Digitizing from images or paper maps Challenges Lower level and historical data Date mismatches with census data Code matching to microdata

17 Digitizing Boundaries Leveraging available digital data
Script input Existing digital data Rough digitized boundaries Script output Relevant boundaries from digital data Relationship between digital and digitized units Advantages Preserve accuracy and detail Flag areas needing more work

18 Code Matching Codes link boundaries to microdata records, connect people to places Boundary data may or may not include codes Approach Name matching, when possible Map observations – digitizing script captures codes Research on boundary changes Boundary shape attributes IPUMS microdata

19 Temporal Harmonization
Purpose Create consistent units for time-series analysis Top-down strategy Start with first administrative level units Harmonize 2nd level units within 1st level “containers” Script to create “least common denominator” units Applicable when maps from multiple years are available Creates aggregate units encompassing areas with boundary changes Constructs source-harmonized crosswalk

20 “Erase” interior boundaries applicable to only one census
Apply harmonized codes Also aids in code matching Crosswalk

21 Regionalization Confidentiality concerns require minimum 20,000 population in each unit disseminated REDCAP tool Constructs regions by combining units Regions meet minimum population threshold Contiguity constrained Combines units that are similar in terms of a selected variable Currently in testing phase REDCAP Algorithms and parameters Optimization variables (e.g., pop. density, education, occupation) Testing on Malawi TAs, Brazil 2000 municipios Guo’s NSF grant number is , in case they ask

22 Regionalization - Lilongwe, Malawi
Units < 20K combined with neighbors to meet threshold Specific aggregation depends on Optimization variable Algorithm Need to check this map on the projector….may be tough to see

23 Beyond Administrative Boundaries
Arbitrary boundaries rasterization

24 Arbitrary Boundaries Watersheds, buffers around features, etc.
Near-term Summarize rasters to user-supplied boundaries Identify administrative units intersecting user-supplied boundaries Future Reallocation based on uniform distribution assumption Reallocation based on other assumptions

25 Rasterization Prototype - All cells in unit get the same value
Use lowest level units available Rates only, not counts Future – Distribute based on ancillary data Requires research on available methods May provide as service – users select: Ancillary data Weights Spatial distribution parameters

26 Acknowledgements IPUMS-International Participating Countries
Brazil Bulgaria Czech Republic Germany Italy Ireland IPUMS-International Supporters & Partners Eurostat Universitat Autònoma de Barcelona Mexico Netherlands Poland Slovenia Spain

27


Download ppt "TerraPop Goals Lower barriers to conducting interdisciplinary human-environment interactions research by making data with different formats from different."

Similar presentations


Ads by Google