Download presentation
Presentation is loading. Please wait.
1
Building Historical Social Science Infrastructure: Data Integration Projects of the Minnesota Population Center Robert McCaa and Steven Ruggles Minnesota Population Center
2
How to get data (once approved) https://www.ipums.org/international https://www.ipums.org/international (also SAS, STATA) 1. Access web-site study documentation 2. Make and submit extract 3. Get email: extract ready 3. Get email: extract ready 4. Retrieve extract 5. Decompress extract 6. Analyze using stat. package
3
Outline History of Public Use Census Microdata IPUMS IPUMS-International NAPP Differences among the projects –Data format –Harmonization –Administration, work processes, and legal constraints
4
History of U.S. Public Use Census Microdata The 1960 One-In-One-Thousand Public Use Sample The 1970 Public Use Samples DUALabs, Beresford, and the harmonized and expanded 1960 sample The new historical samples: Preston, Winsborough, Ruggles The 1980, 1990, and 2000 PUMS: incompatible
5
Table 1. Census files incorporated in the original version of IPUMS 1991: eight census years, four investigators, six performance sites, seven record layouts
6
IPUMS 1987-1992: SHRL Common format FORTRAN programs –Limitations: lost information, false cognates, poor documentation, expensive custom datasets IPUMS was an attempt to do it right –Single harmonized database, comprehensive integrated documentation, no lost information –Beta release 1993, full public release 1995 Internet dissemination –ftp in 1993, web-based interactive extraction in 1995
7
Table 2. Current and Planned IPUMS-USA Data Files
8
IPUMS-International After 1960, most censuses around the world were tabulated by computer McCaa decided that IPUMS model should be applied to other countries Began with a project for Colombia, then in 1999 NSF Infrastructure grant to add six more countries 2003-2005: three major new grants to increase database to 50+ countries
9
IPUMS-International Tasks Inventory and preservation of data and documentation Processing (standardizing format, correcting format errors, drawing samples, adding confidentiality protections, harmonizing codes, etc.) Documentation (especially comparability) Dissemination—obtain licenses that allow us to disseminate data for educational and scholarly usae, and set up secure web-based dissemination system
10
Table 3. Current IPUMS-International Samples
11
IPUMS-International, August 2005 dark green = disseminating medium green = harmonizing light green = negotiating Mollenweide projection
12
Table 4. Status of IPUMS-International Countries
13
North Atlantic Population Project IMAG 1999: LDS data for Britain, Canada, U.S. Minneapolis 2000: meetings to define scope of a harmonization project –Added Norway and Iceland –Adopted decentralized structure with coding work carried out at seven sites, coordination and programming at Minneapolis 2003-2005: preliminary datasets for all countries released 2006-2009: planned expansions (funding pending)
14
Table 5a. Phase I NAPP datasets
15
Table 5b. Phase II NAPP datasets
16
Differences Data Format Problems Harmonization Project administration and work process Ownership and dissemination restrictions
17
Merging the databases Current compatibility and incompatibilities Two formats Integration of web access tools
18
Thank you. http://ipums.org/usa https://ipums.org/international http://nappdata.org
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.