Presentation is loading. Please wait.

Presentation is loading. Please wait.

Disseminating integrated census microdata to academic researchers and policy makers at no cost (plus we pay US$1-5,000 per census to the NSO-owner) * *

Similar presentations


Presentation on theme: "Disseminating integrated census microdata to academic researchers and policy makers at no cost (plus we pay US$1-5,000 per census to the NSO-owner) * *"— Presentation transcript:

1 Disseminating integrated census microdata to academic researchers and policy makers at no cost (plus we pay US$1-5,000 per census to the NSO-owner) * * * Robert McCaa Minnesota Population Center rmccaa@umn.edu www.ipums.org/international www.hist.umn.edu/~rmccaa/IPUMSI rmccaa@umn.edu www.ipums.org/international www.hist.umn.edu/~rmccaa/IPUMSI rmccaa@umn.edu www.ipums.org/international www.hist.umn.edu/~rmccaa/IPUMSI

2 IPUMS-International, 2009 dark green = disseminating medium green = integrating lightest green = negotiating Mollweide projection Special thanks to: CSO-Vietnam NIS-Cambodia BPS-Indonesia PCO-Pakistan NBS-China NSSO-India BBS-Bangladesh DOS-Malaysia NSO-Mongolia CBS-Nepal NSO-Philippines NSO_Thailand

3 Integrating Asian Census Microdata dark green = disseminating medium green = integrating lightest green = talking Respectful invitation to the National Statistical Offices of: Afghanistan Bhutan Iran DPR Korea DPR Laos Maldives Sri Lanka Timor Leste

4 Outline: Disseminating Census Microdata 1. What are census microdata 3 slides 2. Electronic archiving of census microdata: do it now! 4 slides 3. Why are census microdata essential? 2 slides 4. IPUMS-International: invitation to participate 10 slides What is IPUMS? What are the benefits? How are the integrated metadata and microdata constructed and accessed? 5. Conclusions 3 slides

5 1. What are census microdata? And how do they differ from “raw data”? (3 slides)

6 www.ipums.org/international 16th century Aztec census written on fig-bark paper, in Nahuatl, will survive another 500 years Sources: Museo Nacional de Antropología e Historia (Mexico City). "Libro de Tributos," Colección Antigua, ms. 549 bis. Sarah Cline, The Book of Tributes. Early Sixteenth-Century Nahuatl Censuses from Morelos. Los Angeles: 1993. manuscripttranscribed translated and converted to microdata

7 12100102600700720000011210000104 22200202600700720000011210000104 32300100600700720000012123000000 42300200400700000000000000000000 52300200200700000000000000000000 62300200000700000000000000000000 What are “census microdata”?: anonymized, computerized census records of individuals, households & dwellings Easier to integrate than tables. Study any desired set of characteristics. Facilitates comparative research.

8 12100102600700720000011210000104 22200202600700720000011210000104 32300100600700720000012123000000 42300200400700000000000000000000 52300200200700000000000000000000 62300200000700000000000000000000 How do census microdata differ from “raw data”?: 1. detailed geography is suppressed and 2. strict measures are implemented to protect privacy of individuals, households, dwellings & other entities Note absence of detailed geography

9 2. Digital Archiving (4 slides) Census Tablet (digital image): Assyria, 2700 B.P. Library of King Ashurbanipal

10 www.ipums.org/international Bangladesh Bureau of Statistics Tape Archive photo: April 14, 2006 2009: Census data on most of these tapes were recovered.

11 www.ipums.org/international Archiving: no longer a problem for recent censuses --generally excellent in Asian agencies I have visited-- » Documentation (forms, instructions, definitions, dictionaries, methodological reports, etc.): Preserve at least two copies in at least 2 institutes Census docs: Send 1 copy pre-paid courier to MPC » Paper ».PDF » and one of the following:.HTML,.DOC,.XLS, or.TXT » DATA Preserve at least two copies in at least 2 institutes on the most stable media (CD and Servers) Census microdata: send copy pre-paid courier to MPC » Un-edited “Raw Data” (ASCII) » Edited Data (ASCII) 1981 census of Bangladesh 3 tapes containing microdata Even the moldy one was recovered!!!!

12 R E C O V E R S Centro Latino Americano y Caribeño de Demografía (CELADE: Santiago, Chile) ~3000 microdata tapes recovered and fully documented (funded by NSF) IPUMSiIPUMSiIPUMSiIPUMSi

13 R E C O V E R S Centro Latino Americano y Caribeño de Demografía (CELADE) ~3000 microdata tapes recovered and fully documented (funded by NSF) IPUMSiIPUMSiIPUMSiIPUMSi IPUMS now has largest collection of census documentation in the world, having acquired paper/electronic archives from: » United Nations Statistical Division » United Nations Population Division » CELADE (Latin America) » East-West Center (Asia/Pacific) » U.S. Census Bureau International Programs Center

14 www.ipums.org/international Archived census microdata by region and decade % of censuses conducted inventory by IPUMS-International Note: cases confirmed by the corresponding official statistical institute. Some datasets remain to be certified. Some countries have not responded to the invitation to inventory their stocks of data. Source: http://www.hist.umn.edu/~rmccaa/IPUMS/country6.htm March 15, 2009Note: cases confirmed by the corresponding official statistical institute. Some datasets remain to be certified. Some countries have not responded to the invitation to inventory their stocks of data. Source: http://www.hist.umn.edu/~rmccaa/IPUMS/country6.htm March 15, 2009http://www.hist.umn.edu/~rmccaa/IPUMS/country6.htm Region/continentCountries2000s1990s1980s1970s 1960s Latin America21100% 89%81%72% North America27100%72%64%24%8% Africa58100%53%46% 25% 2% Asia44100%54%34%30%13% Europe46100%67%55%41%13% Pacific (pop>.5m)7100% 43%29%

15 www.ipums.org/international Archived census microdata by region and decade % of censuses conducted inventory by IPUMS-International Note: cases confirmed by the corresponding official statistical institute. Some datasets remain to be certified. Some countries have not responded to the invitation to inventory their stocks of data. Source: http://www.hist.umn.edu/~rmccaa/IPUMS/country6.htm March 15, 2009Note: cases confirmed by the corresponding official statistical institute. Some datasets remain to be certified. Some countries have not responded to the invitation to inventory their stocks of data. Source: http://www.hist.umn.edu/~rmccaa/IPUMS/country6.htm March 15, 2009http://www.hist.umn.edu/~rmccaa/IPUMS/country6.htm Region/continentCountries2000s1990s1980s1970s 1960s Latin America21100% 89%81%72% North America27100%72%64%24%8% Africa58100%53%46% 25% 2% Asia44100%54%34%30%13% Europe46100%67%55%41%13% Pacific (pop>.5m)7100% 43%29% What Asian census microdata and documentation still exist …for the 1960s? …1970s? …1980s? …1990s? » How much will be lost before they can be recovered, documented and archived? » Help preserve these treasures now—IPUMS pays costs of shipping and recovery.

16 3. Why is the dissemination of census microdata essential? (2 slides)

17 www.ipums.org/international Julia Lane, European Statisticians Conference (2003) 6 benefits from disseminating microdata » 1. Analyze more realistic questions » 2. Acquire new constituencies and stakeholders » 3. Build trust; reduce suspicion » 4. Replicate findings » a. use standards of UNSD, Eurostat, ISCO, ISCED, etc. » b. facilitate comparative research in time and space » 5. Calculate marginal effects » 6. Assess data quality » …and much, much more….

18 www.ipums.org/international UNSD Principles and Recommendations (Rev. 1, 1997) endorse dissemination of census microdata » §1.218: “There are a range of methods…that can be used to make such microdata available while still protecting individuals’ rights to privacy.” » 2006 Africa Symposium on Statistical Development (Cape Town, Jan 30-Feb. 2, 2006) » “microdata may be disseminated provided that confidentiality is preserved” » Most (all?) advanced statistical agencies make census microdata available (some more widely than others). Since the: » 1960s: USA, Finland, France, Korea, plus 18 Latin American countries » 1970s: Canada, Czechoslovakia, Japan, Malaysia, Norway, Philippines » 1980s: Australia, Italy, Spain, Thailand, plus many Asian countries » 1990s: Germany, Russia, Switzerland, UK, plus many other countries » In four decades of distributing census microdata there is not a single allegation of violation of confidentiality or privacy.

19 4. Invitation to participate in IPUMS-International (10 slides)

20 www.ipums.org/international What is IPUMS-International? …a global collaboratory of National Statistical Institutes & Universities to: » 1. Inventory the world’s census microdata » 2. Archive census microdata and documentation * * * » 3. Integrate census microdata » a. use standards of UNSD, Eurostat, ISCO, ISCED, etc. » b. facilitate comparative research in time and space » 4. Anonymize census microdata to preserve statistical confidentiality, using highest standards » 5. Disseminate restricted access, custom extracts to approved researchers/research projects at no cost

21 www.ipums.org/international IPUMS-International (2009): 130 high precision samples 44 countries, 279.5 million person records Country Censuses SamplesFrance 1962-1999 6Netherlands 1960-2001 3 Argentina 1970-2001 4Ghana 2002 1Palestine 1997 1 Armenia 2001 1Greece 1981-2001 3Panama 1960-2000 5 Austria 1971-2001 4Guinea(Conakry)1983-1996 1Philippines 1990-2000 3 Belarus 1999 1Hungary 1970-2001 4Portugal 1981-2001 3 Bolivia 1976-2001 3India 1983-1999 4Romania 1977-2001 3 Brazil 1960-2001 3Iraq 1996 1Rwanda 1991-2002 2 Cambodia 1998 1Israel 1972-1995 3Slovenia 2001 1 Canada 1971-2001 4Italy 2001 1South Africa 1991-2007 3 Chile 1960-2002 5Jordan 2004 1Spain 1981-2001 3 China 1982-1990 2Kenya 1989-1999 2Uganda 1991-2002 2 Colombia 1964-2005 5Kyrgyz Republic 1999 1United Kingdom 1991-2001 2 Costa Rica 1963-2000 4Malaysia 1970-2000 4United States 1960-2005 6 Ecuador 1962-2001 5Mexico 1960-2005 5Venezuela 1971-2001 4 Egypt 1996 1Mongolia 1989-2000 2Vietnam 1989-1999 2

22 IPUMS-International strengths 1. Uniform legal authorization with each National Statistical Office 2. Access restricted to bona fide researchers with need 3. MPC Experienced integration teams 4. MPC Proven web-based distribution system 5. High user satisfaction 6. NSO: Improved research and empirically based policy-making 7. Sustainable: NSF, NIH funded through 2014

23 Legal: NSO (Austria) and U. of Minnesota

24 www.ipums.org/internationalChileMéxicoCodeLabel1992200219902000 0NIUXXXX ACTIVE (In Labor Force) 100 EMPLOYED, not specified EMPLOYED, not specified···· 110 At work At workXXXX 111 At work, and 'student' At work, and 'student'···X 112 At work, and 'housework' At work, and 'housework'···X 113 At work, and 'seeking work' At work, and 'seeking work'···X 114 At work, and 'retired' At work, and 'retired'···X 115 At work, and 'no work' At work, and 'no work'···X 116 At work, and 'other' At work, and 'other'···X 117 At work, family holding, not specified At work, family holding, not specified···· 118 At work, family holding, not agricultural At work, family holding, not agricultural···· 119 At work, family holding, agricultural At work, family holding, agricultural···· 120 Have job, not at work last week Have job, not at work last weekXXXX IPUMS—Microdata integration method: composite codes (multiple digits) retains not only significant distinctions but also integrates comparable concepts

25 www.ipums.org/international Metadata : Employment Status Metadata : Employment Status EMPSTAT Employment status Description EMPSTAT indicates whether or not the respondent was part of the labor force -- working or seeking work -- over a specified period of time. Depending on the sample, EMPSTAT can also convey further information. The first digit of EMPSTAT is fully comparable, and classifies the population into three groups: employed, unemployed, and inactive. The combination of employed and unemployed yields the total labor force. The second and third digits of EMPSTAT preserve additional information available for some countries and census years but not for others. Employment status is sometimes referred to in other sources as "activity status." Comparability -- General The age of persons to whom the question applies varies across the samples (see Universe). The reference period for the employment status question varies. For most samples, employment status was reported with respect to the day of the census or…

26 www.ipums.org/international Comparability -- Mexico The universe and reference period are fully comparable across the Mexico samples. The 1970 Census did not provide detail on the inactive population except for "houseworkers," while the later samples have numerous subcategories. In 1990, the employment status question refers to "Principal Activity" and therefore under- reports secondary economic activity by students, housewives, family-workers, the semi- retired, and others. The 2000 Census sought to overcome deficiencies in reporting work status for people whose primary activity was not work (students, housewives, retirees, etc.), but who in fact were working according to international definitions. A second question introduced for the first time in 2000 sought to capture this secondary economic activity. For strict comparability with earlier Mexican censuses, this recovered activity (coded “at work and …”) should be considered "inactive." … Integrate: retain all significant detail, harmonize everything Not standardize: force square pegs in round holes

27 www.ipums.org/international 6 steps using https://www.ipums.org/international: 1. Logon w/ password 2a. Study documentation 2b. Design extract 3. Receive email; logon with p/word 4. Download extract (SSL encrypted) 5. UnZip data (also SAS, STATA) 6. Analyze

28 Asian initiative work plan (3 years): 1. Establish legal foundations: negotiate Memorandum of Understanding: National Statistical Institute (NSI) & Minnesota Population Center (MPC) 2. Entrust copies of microdata and documentation to project (NSI) 3. License microdata (MPC pays US$5,000 per census to NSI, upon receipt of microdata, documentation and invoice) 4. Design regional harmonization protocols census-by-census, concept-by-concept, code-by-code and write integrated metadata (MPC) 5. Impose confidentiality protections customized for each census 6. Disseminate microdata to licensed users (MPC, NSI) free of charge 7. Cooperate with regional partners in education and training Project pays all costs, including: » License fee to participating National Statistical Institute » Producer/User workshop, Durban, South Africa, 2009 (ISI)

29 IPUMS-EurAsia: Will your statistical institute participate? » Formalities: 1. Sign Memorandum of understanding 2. Entrust Microdata and documentation to project 3. Collect license fee » 2009+: advise on technical details as needed; workshops as funding permits » 2011: ISI meeting, Dublin, Ireland Inauguration of integrated database with 180 census samples » 2013: ISI meeting, Hong Kong SAR, China Inauguration of integrated database with 220 census samples

30 5. Conclusions (3 slides)

31 Benefits from Disseminating Census Microdata » National Statistical Institutes 1. Manage statistics for more equitable societies 2. Increase trust, transparency and stakeholders 3. Increase usage, better science and policy 4. Enhance cost-benefit ratio 5. Little marginal cost (project pays $5,000 per census) » Citizens, Society and Government: 1. Who we are 2. What the future may bring 3. How policies might be improved

32 Integrating Asian Census Microdata dark green = disseminating medium green = integrating lightest green = talking Respectful invitation to the National Statistical Offices of: Afghanistan Bhutan Iran DPR Korea DPR Laos Maldives Sri Lanka Timor Leste

33 What needs to be done to participate? » National Statistical Office: 1. Endorse project Memorandum of Understanding--80+ countries 2. Entrust copies of documentation & microdata to MPC--75+ countries 3. Invoice for each census $1,000 per census for less than one million person records $5,000 per census for one million or more person records » MPC: 1. Endorse project Memorandum of Understanding– Afghanistan?, Bhutan?, Iran?, DPR Korea?, DPR Laos?, Maldives?, Timor Leste? 2. Pay license fee for microdata and documentation– Indonesia!! 3. Digitize metadata and translate to English– Pakistan!! 4. Harmonize microdata– Cambodia!! 5. Disseminate microdata with copies on CDs to each NSO– Vietnam!!

34 www.ipums.org/international Thank you!! https://www.ipums.org/international additional information at: www.hist.umn.edu/~rmccaa/IPUMSI * * * * * * Contact: rmccaa@umn.edu


Download ppt "Disseminating integrated census microdata to academic researchers and policy makers at no cost (plus we pay US$1-5,000 per census to the NSO-owner) * *"

Similar presentations


Ads by Google