Presentation is loading. Please wait.

Presentation is loading. Please wait.

Accessing Data Collected by the Census Bureau THURSDAY, April 26, 2012.

Similar presentations


Presentation on theme: "Accessing Data Collected by the Census Bureau THURSDAY, April 26, 2012."— Presentation transcript:

1 Accessing Data Collected by the Census Bureau THURSDAY, April 26, 2012

2 Quick Review 2010 decennial data is “short-form” only – limited demographic characteristics; ACS now source of “long-form” type of data Census Data released in two “flavors” – Aggregate data Microdata A third type of data product identifies geographic boundaries Aggregate data released in a variety of products, differing in content, geographic specificity and temporal coverage Microdata has flexibility of individual level information, but balances this by only gross geographic detail

3 Access and Resources Aggregate data resources Microdata resources Geography resources Local resources

4 Access and Resources Aggregate resources: American Factfinder 2, Social Explorer, DataFerrett, Uexplore/Dexter, NHGIS,Historical Census Browser, Geolytics NCDB Microdata resources Online Analysis: SDA & IPUMS Extract/Download: IPUMS, ICPSR, NBER, DataFerrett, Census, Unicon Restricted Use: California Census Research Data Center Geographic: Census, MABLE/Geocorr, IPUMS, NHGIS Documentation: IPUMS, AFF2, ICPSR Visualization: Social Explorer, Historical Census Browser, AFF2 Local Resources: DOF/DRU, SDCs, UC DATA, DataLab, CCRDC

5 Resources: Aggregate Census Data ResourceDatasetsTemporal Coverage Geographic Grain Ease of Use ContextNotes American FactFinder II Decennial, ACS, Economic Census, Pop Est., ++ 2000- Current Block to National MixedGlossaries, Links to Tech Doc  First point of release  Broad array of datasets  Multiple ways to narrow search  “Deep Linking”  Limited Historic Data Social ExplorerDecennial, ACS, 1790-2010Tract to National Very Good Links under “Data”  Both Map and Data interface  “Canned” reports  UCB Library (5 concurrent, proxy) DataferrettDecennial (old), ACS, SAIPE, CBP, ++ 1990-2010 (varies by dataset) Block to Nation MixedLimited Metadata  Mixed in terms of currency  Can be very slow  Both Aggregate and Microdata Uexplore/DexterDecennial, ACS, PopEst, SAIPE, BLS, CBP 1980-2010Block to Nation Non- Intuitive Very nice guides on Geography  Great if familiar with interface, but steep learning curve NHGIS1790-2010Only US, State, and County until 1910. 1910 onward larger lists Clunky Limited  Register for account (free)  My “go-to” site for historical geography  (now partners with Social Explorer) Historical Census Browser 1790-1960State, County, no US totals) EasyLimited  quick comparison over decades

6 Just When You thought it was safe….. American Factfinder II

7 The “new” American Factfinder

8 Data Search Strategy Specification act as a “sieve” – eliminating non-conforming tables, data, geographies, etc. So…. Pick your most limiting conditions first. – Need very detailed geography? Pick that first. – Know your base dataset? Pick that early. – Need data for 2010? Limit your search from the start Think about “bookmarking” for geographies or items you’ll return to

9

10

11

12

13 Alternative to AFF: FTP Full Files

14 AFF 2: Deep Linking

15 Deep Linking in AFF2 http://factfinder2.census.gov/legacy/AFF_deep_linking_guide.pdf factfinder2.census.gov/bkmk/table/version/lang /program/dataset/product [/geo_id[|geo_id]*][/codetype~code[|code]*]*

16 Deep Linking

17

18 Same FTP Options for ACS

19

20

21

22

23 Historical Census Data Browser

24 Example: Class Assignment Describe, in broad terms, the demographics of the Fruitvale community. Population size, SES, race, ethnicity, nativity, age, education, occupation, etc.. Over time? Questions. – What is Fruitvale? A place? A CDP? A neighborhood? – How will we define our geography? Does this limit anything? – What data sets are available for evaluation? – What data items do we want?

25 http://www.acphd.org/media/53462/fruitvale.pdf Tracts 4061-4063, 4065-4066, 4070-4072

26

27 Deep Linking approach

28 http://factfinder2.census.gov/bkmk/table/1.0/en /ACS/10_5YR/B01001A /1400000US06001406100 |1400000US06001406201|1400000US06001406202 |1400000US06001406300|1400000US06001406400 |1400000US06001406500|1400000US06001406601 |1400000US06001406602|1400000US06001407000 |1400000US06001407101|1400000US06001407102 |1400000US06001407200

29 Micro-data Resources

30 Survey Documentation and Analysis (SDA) and the Integrated Public Use Microdata Samples (IPUMS)

31 The Integrated Public Use Microdata Samples

32 www.ipums.orgwww.ipums.org at the Minnesota Population Center www.ipums.org IPUMS-USA IPUMS-USA Harmonized data on people in the U.S. census and American Community Survey, from 1850 to the present IPUMS-USA Harmonized data on people in the U.S. census and American Community Survey, from 1850 to the present. IPUMS-USA IPUMS-CPS IPUMS-CPS Harmonized data on people in the Current Population Survey, every March from 1962 to the present IPUMS-CPSImportant! Harmonized: Questions asked change over time: How to make data comparable? Integrated: Multiple data collections & surveys simultaneously available Microdata: The underlying individual-level data is available, not just pre-defined tables.

33 The American Community Survey and the Current Population Survey CPS – Long-running monthly survey (dating back to the 1940’s) focused on labor force characteristics (unemployment, earnings, hours worked). ~ 55,000 sample HH’s, multiple interviews, personal In addition to the basic monthly questions, additional modules are “piggy-backed” onto the survey to provide more depth on particular topics. Most widely used supplement is the Annual Social and Economic Supplement (ASEC) - aka Annual Demographic Survey or the March Files. (~100,000 HH’s) In-depth survey – lots of detail about sources of income, work, occupational, hours, etc. (as well as core demographic information on race/ethnicity, nativity, age, sex, educataion)

34 The American Community Survey and the Current Population Survey ACS – “New” continuous survey, replaces the long form of the decennial census, first fully implemented in 2005 (non- institutionalized) and 2006 (institutionalized). ~ 2,000,000 HH’s annually, mixed mail-in/personal interviews Substantial overlapping content with CPS Broader range of content, somewhat less detail Larger sample sizes allow for greater geographic detail

35 The American Community Survey and the Current Population Survey Aggregate Microdata vs.

36 The Integrated Public Use Microdata Samples www.ipums.orgwww.ipums.org at the Minnesota Population Center www.ipums.orgStrengths: Tremendous centralized documentation Many “value-added” data items Wonderful extraction engine (if downloading data) Multiple statistical Packages supported Online Analysis also possible

37 The Integrated Public Use Microdata Samples Online Analysis Links

38 What is SDA? What can you do with SDA? The parts of the SDA interface MenuMenu Variable ListVariable List Active variablesActive variables Analysis SpecificationAnalysis Specification The Basics of SDA

39 1. Parts of the SDA interface 2.Finding data/variables/subjects - search - documentation 3.Analysis - Components - rows, columns, selection, controls Procedures - crosstabs, means, correlations 4. Aids in Analysis Recoding Recoding Saving new variables Saving new variables Downloading Downloading Part II. Working with SDA

40 What is SDA? SDA (Survey Documentation and Analysis) is a set of programs for the documentation and Web-based analysis of survey data. It was developed and is maintained by the Computer-assisted Survey Methods Program (CSM) at UC Berkeley. It was developed as a companion program with CASES (Computer Assisted Survey Execution Program), a package for collecting survey data based on structured questionnaires, using a variety of modes of data collection. It operates on a transposed file structure, which makes analysis of datasets, especially large datasets, extremely fast. The Basics of SDA

41 What is SDA? SDA (Survey Documentation and Analysis) is a set of programs for the documentation and Web-based analysis of survey data. It was developed and is maintained by the Computer-assisted Survey Methods Program (CSM) at UC Berkeley. It was developed as a companion program with CASES (Computer Assisted Survey Execution Program), a package for collecting survey data based on structured questionnaires, using a variety of modes of data collection. It operates on a transposed file structure, which makes analysis of datasets, especially large datasets, extremely fast. Part I. The Basics of SDA

42 What data is available in SDA? LOTS! Many popular social science datasets (e.g. the GSS, the ANES, the PUMS from the Decennial Census, the ACS, the CPS Annual Demographic Files,…… can be found in SDA format. Many archives (ICPSR, IPUMS, CPANDA, Roper, SDA, UCDATA….) provide at least some of their holdings in SDA format. Part I. The Basics of SDA

43 Multiple Census Samples at IPUMS ( http://usa.ipums.org/usa/sda/)

44 And CPS (March files) data, as well (http://cps.ipums.org/cps/sda/)

45 What can you do with SDA? SDA can be used to: learn about a dataset (metadata, paradata)learn about a dataset (metadata, paradata) search for variables of interestsearch for variables of interest investigate sample sizes and variable distributionsinvestigate sample sizes and variable distributions perform statistical analysesperform statistical analyses transform, manipulate and create variables for each unittransform, manipulate and create variables for each unit extract and download subsets or full datasetsextract and download subsets or full datasets The Basics of SDA

46 The four parts of the SDA interface Action MenuAction Menu Variable ListVariable List Active VariableActive Variable Analysis SpecificationAnalysis Specification Part I. The Basics of SDA

47 Action Menu

48

49 Collapsed Variable Tree

50 Active Variables

51 Analysis Specification

52 2.Finding data/variables/subjects Online SDA codebook Online SDA codebook IPUMS detailed documentation

53 Analysis – Components - rows, columns, selection, controls Procedures - crosstabs, means, correlations Screens will vary depending upon what procedure you are using. Start with exploratory – frequencies, cross-tabulations Working with SDA

54 The variables you are interested in Who to include in the table

55

56 Aids in Analysis Recoding Recoding Saving new variables Saving new variables Downloading Downloading Part II. Working with SDA

57 age (5-18) Selects, but does not collapse age (r: 5-18)Selects AND Collapses age (d: 5-18)Collapses, but does not select age (c:13,5)Collapses into categories of width w age (c:st,w) starting with value st Recoding variables – on the fly Recoding variables – Web interface Can be used in row, column, control (Crosstabs)

58 Question 1: Use the CPS or ACS? Question 2: What is the desired level of analysis (person, family, household)? Question 3: Who should be excluded? (How to limit to family households, or only (How to limit to family households, or only particular age groups, or….? particular age groups, or….?

59

60 DataFerrett Content

61

62

63

64 Geographic Resources

65

66

67

68

69

70 Selected Data Resources at Berkeley Library Data Lab http://sunsite3.berkeley.edu/wikis/datalab// SDA (Survey Documentation & Analysis) http://sda.berkeley.edu/ Statewide Database http://swdb.berkeley.edu/ California Census Research Data Center http://www.ccrdc.ucla.edu/ The Econometrics lab http://emlab.berkeley.edu/data2.shtml Thomas J. Long Business & Economics Library http://www.lib.berkeley.edu/BUSI/electres.html

71 Questions/Comments email me at: jons@berkeley.edujons@berkeley.edu http://ucdata.berkeley.edu


Download ppt "Accessing Data Collected by the Census Bureau THURSDAY, April 26, 2012."

Similar presentations


Ads by Google