Working with the data

Slides:



Advertisements
Similar presentations
1 ACS Data Products for Use in Transportation Planning: 2004 and Beyond By Phillip Salopek Chief, Journey to Work and Migration Statistics Branch Population.
Advertisements

Census Transportation Planning Products Program Penelope Weinberger CTPP Program Manager - AASHTO 13 th Annual TRB Planning Applications Conference, Reno,
U.S. Census and American Community Survey Overview Open a web browser and go to:
But what about the ACS? What is the American Community Survey? Replacement of the Long Form Continuous Survey Methodology Conducted Monthly.
The American Community Survey (ACS) is a relatively new survey conducted by the U.S. Census Bureau. It uses a series of monthly samples to produce annually.
Census Transportation Planning Package. Mid-Region Council of Governments.
11 ACS Public Use Microdata Samples of 2005 and 2006 – How to Use the Replicate Weights B. Dale Garrett and Michael Starsinic U.S. Census Bureau AAPOR.
1 Case Study 1: How to Deal with Estimates with Low Reliability 2009 Population Association of America ACS Workshop April 29, 2009.
The American Community Survey (ACS) Lisa Neidert NPC Workshop: Analyzing Poverty and Socioeconomic Trends Using the American Community Survey July 12 –
Technical Issues Associated with the American Community Survey Lisa Neidert NPC Poverty/American Community Survey Workshop June 22-26, 2009.
1 Confidence Interval for the Population Proportion.
The American Community Survey (ACS) Lisa Neidert McCormick Specialized Training Institute October , 2009.
The American Community Survey (ACS) Lisa Neidert NPC Workshop: Analyzing Poverty and Socioeconomic Trends Using the American Community Survey June 22 –
Technical Issues Associated with the American Community Survey Lisa Neidert NPC Poverty/American Community Survey Workshop July , 2010.
11 American Community Survey Summary Data Products.
The American Community Survey (ACS) Lisa Neidert NPC Workshop: Analyzing Poverty and Socioeconomic Trends Using the American Community Survey June 23 –
1 The American Community Survey (ACS) 2005 Data Release.
Acquiring socio-economic and business data for neighborhood analysis Open a web browser and go to: Barbara Parmenter Tufts.
1 Commuting and Migration Data Products from the American Community Survey Journey-to-Work and Migration Statistics Branch U.S. Census Bureau State Data.
Commuting in America Using the ACS to Develop a National Report on Commuting Patterns and Trends Penelope Weinberger, CTPP Program Manager, AASHTO ACS.
APDU Webinar User Needs for Calculating Standard Errors in the ACS OR What is a Statistical Calculator? Presented by Doug Hillmer, Independent Consultant.
American Community Survey Continuous Survey Methodology 250,000 Households sampled per month About 1 in 40 Households sampled per.
Census Transportation Planning Products (CTPP) Data Products June 18, 2010.
Equal Employment Opportunity (EEO) Special Tabulation by Jennifer Cheeseman Day Presentation for the State Data Centers Annual Meeting October 15, 2010.
1 Using the American Community Survey with American Factfinder CTPP Webinar Dec 2, 2008 Melissa Chiu, CTPP Coordinator Journey to Work and Migration Statistics.
11 The American Community Survey Steve Murdock, Ph.D. Director, Hobby Center for the Study of Texas Rice University.
Case Study 3: Making Comparisons 2009 Population Association of America ACS Workshop April 29, 2009.
1 What is a “Statistical Calculator”? Presented by Doug Hillmer Independent Consultant.
Impacts of Sample Sizes in the American Community Survey Northwestern University Transportation Center.
Prepared by the North Dakota State Data Center July Using the American Community Survey for Rural Research Dr. Richard Rathge Professor and Director.
American Community Survey Presented at the Meeting of the National Neighborhood Indicators Partnership Susan Schechter May
Issues Related to Data Dissemination in Official Statistics Presented at the European Conference On Quality in Official Statistics Helsinki, Finland May.
But what about the ACS? What is the American Community Survey? Replacement of the Long Form Continuous Survey Methodology Conducted Monthly.
Census Transportation Planning Products Program Penelope Weinberger CTPP Program Manager - AASHTO GIS in Public Transportation Conference (September 14,
1 Commuting and Migration Data Products from the American Community Survey Journey-to-Work and Migration Statistics Branch U.S. Census Bureau 2010 State.
Using 5-year ACS for Transportation Planning Applications Elaine Murakami FHWA Office of Planning (in Seattle) 1.
The World of Census DATA according to Ed Christopher FHWA Resource Center Planning Team an update of relevant activities.
1 Journey-to-Work Data in the American Community Survey (ACS) May 17, 2009 TRB Transportation Planning Applications Conference Federal Data for Modelers.
Exploring Error and the American Community Survey.
1/26/09 1 Community Health Assessment in Small Populations: Tools for Working With “Small Numbers” Region 2 Quarterly Meeting January 26, 2009.
MBA7020_04.ppt/June 120, 2005/Page 1 Georgia State University - Confidential MBA 7020 Business Analysis Foundations Descriptive Statistics June 20, 2005.
1 Public Transportation Data in the American Community Survey (ACS) and Census Transportation Planning Products (CTPP) Dec 3, 2009 AASHTO Standing Committee.
Chicago Traffic Analysis Zones 9-Counties 1990 Population: 7,429,181 Area (sq. miles): 137 Number of zones: 14,127 People per zone: 526 Resident workers:
Case 5 Introduction to Demographic Research Using Aggregated ACS Data for Ecological Regression: Changes in County Poverty Katherine Curtis Adam Slez Jennifer.
Using the American Community Survey (ACS) Maryland Sate Data Center Affiliate Meeting April 4, 2007.
By C. Kohn Waterford Agricultural Sciences.   A major concern in science is proving that what we have observed would occur again if we repeated the.
Using the ACS: Issues with studying small areas and change over time Presented to Association of Public Data Users January 20, 2011.
1 Things That May Affect Estimates from the American Community Survey.
American Community Survey Maryland State Data Center Affiliate Meeting June 17, 2008.
American Community Survey Getting the Most Out of ACS Jane Traynham Maryland State Data Center.
American Community Survey Maryland State Data Center Affiliate Meeting September 16, 2010.
American Community Survey (ACS) 1 Oregon State Data Center Meeting Portland State University April 14,
Using ACS and Census 2010 in Communities and Neighborhoods: Guidelines and Tools POPULATION REFERENCE BUREAU | PRESENTATION BY MARK MATHER.
Confidence Intervals for Proportions Chapter 8, Section 3 Statistical Methods II QM 3620.
Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance.
American Community Survey “It Don’t Come Easy”, Ringo Starr Jane Traynham Maryland State Data Center March 15, 2011.
Things that May Affect the Estimates from the American Community Survey Updated February 2013.
Prepared by the North Dakota Kids Count Sept Using the American Community Survey for Children’s Research Dr. Richard Rathge Policy Analyst North.
The American Community Survey: The Census Bureau’s new annual survey of America Will “Chip” Sawyer Vermont State Data Center.
American Community Survey (ACS) Product Types: Tables and Maps Samples Revised
1 ACS Statistical Issues and Challenges: One-, Three-, and Five-year Period Estimates Alfredo Navarro U.S. Census Bureau Association of Professional Data.
AASHTO & FHWA Appeal re: DRB “rule of three” decision before the Data Stewardship Executive Policy Committee 8/28/2008.
Statistical Significance Testing Why do it? When to do it? How to it?
Learning the CTPP Data Access Software Basics Penelope Weinberger CTPP Program Manager – AASHTO Liang Long CTPP Technical Guru – Cambridge Systematics.
Emerging Issues for Transportation Data Users of ACS Elaine Murakami, FHWA TRB Planning Applications Conference, April 26,
Census Data-Strictly Business?:
Working with the data.
Journey-to-Work and Migration Statistics Branch U.S. Census Bureau
By C. Kohn Waterford Agricultural Sciences
Data issues with the American Community Survey
Presentation transcript:

Working with the data

Where to begin? Have you come across any ACS data issues in your work? 1.Sample Error (90% Confidence) 2.Collapsing 3.Period Estimates 4.Reliability 5.Dollar Values 6.Trend Analysis 7.Weighing Change 8.Light Rail 9.Reweighting 10.CTPP Issues 11.Block Group data

You must do Statistical Significance Tests To avoid false statements like “Based upon data from the 2000 Census (CTPP) and the ACS, the total number workers who live in Flagstaff increased along with the number who took transit to work. During the same time, the number of people who worked at home increased along with those who drove alone and carpooled.” The World Gazette Commutes increase for all modes Sampling Error

How do you do a Significance Test? 1. Get the Margin of Error (MOE) from ACS 2. Calculate the Standard Error (SE) [SE = MOE / 1.645] 3. Solve for Z where A and B are the two estimates 4. If Z Difference is Significant at 90% confidence It is simpler than it looks and there are a lot guides

Some things to keep in mind Obtaining Standard Errors is the Key Formulas vary depending comparisons Sum or Difference of Estimates Proportions and Percents Means and Other Ratios Working with 2000 data will be a little more involved There are resources to help

The ACS compass handbooks A Compass for Understanding And Using ACS Data  Set of user-specific handbooks  Train-the trainer materials  E-learning ACS Tutorial  Annotated Presentations Especially Appendix 3

NY State Data Center Calculator t-to-calculate-acs-margins-of-error-and-statistical- significance-for-sums-proportions-and-ratios/

But what if I am using 2000 non-ACS Data? You will need to Estimate the MOE and know the Survey Design Factor

The CUTR Guide has you covered and a Spreadsheet Calculator There’s a Report

Transportation resources linepubs/nchrp/nchrp_rpt_ 588.pdf

Understanding the MOE Part 1, Profile 1 (Resident data) Using the MOE We know the number of workers has changed, but what is the range of that change? A. 5,744? B. 5,072 to 6,416? C. 3,888 to 7,600?

Another Flagstaff point Part 1, Profile 1 (Resident data) Part 2, Profile 1 (Workplace data) Between the reference period what has the number of people who took transit to work in Flagstaff done? A. Gone Up? B. Gone Down? C. No significant Change Which Table would you use and why?

Two types of Collapsing

C MEANS OF TRANSPORTATION TO WORK - Universe: WORKERS 16 YEARS AND OVER Data Set: American Community Survey 3-Year Estimates Collapsed table Full table not available Sometimes neither tables exist And MOEs are greater than estimate Population = 26,566

“B” and “C” Tables B08006 C08006 Means of Transportation

“B” and “C” Tables

Full and collapsed table What do you notice about the Table?

Some things to be aware of What year is the data? Period Estimate

Reliability/Currency What data is more reliable? Which is more current?

Dollar Values and Income tables ACS asks-- What was your income during the last 12 months? Single Year Estimates 12 different periods Each adjusted to single period (Jan to Dec) Multiyear Estimates Each year adjusted to current year

About Trend Analysis Trend analysis ( overlapping syndrome ) If you are doing trend analysis with multi-year estimates you can not compare successive period estimates due to the overlapping middle years. Also, you can not compare a 3-year estimate with a 5-year estimate

Change in Weighting In 2009 changed to using sub- county totals as opposed to just county totals

Change in Weighting Detroit Example “Detroit is the poster child for odd looking data”

Change in Weighting (Analysis) In 2009 changed to using sub- county totals as opposed to just county totals

Source: 2000 CTPP and 2007ACS3, CTPP Data Profile 1 Light Rail Conundrum Impact of New “Light Rail” systems might not be showing up

One more thing on Pop Estimates The older estimates get revised every year but the ACS does not get reweighted Maricopa County Population Estimates

DRB Said… “Too many variables” crossed with Means of Transportation (Mode) …makes for micro data record …and with a micro data record you could identify an individual Now let’s focus on the CTPP data But First a word on Disclosure - 3 year tables

We Said… Census Said… No, You can’t identify an individual -- Hired a statistical consultant < 0.01% -- Had a hearing with DRB Bosses -- Made every argument possible Tough Luck --Compress your Modes and improve your chances of passing our rules -- Chop your cross tabs to 5 variables The Battle Ensued

What we ended up with – for 3 year Tables Five (5) Variables crossed with Means of Transportation to work (MOT) …and

A boat load of collapsing of the Modes …and

Disclosure Rules 7.For Worker Flows Must have 3 unweighted records for each O-D pair Does not apply to Total Workers or Workers by Mode to Work (all 18 modes) (means of transportation) Rule 7 was the killer For the 5-year CTPP

So What Did We Do? NCHRP Web Report 180 ($550K) Producing Transportation Data Products from the ACS that Comply With Disclosure Rules 5-year CTPP will have two types of tables Tables that passed Census Rules Tables with Perturbation done to them Privacy Protection

Table Summary using 5-year Table list Means of transportationAggregate Vehicles Used Aggregate Travel TimeMean HH Income Aggregate HH IncomeAggregate Carpools Almost all Part 3 Tables Tables Using Perturbed Data Set

Still left with some Disclosure Rules 1. All Tables Rounded 0 = 0, 1-7 =4, 8 or > = nearest multiple of 5 2.Any number that ends in 5 or 0 stays as is 3.Aggregate dollar values rounded to nearest Aggregate minutes to work and aggregate vehicles use standard rounding 5.Totals Rounded independently of cells 6.Medians or quintiles not subject to rounding 7.Percentages and rates calculated after rounding 8. Medians and aggregates must be based on 3 or more values For All tables Regular (A) + Perturbed (B)

Still left with some Disclosure Rules 1.Cell Suppression: For Tables (unweighted sample count of the population), (percent of population in sample), (total housing units sampled), and (percent of housing units sampled), there must be 0 or at least 3 or more occupied housing units in sample to show the table 2.Table Suppression: Aggregates and Means must have at least 3 unweighted cases to be shown. The policy of the ACS program is that if any one cell in a table is suppressed, the whole table is suppressed For Regular (A) Tables Only

Some early issues with the 5-year ACS? Some Very Large MOEs Block Group data only in download area Reliability of tract estimates is much lower than the 2000 LF NO Workplace Tables! The Census Bureau says: BG data should ONLY be used to build up larger geographic areas because the Margins of Error (MOEs) are too large otherwise (JSM Conference August 2010) Ask Again Later Standard Data Products Ken Hodges, Nielsen (claritas) ACS 5-Year Data: A First Look at the First Release (4.5 MB, ppt) ACS 5-Year Data: A First Look at the First Release

Source: Tract Data-Missouri State Data Center, Block Group Data-AFF AFF all 21 Modes, MSDC all 21 but also collapsed with Total Commuters Added MSDC put a value to MOES. Let’s talk about Block Group Data for a moment

First: Let’s consider MOEs What do you notice? Don’t forget if this was CTPP data it would be Rounded too

Now lets fill in the table CB does not give you Total Commuters but you like that. Can we talk about that for a moment?

Now lets fill in the table How would we get Total Commuters and more importantly the MOEs? For the Estimate totals, just add the relevant estimates. But for MOEs you have some decisions to make

Now lets fill in the table 488 Two different MOE approaches available 1. Calculate the 90% margin of error of the sum of more than two estimates 2. Calculate the 90% margin of error of the sum or difference between two estimated values (What two values would you use?) 1. Gives you an MOE of either 245 when including the MOE for ‘Other Means’ or 214 without it 2 Gives you an MOE 0f 209

What data should I use? Travel Times for the 6- counties in NE Illinois 1. To compare with 1970, ‘80, ‘90 and 2000 Travel Times? 2. To compare with my town of 52K people? 3. To validate my 2008 vintage travel demand model? Learn how to do the Coefficient of Variation Test

The Upside - Data Evolution Once you know all the data issues it is possible to use it intelligently It’s ignorance that kills you Slides available at: edthefed.com edthefed.com