Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance.

Slides:



Advertisements
Similar presentations
Introduction to the American Community Survey North Carolina State Data Center North Carolina Office of State Budget & Management.
Advertisements

Using American FactFinder John DeWitt Project Manager Social Science Data Analysis Network Lisa Neidert Data Services Population Studies Center.
Hypothesis Testing Steps in Hypothesis Testing:
STANDARD ERRORS PRESENTATION AND DISEMINATION AT THE STATISTICAL OFFICE OF THE REPUBLIC OF SLOVENIA Rudi Seljak Statistical Office of the Republic of Slovenia.
The American Community Survey (ACS) is a relatively new survey conducted by the U.S. Census Bureau. It uses a series of monthly samples to produce annually.
11 ACS Public Use Microdata Samples of 2005 and 2006 – How to Use the Replicate Weights B. Dale Garrett and Michael Starsinic U.S. Census Bureau AAPOR.
1 Case Study 1: How to Deal with Estimates with Low Reliability 2009 Population Association of America ACS Workshop April 29, 2009.
Using American FactFinder John DeWitt Project Manager Social Science Data Analysis Network Lisa Neidert Data Services Population Studies Center.
Technical Issues Associated with the American Community Survey Lisa Neidert NPC Poverty/American Community Survey Workshop June 22-26, 2009.
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Evaluation.
Why sample? Diversity in populations Practicality and cost.
Chapter 9 Audit Sampling: An Application to Substantive Tests of Account Balances McGraw-Hill/Irwin ©2008 The McGraw-Hill Companies, All Rights Reserved.
Methods and Measurement in Psychology. Statistics THE DESCRIPTION, ORGANIZATION AND INTERPRATATION OF DATA.
Technical Issues Associated with the American Community Survey Lisa Neidert NPC Poverty/American Community Survey Workshop July , 2010.
11 American Community Survey Summary Data Products.
Technical Issues Associated with the American Community Survey Lisa Neidert NPC Poverty/American Community Survey Workshop June 23-27, 2008.
11 American Community Survey Data Products. 2 What do I need to know before using ACS data and data products?
1 The American Community Survey (ACS) 2005 Data Release.
APDU Webinar User Needs for Calculating Standard Errors in the ACS OR What is a Statistical Calculator? Presented by Doug Hillmer, Independent Consultant.
Cover Letters for Survey Research Studies
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
American Community Survey Continuous Survey Methodology 250,000 Households sampled per month About 1 in 40 Households sampled per.
Case Study 3: Making Comparisons 2009 Population Association of America ACS Workshop April 29, 2009.
Household Surveys ACS – CPS - AHS INFO 7470 / ECON 8500 Warren A. Brown University of Georgia February 22,
1 What is a “Statistical Calculator”? Presented by Doug Hillmer Independent Consultant.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Describing Data: Numerical
Determining Sample Size
Issues Related to Data Dissemination in Official Statistics Presented at the European Conference On Quality in Official Statistics Helsinki, Finland May.
111 American Community Survey Fundamentals 2009 Population Association of America ACS Workshop April 29, 2009.
Plans For the First Release of American Community Survey 5-year Estimates Prepared for the Joint Meetings of the SDC and CIC Steering Committees February.
1/26/09 1 Community Health Assessment in Small Populations: Tools for Working With “Small Numbers” Region 2 Quarterly Meeting January 26, 2009.
MBA7020_04.ppt/June 120, 2005/Page 1 Georgia State University - Confidential MBA 7020 Business Analysis Foundations Descriptive Statistics June 20, 2005.
Working with the data
Case 5 Introduction to Demographic Research Using Aggregated ACS Data for Ecological Regression: Changes in County Poverty Katherine Curtis Adam Slez Jennifer.
Using the American Community Survey (ACS) Maryland Sate Data Center Affiliate Meeting April 4, 2007.
Using the ACS: Issues with studying small areas and change over time Presented to Association of Public Data Users January 20, 2011.
1 Things That May Affect Estimates from the American Community Survey.
American Community Survey Getting the Most Out of ACS Jane Traynham Maryland State Data Center.
American Community Survey Maryland State Data Center Affiliate Meeting September 16, 2010.
1 The American Community Survey An Update Pamela Klein American Community Survey Office Washington Metropolitan Council on Governments Cooperative Forecasting.
American Community Survey (ACS) 1 Oregon State Data Center Meeting Portland State University April 14,
Using ACS and Census 2010 in Communities and Neighborhoods: Guidelines and Tools POPULATION REFERENCE BUREAU | PRESENTATION BY MARK MATHER.
American Community Survey “It Don’t Come Easy”, Ringo Starr Jane Traynham Maryland State Data Center March 15, 2011.
Things that May Affect the Estimates from the American Community Survey Updated February 2013.
The American Community Survey: The Census Bureau’s new annual survey of America Will “Chip” Sawyer Vermont State Data Center.
Section 10.1 Confidence Intervals
American Community Survey (ACS) Product Types: Tables and Maps Samples Revised
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
MBA7025_04.ppt/Jan 27, 2015/Page 1 Georgia State University - Confidential MBA 7025 Statistical Business Analysis Descriptive Statistics Jan 27, 2015.
Household Surveys: American Community Survey & American Housing Survey Warren A. Brown February 8, 2007.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics.
The Importance of Sample Size and Its Varying Effects on Precision in Large-Scale Surveys Dipankar Roy, PhD Bangladesh Bureau of Statistics
Statistical Significance Testing Why do it? When to do it? How to it?
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Chapter 6 Sampling and Sampling Distributions
Census Data-Strictly Business?:
Chapter 9 Audit Sampling: An Application to Substantive Tests of Account Balances McGraw-Hill/Irwin ©2008 The McGraw-Hill Companies, All Rights Reserved.
Working with the data.
Testing for a difference
Chapter 3 Describing Data Using Numerical Measures
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Summary descriptive statistics: means and standard deviations:
Presentation transcript:

Some ACS Data Issues and Statistical Significance (MOEs) Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance Testing & Margins of Error (MOEs)

Table Release Rules February 28, 2007

“B” and “C” Tables

Full Table – PASSED FILTERING Statistically too Small

Collapsed Table

The Census Bureau Story Why did we collect all this data if we were not going to release it?

ACS Data Release Rules Doug Hillmer Data Products Area American Community Survey Office U.S. Census Bureau October 11, 2006

Limitation of Disclosure Risk –The Census Bureau’s Disclosure Review Board (DRB) must clear all data products prior to their release to the public. Assurance of Statistical Reliability –Data users need to be able to use ACS estimates as official Census Bureau data. Thus, some rules must be in place to ensure minimum reliability of estimates. –Statistical reliability is assured by: Population size thresholds below which estimates are not released Data release testing and collapsing of tables that fail The Census Bureau Will Not Release All Available Estimates to the Public

The ACS “Identity Crisis” on Reliability Ultimately, the 5-year estimates, with no “data release rules” acts as a long-form replacement Single-year ACS sample is more like a current demographic survey – although much larger in size Question to answer for single-year estimates: Do we accept less detail in our measures of characteristics or do we allow more detail but with data release rules in place? Less detail punishes those areas with the diversity to support the detail.

Choices for displaying estimates in ACS data products No suppression 1.Publish full detail with no suppression but higher pop threshold (eg., 500,000) 2.Publish limited set of estimates for all areas with 65,000+ pop 3.Published more detailed estimates for higher pop threshold and limited set for lower threshold With suppression or Warnings 4.Define a very detailed set of estimates for all geo areas with 65,000+ pop and suppress estimates that fail reliability test 5.Define a very detailed set of estimates for all geo areas with 65,000+ pop and flag estimates that fail reliability test

Filtering > Goal: to identify “weak” tables Some tables have many zero or “near zero” cells and relatively large standard errors Filtering > rule used during ACS: drop tables if… –Universe is less than 500 (weighted) –Average cell size is less than 2 cases (unweighted) filtering > rule used now: –Accept if median coefficient of variation is less than or equal to 61% –Otherwise, collapse and review again

Why not just use cell suppression as is done for the Economic products? Advantages Gets rid of the “bad” estimates Keeps the “good” estimates (depends on complementary suppression) Disadvantages Creates “holes” in distributions Makes new problems for combined estimates (eg., in derived products, such as data profiles) Produces a new set of problems for year-to-year comparisons

Data Release Testing – Step by Step Compute coefficients of variation –Coefficient of variation = standard error / estimate –Standard error = (upper bound – estimate) / 1.65 –If the estimate = 0 set coefficient of variation = 100% Ignore total and sub-total lines in base table Sort coefficients of variation in descending order Find the middle value (the median) If the median is greater than 61% the table FAILS (median > 61% means more than half of the cells have a lower bound of 0; i.e., these cells are not statistically different from 0) If the median is 61% or less the table PASSES

Collapsing Goal: release a simplified version of a base table for a geographic area that otherwise would get nothing Decisions on design of collapsed tables are made by subject-matter experts at the Census Bureau For operational reasons, only one collapsed version of each base table will be available regardless of geographic area

How the Data Release Rules will Work with Collapsed Versions of Base Tables

More About Collapsing Collapsed Tables are designed to assure that derived products (profiles, ranking tables, subject tables,…) can still be sourced from the base tables 2005 Tables: if a table passes filtering and a collapsed version exists, publish both the original version and the collapsed version for that geographic area

Problems to fix in the current implementation of the data release rules Collapsed versions missing in some cases Collapsed versions that aren’t working Poor choices in “sourcing” for derived products (eg., profiles)

Statistical Significance Testing Why should I do it? When should I do it? How do I do it?

Testing is Important Testing is Important

Estimate X is bigger than Y Estimate X is bigger than Y Estimate X this year is larger than X last year Estimate X this year is larger than X last year Estimate X is smaller than Census 2000 value Estimate X is smaller than Census 2000 value State Z has the highest value State Z has the highest value Statements you might want to make

1.Get the Margin of Error (MOE) from ACS 1. Get the Margin of Error (MOE) from ACS 2. Calculate the Standard Error (SE) [SE = MOE / 1.645] 3. Solve for Z where A and B are the two estimates 4. If Z Difference is Significant at 90% confidence How do I do a significance test?

Obtaining Standard Errors is the Key Sum or Difference of Estimates Sum or Difference of Estimates Proportions and Percents Proportions and Percents Means and Other Ratios Means and Other Ratios Simple Formulas Where….

There is HELP off in the wings

But what if I am using 2000 non-ACS Data? Where’s are my MOEs?

Lets get to work on the Standard Error N = Size of publication area (population) Y = Estimate of characteristic X Survey Design Factor

xx=fl Mode to Work

N = Size of publication area (population = 362,563 ) Y = Estimate of characteristic 5Y = 5* 126, , (Y/N) = 126,540 / 362, SE =

X Survey Design Factor SE = ,540 / 362,563 = 35% Survey Design Factor = 0.7 Final Adjusted SE = 450

Tempting Green is OK This is NOT

Want to do an exercise on your own?