Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 National Center for Health Statistics Record Linkage Program Christine S. Cox, Chief, Special Projects Branch (SPB) Office of Analysis & Epidemiology.

Similar presentations


Presentation on theme: "1 National Center for Health Statistics Record Linkage Program Christine S. Cox, Chief, Special Projects Branch (SPB) Office of Analysis & Epidemiology."— Presentation transcript:

1 1 National Center for Health Statistics Record Linkage Program Christine S. Cox, Chief, Special Projects Branch (SPB) Office of Analysis & Epidemiology (OAE) NCHS Data Users Conference August 12, 2008 U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics

2 2 Overview NCHS Record Linkage Program NCHS Record Linkage Program Analytic Issues & Tools Analytic Issues & Tools Comparative Analysis of Public vs Restricted Linked Mortality Files Comparative Analysis of Public vs Restricted Linked Mortality Files Accessing the Restricted-use Linked Data Accessing the Restricted-use Linked Data

3 3 NCHS Record Linkage Program Links survey data with data collected from administrative records Links survey data with data collected from administrative records Designed to maximize the scientific value of the NCHS population-based surveys Designed to maximize the scientific value of the NCHS population-based surveys Examine factors that influence chronic disease, disability, health care utilization, morbidity, and mortality Examine factors that influence chronic disease, disability, health care utilization, morbidity, and mortality

4 4 Why Do Linkage? Augments available information for major diseases, risk factors, and health service utilization Augments available information for major diseases, risk factors, and health service utilization Links exposures to outcomes Links exposures to outcomes Provides longitudinal component to survey data Provides longitudinal component to survey data Reduces cost burden Reduces cost burden Re-contacting survey respondents for follow-up information can be expensive Re-contacting survey respondents for follow-up information can be expensive Increases accuracy and detail of data collected Increases accuracy and detail of data collected

5 5 How Records are Linked NCHS Records SSN Name DoB Sex State of Birth Race State of Residence Marital Status Administrative Records Name DoB Sex State of Birth Race State of Residence Marital Status Non matchesPotential matches Scoring system, clerical review True matchesNon matches Linked Data File NCHS Records SSN Name DoB Sex State of Birth Race State of Residence Marital Status Administrative Records SSN Name DoB Sex State of Birth Race State of Residence Marital Status Non matchesPotential matches Scoring system, clerical review True matchesNon matches Linked Data File

6 6 Research Potential of NCHS Linked Data Aging Aging Risk factors for poor health outcomes (hip fractures, stroke, etc.) Risk factors for poor health outcomes (hip fractures, stroke, etc.) Disability Disability Effects of chronic illness and obesity on disability and mortality Effects of chronic illness and obesity on disability and mortality Disparities Disparities Mortality patterns by race/ethnicity or socioeconomic status Mortality patterns by race/ethnicity or socioeconomic status Health services Health services Functional impairment and health care costs Functional impairment and health care costs Methodologic Studies Methodologic Studies Validation of self-reports vs. administrative records Validation of self-reports vs. administrative records Genetics Genetics Genetic variants and health outcomes Genetic variants and health outcomes

7 7 Record Linkage Activities Mortality Mortality National Death Index National Death Index Social Security Retirement and Disability Social Security Retirement and Disability Data from the Retirement, Survivors, Disability Insurance (RSDI) and Supplemental Security Income (SSI) programs Data from the Retirement, Survivors, Disability Insurance (RSDI) and Supplemental Security Income (SSI) programs Medicare enrollment and payments Medicare enrollment and payments Enrollment and claims data Enrollment and claims data

8 8 NCHS Linked Mortality Data Files X X XX†XX† XX†XX† X X X XX†XX† XX†XX† Restricted-use XX†XX† NHIS 2001-2004 X NNHS 1985 NNHS 1995, 1997, 2004 XX†XX† NHANES 1999-2004 XX†XX†XX NHANES III (1988- 1994) XX NHANES II (1976- 1980) XX NHEFS (1971-1992) XXX LSOA II (1994-2000) XX†XX†XX NHIS 1986-2000 Public-use Restricted-use Future Linkage (death data through 2006) Completed Linkage (death data through 2000/2002) NCHS Health Surveys † Children included

9 9 Number of Deaths by Survey 3,384NHANES III 4,143NHANES II 6,656NHEFS 3,958LSOA II 121,138NHIS 1986-2000 Total DeathsNCHS Survey NHIS and LSOA II have mortality follow-up through 12/31/2002. NHEFS, NHANES II and III have mortality follow-up through 12/31/2000.

10 10 Public-use Linked Mortality Files In 2007, released public-use files with a limited amount of perturbed data and reduced number of mortality variables In 2007, released public-use files with a limited amount of perturbed data and reduced number of mortality variables NHIS 1986-2000 NHIS 1986-2000 NHANES III NHANES III LSOA II LSOA II Study comparing analyses from public-use and restricted-use linked mortality files demonstrated similar results Study comparing analyses from public-use and restricted-use linked mortality files demonstrated similar results Lochner et al. Am. J. Epidemiol. 2008 168: 336-344 Lochner et al. Am. J. Epidemiol. 2008 168: 336-344

11 11 Mortality Data Elements Vital status Vital status Date of death or follow-up time Date of death or follow-up time Underlying cause of death Underlying cause of death Multiple cause of death* Multiple cause of death* Age at death* Age at death* Age last presumed alive* Age last presumed alive* *only available on restricted-use files *only available on restricted-use files

12 12 Research Potential of Linked Mortality Data Excess Deaths Associated with Underweight, Overweight, and Obesity KM Flegal, BI Graubard, DF Williamson, MH Gail; JAMA, 2005;293:1861-1867. Living and Dying in the USA: Behavioral, Health, and Social Differentials of Adult Mortality RG Rogers, CB Nam, RA Hummer; 2000. Suicide among male veterans: a prospective population-based study MS Kaplan, N Huguet, BH McFarland, JT Newsom; J Epidemiol Community Health, 2007; 61:619-624. Epidemiology & Community Health Jour nal of

13 13 XX NHANES II (1976-1980) Future Linkage CMS data 1999-2007 Completed Linkage CMS data 1991-2000 X X X X X X X NNHS 1997, 2004 NHANES 1999-2004 X NHANES III (1988-1994) X NHIS 1994-1998 NHIS 1999-2005 X LSOA II (1994-2000) X NHEFS (1971-1992) NCHS Linked Medicare Data Files

14 14 Medicare Linkage Medicare enrollment and claims data for the years 1991- 2000 Medicare enrollment and claims data for the years 1991- 2000 Denominator file Denominator file MEDPAR Inpatient hospitalization MEDPAR Inpatient hospitalization MEDPAR Skilled nursing facility (SNF) MEDPAR Skilled nursing facility (SNF) Hospital outpatient Hospital outpatient Home Health Agency (HHA) Home Health Agency (HHA) Hospice Hospice Carrier (physician/supplier Part B file) Carrier (physician/supplier Part B file) Durable Medical Equipment (DMERC) Durable Medical Equipment (DMERC) Next data release (1999-2007) Next data release (1999-2007) All of the above files All of the above files Chronic Conditions Warehouse Chronic Conditions Warehouse Medicare Part D (Prescription Drugs) Medicare Part D (Prescription Drugs)

15 15 Summary Medicare Data File Summary Medicare Enrollment and Claims Files (SMEC) for 1991-2000 Summary Medicare Enrollment and Claims Files (SMEC) for 1991-2000 Enrollment information from the Denominator file plus summary variables of claims and payments Enrollment information from the Denominator file plus summary variables of claims and payments Variables modeled after MCBS cost and use files Variables modeled after MCBS cost and use files Total reimbursements per year Total reimbursements per year Total number of claims by Medicare record type Total number of claims by Medicare record type Summary of charges by Medicare record type Summary of charges by Medicare record type Termination status & reason for termination Termination status & reason for termination Monthly HMO enrollment Monthly HMO enrollment Medicare status code (i.e. Part A, B or both) Medicare status code (i.e. Part A, B or both)

16 16 Research Potential of Linked Medicare Data Examine risk factors for health conditions Examine risk factors for health conditions Examine reliability of survey data Examine reliability of survey data Compare survey reported Medicare enrollment to Medicare claims records Compare survey reported Medicare enrollment to Medicare claims records Examine survey report of disability with program participation eligibility criteria Examine survey report of disability with program participation eligibility criteria Examine disparities in Medicare service utilization Examine disparities in Medicare service utilization

17 17 Future Linkage SSA data 1962-2007 Completed Linkage SSA data 1962-2003X X X X X X NHANES 1999-2004 X NHANES III (1988-1994) X NNHS 1985 X NHIS 1994-1998 X NHIS 1999-2005 X LSOA II (1994-2000) X NNHS 1995, 1997, 2004 X NHEFS (1971-1992) NCHS Linked SSA Data Files

18 18 Social Security Linkage Old Age, Survivor, & Disability Income Old Age, Survivor, & Disability Income Master Beneficiary Record (MBR), 1962 - 2003 Master Beneficiary Record (MBR), 1962 - 2003 Program eligibility, benefit amount, payment status, dual entitlement Program eligibility, benefit amount, payment status, dual entitlement Payment History Update System (PHUS), 1984-2003 Payment History Update System (PHUS), 1984-2003 Benefit payment amounts, including withholding information for Medicare Part B premiums Benefit payment amounts, including withholding information for Medicare Part B premiums Supplemental Security Income Supplemental Security Income Supplemental Security Record (SSR), 1974 - 2003 Supplemental Security Record (SSR), 1974 - 2003 Program eligibility, benefit information, and payment status Program eligibility, benefit information, and payment status

19 19 Research Potential of Linked Social Security Data Examine reliability of survey information for SSA program participation and benefits Examine reliability of survey information for SSA program participation and benefits Compare the health characteristics of early retirees (age 62) to those who postpone benefits Compare the health characteristics of early retirees (age 62) to those who postpone benefits Policy analysis using validated survey data Policy analysis using validated survey data Predicting the number of people who will become disabled based upon survey reported health conditions Predicting the number of people who will become disabled based upon survey reported health conditions Determining whether current disability entitlement funding levels will be adequate as the population ages Determining whether current disability entitlement funding levels will be adequate as the population ages

20 20 Future Linkage Activities Linkage of 1999-2004 Medicaid enrollment and claims data linked to 1999-2004 NHIS and NHANES Linkage of 1999-2004 Medicaid enrollment and claims data linked to 1999-2004 NHIS and NHANES NCHS series report comparing the mortality experience of the 1986-2000 National Health Interview Survey Participants with the U.S. population NCHS series report comparing the mortality experience of the 1986-2000 National Health Interview Survey Participants with the U.S. population

21 21 Overview NCHS Record Linkage Program NCHS Record Linkage Program Analytic Issues & Tools Analytic Issues & Tools Comparative Analysis of Public vs Restricted Linked Mortality Files Comparative Analysis of Public vs Restricted Linked Mortality Files Accessing the Restricted-use Linked Data Accessing the Restricted-use Linked Data

22 22 National Center for Health Statistics Record Linkage Program Analytic Issues and Tools Kimberly A. Lochner, SPB, OAE NCHS Data Users Conference August 12, 2008 U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics

23 23 Analytic Issues: Overview Linkage eligibility Linkage eligibility Linkage match status Linkage match status Combining survey years for the linked mortality files Combining survey years for the linked mortality files Changes in surveys or administrative data over time Changes in surveys or administrative data over time Issues with administrative data Issues with administrative data

24 24 Mortality: Analytic Issues Eligibility status Eligibility status Sample weights Sample weights Combining survey years for the linked mortality files Combining survey years for the linked mortality files Variance estimation Variance estimation Changes over time Changes over time ICD-9 and ICD-10 codes ICD-9 and ICD-10 codes Most of these issues apply only to the NHIS Linked Mortality Files Most of these issues apply only to the NHIS Linked Mortality Files

25 25

26 26 Eligibility Status What determines eligibility for mortality follow-up? What determines eligibility for mortality follow-up? Age Age Non “adult” survey respondents are INELIGIBLE Non “adult” survey respondents are INELIGIBLE Future linkages will include children Future linkages will include children Sufficient data for matching Sufficient data for matching Lack of identifying data makes you INELIGIBLE Lack of identifying data makes you INELIGIBLE Drop INELIGIBLE survey respondents Drop INELIGIBLE survey respondents Variable indicating eligibility status on files Variable indicating eligibility status on files

27 27 Mortality Ineligibility: Lack of Matching Data (adults only) 0.0NHEFS 0.13 NHANES III 0.0 NHANES II 8.0 – 11.0 NHIS 1997-2000 2.0 – 3.0 NHIS 1992-1996 < 2.0 NHIS 1986 - 1991 % Ineligible NCHS Health Survey

28 28 Eligibility Status Ineligibility a problem for NHIS Ineligibility a problem for NHIS Created new sample weights to account for ineligibility due to insufficient identifying data Created new sample weights to account for ineligibility due to insufficient identifying data Original NHIS sample weights (WTFA) Original NHIS sample weights (WTFA) New NHIS sample weights (WGT_NEW) New NHIS sample weights (WGT_NEW) Only for core/person files Only for core/person files Recommend using WGT_NEW Recommend using WGT_NEW

29 29 Combining Survey Years NHIS linked mortality files cover two design periods (1986-1994 and 1995-2000) NHIS linked mortality files cover two design periods (1986-1994 and 1995-2000) Follow guidelines on pooling NHIS years Follow guidelines on pooling NHIS years http://www.cdc.gov/nchs/nhis/methods.htm http://www.cdc.gov/nchs/nhis/methods.htm http://www.cdc.gov/nchs/nhis/methods.htm Created new stratum and psu variables for NHIS Linked Mortality files to allow combining across NHIS design years Created new stratum and psu variables for NHIS Linked Mortality files to allow combining across NHIS design years

30 30 Changes in Data Over Time ICD-9 (deaths 1979 – 1998) and ICD-10 (deaths 1999 to present) cover linked mortality files ICD-9 (deaths 1979 – 1998) and ICD-10 (deaths 1999 to present) cover linked mortality files Use both sets of codes to obtain full counts of cause- specific deaths Use both sets of codes to obtain full counts of cause- specific deaths Individual codes (ICD_9REV, ICD_10REV) Individual codes (ICD_9REV, ICD_10REV) Recodes Recodes UCOD_282, (ICD-9) UCOD_282, (ICD-9) UCOD_72, (ICD-9) UCOD_72, (ICD-9) UCOD_34, (ICD-9) UCOD_34, (ICD-9) UCOD_358, (ICD-10) UCOD_358, (ICD-10) UCOD_113 - recodes deaths before 1998 using ICD-10 guidelines UCOD_113 - recodes deaths before 1998 using ICD-10 guidelines Refer to vital statistics report on ICD comparability Refer to vital statistics report on ICD comparability

31 31 Medicare: Analytic Issues Eligibility status Eligibility status Eligible but not matched Eligible but not matched Death Death Linked but no Medicare data Linked but no Medicare data Managed care enrollment Managed care enrollment Non covered services Non covered services Gaps in coverage Gaps in coverage Issues with Medicare data files Issues with Medicare data files See the NCHS-CMS linkage web page under “Analytic/Programming Support” See the NCHS-CMS linkage web page under “Analytic/Programming Support”

32 32 Medicare Ineligible Population and Linkage Rates (65+ years) 95.91.9 NHANES III 81.00.0 NHANES II 84.97.1NHEFS 96.220.4 LSOA II 92.440.3 NHIS 1998 93.730.7 NHIS 1997 92.122.2 NHIS 1996 92.819.3 NHIS 1995 92.817.9 NHIS 1994 % Linked among eligible % Ineligible NCHS Health Survey

33 33 Ineligibles and Non-Matches Must be excluded from your sample Must be excluded from your sample Identify using the variable (CMS_MATCH) on the Feasibility Study Data files Identify using the variable (CMS_MATCH) on the Feasibility Study Data files

34 34 Identifying Deaths Survey participants interviewed before the availability of linked Medicare files could have died before 1991 Survey participants interviewed before the availability of linked Medicare files could have died before 1991 E.g. NHEFS, NHANES II or NHANES III respondents interviewed in Phase I (1988-91) E.g. NHEFS, NHANES II or NHANES III respondents interviewed in Phase I (1988-91) Persons may die during study period and cease to have Medicare records Persons may die during study period and cease to have Medicare records Enrolled in Medicare in 1991 but died before 2000 Enrolled in Medicare in 1991 but died before 2000

35 35 Identifying Deaths Survey respondents who died before 1991 (e.g. from NHANES) can be identified by merging mortality information from the Linked Mortality files Survey respondents who died before 1991 (e.g. from NHANES) can be identified by merging mortality information from the Linked Mortality files Needed to create analytic sample Needed to create analytic sample Persons who died during 1991-2000 should no longer have Medicare records after date of death Persons who died during 1991-2000 should no longer have Medicare records after date of death Look for a CMS date of death (DOD) on each of the Denominator or SMEC files (1991 to 2000) Look for a CMS date of death (DOD) on each of the Denominator or SMEC files (1991 to 2000)

36 36 Linked but no Medicare data No denominator file because No denominator file because Loss of entitlement during 1991-2000 Loss of entitlement during 1991-2000 Deceased prior to 1991 Deceased prior to 1991 CMS record keeping inconsistencies CMS record keeping inconsistencies No claims data No claims data Not utilizing Medicare in 1991-2000 Not utilizing Medicare in 1991-2000 No reimbursable claims No reimbursable claims CMS record keeping inconsistencies CMS record keeping inconsistencies

37 37 No Denominator Record Lack of denominator record can affect your analytic sample – why? Lack of denominator record can affect your analytic sample – why? Can’t determine managed care enrollment Can’t determine managed care enrollment In general, managed care enrollees are excluded from sample (more on this to come) In general, managed care enrollees are excluded from sample (more on this to come)

38 38 Managed Care Enrollment Medicare does not receive claims for beneficiaries enrolled in managed care plans (HMO) Medicare does not receive claims for beneficiaries enrolled in managed care plans (HMO) Do not have complete information on payments or services received Do not have complete information on payments or services received Could miss health events that are being counted based upon submitted claims Could miss health events that are being counted based upon submitted claims Complex issue. Refer to ResDAC Complex issue. Refer to ResDAC http://www.resdac.umn.edu/ http://www.resdac.umn.edu/ http://www.resdac.umn.edu/

39 39 How managed care enrollees affect your research depends upon your question… Studies on reimbursements/charges Studies on reimbursements/charges Option may be to exclude those with any managed care enrollment because you don’t have complete information on payments or services received Option may be to exclude those with any managed care enrollment because you don’t have complete information on payments or services received Studies on health outcomes/events Studies on health outcomes/events Option may be to exclude those with any managed care enrollment because you could miss events Option may be to exclude those with any managed care enrollment because you could miss events Option may be to censor observations at time of first HMO enrollment Option may be to censor observations at time of first HMO enrollment Other methods for addressing HMO enrollment possible depending upon research question Other methods for addressing HMO enrollment possible depending upon research question

40 40 Services not covered in Medicare 1991-2000 files Out-patient prescription drugs Out-patient prescription drugs Routine physical and dental exams Routine physical and dental exams Dentures Dentures Eye glasses Eye glasses Out-of-pocket expenses for Medicare beneficiaries (e.g. deductibles, coinsurance) Out-of-pocket expenses for Medicare beneficiaries (e.g. deductibles, coinsurance)

41 41 SSA: Analytic Issues Eligibility status Eligibility status Eligible but not matched Eligible but not matched Linked but no benefit history data Linked but no benefit history data Records are extracted from files designed for program administration - not for research Records are extracted from files designed for program administration - not for research

42 42 SSA Ineligible Population and Linkage Rates 93.35.6 NNHS 1985 95.32.9 NHANES III 94.66.0NHEFS 97.619.1 LSOA II 86.737.7 NHIS 1998 88.031.4 NHIS 1997 89.125.3 NHIS 1996 90.220.5 NHIS 1995 91.718.6 NHIS 1994 % Linked among Eligible % Ineligible NCHS Health Surveys

43 43 Ineligibles and Non-Matches Must be excluded from your sample Must be excluded from your sample Identify using the variable (SSA_MATCH) on the Feasibility Study Data files Identify using the variable (SSA_MATCH) on the Feasibility Study Data files

44 44 Linked but no SSA Data Linkage is to SSA NUMIDENT file Linkage is to SSA NUMIDENT file Linked to NUMIDENT file but may not be eligible for Social Security benefits Linked to NUMIDENT file but may not be eligible for Social Security benefits Not age eligible for retirement Not age eligible for retirement Defer retirement benefits because working full-time Defer retirement benefits because working full-time Not eligible for Social Security Not eligible for Social Security

45 45 Issues with Administrative Data Administrative data updates Administrative data updates Payment history updates Payment history updates Previously denied claims may be overridden Previously denied claims may be overridden Changes to type of benefit status Changes to type of benefit status Individuals receiving disability (DI) switch to retirement (R) benefits at age 65 in RSDI program Individuals receiving disability (DI) switch to retirement (R) benefits at age 65 in RSDI program Complicated data Complicated data File layouts are complex, e.g. each MBR record has 2 parts File layouts are complex, e.g. each MBR record has 2 parts Calculation of benefits not straightforward, e.g. SSI benefits come from both federal and state programs Calculation of benefits not straightforward, e.g. SSI benefits come from both federal and state programs

46 46 Final Tips Read relevant documentation !!! Read relevant documentation !!! Survey file layouts & detailed notes Survey file layouts & detailed notes Linkage methodology reports Linkage methodology reports Sample SAS & STATA input statements for public- use linked mortality files Sample SAS & STATA input statements for public- use linked mortality files Analytic guidelines Analytic guidelines Consult basic program information Consult basic program information CMS – http://www.cms.gov CMS – http://www.cms.govhttp://www.cms.gov ResDAC – http://www.resdac.umn.edu (Medicare) ResDAC – http://www.resdac.umn.edu (Medicare)http://www.resdac.umn.edu SSA – http://www.ssa.gov and SSA – http://www.ssa.gov andhttp://www.ssa.gov http://www.ssa.gov/regulations/index.htm

47 47 Final Tips Determine NCHS public-use files needed Determine NCHS public-use files needed Determine RDC linked files needed Determine RDC linked files needed Determine feasibility of research question based upon successfully linked respondents Determine feasibility of research question based upon successfully linked respondents Public-use Feasibility Study Data files available indicating whether respondent was linked to Medicare or SSA data and whether there is a record on the various Medicare and/or SSA files Public-use Feasibility Study Data files available indicating whether respondent was linked to Medicare or SSA data and whether there is a record on the various Medicare and/or SSA files Match status (SSA_MATCH & CMS_MATCH) Match status (SSA_MATCH & CMS_MATCH)

48 48 Overview NCHS Record Linkage Program NCHS Record Linkage Program Analytic Issues & Tools Analytic Issues & Tools Comparative Analysis of Public vs Restricted Linked Mortality Files Comparative Analysis of Public vs Restricted Linked Mortality Files Accessing the Restricted-use Linked Data Accessing the Restricted-use Linked Data

49 49 National Center for Health Statistics Record Linkage Program Comparative Analysis of the Public-use and Restricted-use Linked Mortality Files National Center for Health Statistics Record Linkage Program Comparative Analysis of the Public-use and Restricted-use Linked Mortality Files Kimberly A. Lochner, SPB, OAE NCHS Data Users Conference August 12, 2008 U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics

50 50 Objectives Present an overview of the newly available public-use linked mortality files Present an overview of the newly available public-use linked mortality files National Health Interview Survey (NHIS) 1986 to 2000 National Health Interview Survey (NHIS) 1986 to 2000 Third National Health a Nutrition Examination Survey (NHANES III) Third National Health a Nutrition Examination Survey (NHANES III) The Second Longitudinal Study of Aging (LSOA II) The Second Longitudinal Study of Aging (LSOA II) Demonstrate the analytic comparability between the public-use and restricted-use versions of the linked mortality files Demonstrate the analytic comparability between the public-use and restricted-use versions of the linked mortality files

51 51 Background Mortality follow-up studies are a major focus of NCHS record linkage activities Mortality follow-up studies are a major focus of NCHS record linkage activities NCHS linked mortality files created in 2004 made available through NCHS Research Data Center (RDC) NCHS linked mortality files created in 2004 made available through NCHS Research Data Center (RDC) Protects confidentiality of survey participants Protects confidentiality of survey participants May minimize access to highly utilized data sources May minimize access to highly utilized data sources

52 52 Background NCHS plan for public-use linked mortality files included NCHS plan for public-use linked mortality files included Releasing a reduced number of key mortality variables Releasing a reduced number of key mortality variables Perturbing date or cause of death for select records Perturbing date or cause of death for select records Determining that survey participants could not be reidentified Determining that survey participants could not be reidentified Comparing the analytic utility of the public-use file to the restricted-use file Comparing the analytic utility of the public-use file to the restricted-use file

53 53 Public-use Linked Mortality Files NHIS (1986 – 2000) NHIS (1986 – 2000) Each NHIS year is nationally representative survey of the civilian non-institutionalized U.S. population Each NHIS year is nationally representative survey of the civilian non-institutionalized U.S. population Questionnaire content Questionnaire content Basic socio-demographic characteristics Basic socio-demographic characteristics Health conditions and utilization Health conditions and utilization Health status, health care services, and behavior Health status, health care services, and behavior Mortality follow-up through December 2002 Mortality follow-up through December 2002

54 54 Public-use Linked Mortality Files NHANES III (1988 – 1994) NHANES III (1988 – 1994) Includes survey and examination information designed to assess the health and nutritional status of U.S. adults and children. Includes survey and examination information designed to assess the health and nutritional status of U.S. adults and children. Study content Study content Basic socio-demographic characteristics Basic socio-demographic characteristics Medical and dental examinations Medical and dental examinations Laboratory tests Laboratory tests Environmental exposures Environmental exposures Mortality follow-up through December 2000 Mortality follow-up through December 2000

55 55 Public-use Linked Mortality Files LSOA II LSOA II Prospective survey of persons 70 years of age and over at the time of their baseline interview (1994 NHIS) Prospective survey of persons 70 years of age and over at the time of their baseline interview (1994 NHIS) Follow-up interviews in 1997-98 and 1999-00 Follow-up interviews in 1997-98 and 1999-00 Questionnaire content Questionnaire content Basic socio-demographic characteristics Basic socio-demographic characteristics Health conditions, functional health status and disability Health conditions, functional health status and disability Health care utilization Health care utilization Mortality follow-up through December 2002 Mortality follow-up through December 2002

56 56 Data Elements: NHIS Linked Mortality Files Yes (quarter, year)** Yes (month, day, year) Interview date Yes (month, year)** Yes (month, day, year) Date of birth NoYes Age last presumed alive NoYes Age at death Yes (top coded at 85+) Yes Age at interview Yes*Yes Multiple cause-of-death Yes (grouped recode) Yes Underlying cause-of-death Yes (quarter, year) Yes (month, day, year) Death date YesYes Final mortality status Public-useRestricted-use Survey Variables * MCOD flags only for diabetes, hypertension, and hip fracture **Available on the public-use NHIS survey data files

57 57 Data Elements: NHANES III Linked Mortality Files YesYes Mortality source YesNo Person months FU No Yes (month, day, year) Interview date No Yes (month, day, year) Death date No Yes (month, day, year) Date of birth NoYes Age last presumed alive NoYes Age at death Yes**Yes Age at interview/exam Yes*Yes Multiple cause-of-death Yes (grouped recode) Yes Underlying cause-of-death YesYes Final mortality status Public-useRestricted-use Survey Variables Survey Variables * MCOD flags only for diabetes, hypertension, and hip fracture **Available on the public-use NHANESIII survey data files

58 58 Data Elements: LSOA II Linked Mortality Files YesYes Mortality source Yes (month, year) Yes (month, day, year) Interview date Yes (month, year)** Yes (month, day, year) Date of birth NoYes Age last presumed alive NoYes Age at death YesYes Age at interview Yes*Yes Multiple cause-of-death Yes (grouped recode) Yes Underlying cause-of-death Yes (quarter, year) Yes (month, day, year) Death date YesYes Final mortality status Public-useRestricted-use Survey Variables Survey Variables * MCOD flags only for diabetes, hypertension, and hip fracture **Available on the public-use LSOA II survey data files

59 59 Comparative Analyses

60 60 Statistical Methods Compared mean follow-up times and distributions for select causes of death Compared mean follow-up times and distributions for select causes of death Compared the mortality risk for a standard set of socio-demographic covariates for all-cause as well as cause-specific mortality Compared the mortality risk for a standard set of socio-demographic covariates for all-cause as well as cause-specific mortality Cox proportional hazard models Cox proportional hazard models SUDAAN to take into account complex survey design SUDAAN to take into account complex survey design

61 61 Analytic Samples Eligible for mortality follow-up Eligible for mortality follow-up At least 25 years of age at the time of the survey interview At least 25 years of age at the time of the survey interview Non-Hispanic white, non-Hispanic black, or Hispanic Non-Hispanic white, non-Hispanic black, or Hispanic Non missing values for cause of death or other covariates Non missing values for cause of death or other covariates

62 62 Covariates Socio-demographic characteristics reported at time of interview and taken from public-use survey data files: Age Age Sex Sex Race and ethnicity Race and ethnicity Educational attainment Educational attainment Marital status (except NHANES III) Marital status (except NHANES III) Region of the country (except NHANES III) Region of the country (except NHANES III)

63 63 Outcomes All-cause and cause-specific mortality All-cause and cause-specific mortality Cause-specific deaths based on underlying cause of death from the ICD-10 113 grouped recode Cause-specific deaths based on underlying cause of death from the ICD-10 113 grouped recode Duration of follow-up calculated from time of interview until death or censored at end of the follow-up period Duration of follow-up calculated from time of interview until death or censored at end of the follow-up period Restricted-use files use complete information on interview and death month, day, and year Restricted-use files use complete information on interview and death month, day, and year Public-use files use less detailed information on timing of death, some of which is perturbed Public-use files use less detailed information on timing of death, some of which is perturbed NHIS/LSOA II: use interview year and death year only NHIS/LSOA II: use interview year and death year only NHANES III: use person-time follow-up provided on the file NHANES III: use person-time follow-up provided on the file

64 64 NHIS Results Sample (n = 897,232) Sample (n = 897,232) Deaths (n = 114,264) Deaths (n = 114,264) 11.8% weighted 11.8% weighted Follow-up (mean) Follow-up (mean) Restricted-use = 8.6 years Restricted-use = 8.6 years Public-use = 8.7 years Public-use = 8.7 years

65 65 NHIS Linked Mortality Files: Cause-specific Deaths

66 66 NHIS Linked Mortality Files: Relative Hazards for All-Cause Mortality 1.281.28 Some college Some college 1.411.41 High school or GED High school or GED 1.681.68 Less than high school Less than high school Educational attainment (College grad +) 0.890.89 Hispanic Hispanic 1.151.15 NHB NHB Race/ethnicity (NHW) 1.691.69 Male Male Sex (female) 1.091.09 Age (years) Restricted-usePublic-useCovariates Note: Models also adjusted for marital status and region of the country.

67 67 NHIS Linked Mortality Files: Relative Hazards for Homicide Mortality 1.551.65 High school or GED High school or GED 2.312.44 Less than high school Less than high school Educational attainment (More than high school) 3.904.00 NHB NHB Race/ethnicity (NHW) 2.702.70 Male Male Sex (female) 0.990.98 Age (years) Restricted-usePublic-useCovariates Note: Models are restricted to Non Hispanic Whites and Blacks (n = 802,307). Models also adjusted for marital status and region of the country

68 68 NHANES III Results Sample (n = 16,048) Sample (n = 16,048) Deaths (n = 3,209) Deaths (n = 3,209) 12.1% weighted 12.1% weighted Follow-up (mean) Follow-up (mean) Restricted-use = 104.1 months Restricted-use = 104.1 months Public-use = 103.8 months Public-use = 103.8 months

69 69 NHANES III Linked Mortality Files: Cause-specific Deaths Stroke Lung Lung Cancer (all) Ischemic Ischemic Heart disease Causes of Death 7.02696.9266 7.61797.6180 25.369825.0689 11.333611.4344 34.81,15835.51,188 Percentage (weighted) Number (unweighted) Percentage (weighted) Number (unweighted) Restricted-usePublic-use

70 70 NHANES III Linked Mortality File: Relative Hazards for All-Cause Mortality 1.281.28 High school High school 1.391.40 Less than high school Less than high school Educational attainment (More than high school) 0.990.99 Mexican-American Mexican-American 1.381.38 Non Hispanic Black Non Hispanic Black Race/ethnicity (Non Hispanic White) 1.461.46 Male Male Sex (Female) 1.091.09 Age (years) Restricted-usePublic-useCovariates

71 71 NHANES III Linked Mortality File: Relative Hazards for Cerebrovascular Mortality 0.870.87 High school High school 0.810.81 Less than high school Less than high school Educational attainment (More than high school) 1.551.50 Non Hispanic Black Non Hispanic Black Race/ethnicity (Non Hispanic White) 1.101.11 Male Male Sex (Female) 1.121.12 Age (years) Restricted-usePublic-useCovariates Note: Models restricted to Non Hispanic Whites and Blacks (n = 11,985).

72 72 LSOA II Results Sample (n = 8,867) Sample (n = 8,867) Deaths (n = 3,671) Deaths (n = 3,671) 41.4% weighted 41.4% weighted Follow-up (mean) Follow-up (mean) Restricted-use = 4.4 years Restricted-use = 4.4 years Public-use = 4.4 years Public-use = 4.4 years

73 73 LSOA II Linked Mortality Files: Cause-specific Deaths Stroke Lung Lung Cancer (all) Ischemic Ischemic Heart disease Causes of Death 8.33138.3312 5.72045.7204 22.080821.7797 9.03389.0340 34.21,27334.91,302 Percentage (weighted) Number (unweighted) Percentage (weighted) Number (unweighted) Restricted-usePublic-use

74 74 LSOA II Linked Mortality File: Relative Hazards for All-Cause Mortality 1.921.91 80-89 80-89 Age (70-79) 0.790.78 Hispanic Hispanic Educational attainment (More than high school) 1.221.24 Less than high school Less than high school 1.221.23 High school High school 1.081.08 Non Hispanic Black Non Hispanic Black Race/ethnicity (Non Hispanic White) 1.521.53 Male Male Sex (Female) 3.093.11 90+ 90+Restricted-usePublic-useCovariates Note: Models also adjusted for marital status and region of the country.

75 75 LSOA II Linked Mortality File: Relative Hazards for Cancer Mortality 1.041.05 Widowed Widowed 1.301.29 80-89 80-89 Age (70-79) 1.281.29 High school High school Marital status (Married) 1.081.06 Divorced/separated Divorced/separated 1.111.14 Never married Never married 1.141.16 Less than high school Less than high school Educational attainment (More then high school) 1.751.77 Male Male Sex (Female) 0.660.62 90+ 90+Restricted-usePublic-useCovariates Note: Models restricted to Non Hispanic Whites (n = 7,586). Models also adjusted for region of the country.

76 76 Conclusions Public-use linked mortality files yield similar results as the restricted-use data Public-use linked mortality files yield similar results as the restricted-use data Public-use and restricted-use files yield similar hazard ratios and confidence intervals, particularly for common causes of death Public-use and restricted-use files yield similar hazard ratios and confidence intervals, particularly for common causes of death Results for less common causes of death remain consistent, although there tends to be less agreement in the estimates Results for less common causes of death remain consistent, although there tends to be less agreement in the estimates

77 77 Conclusions Caution is urged for analyses of very rare causes of death or small population subgroups Caution is urged for analyses of very rare causes of death or small population subgroups Users of the public-use linked mortality files may request to verify their results through the NCHS Research Data Center Users of the public-use linked mortality files may request to verify their results through the NCHS Research Data Center

78 78 Public-use Linked Mortality Files Can Be Downloaded http://www.cdc.gov/nchs/data_access/data_linkage_activities.htm

79 79 Acknowledgements American Journal of Epidemiology 2008 168(3):336-344 American Journal of Epidemiology 2008 168(3):336-344 SPB data linkage team SPB data linkage team Stephanie Bartee Stephanie Bartee Jim Brittain Jim Brittain Cordell Golden Cordell Golden Donna Miller Donna Miller Gloria Wheatcroft Gloria Wheatcroft

80 80 Overview NCHS Record Linkage Program NCHS Record Linkage Program Analytic Issues & Tools Analytic Issues & Tools Comparative Analysis of Public vs Restricted Linked Mortality Files Comparative Analysis of Public vs Restricted Linked Mortality Files Accessing the Restricted-use Linked Data Accessing the Restricted-use Linked Data

81 81 NCHS Record Linkage Activities: Accessing Restricted Linked data at the NCHS Research Data Center NCHS Record Linkage Activities: Accessing Restricted Linked data at the NCHS Research Data Center Christine Cox NCHS Data Users Conference August 12, 2008 U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics

82 82 Why can’t you just give me the data? NCHS does not “own” the linked administrative data NCHS does not “own” the linked administrative data NCHS data confidentiality rules prohibit the release of potentially identifiable data – special considerations concerning the protection of linked data NCHS data confidentiality rules prohibit the release of potentially identifiable data – special considerations concerning the protection of linked data The RDC is the only option for access to restricted-use data files The RDC is the only option for access to restricted-use data files

83 83 Research Data Center The RDC is a organizational unit located at NCHS headquarters in Hyattsville, MD The RDC is a organizational unit located at NCHS headquarters in Hyattsville, MD Provides access to restricted use data files Provides access to restricted use data files

84 84 Restricted Data Files Include… Linked administrative data Linked administrative data Medicare Medicare SSA SSA Restricted-use linked mortality files Restricted-use linked mortality files Detailed geographic data or contextual data Detailed geographic data or contextual data Census tract & State/county level data Census tract & State/county level data EPA air pollution data EPA air pollution data

85 85 To gain access to NCHS restricted data user must: To gain access to NCHS restricted data user must: Submit a research proposal Submit a research proposal Sign an affidavit of confidentiality Sign an affidavit of confidentiality Promise not to use any method to attempt to identify respondents Promise not to use any method to attempt to identify respondents What to Expect?

86 86 What to Expect? How long for a proposal to be reviewed? How long for a proposal to be reviewed? Usually within 2 weeks, if proposing to use public use survey data with the linked data Usually within 2 weeks, if proposing to use public use survey data with the linked data Up to 1-2 months, if proposing to use non- public survey data with the linked data Up to 1-2 months, if proposing to use non- public survey data with the linked data

87 87 Access Methods Once approved, three methods to access restricted data Once approved, three methods to access restricted data on-site - use local computing resources in the NCHS RDC, Hyattsville, MD on-site - use local computing resources in the NCHS RDC, Hyattsville, MD remote – submit programs electronically to be executed in the RDC with output returned by email remote – submit programs electronically to be executed in the RDC with output returned by email Census RDC- access NCHS data using any one of the nine Census RDCs. Census RDC- access NCHS data using any one of the nine Census RDCs. For all methods of access, restricted data files remain in RDC and output is inspected for disclosure violations For all methods of access, restricted data files remain in RDC and output is inspected for disclosure violations

88 88 On-Site Access Method On-site Facilities On-site Facilities Four user workstations-expandable as needed Four user workstations-expandable as needed Pentium IV computers Pentium IV computers Windows XP Windows XP SAS, STATA, SUDAAN, LIMDEP, SPSS, Watcom Fortran 77, & HLM SAS, STATA, SUDAAN, LIMDEP, SPSS, Watcom Fortran 77, & HLM No removable media No removable media Secure printer Secure printer Open only during normal working hours Open only during normal working hours RDC staff constructs necessary data files, including merged user data RDC staff constructs necessary data files, including merged user data

89 89 Remote Access Method RDC staff constructs necessary data files, including merged user data RDC staff constructs necessary data files, including merged user data SAS programs only, including SAS callable SUDAAN (certain procedures and functions not allowed) SAS programs only, including SAS callable SUDAAN (certain procedures and functions not allowed) Both submitted programs and output undergo a programmed disclosure limitation review Both submitted programs and output undergo a programmed disclosure limitation review Ability to submit analytical computer programs via email from anywhere in the world with access available 24hrs/day Ability to submit analytical computer programs via email from anywhere in the world with access available 24hrs/day

90 90 Census RDC Access Method 9 Census RDCs 9 Census RDCs Los Angeles, Berkeley, Boston, Durham, Los Angeles, Berkeley, Boston, Durham, Ann Arbor, Ithaca, NYC, Chicago, & DC Ann Arbor, Ithaca, NYC, Chicago, & DC Separate Census research proposal is not needed Separate Census research proposal is not needed May have to follow additional security requirements at Census Bureau facilities May have to follow additional security requirements at Census Bureau facilities

91 91 User Fees: Linked Data Access Minimum $250 fee per day for analytic file creation. Census RDC Access …. Minimum $250 fee per day for analytic file creation and $250 per month remote access fee. Remote Access.......... Minimum $250 fee per day for analytic file creation and $200 per day on-site user fee (2-day minimum; 10-day maximum). Guest Researcher (on site) … User Fees Type of Data Access

92 92 Proposal Requirements Proposal is evaluated by review committee Proposal is evaluated by review committee Review criteria Review criteria Scientific and technical feasibility Scientific and technical feasibility Availability of RDC resources Availability of RDC resources Disclosure risk for restricted information Disclosure risk for restricted information The extent to which project is in accordance with the mission of NCHS The extent to which project is in accordance with the mission of NCHS Special note: NCHS does not try to determine if proposals are duplicative Special note: NCHS does not try to determine if proposals are duplicative

93 93 Proposal Requirements: Helpful Tips Be clear about research and data requirements (helps to determine feasibility of project) Be clear about research and data requirements (helps to determine feasibility of project) Clearly identify the sample to be included in the analytic file Clearly identify the sample to be included in the analytic file Provide data dictionaries for both Provide data dictionaries for both Public-use data Public-use data Restricted-use data Restricted-use data Provide examples of expected output Provide examples of expected output

94 94 Visit the RDC at: http://www.cdc.gov/nchs/r&d/rdc.htm or email: rdca@cdc.gov http://www.cdc.gov/nchs/r&d/rdc.htm

95 95 Where to get Help? RDC website contains: RDC website contains: Proposal Checklist Proposal Checklist Sample Proposal Sample Proposal List of available restricted data files List of available restricted data files Detail on Census RDC locations and contact information Detail on Census RDC locations and contact information FAQ’s regarding proposal review process, on-site procedures, area information and contact information FAQ’s regarding proposal review process, on-site procedures, area information and contact information Email: rdca@cdc.gov Email: rdca@cdc.gov

96 96 Questions?


Download ppt "1 National Center for Health Statistics Record Linkage Program Christine S. Cox, Chief, Special Projects Branch (SPB) Office of Analysis & Epidemiology."

Similar presentations


Ads by Google