Presentation is loading. Please wait.

Presentation is loading. Please wait.

Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Similar presentations


Presentation on theme: "Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill."— Presentation transcript:

1 Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill

2 Presenter disclosure information Lesley H Curtis Large Data Sets: An Overview FINANCIAL DISCLOSURE: None UNLABELED/UNAPPROVED USES DISCLOSURE: None

3 Agenda Large Data Sets: An Overview Prescription Drug Data: Advantages, Availability, and Access Linking Large Data Sets: Why, How, and What Not to Do Practical Examples

4 Which large data sets? Relevant for cardiovascular research Available to researchers Potential for linkage Claims data—federal and commercial Inpatient registries Longitudinal cohort studies

5 Claims data n Derived from payment of bills n Payor-centric n Examples l Medicare l Medicaid l Thomson-Reuters l United Health Care

6 Medicare claims data n Inpatient services (Part A) n Outpatient services (Part B) n Physician services (Carrier, Part B) n Durable medical equipment n Home health care n Skilled nursing facilities n Hospice

7 Medicare claims data elements n What data are available l Demographics l Service dates l Diagnoses l Procedures l Hospital / Physician n What data are not available l Physiological measures l Test results l Times of admission, procedures, etc. l Medications

8 Medicare claims data coverage n National scope n What patients will be represented? l Patients enrolled in traditional (fee-for- service) Medicare n What patients will not be represented? l Patients receiving care through the Veterans Health Administration l Patients enrolled in Medicare managed care plans

9 Medicare claims data quality n Main point l Reliability of specific claims data elements depends on importance for reimbursement n Good data on… l Major procedures l Hospitalizations l Mortality n Inconsistent data on… l Comorbidities and illness severity l Procedures with low reimbursement rates

10 Acquiring CMS claims data n All requests begin with ResDAC (www.resdac.umn.edu) n Cost l $15K per year of inpatient+denominator data l $20K per year of 5% data across all files l $30K+ per year of data for custom requests n Detailed approval process l Prepare request packet for ResDAC review (4-6 weeks) l Review by CMS privacy board (4 weeks) l Request processed by contractor (6-8 weeks)

11 Preparing for CMS claims data n Make space l 16 GB for 100% denominator and inpatient files l 57 GB for 5% denominator, inpatient, outpatient, and carrier* files n Manage expectations l Time to process files l Transforming raw claims into usable information Coding algorithms Coding changes l Learning curve

12 The Learning Curve

13 Claims data n Derived from payment of bills n Payor-centric n Examples l Medicare l Medicaid l Thomson-Reuters l United Health Care

14 Commercial claims data elements n What data are typically available l Demographics l Service dates l Diagnoses l Procedures l Medications l Hospital / Physician n What data may not be available l Physiological measures l Test results

15 Commercial claims data coverage n National scope n What patients will be represented? l Individuals who are commercially insured n What patients will not be represented? l The uninsured l Medicare managed care?

16 Commercial claims data quality n Similar to Medicare claims data l Reliability of specific claims data elements depends on importance for reimbursement n Good data on… l Major procedures l Hospitalizations n Inconsistent data on… l Mortality l Comorbidities and illness severity l Procedures with low reimbursement rates

17 Preparing for commercial claims data n Cost l $25-70K depending on size, scope of data request n Size l 100 GB per year of data l Analysis sample sizes will differ from advertised sample sizes n Manage expectations!

18 Registry data n Observational cohorts of patients undergoing specific treatments or having specific conditions n Purpose may be to assess… l Quality of care l Provider performance l Treatment safety/effectiveness n Of interest today are hospital-based registries

19 OPTIMIZE-HF registry n Hospital-based quality improvement program and internet-based registry for heart failure. n 2002-2005: 50,000 patients; > 250 hospitals n Transitioned to GWTG-HF in 2005

20 Registry data coverage n Only patients treated at participating hospitals will be included + All patients at these hospitals included regardless of payor – Participating hospitals may not be representative of hospitals nationwide % of group in selected states State US Elderly Medicare FFS OPTIMIZE-HF California10.1%7.7%13.8% Florida7.4%7.0%8.7% Michigan3.4%4.0%9.5% New York 6.6%6.1%3.5% Pennsylvania5.2%4.4%6.7% Texas6.0%6.5%5.4%

21 Registry data quality n Good data on… l Many of the things not included in Medicare data: Labs, medications, treatment timing, process measures, contraindications (if collected) n Inconsistent data on… l Post-hospitalization follow-up care l Outcomes, particularly long-term

22 Accessing registry data n Networking and partnering l Many require that analyses be performed at selected analytical centers which may have long queues n Approval process via steering or executive committee

23 NHLBI longitudinal cohort studies n Atherosclerosis Risk in Communities Study (ARIC) n Cardiovascular Health Study (CHS) n Framingham Heart Study n Jackson Heart Study n Multi-Ethnic Study of Atherosclerosis n Women’s Health Initiative

24 Cardiovascular Health Study (CHS) n n Prospective, observational study of CV disease in the elderly (Washington Co. Maryland, Forsyth Co. NC, Sacramento Co. CA, and Pittsburgh, PA.) n n Baseline exams occurred from 1989-90. n n Minority cohort added at Year 5 n n Annual exams, with ‘major’ exams occurring at year 5 (1992-93), and year 9 (1996-97). Last exam was year 11 (1998-99). n n 5,201 participants at baseline; 687 additional minority participants  5,888

25 Cardiovascular Health Study data elements n What data are available l Demographics l Medical, personal history l Physiological measures, test results l QOL, depression l Cognitive function n What data are not available l Service dates l Procedures l Hospital/physician

26 Cardiovascular Health Study data quality n Main point l Data collected are of high quality n Good data regarding… l Cardiovascular risk factors l Cardiovascular endpoints l General health n Limited data on… l Non-cardiovascular risk factors l Non-cardiovascular endpoints

27 Accessing NHLBI cohort studies n Via the NHLBI data repository l HIPAA identifiers, geography removed n Via Coordinating Center for identifiable data n Size l 20MB per year of data

28 NHLBI-Medicare linked data sets n CMS linked with… l CHS (1991-2004, 2005-2009 pending) l Framingham (2000-2009 pending) l Jackson Heart Study (2000-2009 pending) l Multi-Ethnic Study of Atherosclerosis (2000- 2009 pending) l Atherosclerosis Risk in Communities l Women’s Health Initiative

29 Conclusion n Large data sets abound n Do yourself a favor…manage expectations!

30 Contact Information Lesley Curtis Lesley.curtis@duke.edu

31


Download ppt "Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill."

Similar presentations


Ads by Google