Presentation is loading. Please wait.

Presentation is loading. Please wait.

Harnessing Health.Data.gov Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community

Similar presentations


Presentation on theme: "Harnessing Health.Data.gov Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community"— Presentation transcript:

1 Harnessing Health.Data.gov Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ April 17, 2013 http://semanticommunity.info/Health_Datapalooza_IV#Health.Data.gov 1

2 Background HealthData.gov and Health Datapalooza III Knowledge Base and Data Ecosystem: – Two Published Stories, Two Spreadsheets, and Two Spotfire Dashboards. My Note: HealthData.gov 194 Data Sets in 2012 and 399 now in 2013. Health Datapalooza IV Technology Development Track: – Knowledge Graph, Metadata, RPI Watson, Bootcamp, and Linked Data. See Next Slide My Process: – Harness Data for Diabetes Knowledge Base – Data Ecosystem Spreadsheet – Data Ecosystem Spotfire My Results: – Story – Slides – Spotfire Dashboard – Research Notes 2

3 HealthData.gov and Health Datapalooza III Knowledge Base 3 http://semanticommunity.info/HealthData.gov

4 HealthData.gov and Health Datapalooza III Spotfire Data Ecosystem 4 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?HealthData.gov-Spotfire

5 Health Datapalooza IV Technology Development Track Open Health Knowledge Graphs: – This session will describe healthdata.gov platform components, including new functionality that programmatically exposes tabular and graph-oriented data. healthdata.gov Lifting Schemes: – We will describe the ‘bottom up’ automation tools and techniques employed in the winning submission for the healthdata.gov Metadata Domain Challenge.healthdata.govMetadata Domain Challenge Open Government Data: – We will present emerging solution standards and transitioning academic technologies, including innovative work conducted by the ‘Watson’ research group at Rensselaer Polytechnic Institute on using Watson as a ‘data advisor’.‘Watson’ research groupRensselaer Polytechnic Institute Health Industry Bootcamp - A Real-World Crash Course: – An interactive, games-based bootcamp designed to get participants up and running the same day with their own real-world portfolio covering how to use public data to create market value, how to navigate perverse incentives in the industry, and how to deliver public and social good. Cooperation Without Coordination: Managing Distributed Clinical Trial Data: – TBA See http://health.data.gov/cqld/ and http://reference.data.gov/cqld/about.htmlhttp://health.data.gov/cqld/http://reference.data.gov/cqld/about.html Linked Data – Structured Data on the Web: – TBA See http://sw.appliedinformaticsinc.com/fct/facet_doc.htmlhttp://sw.appliedinformaticsinc.com/fct/facet_doc.html 5 http://healthdatapalooza.org/agenda/tech-development-track/

6 Vocab.Data.gov: Government Data Vocabulary 6 http://vocab.data.gov/gd

7 Health Data Platform Metadata Challenge 7 http://www.health2con.com/devchallenge/health-data-platform-metadata-challenge/ http://www.healthdata.gov/blog/domain-challenge-1-metadata Mirrored http://hub.healthdata. gov to improve the CKAN-metadata and RDF. http://hub.healthdata. gov Created three levels of metadata for http://healthdata.gov datasets. http://healthdata.gov Created a set of ontologies to link several datasets from HealthData.gov.

8 IBM Watson at RPI What is Watson?: – The underlying “DeepQA” architecture is designed to find the meaning behind a question posed in natural language and deliver a single, precise answer. IBM’s Watson goes to school: A Q&A with RPI’s Jim Hendler: – A version of the system similar to the one used on “Jeopardy!” will be housed at RPI for three years as part of a Shared University Research Award from IBM Research. The system at RPI will have 15 terabytes of hard disk storage and give 20 users access to the system simultaneously, making it, according to a release, "an innovation hub” for the campus. – One thing we want to explore is how Watson can interact with social media, especially things such as “tweets” where the language is not as carefully constructed as it is in the documents Watson has used in the Jeopardy game. – I run a group that does a lot of work with Open Government Data systems (like the US data.gov) and we’re excited about the possibility of using Watson to help researchers around the world find relevant government data and documents for their work. – Our goal for the next few years is to gain an understanding of what having the new ways of bringing unstructured data and documents into our computational lives will be. 8 http://watson.rpi.edu/ My Note: See Our Semantic Medline Work with New Cray Graph Computer.Semantic MedlineNew Cray Graph Computer

9 Health.Data.gov 9 http://www.healthdata.gov/ My Note: Promotes the Diabetes Challenge, But Does Not Provide Much Data For It!

10 Health.Data.gov: Search for Diabetes 10 http://www.healthdata.gov/dataset/search/diabetes http://statesnapshots.ahrq.gov/snaps09/allStatesallMeasures.jsp?menuId=63&state= My Note: Found One Data Set and Downloaded Two Excel Files and Added Them to the Diabetes Ecosystem Spreadsheet. See Slide 18.

11 HealthData.gov Catalog Hub 11 http://hub.healthdata.gov/ My Note: 402 datasets instead of 399. My Note: Found Same State AQHR Snapshots and CDC WONDER Births. See Next Slide.

12 HealthData.gov Catalog Hub: CDC WONDER Births 12 http://hub.healthdata.gov/dataset/wonder-births

13 HealthData.tw.rpi.edu Catalog Hub: CDC WONDER Births 13 http://healthdata.tw.rpi.edu/hub/dataset/wonder-births-1 “We mirrored the http://hub.healthdata.gov CKAN instance using its API to our own instance at http://healthdata.tw.rpi.edu/hub. This allowed us to both improve the CKAN-based metadata, including adding Data Dictionaries and Technical Documentation as Resources, and to improve the RDF generated by CKAN.” http://hub.healthdata.gov http://healthdata.tw.rpi.edu/hub Source: Health Data Platform Metadata ChallengeHealth Data Platform Metadata Challenge Source: See Next Slide

14 CDC WONDER: Natality Information Live Births 14 http://wonder.cdc.gov/natality.html My Note: Data Description contains Maternal Risk Factors: Diabetes - Yes, No, Not Stated, Not Reported. My Note: A Data Access Agreement is Required.

15 CDC WONDER: Natality Data Live Births - Diabetes 15 http://wonder.cdc.gov/controller/datarequest/D66;jsessionid=A7C4A365FB2F877955A61D7BF9C5EC5C

16 CDC WONDER: Natality Data Live Births - Diabetes 16 http://wonder.cdc.gov/controller/datarequest/D66;jsessionid=A7C4A365FB2F877955A61D7BF9C5EC5C My Note: Export to Text File And Remove Metadata and Import to Spreadsheet.

17 Harness Health.Data.gov Data to Address Diabetes in the US Knowledge Base 17 http://semanticommunity.info/Health_Datapalooza_IV#Health.Data.gov My Note: Did not find CAHMI! My Note: Only found one!

18 Diabetes Data Ecosystem Spreadsheet 18 http://semanticommunity.info/@api/deki/files/23811/Diabetes.xlsx

19 NHQR State Snapshots 2009 19 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?AHRQFocusonDiabetes-Spotfire.dxp

20 AHRQ State Snapshots Conclusion Getting started on quality improvement is not an easy task. One strategy a State may find helpful is to identify other States with populations similar to those targeted for a quality improvement effort. For example, a State seeking to improve rates of pneumonia vaccination for people discharged from hospitals may want to model its efforts on those of a State that has previously implemented an improvement program in this area and demonstrated success. In many cases, the greatest value in comparison may lie in identifying States that have started from relatively low performance and made incremental improvements. The State with the greatest improvements may have the most to contribute in demonstrating to other States how to encourage delivery system change that improves quality of care. 20 http://statesnapshots.ahrq.gov/snaps09/interpretation.jsp?menuId=67&state=AL#conclusion

21 AHRQ Quality of Care for Diabetes by Region and State for 2005-2006 by Conditions 21 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?AHRQFocusonDiabetes-Spotfire.dxp

22 CDC WONDER Births Natality Diabetes 22 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?AHRQFocusonDiabetes-Spotfire.dxp

23 Diabetes Data Ecosystem Spotfire 23 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?AHRQFocusonDiabetes-Spotfire.dxp My Note: Can See All the Data Sets and Their Data Elements To Do Joins, Mappings, and Rule-Driven Visualizations.

24 Conclusions and Recommendations A Health.Data.gov search for “diabetes” gives only one data set. A Search of HealthData.gov Catalog Hub gives two data sets. The Health Datapalooza IV Technology Development Track Objectives Are Shown in This Work. I prefer both human-readable and machine-readable metadata instead of just the later which I find at the HealthData.gov Catalog Hub. Next is First Lady Michelle Obama on Exercise and Dr. Amen on Natural Supplements Data in Preventing and Treating Diabetes. 24


Download ppt "Harnessing Health.Data.gov Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community"

Similar presentations


Ads by Google