Harnessing Health.Data.gov Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community

Slides:



Advertisements
Similar presentations
Meta Data Larry, Stirling md on data access – data types, domain meta-data discovery Scott, Ohio State – caBIG md driven architecture semantic md Alexander.
Advertisements

Federal Transparency.gov As Data For the Digital Government Strategy Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for Business: Semantic Verses Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.
Galia Angelova Institute for Parallel Processing, Bulgarian Academy of Sciences Visualisation and Semantic Structuring of Content (some.
Dynamic Case Management for Military and Intelligence Departments Can Improve Their Enterprise Architecture Programs Dr. Brand Niemann Director and Senior.
OMB Data Visualization Tool Requirements Analysis: SAS Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Build VIVO in the Cloud NIH Workshop on Value Added Services for VIVO Brand Niemann Semantic Community March 25-26,
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
OMB Data Visualization Tool Requirements Analysis: IBM Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
OMB Data Visualization Tool Requirements Analysis: Microsoft Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
Linked Data Visualizations for Eurostat Linked Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
OMB Data Visualization Tool Requirements Analysis: SAP Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
3 Round Stones: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Big Data Conference: Analytics and Applications for Federal Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
Information Sharing Begins With Me Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
GIS Data Science for Collaboration Across Communities: GIScience 2.0 and Beyond Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Xperience 2013 Be Informed 4.2 Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
Data Science Publication for NSF Polar Cyberinfrastructure Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
EPA Indicators of Our Health and Environment Updated and Improved Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Big Data Symposium: Analytics and Applications for Federal Big Data – Bureau of Justice Statistics Dr. Brand Niemann Director and Senior Enterprise Architect.
Big Data Symposium: Analytics and Applications for Federal Big Data - FEMA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Farm Data Dashboards: USDA and Microsoft Innovation Challenge Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
CHILDREN, YOUTH AND WOMEN’S HEALTH SERVICE New Executive Leadership Team 15 December 2004 Ms Heather Gray Chief Executive.
Data Science for DataBay DataBay "Reclaim the Bay" Innovation Challenge: August 1-3, 2014, Smithsonian Environmental Research Center, 647 Contees Wharf.
The 2012 EuroStat Regional Yearbook for Semantic Interoperability Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Why Doesn't EPA Have a Self- Contained Statistical Unit?: A Tribute to Doug Engelbart Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science for USDA Big Data
Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Data Science for Migration Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Health Datapalooza IV: Child and Adolescent Health Data App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Build the NY Times Subject Headings and Topics in the Cloud Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 4,
SmartGrid and Spotfire Cloud Computing - Similarities in Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Health Datapalooza Would Benefit From Real Innovation Investment Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Research on US Federal Government Handling of Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Build the NITRD Dashboard in the Cloud Brand Niemann Semantic Community March 14,
Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Semantic Data Science for the US Census Bureau Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Harnessing Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL.
NHS – Enabling Change Improving processes and adding value 5th February 2015 Ian Quinnell Associate Director for Programme Management and Service Improvement.
Open Terminology Portal (TOP) Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer Institute, Center for Biomedical Informatics.
Data Science for HealthCare.gov Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
NGA Demo Participant Collaboration Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIEM 3.0 Data Analytics App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Using Open Data to Create Value for Citizens. Data.gov Provides instant access to ~400,000 datasets in easy to use formats Contributions from UN, World.
Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Driving Innovation with Open Data Chris Musialek in place for Jeanne Holm Data.gov February 9, 2012.
HealthIT.gov Dashboard: Spotfire not Flash Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Open Government Data Dominic DiFranzo PhD Student/Research Assistant Rensselaer Polytechnic Institute Tetherless World Constellation.
Multimedia Industry Knowledge CUFGEN01A Develop And Apply Industry Knowledge CUFMEM08A Apply Principles Of Instructional Design To A Multimedia Product.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
Data Science for the National Big Data R&D Initiative Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Conceptualizing the research world
Spotfire 5 Users Guide Dashboard
So those old tests don’t go to waste!
Leads Origins 67% faster Speeds Offers
Presentation transcript:

Harnessing Health.Data.gov Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger April 17,

Background HealthData.gov and Health Datapalooza III Knowledge Base and Data Ecosystem: – Two Published Stories, Two Spreadsheets, and Two Spotfire Dashboards. My Note: HealthData.gov 194 Data Sets in 2012 and 399 now in Health Datapalooza IV Technology Development Track: – Knowledge Graph, Metadata, RPI Watson, Bootcamp, and Linked Data. See Next Slide My Process: – Harness Data for Diabetes Knowledge Base – Data Ecosystem Spreadsheet – Data Ecosystem Spotfire My Results: – Story – Slides – Spotfire Dashboard – Research Notes 2

HealthData.gov and Health Datapalooza III Knowledge Base 3

HealthData.gov and Health Datapalooza III Spotfire Data Ecosystem 4

Health Datapalooza IV Technology Development Track Open Health Knowledge Graphs: – This session will describe healthdata.gov platform components, including new functionality that programmatically exposes tabular and graph-oriented data. healthdata.gov Lifting Schemes: – We will describe the ‘bottom up’ automation tools and techniques employed in the winning submission for the healthdata.gov Metadata Domain Challenge.healthdata.govMetadata Domain Challenge Open Government Data: – We will present emerging solution standards and transitioning academic technologies, including innovative work conducted by the ‘Watson’ research group at Rensselaer Polytechnic Institute on using Watson as a ‘data advisor’.‘Watson’ research groupRensselaer Polytechnic Institute Health Industry Bootcamp - A Real-World Crash Course: – An interactive, games-based bootcamp designed to get participants up and running the same day with their own real-world portfolio covering how to use public data to create market value, how to navigate perverse incentives in the industry, and how to deliver public and social good. Cooperation Without Coordination: Managing Distributed Clinical Trial Data: – TBA See and Linked Data – Structured Data on the Web: – TBA See 5

Vocab.Data.gov: Government Data Vocabulary 6

Health Data Platform Metadata Challenge Mirrored gov to improve the CKAN-metadata and RDF. gov Created three levels of metadata for datasets. Created a set of ontologies to link several datasets from HealthData.gov.

IBM Watson at RPI What is Watson?: – The underlying “DeepQA” architecture is designed to find the meaning behind a question posed in natural language and deliver a single, precise answer. IBM’s Watson goes to school: A Q&A with RPI’s Jim Hendler: – A version of the system similar to the one used on “Jeopardy!” will be housed at RPI for three years as part of a Shared University Research Award from IBM Research. The system at RPI will have 15 terabytes of hard disk storage and give 20 users access to the system simultaneously, making it, according to a release, "an innovation hub” for the campus. – One thing we want to explore is how Watson can interact with social media, especially things such as “tweets” where the language is not as carefully constructed as it is in the documents Watson has used in the Jeopardy game. – I run a group that does a lot of work with Open Government Data systems (like the US data.gov) and we’re excited about the possibility of using Watson to help researchers around the world find relevant government data and documents for their work. – Our goal for the next few years is to gain an understanding of what having the new ways of bringing unstructured data and documents into our computational lives will be. 8 My Note: See Our Semantic Medline Work with New Cray Graph Computer.Semantic MedlineNew Cray Graph Computer

Health.Data.gov 9 My Note: Promotes the Diabetes Challenge, But Does Not Provide Much Data For It!

Health.Data.gov: Search for Diabetes My Note: Found One Data Set and Downloaded Two Excel Files and Added Them to the Diabetes Ecosystem Spreadsheet. See Slide 18.

HealthData.gov Catalog Hub 11 My Note: 402 datasets instead of 399. My Note: Found Same State AQHR Snapshots and CDC WONDER Births. See Next Slide.

HealthData.gov Catalog Hub: CDC WONDER Births 12

HealthData.tw.rpi.edu Catalog Hub: CDC WONDER Births 13 “We mirrored the CKAN instance using its API to our own instance at This allowed us to both improve the CKAN-based metadata, including adding Data Dictionaries and Technical Documentation as Resources, and to improve the RDF generated by CKAN.” Source: Health Data Platform Metadata ChallengeHealth Data Platform Metadata Challenge Source: See Next Slide

CDC WONDER: Natality Information Live Births 14 My Note: Data Description contains Maternal Risk Factors: Diabetes - Yes, No, Not Stated, Not Reported. My Note: A Data Access Agreement is Required.

CDC WONDER: Natality Data Live Births - Diabetes 15

CDC WONDER: Natality Data Live Births - Diabetes 16 My Note: Export to Text File And Remove Metadata and Import to Spreadsheet.

Harness Health.Data.gov Data to Address Diabetes in the US Knowledge Base 17 My Note: Did not find CAHMI! My Note: Only found one!

Diabetes Data Ecosystem Spreadsheet 18

NHQR State Snapshots

AHRQ State Snapshots Conclusion Getting started on quality improvement is not an easy task. One strategy a State may find helpful is to identify other States with populations similar to those targeted for a quality improvement effort. For example, a State seeking to improve rates of pneumonia vaccination for people discharged from hospitals may want to model its efforts on those of a State that has previously implemented an improvement program in this area and demonstrated success. In many cases, the greatest value in comparison may lie in identifying States that have started from relatively low performance and made incremental improvements. The State with the greatest improvements may have the most to contribute in demonstrating to other States how to encourage delivery system change that improves quality of care. 20

AHRQ Quality of Care for Diabetes by Region and State for by Conditions 21

CDC WONDER Births Natality Diabetes 22

Diabetes Data Ecosystem Spotfire 23 My Note: Can See All the Data Sets and Their Data Elements To Do Joins, Mappings, and Rule-Driven Visualizations.

Conclusions and Recommendations A Health.Data.gov search for “diabetes” gives only one data set. A Search of HealthData.gov Catalog Hub gives two data sets. The Health Datapalooza IV Technology Development Track Objectives Are Shown in This Work. I prefer both human-readable and machine-readable metadata instead of just the later which I find at the HealthData.gov Catalog Hub. Next is First Lady Michelle Obama on Exercise and Dr. Amen on Natural Supplements Data in Preventing and Treating Diabetes. 24