Research on US Federal Government Handling of Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community

Slides:



Advertisements
Similar presentations
Federal Transparency.gov As Data For the Digital Government Strategy Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Advertisements

OMB Data Visualization Tool Requirements Analysis: Information Builders Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for Tackling the Challenges of Big Data
OMB Data Visualization Tool Requirements Analysis: Birst Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Dynamic Case Management for Military and Intelligence Departments Can Improve Their Enterprise Architecture Programs Dr. Brand Niemann Director and Senior.
Title: Build EPA Apps in the Cloud Dr. Brand Niemann Former US EPA Senior Enterprise Architect and Data Scientist Current Binary Group Senior Enterprise.
Presentation to Data.gov PMO Semantic Web/Linked Data Team Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 27,
Build the Binary Group in the Cloud Brand Niemann Senior Enterprise Architect Binary Group August 5, Updated August 8,
Build Systems of Systems in the Cloud: Tutorial Brand Niemann Director and Senior Data Scientist Semantic Community November 9,
A Search for Veterans Benefits Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community December 22,
OMB Data Visualization Tool Requirements Analysis: Logi Analytics Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
My FamilySearch.org Tutorial Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community My Personal Family History Dashboard.
OMB Data Visualization Tool Requirements Analysis: Microsoft Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Big Data Innovation: Semantic Analytics 14 th SOA for eGovernment Conference Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
EPA Big Data Analytics: Data Science for EPA Fracturing Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Semantic Data Discovery: Proof of Concept for DHS
Linked Data Visualizations for Eurostat Linked Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
OMB Data Visualization Tool Requirements Analysis: SAP Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Mandates for Data Transparency in 113th Congress: DataCoalition.org Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
3 Round Stones: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Big Data Conference: Analytics and Applications for Federal Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Imagine Everything is Before You: Past, Present, and Future Paper and Demonstration for the 2014 Family History Technology BYU Dr. Brand Niemann.
NIEM as Big Data in a Network with Data Science Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Semantic Knowledge Bases and Be Informed for the FAA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Information Sharing Begins With Me Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
GIS Data Science for Collaboration Across Communities: GIScience 2.0 and Beyond Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
EPA Indicators of Our Health and Environment Updated and Improved Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Big Data Symposium: Analytics and Applications for Federal Big Data – Bureau of Justice Statistics Dr. Brand Niemann Director and Senior Enterprise Architect.
Big Data Symposium: Analytics and Applications for Federal Big Data - FEMA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Farm Data Dashboards: USDA and Microsoft Innovation Challenge Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
1 Build Your Own Data.gov Mashup-of-Mashups Catalog Brand Niemann Senior Enterprise Architect U.S. EPA November 5, 2010.
Data Science for Agency Initiatives 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
ACCESS for VALIDITY ACCESS for INNOVATION. Starting January 2011 for NEW proposals Not voluntary – “integral part” of proposal and FastLane Required for.
Data Science for DataBay DataBay "Reclaim the Bay" Innovation Challenge: August 1-3, 2014, Smithsonian Environmental Research Center, 647 Contees Wharf.
Data Science ESIP Publication Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for USGS Minerals Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for DTIC Data Ecosystem Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
The 2012 EuroStat Regional Yearbook for Semantic Interoperability Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Why Doesn't EPA Have a Self- Contained Statistical Unit?: A Tribute to Doug Engelbart Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science for USDA Big Data
Data Science for EPA Big Data Analytics: Oregon Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
1 Services and Cloud Computing Work Groups: Status Report Brand Niemann US EPA December 3, 2009.
Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Data Science for Migration Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Health Datapalooza IV: Child and Adolescent Health Data App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
SmartGrid and Spotfire Cloud Computing - Similarities in Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Health Datapalooza Would Benefit From Real Innovation Investment Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for HealthCare.gov Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Department of Commerce App Challenge: Big Data Dashboards Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Data Science for DoI BSEE Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for DoI BSEE.
Data Science for Joint Doctrine Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Joint.
SICoP 2011: Transforming Government through Innovation with Semantic Technologies Semantic Tech and Business Conference, November 29 – December 1, 2011.
Data Science for FDA RFI Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
NGA Demo Participant Collaboration Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Cross Information Sharing and Integration for the Intelligence Community: 13 th SOA for eGovernment Conference Dr. Brand Niemann Director and Senior Enterprise.
NIEM 3.0 Data Analytics App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
Harnessing Health.Data.gov Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
U.S. Federal Government Handling of Data for Open Government Data in Japan Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Data Science for Global Ebola Response Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
HealthIT.gov Dashboard: Spotfire not Flash Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for the National Big Data R&D Initiative Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Spotfire 5 Users Guide Dashboard
Title: Build EPA Apps in the Cloud
Presentation transcript:

Research on US Federal Government Handling of Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community AOL Government Blogger January 21, (Password Protected) 1

Research Specification Analysis of current open government initiatives and programs in the U.S.A./research on rules governing data and information disclosure: – The research should cover the latest trends in regulations and operational rules governing the disclosure of federal agency and other government-held data and information, including the handling of highly sensitive data related to national security, in light of the Obama administration’s open government policy, and the latest information available on the operation of the federal cloud. – Rules governing public access to data via DATA.gov. NIEM content and implementation: – Investigate the content and implementation of NIEM. – Investigate and analyze the realities of current usage by U.S. government departments and agencies and overseas: What are the actual content and usage by U.S. federal institutions, state and local governments, foreign governments, and private-sector corporations in the case of DHS and NIEM? – In particular, conduct an in-depth survey on the current usages and its issues/challenges by New York City Government, Pennsylvania State Government, New York State Government. It will be appreciated if you can add more good example of any State Government. 2

Big Picture We all have: – Content that we now call “Big Data” – Technology that we now call “BYOX” (bring your own device, data, etc.) Governments have data: – Statistical (collected to answer questions) – Open (just happens) – Classified (national security and proprietary work) Governments have concerns: – Privacy (personal information) – Security (need to know) – “Mosaic effect” (aggregation can reveal a privacy or security breach) 3

Big Solution Governments need: – Chief Data Officers – Teams of Data Scientists and Statisticians – Return on Investment Governments should: – Work with the right data – Work with the right people – Work on the right projects 4

My Examples Current: – Japan Statistical Yearbook 2012 – CKAN Japan Open Government Data – The US District of Columbia Data Catalog and 311 Message Service Future (proposed): – Traffic Monitoring with GPS on Cabs in Tokyo – MetaTags for Classified Data Before Encryption Add facets to metadata that can be searched 5

Research Specification Response 1 Analysis of current open government initiatives and programs in the U.S.A./research on rules governing data and information disclosure: – The research should cover the latest trends in regulations and operational rules governing the disclosure of federal agency and other government-held data and information, including the handling of highly sensitive data related to national security, in light of the Obama administration’s open government policy, and the latest information available on the operation of the federal cloud. GENERAL RESPONSE: al_Report and ee/Cloud_Computing_AND_Big_Data_Forum_and_Workshop_January_15- 17_ al_Report ee/Cloud_Computing_AND_Big_Data_Forum_and_Workshop_January_15- 17_2013 – Rules governing public access to data via DATA.gov. GENERAL RESPONSE: and 6

Research Specification Response 2 NIEM content and implementation: – Investigate the content and implementation of NIEM. – Investigate and analyze the realities of current usage by U.S. government departments and agencies and overseas: What are the actual content and usage by U.S. federal institutions, state and local governments, foreign governments, and private-sector corporations in the case of DHS and NIEM? – In particular, conduct an in-depth survey on the current usages and its issues/challenges by New York City Government, Pennsylvania State Government, New York State Government. It will be appreciated if you can add more good example of any State Government. GENERAL RESPONSE: – ment_Report ment_Report – ual_Report ual_Report – nsible_Information_Sharing:_Engaging_Industry_to_Improve_Standards- Based_Acquisition_.26_Interoperability nsible_Information_Sharing:_Engaging_Industry_to_Improve_Standards- Based_Acquisition_.26_Interoperability – MY NOTE: Do not know or more States than New York and Pennsylvania. Oracle may be a larger and better implementer of NIEM than “NIEM” itself! 7

Research Specification Response MY NOTE: This is password protected and under review by IBT and METI.

Research Specification Response 4 Governments need: – Chief Data Officers (IBT and METI) – Teams of Data Scientists and Statisticians (Me to Start) – Return on Investment (My Consulting Fee) Governments should: – Work with the right data (Statistical and Open) – Work with the right people (Data Scientists) – Work on the right projects (Pilot Dashboards) 9

My 5-Step Method So what I like to do to illustrate (data science) and explain (data journalism) is the following (like a recipe): – Put the Best Content into a Knowledge Base (e.g. MindTouch*) The Japan Statistical Yearbook 2012 – Put the Knowledge Base into a Spreadsheet (Excel*) Linked Data to Subparts of the Knowledge Base – Put the Spreadsheet into a Dashboard (Spotfire*) Data Integration and Interoperability Interface – Put the Dashboard into a Semantic Model (Excel*) Data Dictionaries and Models – Put the Semantic Model into Dynamic Case Management (Be Informed*) Structured Process for Updating Data in the Dashboard 10 * Examples of tools used.

To Get to 5-Stars With Open Data StarDefinitionExample / Tool* Make your stuff available on the Web (whatever format) under an open license This StoryThis Story / MindTouch Make it available as structured data (e.g., Excel instead of image scan of a table) SpreadsheetSpreadsheet / Excel Use non-proprietary formats (e.g., CSV instead of Excel) TableTable / MindTouch and Spotfire Use URIs to identify things, so that people can point at your stuff Table of ContentsTable of Contents / MindTouch and Spotfire Link your data to other data to provide context TableTable / MindTouch and Spotfire 11 * Examples of tools used. Source of Star and Definition:

Japan Statistical Yearbook

METI Open Data Dashboard-Spotfire 13

CKAN Japan Open Government Data 14

The US District of Columbia Data Catalog and 311 Message Service 15 MY NOTE: I did not find a Japanese City with a Data Catalog so I used a US City with an excellent one I had done.

Conclusion The Open Government Data/Linked Data Initiatives in the US (Data.gov) and elsewhere started with the wrong data. I recommended that Data.gov start with the best US government data from the Federal Statistical Agencies (e.g. US Census Bureau) and make it the standard practice of high quality data and metadata rendered in the new linked open data way. Experience has shown that to be the case with the statistical community calling Data.gov and similar efforts essentially ‘IT projects.’ The solution is to first render a countries high quality statistical data in the new way and then try to render the open data the same way as much as possible. I have done this for the US and Europe, and now for Japan. 16