Presentation is loading. Please wait.

Presentation is loading. Please wait.

U.S. Federal Government Handling of Data for Open Government Data in Japan Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.

Similar presentations


Presentation on theme: "U.S. Federal Government Handling of Data for Open Government Data in Japan Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist."— Presentation transcript:

1 U.S. Federal Government Handling of Data for Open Government Data in Japan Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ February 12, 2013 http://semanticommunity.info/A_Japan_METI_Open_Data_Dashboard/Final_Report 1

2 Outline Data Science Team Semantic Community: Mission Statement for 2013 Why We Are Here Current US Government Semantic Web Strategy International Linked Open Data Strategy: Linked Open Data Cloud Data Our Semantic Web Strategy for Data: Simple Explanation My 5-Step Method To Get to 5-Stars With Open Data for a System of Systems Architecture Spotfire for Data Science Analytics A Japan METI Open Data Dashboard Summary: Building a Digital Government by Example 2

3 Data Science Team Dr. Brand Niemann, Director and Senior Data Scientist, Semantic Community Dr. Tom Rindflesch, Research Group Lead for Semantic Medline, National Library of Medicine Dr. Victor Pollara, Senior Principal Scientist, Noblis Dr. Eric Little, Director of Information Management, Orbis Technologies Mark Guiton, Director, Government Relations, Cray Inc. – Cray-YarcData announced semifinalists last month http://www.yarcdata.com/press-release-12-3-12.html 3

4 Semantic Community: Mission Statement for 2013 Help the Data Transparency Coalition help the 113th Congress with the re- introduction of the Data Act by Building the Federal Financial Information Network in the Cloud for the 113th Congress, January 4, Slides. Continue to work with Big Data Analytics (e.g. Recorded Future, Spotfire, etc.), Content Analytics and Knowledge Management (e.g. MindTouch), and Semantic Technologies (e.g. Be Informed, Semantic Insights, etc.) for data science and data journalism. Slides. Help start Open Government Data for Japan (and the US and Europe) with the Right Data (Statistical) with the Right People (Data Scientists) Working on the Right Business Problems (Return on Investment): January 21, Slides. Help the Federal Big Data Senior Steering Group with A Semantic Web Strategy for Big Data and to move From the Year of Big Data to the Year of the Data Scientist Working With Big Data, January 24, Slides. Help the ACT-IAC AMWG, C&T SIG, and ET-SIG with Big Data on Mobile Devices, Collaboration and Transformation, & Government Challenges With Big Data, January 16 and February 23, Slides. http://semanticommunity.info/#Welcome_to_Semantic_Community.info:_Community_Infrastructure_Sandbox_for_2013 4

5 Why We Are Here NIEM and Open Government Data Strategy Final Report – Abstract – Introduction – Open Data in Japan – My Stories – Recent Research – Results – Conclusions and Recommendations – Appendices Open Government Data In Japan January 2013 My Stories Recent Research February 12-14 2013 Meetings Upcoming Meeting Questions for NIEM and ODG Leaders 5 For details see: http://semanticommunity.info/A_Japan_METI_Open_Data_Dashboard/Final_Reporthttp://semanticommunity.info/A_Japan_METI_Open_Data_Dashboard/Final_Report

6 Current US Government Semantic Web Strategy Data.gov Advocates RDFa 1.1 Lite for Semantic Web Strategy. – See Comment From Owen Ambur on Next Slide. I believe there is a better way to handle this that I showed the W3C eGov Special Interest Group on January 21st and have recommended for the reintroduction of the Data Act to the 113 th Congress. – Create a Semantic Index of Strong Relationships (SR) in RDF Format in a Spreadsheet. See next slide for example (spreadsheet and words) – Integrate That With Other Spreadsheets and Relational Databases in An Interoperability Interface (e.g. Dashboard) That Can Searched. Essentially: – Computer Scientists Use RD2RDF (James Hendler) – Data Scientists Use SR2Excel2RDF (Brand Niemann) 6

7 Comment From Owen Ambur OMB's official guidance to agencies on implementation of section 10 of the GPRA Modernization Act (GPRAMA) says they may use XML, JSON, spreadsheets or CSVs in order to meet the requirement to publish their strategic and performance plans and reports in machine-readable format... but not PDF or HTML -- at least not without "enhanced structural elements".[1] I couldn't help but chuckle at how [1] is a PDF. I get your point however, which I think reinforces mine, that there is no US federal policy that prefers RDFa 1.1 over HTML Microdata for publishing metadata in HTML. – [1] RDFa Lite 1.1, W3C Recommendation, June 7, 2012, Manu Sporny, editor, see http://www.w3.org/TR/rdfa-lite/http://www.w3.org/TR/rdfa-lite/ Source: Owen Ambur, December 18, 2012, W3C eGov Mailing List. 7

8 International Linked Open Data Strategy: Linked Open Data Cloud Data 8 http://semanticommunity.info/@api/deki/files/8824/=VIVO.xlsx My Question: Is it easy to add columns for who links to who? Answer: Not in a single table. SPARQL can't do cross- tabulation (Richard Cyganiak).

9 International Linked Open Data: Comments to David Wood The Linked Open Data Cloud is not actually “linked data”. – RDF at Data.gov is not linked data. The analytical and statistical communities view Data.gov and Linked Open Data as “IT projects”. – Former Census Bureau Director Robert Groves. Conventional tools can do linked data and data integration. – Spotfire Information Designer, Informatica, Information Builders, etc. 9 http://manning.com/dwood/LinkedData_MEAP_ch1.pdf http://semanticommunity.info/AOL_Government/Exploiting_Linked_Data_with_BI_Tools

10 Our Semantic Web Strategy for Data: Simple Explanation One Table: – Two Columns Example: Column 1: Section and Column 2: URL Note: A Column 3: Description could be in the URL Example: See Next Slides – Three Columns: Example: Column 1: Subject, Column 2: Object, and Column 3: Predicate Note: This is the Semantic Web’s Linked Open Data Cloud as Linked Open Data for Network Analytics! Example: See Next Slides – Four Columns: Examples: Column 1: Subject, Column 2: Attribute, Column 3: From, and Column 4: To, or Column 1: City, Column 2: Country, Column 3: Longitude, and Column 4: Latitude Note: This is the format for Spotfire’s Network Analytics Module developed for the CIA Example: See Semantic MedlineSemantic Medline 10

11 Knowledge Base in Spreadsheet http://semanticommunity.info/@api/deki/files/21577/METI2013.xlsx 11 MY NOTE: Several Apps and Statistical Data Has Lots of Excel Data That I Used.

12 Knowledge Base in Wiki 12 http://semanticommunity.info/A_Japan_METI_Open_Data_Dashboard/Final_Report

13 Alpha.Data.gov in Spreadsheet 13 http://semanticommunity.info/@api/deki/files/21577/METI2013.xlsx MY NOTE: Most Services and Not Data Downloads, But Uploads! See More Detail in Examples of Alpha.Data.gov Slide.

14 Alpha.Data.gov in Wiki 14 http://semanticommunity.info/A_Japan_METI_Open_Data_Dashboard/Final_Report/alpha.data.gov

15 Alpha.Data.gov 15 http://alpha.data.gov/ New Focus!

16 Examples of Alpha.Data.gov 16 http://semanticommunity.info/@api/deki/files/21577/METI2013.xlsx MY NOTE: Most Services and Not Data Downloads, But Uploads!

17 Our Semantic Web Strategy for Data: Spotfire Network Analytics 17 http://semanticommunity.info/AOL_Government/Social_Media_-_Six_Degrees_of_Separation_and_Now_Even_Less

18 My 5-Step Method So what I like to do to illustrate (data science) and explain (data journalism) is the following (like a recipe): – Put the Best Content into a Knowledge Base (e.g. MindTouch*) NASA Big Data – Put the Knowledge Base into a Spreadsheet (Excel*) Linked Data to Subparts of the Knowledge Base – Put the Spreadsheet into a Dashboard (Spotfire*) Data Integration and Interoperability Interface – Put the Dashboard into a Semantic Model (Excel*) Data Dictionaries and Models – Put the Semantic Model into Dynamic Case Management (Be Informed*) Structured Process for Updating Data in the Dashboard 18 * Examples of tools used.

19 To Get to 5-Stars With Open Data StarDefinitionExample / Tool* Make your stuff available on the Web (whatever format) under an open license This StoryThis Story / MindTouch Make it available as structured data (e.g., Excel instead of image scan of a table) SpreadsheetSpreadsheet / Excel Use non-proprietary formats (e.g., CSV instead of Excel) TableTable / MindTouch and Spotfire Use URIs to identify things, so that people can point at your stuff Table of ContentsTable of Contents / MindTouch and Spotfire Link your data to other data to provide context TableTable / MindTouch and Spotfire 19 * Examples of tools used. Source of Star and Definition: http://www.w3.org/DesignIssues/LinkedData.htmlhttp://www.w3.org/DesignIssues/LinkedData.html

20 System of Systems Architecture 20 S Semantic Index of Linked Data (e.g. Excel) Dynamic Case Management (e.g. Be Informed) Data Science Library (e.g. Spotfire) Data Science Products (e.g. Spotfire)

21 Data Federation in Spotfire: In-Memory and In-Database Data In-Memory Data – When you are working with in-memory data tables (text files, Excel files, information links, etc.) you have access to all the functionality of Spotfire. You have the opportunity to use all columns as filters and perform any number of calculations. You can also use any of the tools within Spotfire to cluster data, calculate new columns, bin columns, make predictions etc. See Working With Large Data Volumes for some tips on how to improve the performance of an analysis with lots of data.Working With Large Data Volumes In-Database Data – When a connection to an external source is set up, all calculations are done using the external system and not with the Spotfire data engine. This will allow you to work with data volumes too large to fit into primary memory and take advantage of the power of the external system. When working with external data connections, you access only the current selection of data and all aggregations and calculations are made in-database (in-db). 21

22 Data Federation in Spotfire: Database Connections, Information Links, & Analytics Library 22 A database connection dialog is used to set up a connection to say a Teradata database, where you can analyze data from the database without bringing it into your analysis. An information link is a structured request for data which can be sent to the database. These specifications include one or more columns, and may include one or more filters. – Stated in plain English, an information link could be: "Fetch the Name, Address and Phone_number for employees that pass the filter High_Income." – Information links can also be used to limit what data to open in an analysis in a number of different ways. The library provides publishing capabilities for all of your analysis materials, so you can share data with your colleagues. The library can be used directly from Spotfire by anyone who has at least read privileges.

23 Spotfire for Big Data Analytics: Microscope 23 http://semanticommunity.info/Emerging_Technology_SIG_Big_Data_Committee/Government_Challenges_With_Big_Data#Spotfire_Dashboard NASA GCMD: Gateway to Big Data NSF Big Data Awards: Follow the Work OSTP Harnessing The Power of Digital Data Report: Well-Defined URLs PCAST Designing a Digital Future Report: Interoperability Interface NITRD Dashboards: Live Demonstrations Four Clicks: See, Sort/Search, Download, & Share (iPad)

24 Data Science Analytics Library: Telescope & Library 24 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public Live Links to Outside Data Sources Live Information Links Between Analytics

25 A Japan METI Open Data Dashboard: Knowledge Base 25 http://semanticommunity.info/A_Japan_METI_Open_Data_Dashboard MY NOTE: How It Was Built. MY NOTE: Complete Japan Statistical Yearbook 2012 In Knowledge Base, Spreadsheet, and Spotfire Dashboard!

26 A Japan METI Open Data Dashboard: Spreadsheet 26 http://semanticommunity.info/@api/deki/files/19936/METI2012.xlsx

27 A Japan METI Open Data Dashboard: Spotfire 27 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?METIOpenDataDashboard-Spotfire

28 Summary Semantic Community has a NIEM and Open Government Data Strategy: – Open Source Platform for Creating Knowledge Bases: HTML 5 Well-defined URLs APIs – Machine-readable Semantic Web – Linked Data Formats: Subject Object Predicate – Data Science Analytics: Federation Visualizations and Statistics Network Graphs This Supports Building a Digital Government by Example: – See Next Slide 28

29 Building a Digital Government by Example 29 http://semanticommunity.info/AOL_Government/Building_a_Digital_Government

30 Q & A Contact Information: – Brand Niemann bniemann@cox.net – Tom Rindflesch trindflesch@mail.nih.gov – Victor Pollara victor.pollara@noblis.org – Mark Guiton mguiton@cray.com – Eric Little elittle@orbistechnologies.com 30


Download ppt "U.S. Federal Government Handling of Data for Open Government Data in Japan Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist."

Similar presentations


Ads by Google