Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community

Similar presentations


Presentation on theme: "Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community"— Presentation transcript:

1 Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community http://semanticommunity.info/ http://www.meetup.com/Federal-Big-Data-Working-Group/ http://www.meetup.com/Virginia-Big-Data-Meetup http://www.meetup.com/Northern-Virginia-Semantic-Web-Meetup/ http://semanticommunity.info/Data_Science/Federal_Big_Data_Working_Group_Meetup May 21, 2015 1

2 Introduction NIST is seeking feedback on the Version 1 draft of the NIST Big Data Interoperability Framework. Once public comments are received, compiled, and addressed by the NBD-PWG, and reviewed and approved by NIST internal editorial board, Version 1 of Volume 1 through Volume 7 will be published as final. Three versions are planned, with Versions 2 and 3 building on the first. My Comment: I complemented the NIST Team on excellent work over a long period of time and told them that I asked the 700+ members of our Federal Big Data Working Group Meetup to review the DRAFT documents and provide comments. I said I think this will take us longer than the May 21st deadline and we plan to do a Meetup on this in July. We are looking especially for the 6 Uses Cases that have data sets according to a recent email we saw from the NIST Big Data Workgroup participants. 2

3 Federal Big Data Working Group Meetup 3 http://www.meetup.com/Federal-Big-Data-Working-Group/events/222458479/

4 NIST Requests Comments on NIST Big Data interoperability Framework 4 http://bigdatawg.nist.gov/V1_output_docs.php

5 NIST Big Data interoperability Framework: Seven Volumes The NIST Big Data Interoperability Framework consists of seven volumes, each of which addresses a specific key topic, resulting from the work of the NBD-PWG. The seven volumes are as follows: – Volume 1, Definitions – Volume 2, Taxonomies – Volume 3, Use Cases and General Requirements – Volume 4, Security and Privacy – Volume 5, Architectures White Paper Survey – Volume 6, Reference Architecture – Volume 7, Standards Roadmap My Comment: Volumes 1 and 2 support the Knowledge Base, Volume 3 Supports the Data Science Data Publication, and Volumes 1-7 all support the Massive Open Online Course (MOOC). 5

6 NIST Big Data interoperability Framework: Three Stages The NIST Big Data Interoperability Framework will be released in three versions, which correspond to the three stages of the NBD- PWG work. The three stages aim to achieve the following: – Stage 1: Identify the high-level Big Data reference architecture key components, which are technology, infrastructure, and vendor agnostic. – Stage 2: Define general interfaces between the NIST Big Data Reference Architecture (NBDRA) components. – Stage 3: Validate the NBDRA by building Big Data general applications through the general interfaces. My Comment: The Federal Big Data Working Group Meetup is creating an interface (Stage 2) and applications (Stage 3) by doing Data Science for NIST Big Data Framework! 6

7 Purpose While I have started a Comment Template for detailed comments, my focus is to use the excellent content for the Federal Big Data Working Group Meetup as follows: – Build a Knowledge Base (especially using the Definitions and Taxonomies). – Build a Data Science Data Publication (especially using Use Case & Requirements). – Build a MOOC (Massive Open Online Course) (using the above and Security and Privacy, Architecture White Paper Survey, Reference Architecture, and Standards Roadmap). 7

8 Data Mining Standard Process Data Science for NIST Big Data Framework will be done by Data Mining following the six step standard: – CRISP-DM Step 1: Business (Organizational) Understanding – CRISP-DM Step 2: Data Understanding – CRISP-DM Step 3: Data Preparation – CRISP-DM Step 4: Modeling – CRISP-DM Step 5: Evaluation – CRISP-DM Step 6: Deployment Data Mining 8

9 Method and Results The method and results are documented in the Slides and Spotfire Dashboard. The Knowledge Base Index and selected tables will be documented in the NIST Big Data Spreadsheet. The Meetup date and agenda will be announced soon. 9

10 Data Mining Standard Results CRISP-DM Step 1: Business (Organizational) Understanding: – Knowledge Base: 7 Word Documents to MindTouch CRISP-DM Step 2: Data Understanding: – MindTouch Index to Spreadsheet CRISP-DM Step 3: Data Preparation: – Report Tables and Use Case Data Sets CRISP-DM Step 4: Modeling: – Spotfire Exploratory Data Analysis CRISP-DM Step 5: Evaluation: – Data Science Answer to Four Questions CRISP-DM Step 6: Deployment: – Data Science Data Publication and MOOC 10

11 Data Science for NIST Big Data Framework: MindTouch Knowledge Base Index 11 Data Science for NIST Big Data FrameworkData Science for NIST Big Data Framework NIST Big Data FrameworkNIST Big Data Framework

12 Data Science for NIST Big Data Framework: MindTouch Knowledge Base Find 12 Data Science for NIST Big Data FrameworkData Science for NIST Big Data Framework NIST Big Data FrameworkNIST Big Data Framework Google Chrome Find: Data sets

13 Data Science for NIST Big Data Framework: Spreadsheet Knowledge Base: Find 13 NIST Big Data Spreadsheet

14 Data Science for NIST Big Data Framework: Spreadsheet Knowledge Base: Other 14 NIST Big Data SpreadsheetNIST Big Data Spreadsheet. Report Tables and Use Case Data Sets

15 Data Science for NIST Big Data Framework: Spotfire Cover Page 15 Web Player

16 Data Science for NIST Big Data Framework: Spotfire Tab 1 16 Web Player

17 Data Science for NIST Big Data Framework: Spotfire Tab 2 17 Web Player

18 Data Science for NIST Big Data Framework: Spotfire Tab 3 18 Web Player

19 Data Science for NIST Big Data Framework: Spotfire Tab 4 19 Web Player

20 Conclusions and Recommendations The Version 1 DRAFT NIST Big Data Interoperability Framework (7 volumes) has been reviewed for detailed comments and repurposed by the Federal Big Data Working Group Meetup. A Knowledge Base, Data Science Data Publication, and Massive Open Online Course (MOOC) have been created from the excellent content using the CRISP Data Mining Standard. The methods and results are documented to aid the NIST Big Data Work Group and Federal Big Data Working Group Meetup in future activities. The Federal Big Data Working Group Meetup is creating an interface (Stage 2) and applications (Stage 3) by doing Data Science for NIST Big Data Framework! The Federal Big Data Working Group Meetup is focused on Use Cases with Government Data and Workforce Education of Data Scientists and Chief Data Officers. 20


Download ppt "Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community"

Similar presentations


Ads by Google