Data Science Publication for NSF Polar Cyberinfrastructure Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community

Slides:



Advertisements
Similar presentations
OMB Data Visualization Tool Requirements Analysis: Information Builders Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Advertisements

Data Science for Business: Semantic Verses Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for Tackling the Challenges of Big Data
Director and Senior Data Scientist/Data Journalist
EarthCube Data Science Publications Dr. Joan Aron Dr. Sophia Liu Dr. Brand Niemann May 29, 2015
A Search for Veterans Benefits Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community December 22,
Data Science for MyFamilySearch.org Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community My Personal Family History.
EPA Big Data Analytics: EnviroAtlas Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
My FamilySearch.org Tutorial Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community My Personal Family History Dashboard.
OMB Data Visualization Tool Requirements Analysis: Microsoft Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
EPA Big Data Analytics: Data Science for EPA Fracturing Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Semantic Data Discovery: Proof of Concept for DHS
Linked Data Visualizations for Eurostat Linked Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
OMB Data Visualization Tool Requirements Analysis: SAP Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for USGS Minerals Big Data Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data.
A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government.
Imagine Everything is Before You: Past, Present, and Future Paper and Demonstration for the 2014 Family History Technology BYU Dr. Brand Niemann.
Information Sharing Begins With Me Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
GIS Data Science for Collaboration Across Communities: GIScience 2.0 and Beyond Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
EPA Indicators of Our Health and Environment Updated and Improved Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Big Data Symposium: Analytics and Applications for Federal Big Data – Bureau of Justice Statistics Dr. Brand Niemann Director and Senior Enterprise Architect.
Big Data Symposium: Analytics and Applications for Federal Big Data - FEMA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Farm Data Dashboards: USDA and Microsoft Innovation Challenge Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
AERONET Web Data Access and Relational Database David Giles Science Systems and Applications, Inc. NASA Goddard Space Flight Center.
Data Science for Agency Initiatives 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Data Science for International Data Week 2016: Concept Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science.
Data Science for DataBay DataBay "Reclaim the Bay" Innovation Challenge: August 1-3, 2014, Smithsonian Environmental Research Center, 647 Contees Wharf.
Data Science ESIP Publication Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for USGS Minerals Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for DTIC Data Ecosystem Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
The 2012 EuroStat Regional Yearbook for Semantic Interoperability Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Why Doesn't EPA Have a Self- Contained Statistical Unit?: A Tribute to Doug Engelbart Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science for USDA Big Data
Data Driven Farming: Week 5: Evaluation
Data Science for EPA Big Data Analytics: Oregon Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Data Science for Migration Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Health Datapalooza IV: Child and Adolescent Health Data App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Climate News Study A member of the general public has found a copy of the (attached) graphic on the New York Times website. They see the tag line as NSIDC.
Build the NY Times Subject Headings and Topics in the Cloud Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 4,
SmartGrid and Spotfire Cloud Computing - Similarities in Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Data Science for NSF Data Science Workshop 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science NSF.
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
The Arctic Observing Network (AON) Cooperative Arctic Data and Information Service (CADIS) Florence Fetterer,
Data Science for HealthCare.gov Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Driven Farming: Week 6: Deployment Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Week 6 Deployment.
Data Science for Semantics Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Semantics.
Department of Commerce App Challenge: Big Data Dashboards Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Data Science for DoI BSEE Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for DoI BSEE.
Data Science for Joint Doctrine Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Joint.
Searching Tutorial By: Lola L. Introduction:  When you are using a topic, you might want to use “keyword topics.” Using this might help you find better.
Data Science for FDA RFI Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
NGA Demo Participant Collaboration Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIEM 3.0 Data Analytics App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for EarthCube 2015 Key Documents Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for Global Ebola Response Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
National Data Science Organizers Lightning Talks From Around the Country Dr. Brand Niemann Founder and Co-Organizer Federal Big Data Working Group Meetup.
HealthIT.gov Dashboard: Spotfire not Flash Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for the National Big Data R&D Initiative Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for RDA Climate Change Data Challenge and Meetup
Spotfire 5 Users Guide Dashboard
Presentation transcript:

Data Science Publication for NSF Polar Cyberinfrastructure Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community November 4,

Preface Some prep work is already underway (if you scour the Open Science Codefest site you will find some) to prepare some datasets of relevance to the Polar community. We will provide some of this prepared data to interested parties ahead of the workshop in the next few weeks in case folks want to start hacking early. We will tweet under the hash tag: #nsfpolardatavis 2 Source: Chris Mattmann

Overview Build the knowledge base (in MindTouch) and spreadsheet (in Excel) first, which then makes the Spotfire (data browser) application easier to “storify” the results. Follow the Cross-Industry Data Mining Standard by: – 1 Business Understanding (of the Hackathon), – 2 Data Understanding (by mining the Sessions), – 3 Data Preparation (by screen scraping and downloading), – 4 Modeling (enough data for statistical significance?), – 5 Evaluation (How collected?, Where stored?, What results?, and Believe them?; and – 6 Deployment (Story and Demo). The documentation will be in the form of the Data Science Publication for NSF Polar Cyberinfrastructure. My goal is to see if I can integrate and federate these multiple data sources. 3

Data Science for Business: Data Mining Process 4 Source: Data Science for Business: Chapter 2, 2014

Data Science for NSF Polar Cyberinfrastructure: Knowledge Base 5 Data Science for NSF Polar Cyberinfrastructure

Possible Data Sets Scour the Open Science Codefest: – codefest/issues/26 codefest/issues/26 The Polar Data Catalogue (YES): – BCO-DMO (YES): – Polar Hub (NO): – The AMRC at University of Wisconsin-Madison (YES): – ftp://amrc.ssec.wisc.edu/pub/requests/DVPC/ ftp://amrc.ssec.wisc.edu/pub/requests/DVPC/ 6

Open Science Codefest: NASA/NSF/NSIDC Data Sets NASA Antarctic Master Directory – A master directory for arctic data sets tal=amd&KeywordPath=Parameters|CRYOSPHERE&Metadat aType=0&lbnode=mdlb2 tal=amd&KeywordPath=Parameters|CRYOSPHERE&Metadat aType=0&lbnode=mdlb2 NSF ACADIS Gateway – NSF data repository for arctic/polar data NSIDC Arctic Data Explorer – National Snow and Ice Data Center repository 7 Source: Link to presentation given at Open Science CodefestLink to presentation given at Open Science Codefest

Polar Data Catalogue: Home Page 8

Polar Data Catalogue: Collections 9

Polar Data Catalogue: Search 10

Polar Data Catalogue: Canadian Lake Ice Database MB MDB

Polar Data Catalogue: Sea Ice Thickness in Southern Beaufort Sea 12 Downloaded 5 Files 4 Text and 1 ZIP (Shape) 1.3 MB

Polar Data Catalogue: Spreadsheet 13 Canadian Lake Ice Database Sea Ice Thickness in Southern Beaufort Sea

BCO-DMO 14 Tutorial PDF

Data Access Tutorial 2014 OCB PI Summer Workshop How to Submit Data Data access: TEXT-BASED SEARCH scenario 1: – You have a general idea of what you are looking for. Data access: MAP BROWSE scenario 2: – You are interested in data from a particular geographic region. Data access: MAP KEYWORD SEARCH scenario 3: – You are interested in data of a particular type from a particular geographic area. Data access: MAP SEMANTIC SEARCH scenario 4: – You have an idea what you are looking for, but you do not know the Program, Project, or Deployment name. Glossary of Terms Acknowledgments Follow BCO-DMO 15 My Question: Could Spotfire do all of this?

BCO-DMO Datasets 16

BCO-DMO MapServer Geospatial Interface 17

Polar Hub: A Global Hub for Polar Data Discovery 18 My Question: Where is the Data?

The AMRC at University of Wisconsin-Madison NameSizeDate Modified [parent directory] Ant_IR_area/9/30/14, 8:18:00 PM Ant_IR_netCDF/9/30/14, 8:37:00 PM AWS_dat_MAY_2014/Text Files: Spotfire?9/30/14, 3:23:00 PM AWS_q10_MAY_2014/Text Files: Spotfire?9/30/14, 3:24:00 PM AWS_q1h_MAY_2014/Text Files: Spotfire?9/30/14, 3:25:00 PM AWS_q3h_MAY_2014/Text Files: Spotfire?9/30/14, 3:27:00 PM AWS_r_MAY_2014/Text Files: Spotfire?9/30/14, 3:22:00 PM readme.txt4.6 kB10/6/14, 6:37:00 PM 19 Index of /pub/requests/DVPC/ ftp://amrc.ssec.wisc.edu/pub/requests/DVPC/

Data Science for NSF Polar Cyberinfrastructure: Spreadsheet Knowledge Base 20

Data Science for NSF Polar Cyberinfrastructure: Spotfire Cover Page 21 Web Player

Data Science for NSF Polar Cyberinfrastructure: Spotfire Visualizations 22 Web Player