Data Science for RDA Climate Change Data Challenge and Meetup

Slides:



Advertisements
Similar presentations
Data Science for Tackling the Challenges of Big Data
Advertisements

Build VIVO in the Cloud NIH Workshop on Value Added Services for VIVO Brand Niemann Semantic Community March 25-26,
The Vision, Process, and Requirements for Creating EarthCube Presentation at Second EarthCube WebEx Aug 22, 2011.
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
EPA Big Data Analytics: Data Science for EPA Fracturing Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for USGS Minerals Big Data Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data.
A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government.
Imagine Everything is Before You: Past, Present, and Future Paper and Demonstration for the 2014 Family History Technology BYU Dr. Brand Niemann.
GIS Data Science for Collaboration Across Communities: GIScience 2.0 and Beyond Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science Publication for NSF Polar Cyberinfrastructure Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Centers for Disease Control and Prevention Office of Public Health Scientific Services CDC Health Information Innovation Consortium November Forum Brian.
Big Data Symposium: Analytics and Applications for Federal Big Data - FEMA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Farm Data Dashboards: USDA and Microsoft Innovation Challenge Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Partnerships and Broadening Participation Dr. Nathaniel G. Pitts Director, Office of Integrative Activities May 18, 2004 Center.
Data Science for Agency Initiatives 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Data Science for NOAA Chief Data Officer and Big Data Predictive Analytics Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Data Science for International Data Week 2016: Concept Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science.
Director and Senior Data Scientist/Data Journalist
Data Science for DataBay DataBay "Reclaim the Bay" Innovation Challenge: August 1-3, 2014, Smithsonian Environmental Research Center, 647 Contees Wharf.
Data Science ESIP Publication Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for USGS Minerals Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for DTIC Data Ecosystem Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
The 2012 EuroStat Regional Yearbook for Semantic Interoperability Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Why Doesn't EPA Have a Self- Contained Statistical Unit?: A Tribute to Doug Engelbart Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science for USDA Big Data
Data Science for HealthData.gov Developers & Family Caregivers Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
Data Science for EPA Big Data Analytics: Oregon Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
U.S. Department of the Interior U.S. Geological Survey A vision for a global community Linda Gundersen Director Science Quality and Integrity US Geological.
Data Science for Migration Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Health Datapalooza IV: Child and Adolescent Health Data App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
SmartGrid and Spotfire Cloud Computing - Similarities in Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Data Science for NSF Data Science Workshop 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science NSF.
Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
EFDRR Our Goal… Good HFA Exchanges 1.Describe some exchanges that have taken place and any results. 2.Analyse the results of the questionnaire. 3.Make.
NanoHUB.org and HUBzero™ Platform for Reproducible Computational Experiments Michael McLennan Director and Chief Architect, Hub Technology Group and George.
Data Science for Semantics Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Semantics.
Data Science for DoI BSEE Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for DoI BSEE.
Data Science for Joint Doctrine Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Joint.
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
1 Social Business Intelligence from Open Government Data Brand Niemann Senior Enterprise Architect US EPA November 27, 2010 DISCLAIMER: While allowed to.
Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.
Midwest Big Data Hub Edward Seidel Director, NCSA Founder Prof. of Physics, Prof of Astronomy On behalf of the Midwest Big Data Hub 1 Brian Athey Sarah.
Data Science for Random Forests Meetup
Midwest Big Data Hub Letters of Intent for NSF Edward Seidel Director, NCSA Founder Prof. of Physics, Prof of Astronomy On behalf of the Midwest.
Data Science for EarthCube 2015 Key Documents Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
Data Science for Global Ebola Response Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
National Data Science Organizers Lightning Talks From Around the Country Dr. Brand Niemann Founder and Co-Organizer Federal Big Data Working Group Meetup.
HealthIT.gov Dashboard: Spotfire not Flash Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for the National Big Data R&D Initiative Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for UN, HDX, OSTP, RDA, etc. July 15 th : Data Science for RDA Climate Change Data Challenge and Meetup Goals: Goal 1: Digital Catalog.
EGI towards H2020 Feedback (from survey)
National Science Foundation Opportunities
René Bastón, Executive Director Kathy McKeown, PI
South Big Data Innovation Hub
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
Computer Science Department, University of Missouri, Columbia
First Meetup: Data Science for the Data Act at Treasury
Digital library for Earth System Education Teaching Boxes
The Q Improvement Lab August 2017.
Spotfire 5 Users Guide Dashboard
Enabling ML Based Research
October 24-26, 2019 Washington, DC Area
ESI Advisory Council Meeting
Presentation transcript:

Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for RDA Climate Change Data Challenge September 28, 2015

NSF Graduate Data Science Workshop & Community Building, August 5-7, Seattle This NSF sponsored 2.5 day workshop on August 5th – 7th on the University of Washington, Seattle campus, will bring together 100 graduate students from diverse domain sciences and engineering with Data Scientists from industry and academia to discuss and collaborate on Big Data / Data Science challenges. In addition to keynote presentations from high profile speakers, the participants will present posters covering their own research and work collaboratively to begin to solve some of the Grand Challenge problems facing Data Enabled Science & Engineering disciplines. After the workshop, the output from the collaborative teams will be published in an open access environment. Through the shared work at the workshop and beyond, the participants will form lasting, collaborative relationships with their peers and the senior academia partners and industry participants including those from Amazon, Google and Microsoft. The workshop Grand Challenge topics will be selected from the highest scoring white paper submissions. During the workshop, attendees will form teams to work on the Grand Challenges. The authors of the very highest scoring white papers will be invited to give lightning talks of a few slides during the plenary session to describe their challenges or methods. http://depts.washington.edu/dswkshp/

Purpose I think we will do a meetup (or series of meetups like this) to support the NSF Data Science / Big Data Community and use the RDA Climate Change Data Challenge, climate.data.gov, and the U.S. Climate Resilience Toolkit data sets, I am preparing, to jump start our meetup members and other data science meetup participants. Data Sets: RDA Climate Data Challenge: Only 17 of 64 could be used so far. NTRD: 36 Shape (problem reading largest file). Climate.Data.gov: 16 of 38 used so far. U.S. Climate Resilience Toolkit: 63 data sets used in 80 Case Studies. Using Climate Data, Satellite Imagery, and Local Knowledge to Prevent Famine uses 6 data sets (the maximum for any case study), so this would be the best one for integrating multiple data sets. National Climate Assessment: 2377 data sets, in addition to the 36 data tables I extracted from the report itself. See: Spreadsheet

Data Science for RDA Climate Change Data Challenge and Meetup Goal 1: Digital Catalog - Done Goal 2: Data Audit - Done Goal 3: Individual Data Sets in Spotfire – Done (RDA and NTRD) Goal 4: Integration/Applications – IN PROCESS (See right box) Goal 5: Meetups/Data Science Publication/MOOCs – IN PROCESS (See right Box) An additional goal, is to integrate the climate.data.gov and the U.S. Climate Resilience Toolkit into one “seamless” system, which we will call "a Data Science Data Publication". This will be my challenge submission and experimentation day demo for the 6th RDA Plenary in Paris on September 23-25, and support the NSF Meetup of Data Science Meetups on November 6-7 in Washington DC. Our Meetup of Data Science Meetups in preparation for the November 6-7th Meetup is tentatively planned for September 28th.

NSF Big Data Hubs and Data Science Meetups Initial Schedule: Data Science Call I: June 12, 2015 Data Science Call II: June 18, 2015 In-person Meetup Workshop: Washington, DC November 6-7, 2015 Big Data Regional Innovation Hubs (Accelerating the Big Data Innovation Ecosystem): Midwest Northeast South West Initial Ideas: Data Science YouTube channel or Podcast Angie's List for Data Scientists Gathering groups working around the same domain. I.e. connecting people doing different climate global challenges Groups Participating: Bayes Impact San Francisco, CA Non-profit Big Data Utah Salt Lake City, Utah Collaboration Boston Predictive Analytics Boston, MA Meetup Data Community DC Washington, DC Meetup Data Science ATL Atlanta, GA Meetup Data Science for Social Good Chicago, IL Fellowship Program DataKind New York, NY Nonprofit District Data Lab Washington, DC Meetup NYC Data Science New York, NY Meetup SF Data Mining San Francisco, CA Meetup Data Science Chicago Chicago, IL Meetup Data Science MD Baltimore, MD Meetup U.S. Ignite Nation-wide Communities Non-profit Analytics Club Boston, MA Meetup Data Science for Social Good Atlanta Atlanta, GA Fellowship Program https://bdhub.info/ http://data-science.meetup.com/

http://data-science.meetup.com/ My Note: Start to Join and Invite Them to the September 28th Meetup.

Data Mining Data.gov and U.S. Climate Resilience Toolkit Themes Data Resources Challenges FAQ Contact Climate Other? Get Started Taking Action Tools Topics Expertise About Contact Funding Opportunities FAQ http://www.data.gov/climate/ http://toolkit.climate.gov/

http://www.data.gov/climate/

Spreadsheet My Note: Requested and received spreadsheet of 547 data sets and all 100,000+ data sets so I can integrate the catalog and the actual data sets.

Spreadsheet My Note: See imported and filtered in Spotfire in next slide.

My Note: First example in next Tab (in process)

http://toolkit.climate.gov/

http://toolkit.climate.gov/help/partners

Expertise My Note: From map popups to MindTouch to spreadsheet to Spotfire.

Spreadsheet

http://toolkit.climate.gov/training-courses

Spreadsheet My Note: These can be filtered in spreadsheet and Spotfire.

My Note: Filter by Type of Training and/or Difficulty Scale.

Climate Explorer—Visualizing Climate Data in Maps and Graphs Climate Explorer is a research application built to support the U.S. Climate Resilience Toolkit. The tool offers interactive visualizations for exploring maps and data related to the toolkit's Taking Action case studies. Map layers in the tool represent geographic information available through climate.data.gov. Each layer's source and metadata can be accessed through its information icon. Climate Explorer graphs display 1981-2010 U.S. Climate Normals for temperature and precipitation, overlain with daily observations from the Global Historical Climatology Network-Daily (GHCN-D) database. Please note that GHCN-D data have been checked for obvious inaccuracies, but they have not been adjusted to account for the influences of historical changes in instrumentation and observing practices. GHCN-D data are useful for comparing weather and climate, but for long-term climate change analyses, we recommend the National Climatic Data Center's Climate at a Glance. Climate Explorer

http://toolkit.climate.gov/climate-explorer/ My Note: This is like Spotfire with the NTRD I just did! I can reproduce these in Spotfire.

http://toolkit.climate.gov/crt-search

http://toolkit.climate.gov/crt-search?query=*&resource=18 My Note: Find the words “datasets” but not the data! Spreadsheets and Spotfire show you the data (e.g. CSV)!

Conclusions and Recommendations In support of the NSF Data Science / Big Data Community and the Research Data Alliance (RDA), Semantic Community has prepared four multiple data set data sets from the RDA Climate Change Data Challenge, U.S. National Transportation Atlas Database (NTRD), Climate.data.gov, and the U.S. Climate Resilience Toolkit, to jump start the Federal Big Data Working Group Meetup, and other data science meetup participants, for our September 28th Meetup of Data Science Meetups, to prepare for the NSF Meetup of Data Science Meetups, November 6-7, 2015. All of the information is a Data FAIRPort (Free, Accessible, Interoperable, and Reusable) in a Data Science Commons or Hub as a community service. Suggestions and feedback are welcomed.