Data-enabled Science: Challenges and Opportunities 2013 Tropical Cyclone Research Forum 67 th Interdepartmental Hurricane Conference 6 March 2013 College.

Slides:



Advertisements
Similar presentations
Dr. Sharon Mosher Summit on the Future of Undergraduate Geoscience Education Dr. Sharon Mosher Dean, Jackson School of Geosciences Sponsored by Jackson.
Advertisements

NG-CHC Northern Gulf Coastal Hazards Collaboratory Simulation Experiment Integration Sandra Harper 1, Manil Maskey 1, Sara Graves 1, Sabin Basyal 1, Jian.
Earth System Curator Spanning the Gap Between Models and Datasets.
Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
SAN DIEGO SUPERCOMPUTER CENTER Choonhan Youn Viswanath Nandigam, Nancy Wilkins-Diehr, Chaitan Baru San Diego Supercomputer Center, University of California,
To facilitate readily accessible research infrastructure data to advance our understanding of Earth systems through an international community-driven effort,
Symposium on Digital Curation in the Era of Big Data: Career Opportunities and Educational Requirements: A Data Scientist Perspective Dr. Vicki Lynn Ferrini.
EInfrastructures (Internet and Grids) US Resource Centers Perspective: implementation and execution challenges Alan Blatecky Executive Director SDSC.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
March 25-27, 2002EarthScope IT Workshop USArray Breakout Group The USArray system: Sensor  Collection point Collection point  Archive Archive  User.
A New Generation of Data Services for Earth System Science Education and Research: Unidata’s Plans and Directions AGU Fall Meeting San Francisco, CA 6.
The "Earth Cube” Towards a National Data Infrastructure for Earth System Science Presentation at WebEx Meeting July 11, 2011.
The Vision, Process, and Requirements for Creating EarthCube Presentation at Second EarthCube WebEx Aug 22, 2011.
GeoData 2011 Workshop Data Life Cycle Break Out #3 Wednesday, 2 March 2011 Moderator: Mohan Ramamurthy, Unidata.
ESIP Air Quality Workgroup and the GEO Air Quality Community of Practice collaboratively building an air quality community network for finding, accessing,
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
EarthCube Transforming the Geosciences UCGIS Symposium - George Mason U: May 23, 2013 A Joint Venture of the NSF Directorate of Geosciences and Office.
V. Chandrasekar (CSU), Mike Daniels (NCAR), Sara Graves (UAH), Branko Kerkez (Michigan), Frank Vernon (USCD) Integrating Real-time Data into the EarthCube.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Advances in Cyberinfrastructure with a Focus on Data: a U.S. National Science Foundation Overview Alliance for Permanent Access to Records of Science in.
THEME[ENV ]: Inter-operable integration of shared Earth Observation in the Global Context Duration: Sept. 1, 2011 – Aug. 31, 2014 Total EC.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
Unidata Policy Committee Meeting Bernard M. Grant, Assistant Program Coordinator for the Atmospheric and Geospace Sciences Division May 2012 NSF.
The Digital Library for Earth System Education: A Community Resource
Imagine a World…. With easy, unlimited access to scientific data from any field Where you can easily plot data of interest and display it any way you want.
1 Addressing Critical Skills Shortages at the NWS Environmental Modeling Center S. Lord and EMC Staff OFCM Workshop 23 April 2009.
Who are we? -Group of active climate researchers with diversified expertise in a wide range of disciplines relevant to climate science, including atmosphere,
Providing data services, tools and cyberinfrastructure leadership Unidata Policy Committee May 2011 Organizational Collaboration, Participation,
U.S. Department of the Interior U.S. Geological Survey A vision for a global community Linda Gundersen Director Science Quality and Integrity US Geological.
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Epilogue The Future of Knowledge Management.
ESIP Federation Air Quality Cluster Partner Agencies.
EarthCube Solicitation Update 3/13/2013 Eva Zanzerkia, Barbara Ransom, Irene Lombardo, Leonard Pace Lisa Boush, Bob Chadduck, Mark Suskin.
Introduction GeoData 2014 Workshop #geodata2014 June 17-19, 2014,NCAR, Boulder, CO Peter Fox (RPI)
Office of Science Office of Biological and Environmental Research DOE Workshop on Community Modeling and Long-term Predictions of the Integrated Water.
What is GEO? launched in response to calls for action by the 2002 World Summit on Sustainable Development, Earth Observation Summits, and by the G8 (Group.
Flash Flood Forecasting as an Element of Multi-Hazard Warning Systems Wolfgang E. Grabs Chief, Water Resources Division WMO.
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
© euroCRIS/Keith G Jeffery 1 euroCRIS and e-Infrastructure Keith G Jeffery President, euroCRIS Premium Members.
Finding Partners, Creating Impact Rusty Low Poles Together Workshop NOAA Boulder, CO July 20-22, 2005.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
CyberInfrastructure for Network Analysis Importance of, contributions by network analysis Transformation of NA Support needed for NA.
ESIP Vision: “Achieve a sustainable world” by Serving as facilitator and advisor for the Earth science information community Promoting efficient flow of.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Applied Sciences Perspective Lawrence Friedl, Program Director NASA Earth Science Applied Sciences Program LANCE User Working Group Meeting  September.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Big Data in the Geosciences, University Corporation for Atmospheric Research (UCAR/NCAR), and the NCAR Wyoming Supercomputing Center (NWSC) Marla Meehl.
1 Why is Digital Curation Important for Workforce and Economic Development? Alan Blatecky Office of Cyberinfrastructure Symposium on Digital Curation in.
1 Symposium on the 50 th Anniversary of Operational Numerical Weather Prediction Dr. Jack Hayes Director, Office of Science and Technology NOAA National.
End-to-End Data Services A Few Personal Thoughts Unidata Staff Meeting 2 September 2009.
The Global Organization for Earth System Science Portals Don Middleton National Center for Atmospheric Research; Boulder, Colorado, USA Scientific Computing.
European Science Cloud for Research Towards a common vision Per Öster CSC – IT Center for Science Ltd.
Connecting Users, Data & Data Repositories Simon J. Goring ORCID: John W. Williams doi: /m9.figshare Distinguished Lecture.
NASA Earth Exchange (NEX) A collaborative supercomputing environment for global change science Earth Science Division/NASA Advanced Supercomputing (NAS)
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
BG 5+6 How do we get to the Ideal World? Tuesday afternoon What gaps, challenges, obstacles prevent us from attaining the vision now? What new research.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Transformative Earth Sciences through Data: Neotoma, EarthCube & Flyover Country Simon Goring Assistant Scientist University of Wisconsin - Madison S i.
© 2007 IBM Corporation IBM Software Strategy Group IBM Google Announcement on Internet-Scale Computing (“Cloud Computing Model”) Oct 8, 2007 IBM Confidential.
EarthCube Sustaining the Geosciences for 21 st Century Challenges Credits: from top to bottom: NOAA Okeanos Explorer Program (CC BY-SA 2.0), NASA/Kathryn.
Helmholtz Open Science Webinars on Research Data Webinar 34 – 6 / 11 April 2016 Dr. Birgit Schmidt Niedersächsische Staats- und Universitätsbibliothek.
GISELA & CHAIN Workshop Digital Cultural Heritage Network
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
Brian Matthews STFC EOSCpilot Brian Matthews STFC
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Bird of Feather Session
Brokering as a Core Element of EarthCube’s Cyberinfrastructure
Presentation transcript:

Data-enabled Science: Challenges and Opportunities 2013 Tropical Cyclone Research Forum 67 th Interdepartmental Hurricane Conference 6 March 2013 College Park, MD Dr. Mohan Ramamurthy Unidata Program Center University Corporation for Atmospheric Research 0

The Era of Data-Intensive Science Data is the lifeblood of science, but we need to move from creating data to discovering knowledge. 1

BIG DATA BUZZ 2

Digital Universe According to a study by IDC, 1.8 Zettabytes of information was created in The Internet of things.

“Sea of Data” GOES-R (2017) JPSS (2014) ~3 Tb of data/day Phased Array Radar, with 20 to 30-second volume scans, compared with 5- 7 mins. with current radars. Global, high-resolution coupled models integrated in ensemble mode from days to decades 4

CMIP-5 & IPCC Fifth Assessment Source: Bryan Lawrence, British Atmospheric Data Centre 5

Expected Increase in Data Volume Source: NCDC, NOAA 6

A Provocative Suggestion Wired, 23 June 2008 issue 7

Data Challenges: The Five “V”s Volume: Explosion of data Variety: Different types of data (e.g, multidisciplinary, societal information, etc.); interoperability Velocity: Rate of change, Speed of discovery, access, analysis, integration, and visualization; Views: Many consumers of data (e.g., researchers, educators, policy makers, social scientists, and the public); Diverse applications; Multiple devices; Virtual Communities: Globally distributed, different practices and policies for data sharing; 8

Services in a Cloud Client Access from any Device Many types of data services running in the cloud A range of client devices

“Cloud” Computing & Virtualization Data volumes are too large to bring all of the data to your local environment Need to keep data close to the point of origin or dissemination Will need to move more of the processing, applications, and computations on to the server (e.g., GDS, NCO, etc.). Impractical to store/serve data in multiple formats – so need built-in translators, brokers and mediation services. 10 Industry is well ahead of the science community in dealing with Big Data, Cloud Computing, and Virtualization (e.g., SaaS, Application Service Providers, MapReduce, Hadoop)

Emergency Response Ensemble Predictions Coastal Environments GIS Integration End-to-End Data Services Chained by Workflows Predicting societal impact of flooding from hurricanes involves integrating data from atmospheric sciences, oceanography, hydrology, geology, and social sciences and interfacing the results with decision support systems. 11

Earth System/Multidisciplinary Science It requires data and information integration and knowledge synthesis across “systems” or domains. Challenge: Providing the right data, in the right format, to the right application. 12

GIS Integration: An Enormous Opportunity Need geospatially-enabled cyberinfrastructure so that information can be integrated for location-based understanding of events, processes, interactions, and impacts. GIS integration should not be an after thought. Scientific data systems need to directly enable GIS tools. 13

The Long Tail Problem – NSF Guideline: All proposals must include a Data Management Plan. – By some estimates, only 5% of the data generated by individual PIs is shared – There are many reasons for this – both sociological as well as technical Lack of incentives Inadequate resources; burden of unfunded madate Protectiveness – PIs don’t want to be “scooped” Absence of campus or community data repositories – Need to give PIs the tools and incentives for sharing their data and adding metadata Need to change the culture 14

Data Citation: Opportunity & Challenge Scientific publications should be accompanied by data, algorithms, models, and parameters – need comprehensive data citation. Need transparency. Important for reproducibility. This is not just a technical challenge, but it is also a major cultural and organizational challenge. 15 New way of thinking for authors, software developers, data providers, publishers, and other stakeholders Incentives for authors and publishers Copyright and intellectual property right issues Persistence of all components: datasets, tools, processing services, and interfaces

Unidata 2020 Vision: Geoscience at the Speed of Thought through accelerated data discovery, access, analysis, and visualization. Reduce “data friction”, lower the barriers, and reduce “time to research” Accelerate user workflows (manual or automated) Contribute toward flipping the 4:1 ratio (the current situation) 16

EarthCube Vision Transform the conduct of geosciences research with the next generation CI. Create effective community-driven cyberinfrastructure. Enable global data discovery within the geosciences Achieve interoperability and data integration across disciplines. Transform the conduct of geosciences research with the next generation CI. Create effective community-driven cyberinfrastructure. Enable global data discovery within the geosciences Achieve interoperability and data integration across disciplines. Dynamic Earth Changing Climate Earth & Life Geosphere- Biospheric Connection Water: Changing Perspectives 17

EarthCube Community Workshop A pilot project on coordinated, distributed national ensemble prediction that involves universities that are interested in participating. Develop a prototype system that links data sets and assimilation and prediction systems systems together, involving the most used projects. Create a concrete plan for greater coordination of ongoing and future programs and facilities, developing a next-generation testbed facility to advance the science. Organize meetings to leverage and expand communication, and enhance data sharing, and facilitate sustained interactions. Entrain undergraduate and graduate students into research and educational activities related to “big data. 18 Shaping the Development of EarthCube to Enable Advances in Data Assimilation and Ensemble Prediction December 2012, Boulder CO Recommendations

Concluding Remarks  We live in an exciting era in which advances in computing and communication technologies, coupled with a new generation of geoinformatics, are accelerating scientific research, creating new knowledge, and leading to new discoveries at an unprecedented rate.  Contact:  Acknowledgement: NSF Award