Big data and Open data for solid Earth science Massimo Cocco & EPOS Team Istituto Nazionale di Geofisica e Vulcanologia EPOS PP Coordinator Big Data &

Slides:



Advertisements
Similar presentations
Research Infrastructures WP 2012 Call 10 e-Infrastructures part Topics: Construction of new infrastructures (or major upgrades) – implementation.
Advertisements

Massimo Cocco EPOS PP Coordinator TABOO Meeting Aprile 2013 Integrazione delle infrastrutture di ricerca in EPOS: data services per la ricerca e la società
1 Ideas About the Future of HPC in Europe “The views expressed in this presentation are those of the author and do not necessarily reflect the views of.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
FP6 Thematic Priority 2: Information Society Technologies Dr. Neil T. M. Hamilton Executive Director.
1 Ideas About the Future of HPC in Europe “The views expressed in this presentation are those of the author and do not necessarily reflect the views of.
EPOS participation to the INFRASTRUCTURES 2011 call Three coordinated proposals VERCE, EUDAT, ENVRI J.-P. Vilotte (CNRS-INSU), A. Michelini (INGV), T.
GEO Geohazard Supersites and Natural Laboratories (GSNL): Building data infrastructures for science Massimo Cocco EPOS PP Coordinator INGV, Rome GEO-X.
Massimo Cocco & Joern Lauterjung EPOS PP Council Rome September 19 th 2013 The Technical architecture Material prepared by WG7.
The European Plate Observing System (EPOS): Integrated Services for solid Earth Science Massimo Cocco and the EPOS Consortium EPOS PP Coordinator
Research and Innovation Research and Innovation Research and Innovation Research and Innovation Research Infrastructures and Horizon 2020 The EU Framework.
WP5 Strategy Domenico Giardini SED ETHZ. WP5 Objectives Harmonize national implementation Integrate the European scientific community Establish Centres.
GEO Work Plan Symposium 2012 ID-05 Resource Mobilization for Capacity Building (individual, institutional & infrastructure)
The Preparatory Phase Proposal a first draft to be discussed.
European Life Sciences Infrastructure for Biological Information ELIXIR
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
Scientific Data Infrastructure: activities in the Capacities Programme of FP7 Presentation at euroCRIS Workshop, Brussels 15 September 2009 "The views.
Data Infrastructures Opportunities for the European Scientific Information Space Carlos Morais Pires European Commission Paris, 5 March 2012 "The views.
1 European policies for e- Infrastructures Belarus-Poland NREN cross-border link inauguration event Minsk, 9 November 2010 Jean-Luc Dorel European Commission.
WP7 - Architecture and implementation plan Objectives o Integrating the legal, governance and financial plans with technological implementation through.
Session Chair: Peter Doorn Director, Data Archiving and Networked Services (DANS), The Netherlands.
ESPON Seminar 15 November 2006 in Espoo, Finland Review of the ESPON 2006 and lessons learned for the ESPON 2013 Programme Thiemo W. Eser, ESPON Managing.
European Broadband Portal Phase II Application of the Blueprint for “bottom-up” broadband initiatives.
A complementary view from the DIGOIDUNA study Paolo Bouquet, University of Trento, Italy SMART 2010/0054.
GEO Geohazard Supersites Current Status and key challenges EGU 2013 Vienna Supersites Splinter Meeting.
EPOS a long term integration plan of research infrastructures for solid Earth Science in Europe Preparatory Phase Project
A new start for the Lisbon Strategy Knowledge and innovation for growth.
From GEANT to Grid empowered Research Infrastructures ANTONELLA KARLSON DG INFSO Research Infrastructures Grids Information Day 25 March 2003 From GEANT.
Seismic Hazard Map EPOS European Plate Observing System RESEARCH INFRASTRUCTURE AND E-SCIENCE FOR DATA AND OBSERVATORIES ON EARTHQUAKES, VOLCANOES, SURFACE.
EPOS European Plate Observing System Research Infrastructure and e-science for Data and Observatories on Earthquakes, Volcanoes, Surface Dynamics and Tectonics.
A public-private partnership building a multidisciplinary cloud platform for data intensive science Bob Jones Head of openlab IT dept CERN This document.
What is GEO? launched in response to calls for action by the 2002 World Summit on Sustainable Development, Earth Observation Summits, and by the G8 (Group.
1 e-Infrastructures e-Infrastructures Taking stock and looking ahead an European perspective Bernhard Fabianek European Commission - DG INFSO GÉANT & e-Infrastructure.
EPOS where we are, where we go ! Massimo Cocco INGV Rome September EPOS Meeting.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
TCS-ICS interactions Kuvvet Atakan 1 and the WP6 and WP7 Teams 1 University of Bergen / Department of Earth Science.
WP3 Harmonization & Integration J. Lauterjung & WP 3 Group.
Why are we here? projectsnational coordinationWorking groupsTCS.
EPOS IP Roadmap Massimo Cocco & PDB. EPOS IP project Timeline Implementation Validation Pre-operation.
EPOS IP Validation and pre-operational phases Massimo Cocco & PDB.
WP6 Technical Work J Lauterjung GFZ Potsdam. Objective The main objective is the development of a novel and efficient e- infrastructure concept addressing.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
Splinter Session 1a : Identify topics Europe would like to have included in the GEO WP Chair: Luigi Fusco, ESA Reporting: Luca Demicheli, EuroGeoSurveys.
The Global Scene Wouter Los University of Amsterdam The Netherlands.
Creating Global Information Commons for Science Presentation at the EUROCRIS meeting September 2006, Brussels by Jean-Jacques Royer CODATA Treasurer.
European Science Cloud for Research Towards a common vision Per Öster CSC – IT Center for Science Ltd.
EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number EGI vision for the EOSC Tiziana.
1 Kostas Glinos European Commission - DG INFSO Head of Unit, Géant and e-Infrastructures "The views expressed in this presentation are those of the author.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EPOS and EUDAT.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Aalto Data.
Towards a European Shared Environmental Information System in Support of Environmental Policies: INSPIRE: an Inspired revolution for a knowledge-based.
European Perspective on Distributed Computing Luis C. Busquets Pérez European Commission - DG CONNECT eInfrastructures 17 September 2013.
COST Action and European GBIF Nodes Anne-Sophie Archambeau.
EGI-Engage EGI Webinar - Introduction - Gergely Sipos EGI.eu / MTA SZTAKI 6/26/
EGI-InSPIRE EGI-InSPIRE RI EGI strategy towards the Open Science Commons Tiziana Ferrari EGI-InSPIRE Director at EGI.eu.
Geo-energy Test Beds: part of the European Plate Observing System Michael H. Stephenson, David Schofield, Chris Luton, Florian Haslinger, Jan Henninges.
Implementation of the European Plate Observing System (EPOS) Infrastructure Kirsten Elger1, Jörn Lauterjung1, Damian Ulbricht1, Massimo Cocco2, Kuvvet.
Towards a European Open Science Cloud for research
EPOS European Plate Observing System
Alessandro Spinuso, Andreas Rietbrock, Andrè Gemuend,
EUMEDCONNECT3-ASREN, 27 June 2014
Summit 2017 Breakout Group 2: Data Management (DM)
Similarities between Grid-enabled Medical and Engineering Applications
Presentation on Copernicus Dissemination
EGI-Engage Engaging the EGI Community towards an Open Science Commons
Summit 2017 Breakout Group 1: Advanced Research Computing (ARC)
EGI Webinar - Introduction -
Common Solutions to Common Problems
League of Advanced European Neutron Sources
EOSC-hub Contribution to the EOSC WGs
Presentation transcript:

Big data and Open data for solid Earth science Massimo Cocco & EPOS Team Istituto Nazionale di Geofisica e Vulcanologia EPOS PP Coordinator Big Data & Open Data – Brussels – May

Solid Earth Science KEYWORDS Multidisciplinary contributions Services to society Community building Geo-Hazards Geo-Resources Environmental changes

European Plate Observing System: Mission EPOS is a long-term plan for the integration of research infrastructures for solid Earth Science in Europe EPOS will integrate the existing advanced European facilities into a single, sustainable, distributed infrastructure taking full advantage of new e-science opportunities EPOS has the ambitious goal to facilitate research by providing open access to data, modeling tools, and facilities trough an efficient and multidisciplinary research platform. This platform will facilitate innovative research for accurate, durable, and sustainable answers to societal questions relevant to the environment and human welfare.

EPOS Community EPOS will increase their efficiency, improve and simplify their use, and allow multilateral strategic coordination for their sustainability, operation, and development EPOS integrates a large number of infrastructures and communities

The EPOS Integrated Core Services will provide access to multidisciplinary data, data products, synthetic data from simulations, processing and visualization tools,.... The EPOS Integrated Core Services will serve scientists and other stakeholders, young researchers (training), professionals and industry EPOS is more than a mere data portal: it will provide not just data but means to integrate, analyze, compare, interpret and present data and information about Solid Earth Thematic Core Services are infrastructures to provide data services to specific communities (they can be international organizations, such as ORFEUS for seismology) National Research Infrastructures and facilities provide services at national level and send data to the European thematic data infrastructures. Topological Architecture

The National RIs MAP OF: - Seismic/GPS stations - Laboratories -- etc…. Diversity in data type and formats Research Infrastructure LIst 244 Research Infrastructures 138 Institutions 22 countries 2272 GPS receivers 4939 seismic stations 464 TB Seismic data PB Storage capacity 828 instruments in 118 Laboratories

Access to Data Products (Taxonomy)  Level 0: raw data, or basic data  Level 1: data products coming from nearly automated procedures  Level 2: data products resulting by scientists’ investigations  Level 3: integrated data products coming from complex analyses or community shared products  Level 4. Software, IT tools seismograms Earthquake locations Interferograms Seismic hazard map

EPOS Data, Access, and IPR policy Balance: Legal risk : Openness : Traceability IPR Terms & Conditions Restrictions IPR Terms & Conditions Restrictions Licensing Data & Service Providers Data & Service Providers EPOS Data & Service Users Data & Service Users Open Access deposit terms Open Access license Data & Data Products Level 0,1, 2, 3 Data & Data Products Level 0,1, 2, 3 Tools & Software Open: Restricted:Embargoed Users Anonymous: Registered:Authorized Categorization as needed for legal aspects Categorization as needed for legal aspects mix and match as required Guiding principles: – open access – licensing – no charges Protect EPOS legally Trace EPOS use & users Unrestricted use & access

Big Data Open Data

Functional Architecture Higher level data products Compatibility Layer is the TCS-ICS Interface and it guarantees integration & interoperability

Data Timeline Data acquisition, validation & standardization Data collection & preservation (PID, DOI) Accessibility, integration, computation

12 HPC GRID Cloud EUDAT EPOS Architecture

EPOS Challenges Providing services to solid Earth community – Engaging data providers & users (future data products providers) Involve other scientific communities – Environmental science (marine, atmosphere,....) e-science community – IT innovation for developing e-RIs – Access to services for distributed resources (different timelines) Involve private sector with a clear strategy

14 A Paradigm Shift: from Data Driven to Data intensive Research The earthquake data-driven research has entered a fundamental paradigm shift. Data intensive applications. To exploit the full potential of this rapidly growing European and Global data-rich environment, To guarantee optimal operation and design of the high-cost monitoring facilities,. Data-intensive research is rapidly spreading in the community. Large volumes of time-continuous seismograms contain a wealth of hidden information about the Earth’s interior properties and wave sources, and their variation through time. Mining, analyzing and modelling, this abundance of digital data will reveal new insights at all depths in the planetary interior and at higher resolution than is possible by any other approach. European Plate Observing SystemBig Data Open Data workshop, Brussels May VERCE : Virtual Earthquake and Seismology Research Community e-science environment in Europe Big Data Open Data

Data Intensive simulation and inversion Global scale: Waveform prediction for large earthquakes Full waveform inversion tomography: new inside in the deep Earth Regional scale: Wave propagation in complex geological media Full waveform inversion Extended earthquake sources imaging Strong motion prediction: Physically-based hazard assessment Earthquake source dynamics Stochastic wave simulation Aero-acoustic wave simulation in a volcano Käser et al. (2009) Seismic wave propagation and tomography Komatisch et al. (2009) Capdeville et al. (2003) Strong motion simulation: Grenoble Valley Chaljub et al. (2009); Delavaud et al. (2009), Käser et al. (2009) Fichtner et al. (2009)

The long “Heavy” tail Integrated & Federated Data Infrastructures for Solid Earth Sciences in Europe (EPOS) 1 ≈ PB/year & data diversity (type & formats)

Ethic Issues

18 e-IRG Workshop, May, 2013, Dublin (IRL) Summary Individual communities have their own thematic services developed throughout many years and, in general, they are happy with them (!) ➜ ad hoc solutions In solid Earth sciences (EPOS), data sharing has enormous potential but there may not yet be enough consciousness of the scientific problems that can be addressed, i.e., a new typology of scientists targeting multidisciplinary problems is to be formed Building an e-infrastructure is very demanding given the diversification of the communities in terms of different levels of data organization development/maturity and willingness to be part of Must not loose pieces (communities) along the way ➜ capitalize on the existing developments and introduce novelties by making synergy with the different projects and the communities ➜ efficient communication policy.

19 e-IRG Workshop, May, 2013, Dublin (IRL) Summary (cont’d) To achieve the best results, it needed continuous orchestration between scientific communities and ITs (e.g., scalability, AAI) EUDAT can represent a data organization model and services which can be instrumental toward EPOS e-infrastructures EUDAT and VERCE are posing particular attention to large-to-huge data volumes analyses The communities are undergoing a positive, maturation process and the ITs are understanding progressively the problems of the formers and envisaging solutions ➜ mutual trust and synergy Interactions with industry in Earth sciences require effective strategies and particular attention (ethic issues, use and re-use of scientific data)

Thank you for attention Research infrastructures and e-science for data and observatories on Geo-Hazards and Geo-Resources

The EPOS chain: high gain/high-but manageable riskrisk

22 e-IRG Workshop, May, 2013, Dublin (IRL) Comments on data sharing in EPOS EPOS (sub-)communities feature very different levels of data organization development/maturity Most communities have developed in-house their own data services Many communities are already striving for their own data archive and services and they are afraid and in some cases difficult to share their data (e.g., why should I put resources in changing what I am doing if I can barely keep track of the services I am compelled to provide ?) Many communities think they have already the best services (i.e., they can carry out their own research!) and they do not see why the data should be shared (or better qualified). Overall, it is a slow process to introduce new concepts, to adopt the same jargon and users/scientists often not yet ready BUT it is a positive maturation process

23 e-IRG Workshop, May, 2013, Dublin (IRL) EPOS KEYWORDS Integration of the existing national and trans-national RIs Interoperability of thematic (community) services across several multidisciplinary communities Open access to a multidisciplinary research infrastructure for promoting cross-disciplinary research Acknowledgment of the data source Progress in Science through prompt and continuous availability of high quality data and the means to process and interpret them (e.g., explore and mine large data volumes, results easily reproducible/replicable) Data infrastructures and novel core services will contribute to information, dissemination, education and training. Implementation plans, which require strategic investment in research infrastructures at national and international levels. Societal contributions, e.g., hazard assessment and risk mitigation

I.Data and service providers from the solid Earth sciences (  National data and service providers  International data and service providers  Data products providers II.Scientific User Community  Researchers from solid Earth Science  Solid Earth science community projects (NERA, SHARE, REAKT,....)  Training and educational institutions, projects and initiatives  Researchers and organizations from outside the solid Earth sciences III.Governmental Organizations  National governments  Funding agencies  Civil protections authorities  European Commission IV.Other data and service providers and users  IT projects and experts, Industry, Private data and service providers V.General Public EPOS Stakeholders

EPOS Solid Earth RIs Data Providers from solid Earth Science Governments & Funding Agencies Other Stakeholders ICT Industry / Public Other disciplines Users Community (academia) Training & Education data modeling (new products) Stakeholder Strategy

Thematic Services (TCS) WG1 - Seismology WG2 - Volcanology WG3 – Geological Data WG4 – GNSS Data WG6 – Analytical and Experimental Laboratories WG8 – Satellite Data WG10 - Infrastructures for Georesources WG 9 – Geomagnetic Observ. Governance Data Products Services WG5 Near Fault Observatories