Presentation is loading. Please wait.

Presentation is loading. Please wait.

DI4R, 30th September 2016, Krakow

Similar presentations


Presentation on theme: "DI4R, 30th September 2016, Krakow"— Presentation transcript:

1 DI4R, 30th September 2016, Krakow
Collection and Analysis of Ocean Big Data: Building the EMSODEV Data Management Platform using EGI Federated Cloud DI4R, 30th September 2016, Krakow Pasquale Andriani

2 ..about EMSODEV

3 EMSODEV scenario DATA MANAGEMENT PLATFORM Data ingestion Data access
Real-time Asynch. Data ingestion Ingestion speed Data access DMP tools

4 DMP Design 1/2 ENVRI Reference Model v2.0 Data acquisition Data curation Data publishing Data processing Data use Computational Viewpoint (CV) has been used to identify a standard set of components (CV Objects) and interfaces that inspired the design of the EMSODEV DMP architecture in different phases

5 EMSODEV DATA MANAGEMENT PLATFORM
DMP Design 2/2 EMSODEV DATA MANAGEMENT PLATFORM <<external resource>> EMSO Regional Nodes Data Files <<experimental lab>> EMSODEV API <<security service>> Authentication &Authorization Tool <<virtual lab>> DMP Tools data use <<experimental lab>> Data Analysis Tool <<instrument controller>> Sensor Observation Service data acquisition <<data transfer service>> Transfer Flow Orchestrator <<raw data collector>> Push Transfer Flow Pull Transfer Flow <<data store controller>> NoSQL DBs Streaming Store Controller Distributed File System Time Series DB <<catalogue service>> Metadata and Service Repository <<data exporter>> Dataset Exporter <<data importer>> Processing Results Importer Regional Node Importer <<data stager>> Stager Engine data curation <<process controller>> Batch Processor Engine Streaming Processor Engine <<coordination service>> Analysis Manager data processing Mapping of ENVRI CV Objects to EMSODEV DMP architectural components Instrument Controller in the Data Acquistion <<data broker>> Broker Engine data publishing

6 DMP infra. on EGI FedCloud
EMSODEV DMP current prototype Test VO: fedcloud.egi.eu Cloud Compute: 8 VMs (8 CPUs + 16GB RAM + 40GB HD) EMSODEV DMP (requested SLA request in early August 2016) Production VO: vo.emsodev.eu Cloud Compute: ~10 VMs (8 CPUs + 16GB RAM + 40GB HD) File Storage: 5 TB

7 DMP Operation – Apache Ambari
Dashboard for provisioning, managing, monitoring and securing the EMSODEV cluster hosting the EMSODEV DMP.

8 Data Acquisition and Curation
At this stage, two raw data collectors exist: A Pull Transfer Flow: data is retrieved via API exposed by an OGC SOS server available at the OBSEA observatory located in Vilanova and managed by Universitat Politecnica De Catalunya. SOS server API GetCapabilities EMSODEV DATA MANAGEMENT PLATFORM GetObservation OBSEA data DescribeSensor A Push Transfer Flow: data is sent to a DMP service which “listens” to near-real time updates on XML files describing sensors data and measurements.

9 Data Publishing and Use
Real-time dashboard solution for time-series data analysis After the storing phase, data is visualized by using a real-time dashboard, in particular we are testing elasticsearch with Grafana. The choice of this two tools has been made because elasticsearch allow Real-time data search And with Grafana is possible to have real-time summary and charting and both are Open source and under Apache 2 license that is one of the common licenze type in research project. Real-time data search Real-time advanced analytics Schema-free Real-time summary and charting Apache2 Open Source License Distributed, scalable, and highly available

10 DMP preliminary REST API
Built and managed through: Swagger Editor Swagger UI Swagger CodeGen

11 Advantages of using EGI FedCloud
A ready-to-use IaaS where to deploy on-demand IT services Easy VM and security management via OpenStack Horizon Scalable according to community needs (within the boundaries established through SLA) Secure VM access via a mechanism (VOMS credentials) based on proxy credentials issued and verified by EGI Fast and reliable support (ggus.eu trouble-ticketing and by mail)

12 Questions Pasquale Andriani Engineering Ingegneria Informatica SpA
Italy


Download ppt "DI4R, 30th September 2016, Krakow"

Similar presentations


Ads by Google