Presentation is loading. Please wait.

Presentation is loading. Please wait.

BR 1 SIMDAT HALO meeting –11.07.06 Meteo Activity of the SIMDAT project: Building components of the WIS Baudouin Raoult ECMWF.

Similar presentations

Presentation on theme: "BR 1 SIMDAT HALO meeting –11.07.06 Meteo Activity of the SIMDAT project: Building components of the WIS Baudouin Raoult ECMWF."— Presentation transcript:

1 BR 1 SIMDAT HALO meeting –11.07.06 Meteo Activity of the SIMDAT project: Building components of the WIS Baudouin Raoult ECMWF

2 BR 2 SIMDAT HALO meeting –11.07.06 Data Grids for Process and Product Development using Numerical Simulation and Knowledge Discovery 4 years project funded by the EU -Contract with EU was signed on 1 September 2004 SIMDAT focuses on 4 application areas: -product design in automotive and aerospace, -process design in pharmacology -service provision in meteorology Budget of 11 M Phase 1 : Connectivity Phase 2 : Interoperability Phase 3 : Knowledge. Deployment of Grid infrastructure with particular attention to data transport and management. Distributed DB access. Virtual Data Repository. Introduction of Grid technologies research. Introduction of VO. Integration of analysis services, workflows, discovery and data mining

3 BR 3 SIMDAT HALO meeting –11.07.06 SIMDAT Meteorology Partners 22 members in the consortium Deutscher Wetterdienst (DWD) ECMWF EUMETSAT Météo France UK Met Office Intel Ontoprise IBM IT Innovation NEC

4 BR 4 SIMDAT HALO meeting –11.07.06 Meteo activity To build an integrated and scalable framework for the collection and sharing of distributed data (WIS building blocks) -Instead of each National Met Service having a GISC, A virtual GISC -2 DCPCs : ECMWF, EUMETSAT Service oriented framework targeting meteorology, hydrology, climate and environment and offering transparent access to distributed resources -Grid enabled software -Services to process the data, elaborate products, visualize those products Some key elements of the project are: -A single view of meteorological information which is distributed amongst the 5 partners -Improve visibility and access to meteorological data through a comprehensive discovery service -Offer a variety of reliable services for routine dissemination and for collection of data -Provide a global access control policy managed by the partners and integrated into their existing security infrastructure 320 men/month taking into account the technology contribution to the meteo application

5 BR 5 SIMDAT HALO meeting –11.07.06 Virtual meteorological Centre - functional view Through the Distributed Portal users searches for and retrieves data, subscribe to services such as routine dissemination subject to authentication and authorization The Virtual Database Service provides a single view of partners databases

6 BR 6 SIMDAT HALO meeting –11.07.06 Architectural Choices Catalogue duplicated and synchronized at each site -To have a fast discovery (browse & search phase) and a reliable system (client redirection to another node) Build an open and flexible framework integrating technologies from different areas -Allow to pick the best components of each Grid Middleware (Globus,OGSA-DAI) -Associate J2EE and Grid/Web Services technologies to build solid components QoS and Robustness are amongst the top priorities of the project -Framework based on J2EE components -Use pipelining, priority and queuing mechanisms to process users requests

7 BR 7 SIMDAT HALO meeting –11.07.06 Architecture 3 main components to build the virtual database: Data Repository, Catalogue Node and Portal -installed on each partner site and interconnected through a dedicated secure connection channel Data Repository -Interface to the partners databases -Offers metadata information to describe, search, locate data -Offers interface to retrieve data from the associated local databases Catalogue Node -Maintains the registry and ensures synchronisation -Harvests metadata and requests data from the data Repository -Ingests data and maintains the cache of the real-time data -Serves clients: Portal or other Nodes -Monitors the execution of the requests Distributed Portal -Offers interface to search/browse the catalogue

8 BR 8 SIMDAT HALO meeting –11.07.06 Architecture – cont

9 BR 9 SIMDAT HALO meeting –11.07.06 WMO Core metadata standard WMO Core Profile 0.2, profile of ISO19115 on geo-referenced data Not scalable -Records are large and contain redundant information, slowing down the database hosting the catalogue -Same information repeated in all metadata records Unnecessary information is circulating over the network -Some documents are orders of magnitude larger than data itself -Cannot represent very large archives with small granularity Cannot fulfil all requirements to build the Virtual Meteorological Centre -Information on how to retrieve data from local databases -Information to create a directory (Taxonomy of documents) -Information to sub-select data from a dataset

10 BR 10 SIMDAT HALO meeting –11.07.06 Solutions Split XML documents into fragments to solve the scalability issue -WMO core metadata is structured -Some parts are shared amongst many documents Add specific extension to define all relevant information needed to implement the system and not defined by the WMO core -Internal unique ID -Hierarchy relationship -Physical location (which node holds the data) -Information used to generate a valid request to retrieve data from the end system -Information used to create web interface for the end user Work with WMO to Integrate extensions in future releases of standards WMO UKMO Synop Heathrow 2005-10-12 Core Owner Data type Location Date

11 BR 11 SIMDAT HALO meeting –11.07.06 WMO Information System (WIS) Requirements Support variety of data types (Common to all WMO Programmes) Support Archive and Real-time datasets Build a Catalogue of all the meteorological data for exchange to support WMO programmes Support ad-hoc requests for data and products: Pull model Support routine dissemination of all observed data and products both real-time and non real-time : Push model Support network security Support of different users profile and data policies Use different types of communication links (GTS, satellite, dedicated links)

12 BR 12 SIMDAT HALO meeting –11.07.06 WIS Requirements Support variety of data types

13 BR 13 SIMDAT HALO meeting –11.07.06 Data Repository Functions Interface to the existing Meteorological Databases -It provides access to any kind of databases (rdbms, bespoke, flat files) Metadata provider -Provide Metadata information to discover, locate and describe data, in respect with a defined XML metadata format -Answer Catalogue Node metadata harvesting messages Data provider -Provide an interface to asynchronously request data from the associated existing database (to support real-time & archive datasets) -Transform the XML data request to the real database request -Offer a data channel (HTTP, FTP, …) to send the retrieved data to the Catalogue Node

14 BR 14 SIMDAT HALO meeting –11.07.06 Data Repository Implementation Implemented as a web-service using a document-based interface -Protocol entirely described in an XML Message -Independent from the network transport (HTTP, SOAP, etc) Three transport methods are supported -OGSA-DAI WSRF -Web Services (WS-I, WSDL, SOAP) -REST (XML over HTTP) VMCMessage Protocol -A set of XML messages have been defined for metadata harvesting (Info,GetMetadataRecord) -A set of XML messages have been defined for data requesting (Submit, GetSubmitStatus, DeleteRequest)

15 BR 15 SIMDAT HALO meeting –11.07.06 WIS Requirements Era40 ReanalysisData IAA NWP Outputs Data Unidart Climate Data JEDDS Aeronautical Data UMARF Satellite Data Support real-time data

16 BR 16 SIMDAT HALO meeting –11.07.06 Realtime Data Repository A GTS Data Repository is being developed by Meteo-France -Interfaced with the GTS (through a MSS) -It publishes GTS collections For phase II : One source providing GTS data -No data replication over the SIMDAT infrastructure For phase III several sources plugged onto SIMDAT -Strategy to uniquely identify the datasets (using MD5 hash codes) -Real-time data replication using the metadata synchronization mechanism Generic Solution which can be used by all the partners

17 BR 17 SIMDAT HALO meeting –11.07.06 WIS Requirements Build a Catalogue of all the available meteorological products

18 BR 18 SIMDAT HALO meeting –11.07.06 Catalogue Node The Catalogue is built using the metadata harvested from the Data Repositories The Catalogue is synchronized and replicated on each Catalogue Node The Catalogue offers discovery services accessible to the user through the distributed portal The Catalogue contains the necessary information to retrieve and sub select the data

19 BR 19 SIMDAT HALO meeting –11.07.06 SIMDAT Infrastructure Support ad-hoc requests for data & products: Pull model

20 BR 20 SIMDAT HALO meeting –11.07.06 Distributed Portal A Portal is deployed on each site and offers a unique view of all the datasets available Portal offers discovery mechanisms to the users -Full text, temporal and geographical search (google-like) -Directory browsing (yahoo-like browsing) Portal provides request handling mechanisms to the users -Submitted requests can be asynchronous to manage long-lived requests -A user can manage its requests (check status, delete them …) -A user retrieve the associated data when the request is complete Portal uses the information contained in the metadata to create the data sub-selection forms -The metadata/data providers define how to access its datasets

21 BR 21 SIMDAT HALO meeting –11.07.06 How to create the database requests ? Keep the request language of the different databases -Non intrusive solution Add information in metadata extension to build the end system request: - : hold information specific on how to generate a valid request to the data repository - : hold information on how to create a web interface to let the user select items from the dataset Web portal uses the element to present selection dialogues to the user

22 BR 22 SIMDAT HALO meeting –11.07.06

23 BR 23 SIMDAT HALO meeting –11.07.06

24 BR 24 SIMDAT HALO meeting –11.07.06

25 BR 25 SIMDAT HALO meeting –11.07.06 WIS Requirements Dissemination/Subscription Will be addressed in phase III of the project Support routine dissemination of all observed data and products both real-time and non real-time : Push model

26 BR 26 SIMDAT HALO meeting –11.07.06 WIS Requirements Inter-Node Communications secured using SSL Support Network Security

27 BR 27 SIMDAT HALO meeting –11.07.06 WIS Requirements Virtual Organization Implementation: Framework study and investigation in Phase II First Stable Version delivered for Nov 06 Support of different users profile & data policies

28 BR 28 SIMDAT HALO meeting –11.07.06 VO Domains Domain -Group of organisations that share a common policy (e.g. the RA-VI V-GISC) -The VO might contain a number of sub-domains. Authentication (AuthN) -Users register with a node. -Users are known to all the nodes in the same domain -Any node within the domain should be able to authenticate a user of the domain. Authorisation (AuthZ) -AuthZ is performed at the node level to allow/deny access to the data. -Data Access policy is expressed within the metadata. A B C F E VO Domain D1D1 D2D2

29 BR 29 SIMDAT HALO meeting –11.07.06 Cross-domain issues Metadata is visible across all domains -But some metadata can be explicitly hidden Cross-domain authorisation involves user registration -User from domain D 2 wanting to access data which is limited to domain D 1 will have to register to domain D 1 Cross-domain authentication will be recognised on a trust relation- ship previously established. -Users authenticated coming from D 2 into D 1 will be checked against the trusted CA domains. The concept of domain needs to be validated by VO working group A B C F E VO Domain D1D1 D2D2

30 BR 30 SIMDAT HALO meeting –11.07.06 WIS Requirements Currently deployed on Internet Phase II : Study on a dual RMDCN/Internet deployment for production Phase III :RMDCN deployment and Eumetcast integration study Use different types of communication links

31 BR 31 SIMDAT HALO meeting –11.07.06 What do you need to publish data ? Installation -Install a Catalogue -Install a Data Repository Develop a Module to request data from the existing database -It can simply be a shell script calling the database client with the zero development Data Repository Define the metadata describing the datasets -Define the discovery information (keyword, geographical, temporal) -Define how to request the database Static information necessary to access the database Define how to sub-select data -A metadata definition wizard is being developed

32 BR 32 SIMDAT HALO meeting –11.07.06 Milestones Synchronization Engine Enhancements - June 06 Mesh Network Management Software - June 06 -Lead by INTEL and fully compatible with the new synchronization engine WSRF interfaces implementation - Sep 06 Metadata Manager migration toward ebXML -Lead by UKMO, feasibility study by June 06 Development of a Real-time Data Repository -To acquire GTS observations : Lead by Meteo-France, first implementation by Sep 06 Implementation of the security services of the VO - Feb 07 Onotology based discovery service -First Thesaurus implementation Sep 06, discovery interface Mar 07

33 BR 33 SIMDAT HALO meeting –11.07.06 CBS conference demonstration Meshed network of GISCs and DCPCs Based on SIMDAT software and including the 5 European partners, JMA, CMA, BoM, NCAR, NODC -JMA, CMA, BoM fully integrated in the grid architecture -NCAR acting as DCPC and providing metadata information via OAI -NODC currently investigating the SIMDAT software

34 BR 34 SIMDAT HALO meeting –11.07.06 Results Achieved Unified Catalogue based on WMO Core Profile v0.2 First element of the security infrastructure Five (+2.5) Meteorological Centres interconnected and exchanging data and metadata Users able to search browse and retrieve data distributed within the partners Era40 Data IAA Data UNIDART Data JEDDS Data UMARF Satellite Data

35 BR 35 SIMDAT HALO meeting –11.07.06 Results Achieved (cont.) Flexible, non intrusive architecture -Support any kind of databases (RDBMS, XML, Flat File, Object, bespoke). -Zero development Data Repository -Support Asynchronous requests (Archive, long requests) Interests shown by meteorological community: -JMA (Japan) and CMA (China) fully integrated -BoM (Australia), KMA (Korea) and NODC (Russia) in progress -NCAR (US) catalogue is harvested using OAI, users are redirected to NCAR portal SIMDAT work feeds back into WMO through expert teams: -ET-WISC: SIMDAT Meteo requirements are now used as the WIS requirements, IPET-MI: Findings have been used for the definition of the WMO Core Profile 0.3, ET-CTS: SIMDAT infrastructure is seen as a major infrastructure for implementing the WIS

Download ppt "BR 1 SIMDAT HALO meeting –11.07.06 Meteo Activity of the SIMDAT project: Building components of the WIS Baudouin Raoult ECMWF."

Similar presentations

Ads by Google