Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Databases and metadata G. Bégni, H. Makhmara - MEDIAS-France July 18, 2004 ENVIROMIS Tomsk.

Similar presentations


Presentation on theme: "Distributed Databases and metadata G. Bégni, H. Makhmara - MEDIAS-France July 18, 2004 ENVIROMIS Tomsk."— Presentation transcript:

1 Distributed Databases and metadata G. Bégni, H. Makhmara - MEDIAS-France July 18, 2004 ENVIROMIS Tomsk

2 Aims of the presentation Understanding the principles of metadata and databases. Understanding the principles of metadata and databases. Making the scientific community aware of the efforts expected in terms of data documentation. Making the scientific community aware of the efforts expected in terms of data documentation. Highlighting the positive impacts of such efforts. Highlighting the positive impacts of such efforts. Demonstrating the need of an easy way to access distributed databases. Demonstrating the need of an easy way to access distributed databases.

3 Approach Presentation of the AMMA context and its constraints: status of the problem. Presentation of the AMMA context and its constraints: status of the problem. Reflection on a solution. Reflection on a solution. Abstract description of the various elements part of the solution. Abstract description of the various elements part of the solution. Selection and justification of standards and techniques. Selection and justification of standards and techniques. Assessment of selections. Assessment of selections.

4 AMMA context Scientific level Scientific level Multi-disciplinary Multi-disciplinary Multi-scale. Multi-scale. Technical level Technical level Multi-format Multi-format Multi-volume Multi-volume Multi-structure Multi-structure Multi-location. Multi-location. Cultural level Cultural level Multi-lenguage Multi-lenguage Multi-usage Multi-usage Multi-possibilities. Multi-possibilities.

5 Constraints involved Providing the various communities with the best suited access to data (language, medium, cost, services…) Providing the various communities with the best suited access to data (language, medium, cost, services…) Guaranteeing the durability of data wherever they are produced. Guaranteeing the durability of data wherever they are produced. Ensuring the durability of services as time goes by (technological developments). Ensuring the durability of services as time goes by (technological developments).

6 Access services Easy web interface for data research and location (geographical, temporal, thematic, keywords). Easy web interface for data research and location (geographical, temporal, thematic, keywords). Transparent service to access heterogeneous distributed data (possibilities of compiling…). Transparent service to access heterogeneous distributed data (possibilities of compiling…). Homogeneous documentation for heterogeneous data in order to optimise their exploitation. Homogeneous documentation for heterogeneous data in order to optimise their exploitation.

7 Data durability Multiple and systematic back-up procedure. Multiple and systematic back-up procedure. Data transparency in relation to technological changes (hardware, software). Data transparency in relation to technological changes (hardware, software). Transparent data exploitation as time goes by. Transparent data exploitation as time goes by.

8 A solution Fully defined back-up process. Fully defined back-up process. Data storage in standardised formats. Data storage in standardised formats. Clear data documentation for future exploitation. Clear data documentation for future exploitation.

9 Service durability Services should not depend on any proprietary or « exotic » software. Services should not depend on any proprietary or « exotic » software. The quality of a service should not deteriorate according to technological changes. The quality of a service should not deteriorate according to technological changes.

10 A solution Services based on standards. Services based on standards. Services based on the « Open source». Services based on the « Open source».

11 To sum-up: Standardise storage. Standardise storage. Standardise services. Standardise services. Standardise exploitation. Standardise exploitation. However, some data formats cannot be standardised (satellite imaging). However, some data formats cannot be standardised (satellite imaging). Neither can the related services. Neither can the related services.

12 Principles applied Every item liable to be standardised should be standardised. Every item liable to be standardised should be standardised. There should be a system gateway based on standards only. There should be a system gateway based on standards only. Every item that cannot be standardised should be described in a standardised way. Every item that cannot be standardised should be described in a standardised way.

13 A standard for each element Data storage: ANSI/ISO, SQL, XML. Data storage: ANSI/ISO, SQL, XML. Data description: FGDC-STD or ISO Data description: FGDC-STD or ISO Service description: W3C SOAP. Service description: W3C SOAP. Catalogue: ANSI/ISO (Z39.50). Catalogue: ANSI/ISO (Z39.50).

14 Data description Metadata Formed from a Greek root (« meta »). Formed from a Greek root (« meta »). What surpasses, encompasses a subject, a science. (Le Robert Dictionary). What surpasses, encompasses a subject, a science. (Le Robert Dictionary). Denoting a nature of a higher order or more fundamental kind. (Ofxord Talking Dictionary). Denoting a nature of a higher order or more fundamental kind. (Ofxord Talking Dictionary). English: metadata French: métadonnées. English: metadata French: métadonnées. Literally speaking, metadata are data about data. Literally speaking, metadata are data about data. To be more precise, they are structured sets of information that describe resources. To be more precise, they are structured sets of information that describe resources.

15 Metadata standards Metadata have always existed. Metadata have always existed. An effort of world-wide standardisation has been undertaken for several years. An effort of world-wide standardisation has been undertaken for several years. Several (georeferenced) standards: Several (georeferenced) standards: 1. Content Standard for Digital Geospatial Metadata: FGDC-STD ISO since the end of FGDC is a de facto standard. FGDC is a de facto standard.

16 Advantages Homogeneous presentation. Homogeneous presentation. Pooled developments. Pooled developments. Possibility to automate data processing. Possibility to automate data processing. Comparison of examples: Comparison of examples: 1.GeoConnections Portal, Canada: 2.Portal on desertification monitoring (OSS/Medias/SCOT):

17

18

19

20

21

22 Efforts asked from data providers Be aware of standards. Be aware of standards. Endeavour to describe data as completely as possible. Endeavour to describe data as completely as possible. Use data exchange formats as simple and consistent as possible. Use data exchange formats as simple and consistent as possible Data providers do not have to care about the technical or formal aspects of standards. Data providers do not have to care about the technical or formal aspects of standards. Database managers will provide them with easy and user-friendly tools to describe their data. Database managers will provide them with easy and user-friendly tools to describe their data.

23

24

25 MetaCatalog (Portal to the AMMA I.S) Meta database (ISO AND/OR FGDC) DB AMMASATDB LOPDB SOP Exchange protocol AMMA INFORMATION SYSTEM ARCHITECTURE 1.Search by criteria (User friendly interface) 2.Query metadata 3.Retrieve metadata 4.Choose datasets 4.Query data 5. Locate and query datasets from relevant data sources 6. Retrieve datasets

26 Technical diagram Z39.50 YAZPHP ZOOM Other catalogues (GCMD, Clearinghouse FGDC) XML records Metadata creation - validation Web forms Import XML Catalogue service (any user)Edition service (data provider) Zebra server Zebra indexer ZAP client

27 Characteristics Management of multi-standard metadata Management of multi-standard metadata ISO ISO FGDC FGDC DIF if XML schema. DIF if XML schema. Transparent to the data provider. Transparent to the data provider. Transparent to the user. Transparent to the user.

28 Data access services Médias-France is devoloping generic data access services Médias-France is devoloping generic data access services These services have to be auto descriptive, registered and with well know interfaces These services have to be auto descriptive, registered and with well know interfaces For the moment, we focus our efforts on software permitting access to geographically distant databases (Distributed databases) For the moment, we focus our efforts on software permitting access to geographically distant databases (Distributed databases)

29 Principe Each service is registered within a directory server Each service is registered within a directory server Each data source declares what data it serves Each data source declares what data it serves A web portal is used by scientists to locate and request data from different sources A web portal is used by scientists to locate and request data from different sources Data is sent back to the user in a standardized format Data is sent back to the user in a standardized format

30 Implementation Data sources are under PostgreSQL, flat files or other RDBSM systems Data sources are under PostgreSQL, flat files or other RDBSM systems Each data server is a DODS servlet (Distribued Oceanographic Data System) Each data server is a DODS servlet (Distribued Oceanographic Data System) Sevlet container is Apache Tomcat Sevlet container is Apache Tomcat Metada are in XML files Metada are in XML files

31

32

33

34

35 Prospects Develop Web services based on W3C SOAP recommandation Develop Web services based on W3C SOAP recommandation Implement a Directory service for services Implement a Directory service for services Hope share development effors with other organisations, within the framework of international projects (Funded by EC, INTAS…) Hope share development effors with other organisations, within the framework of international projects (Funded by EC, INTAS…)


Download ppt "Distributed Databases and metadata G. Bégni, H. Makhmara - MEDIAS-France July 18, 2004 ENVIROMIS Tomsk."

Similar presentations


Ads by Google