Presentation is loading. Please wait.

Presentation is loading. Please wait.

Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Life On The Edge: Global Geoscience Data Delivery Ollie Raymond with Nick Ardlie,

Similar presentations


Presentation on theme: "Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Life On The Edge: Global Geoscience Data Delivery Ollie Raymond with Nick Ardlie,"— Presentation transcript:

1 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Life On The Edge: Global Geoscience Data Delivery Ollie Raymond with Nick Ardlie, Dale Percival, Lesley Wyborn, and Aaron Sedgmen

2 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Dont you hate it when… You cant exchange geological data with your project partners because you use different systems? You didnt realise that the shapefile you downloaded last year has been superseded by an updated version? You know theres useful information out there, but you cant find it? You waste valuable time converting formats of datasets you do find? You cannot add real-time data from other sources to your information systems? You keep emailing and burning CDs to publish your data to clients who need it urgently? YOU NEED A WEB-DELIVERED DATA STANDARD!

3 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Outline Living on the scientific fringe A short history of digital geological map data standards at BMR-AGSO-GA Customers, the web, and why we need digital data standards GeoSciML and O&M - what are they? Developing web services using data standards –Testbeds and living on the technological edge

4 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Data modellers and standards developers have always been regarded by geologists as a quaint lunatic fringe who are a bit of an annoyance for their important scientific research… …until those same geologists want to exchange their data with other geologists…. …and they spend the next week reformatting data from different sources.

5 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Life on the fringe is exacerbated by data modelling technobabble…. Ontology The specification of one's conceptualisation of a knowledge domain Que?

6 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 a set of controlled vocabularies (ie, lists of agreed terms) which describe concepts in a field of interest eg, mineral names and lithology names describing rocks in Geology the relationships between concepts and between the agreed terms used to describe those concepts eg, geological units are composed of rocks granite is a type of felsic intrusive rock a set of rules about how to specify the terms and relationships Ontology

7 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 A short history of GAs digital geoscience map data standards… Pre 1990s Geological data and the standards that govern it have come a long way since the day of the old BMR cartographic symbols book Last printed in 1989, the symbols book described the appearance, but rarely the meaning, of every line and symbol on a printed geological map

8 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 The 1990s – the decade of GIS BMRs first proposed GIS data dictionary for geological map data was written in 1992 by a very young Robyn Gallagher when BMR realised that the new digital GIS products had no quality control Some basic map data themes –geological unit polygons and boundaries –structures including faults, veins and dykes, and folds It also described some point located datasets including –outcrop locations, structural measurements, geochemical analyses and mineral occurrences some cartographic frames and graticules less than 8 pages long A short history of GAs digital geoscience map data standards…

9 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 The data dictionary was extended over the next 5 years until AGSO merged with AUSLIG and we geologists were exposed to the much more rigorous standards definitions used by AUSLIG The result was the GA Geoscience Data Dictionary for Spatial Data –86 different spatial data themes –minerals, petroleum, regolith and marine geology and geophysics –mines, wells/drillholes, topography, urban, cultural and infrastructure themes – cartographic layers A short history of GAs digital geoscience map data standards…

10 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 They want the best and most current geoscience data they want it free and they want it NOW, 24-7 And they want to take GAs and other Federal Govt data, the States data, CSIROs data, international data, and combine it with their own data And they want to use all of this data in any number of 2D and 3D modelling and display software applications Our software-specific, agency-specific data standards dont cut the mustard when customers are trying to integrate data across jurisdictions The customers…

11 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 The problem access to Government geoscience information is fragmented and inefficient Delivering Government Digital Geoscience Data Minerals Exploration Action Agenda … existing information is distributed across eight state and federal agencies each with its own information management systems and data formats up to 80% of time acquiring pre-competitive data is taken up by reformatting disparate data from government sources a disincentive to exploration

12 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 VIC QLD NSW NT SA WA GA TAS Data sources + CSIRO, consultants Environment Minerals Industry Resource Management Public Government Universities Petroleum Industry Currently in Australia… Web tools Mapserver ArcIMSIMFGeoserver data accessible online view, query and download data 8 different data structures data in several software formats cannot view more than one states data at a time

13 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Government Geoscience Online 8 online geoscience delivery systems 8 data structures 2 proprietary (software-specific) data formats cannot access more than one agencys data at a time 8 online geoscience delivery systems 8 data structures 2 proprietary (software-specific) data formats cannot access more than one agencys data at a time

14 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 WANT Description Label Age Rationalising data sources

15 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 WANT Rationalising data sources ESRI ESRI MAPINFO MAPINFO

16 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 The problem You CAN get them to agree on a software-independent DATA TRANSFER STANDARD The solution How do you get 8 Australian jurisdictions to provide digital geoscience map data in the same format? You will never get them to agree to change their agency database structures to a single structure You will never get them to agree to use the same software for data maintenance and delivery

17 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Interoperability What is a Digital Data Standard good for? A common data structure in which you deliver your data, is software-independent, but most of all, a digital data standard enables …. Lesley Wyborn, 2005 My stuff works with your stuff

18 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 GeoSciML G eo S cience M arkup L anguage GeoSciML G eo S cience M arkup L anguage O & M Observations and Measurements O & M Observations and Measurements

19 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Committee for the Management and Application of Geoscience Information Australia USA Canada France UK Sweden Italy Japan Interoperability Working Group

20 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 CGI Interoperability Working Group geologists, geophysicists, information modellers, web programmers

21 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Geological Data Model A logical data structure a complex model (hierarchical, relational) tells users what geological information goes where and what terminology is to be used (vocabularies) scientifically robust, developed by the scientific community internationally agreed data providers need only to map their own local data structures to the data transfer structure data providers dont need to change their local database structures to use the transfer standard What is GeoSciML? (Part 1)

22 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Constructed using UML (Unified Modelling Language) tools Presented as a series of class diagrams which show the attributes of and relationships between geological features and other data types The GeoSciML Data Model e.g. Geologic(al) units composition (earth materials) metamorphism weathering character physical properties related structures unit types (eg, lithostratigraphic, chronostratigraphic) age and geological history (events) unit parts (child/parent relations)

23 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 GEODX.STRATNAMES.TOPMINAGENAME GEODX.STRATNAMES.BASEMAXAGENAME GEODX.STRATNAMES.TOPMINAGENAME GEODX.STRATNAMES.BASEMAXAGENAME GEODX.STRATLITHS.LITHOLOGY GEODX.RANKSYNONYMS.RANKNAME SDE.CDI_VICSTRATS.FORMTYPE Mapping your database to GeoScIML Mapping your database to GeoScIML

24 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 What is GeoSciML? (Part 2) XML encoding the markup language used to deliver the model to the internet builds on established internet standards such as GML (Geographic Markup Language) open source software independent machine readable

25 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Medium to fine-grained lithic sandstone to siltstone Site 95846001 Rock #1 grey-green siliciclastic clastic sedimentary medium (1-5mm) normal grading.

26 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 What is O&M? Like GeoSciML, it is a data model and GML schema A data model for any type of scientific observation, measurement and sampling frame It is a more generic model, not just for geoscience - it is less prescriptive than GeoSciML It provides a platform on which individual science communities can build more domain-specific types of observation, measurement and sampling For example, the GeoSciML working group have adopted the sampling point and sampling curve models of the O&M standard for geological use in delivering outcrop sample locations and boreholes

27 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 What is O&M? The O&M standard is more mature than GeoSciML It is nearing full ratification by the Open Geospatial Consortium Two Australian members on the review panel - Simon Cox (CSIRO) senior editor, and Nick Ardlie (GA) Aim to be submitted to ISO this year Already used in GA in development of the Located Sample Data SPOT

28 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 International Testbeds Testbed 1. 2005 - A borehole demonstrator between UK and France Testbed 2. 2006 – A six nation demonstrator delivering geological map data from globally distributed sources using GeoSciML v1.1 successfully demonstrated WMS/WFS delivery, display and download of distributed data sources and simple query functions but lacked true interoperability between data sources leading edge technology suffered a little from immature and shifting standards in WMS/WFS, GML and GeoSciML developing web services software and no-one had done this before with such a complex data model

29 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 GeoSciML Vancouver, CA Uppsala, SV Canberra, AU Ottawa, CA Reston, VA Keyworth, UK Portland, OR Orleans, FR GeoSciML Testbed2 Accessing GeoSciML data using a web client in Canada GeoSciML Testbed2 Accessing GeoSciML data using a web client in Canada

30 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 GeoSciML Testbed2 (Canadian client) Display data from distributed sources in a map and query a feature GeoSciML Testbed2 (Canadian client) Display data from distributed sources in a map and query a feature

31 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 International Testbeds Testbed 3. 2007/8 (in progress) An eight nation demonstrator using GeoSciML v2.0 Aims: to test true interoperability of WMS and WFS services using both the GeoSciML and O&M data standards to test WFS query functionality within a complex data model to test the ability of various software applications to consume data in GeoSciML format to test registry services to discover and deliver geoscience information from distributed sources Testbed 3. 2007/8 (in progress) An eight nation demonstrator using GeoSciML v2.0 Aims: to test true interoperability of WMS and WFS services using both the GeoSciML and O&M data standards to test WFS query functionality within a complex data model to test the ability of various software applications to consume data in GeoSciML format to test registry services to discover and deliver geoscience information from distributed sources

32 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 GeoSciML Vancouver, CA Uppsala, SV Canberra, AU Ottawa, CA Reston, VA Keyworth, UK Portland, OR GeoSciML Testbed3 Registry services GeoSciML Testbed3 Registry services Japan Italy Orleans, FR REGISTRY Multilingual vocabularies Map legends - (StyledLayerDescriptors) Lists of available WMS and WFS services from distributed sources

33 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Integrated Map Map Data Service 1 Map Data Service 2 Levels of Map Data Interoperability (after Brodaric, 2007) Levels of Map Data Interoperability (after Brodaric, 2007) systems Data Services (WMS, WFS) semantic Data Content (Vocabularies) schematic Data Structure (GeoSciML, O&M) syntax Data Language (GML) pragmatic Data Context (Geologist) Achieving Interoperability with Map Data

34 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 The 3 most important things to consider in constructing an interoperable web service testbed 1. 1.compliance 2. 2.compliance 3. 3.compliance - to OGC web standards (WMS, WFS) - to the data model schema (GeoSciML) - to agreed vocabularies Lessons Learnt from Testbed2

35 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 The GeoSciML data model contains much interpretive and text- based data. There is not a large amount of relatively simple numerical data This means that semantic compliance (ie, compliance to many controlled vocabularies) is not a trivial exercise But compliance to vocabularies (eg, for Age) is crucial to be able to construct standardised WFS / WMS requests on distributed data This became evident very quickly in Testbed2 in trying to execute the agreed use cases - eg, select geologic features where Age = xxx Semantic Interoperability

36 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Cainozoic? Palaeozoic? Archaean? Bolindian? Eastonian? Gisbornian? Late? Early? Semantic Interoperability

37 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Flexibility in data representation a feature of the GeoSciML model allows representation of some data in different ways according to a users need eg, geologic age - - single numeric value (eg: 455 Ma) - - single defined text value (eg: Ordovician) - - lower and upper value range (eg: 420 to 460 Ma; Silurian to Ordovician) Schematic Interoperability

38 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 This pattern is flexible and entirely representative of how geologists use Age information, BUT…. It is an issue for interoperability - how do you process a query on Age if the data in different datasets is in different, but still schema compliant, formats? Schematic Interoperability

39 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Testbed2 example of a WFS query on age Clients decision to query on upper age only Schematic Interoperability

40 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 GeoSciML v2.0 now contains a preferredAge attribute - a single value attribute designed purely to allow simpler and more straightforward queries on Age Schematic Interoperability + pragmatism Schematic Interoperability + pragmatism

41 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Existing proprietary vendor software and open source software aims to support the detail of OGC web service specifications (e.g. GML and complex features) …but they are still being developed Much collaborative work was done with software developers during Testbed2 to be able to serve the complex feature model needed for geological information Software capabilities

42 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Highlighted both the capabilities and the limitations of Web Feature Service and OGC standards in a real-world, complex feature environment Highlighted technical challenges for software developers and vendors to be able to deliver and consume OGC-compliant, complex feature WFS services Lessons learnt from Life on the Edge (a.k.a. GeoSciML Testbed2)

43 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Highlighted the need to establish well-defined limits on use cases for any web data services. Unlimited interoperability of complex geoscience data is not realistic Highlighted the importance of rigorous documentation of the data model to guide participants in a distributed network Risk analysis at the pointy end of R&D testbed projects is crucial Success in Testbed3 is vital to achieve wide up-take of web services in the production environment Lessons learnt from Life on the Edge (a.k.a. GeoSciML Testbed2)

44 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Where to from here? Figure 4: IM Strategic Goals (GA IMSP 2004-2009, p12) Machine to Machine Interfaces Human Interfaces

45 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Where to from here? OneGeology ~1:1 million scale digital geology of the world over 50 nations on all continents Other Geoscience MLs under development involving GA Landslides Mineral Occurrences Geochronology Geochemistry many more that GA could be involved in Other Geoscience MLs under development involving GA Landslides Mineral Occurrences Geochronology Geochemistry many more that GA could be involved in

46 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Where to from here? Within Australia… An Australian Geoscience Portal ? All government geoscience map data Data served from distributed state and federal sites to a single portal using the GeoSciML and O&M data transfer standards Within Australia… An Australian Geoscience Portal ? All government geoscience map data Data served from distributed state and federal sites to a single portal using the GeoSciML and O&M data transfer standards

47 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Open data standards –not dependent on proprietary software; internationally agreed Efficiencies for industry –data from government providers is up-to-date, easily discoverable, and standard; no reformatting required Efficiencies for government –no need to change local data structures; just map each database to GeoSciML –new data is immediately available to the internet as a web service –no need to maintain data in several different software formats –standard format for industry mandatory reporting Benefits for the wider geoscience community –same methodologies used to develop the GeoSciML standard can be used by other scientific communities Benefits of Web Services for Government Geoscience

48 Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Questions


Download ppt "Life On The Edge - Global Geoscience Data Delivery - DGAL 24 Oct 2007 Life On The Edge: Global Geoscience Data Delivery Ollie Raymond with Nick Ardlie,"

Similar presentations


Ads by Google