1 Ontology Work @ GeoConnections’ CGDI & CCRS’ NRCan Brian McLeodCanada Centre for Remote SensingSalutations …..
2 Intelec Geomatics Inc. (Montreal, Quebec) The M3GO project is a small project funded by GeoConnections program. It is rather a “proof of concept”Of Ontology/Semantic work that is needed while developing an NSDI such as the Canadian GeospatialData Infrastructure (CGDI), helping in data discovery through Portal such GeoConnections Discovery Portal.GeoInnovations (technology development program)
3 Overview Semantic interoperability background Ontology Service Project ContextIntroductionObjectivesMethodologyArchitectureSoftwareDemonstrationNext Steps
4 Introduction [Brodeur] Multiplication of geospatial data sources and increased usage of geospatial information technologiesNTDB, VMap, DCF, BDTQ, OBM, Geographic Data BC;Geospatial data and services are more and more accessible on the WebCanadian Geospatial Data Infrastructure (CGDI), NSDI;Today, users are turned to various geospatial data sources to fulfill their needs;Interoperability of geospatial data and geoprocessing, proposed at the beginning of the nineties, constitutes a solution for the sharing, re-use, and integration of geospatial data (McKee and Buehler 1998; Sondheim, Gardels and Buehler 1999).Currently, we note a multiplication of geospatial data sources and an increased use of geospatial information technologies. Among others, there is the NTDB, the VMap librairies, the digital cartographic files, and several provincial geospatial data sources.Geospatial data and services are more and more available on the Web because of infrastructures specially developed for this purpose. Here, we think specifically to the CGDI and to the NSDI in the United States.Today, users of geospatial data are more and more making use of multiple heterogeneous data sources to satisfy their specific needs.At the beginning of the nineties, interoperability of geospatial data and geoprocessing has been proposed as a solution for sharing, re-using and integrating geospatial data.
5 Problem [Brodeur]Availability of multiple geospatial databases on the Web;Each database or information community uses a specific vocabulary;Databases are heterogeneous at syntactic, structural and semantic levels;Many users benefit from more than one geospatial database to satisfy their needs;Many problems such as the difficulty to locate geospatial dataLocating: search, identification, selection and extraction of geospatial data from external sources.At time being, the problem is that various geospatial databases exist and are accessible on the Web.Each database makes use of a specific vocabulary that has been chosen based on the needs of those who have developed these databases or the information community to which a database adheres.Therefore, databases are typically heterogeneous at the syntactic, schematic and also semantic level.However, users of geospatial data, who access more than one database to fulfill their specific needs, experience many problems such as access, sharing, and integration but more specifically the problem to locate the exact data they need; this is the problem that I have addressed in my research work.I intend here by locating the exact data, the search, the identification, the selection, and the extraction of geospatial data from external sources.
6 ProblemHow does someone assess if the result he/she gets from his/her request corresponds to the initial perception of the reality he/she had in mind when he/she sent that request?Here, we have a sample of data types extracted from six distinct geospatial data sources. They depict same phenomena of the topographic Reality differently.(Click) For instance, for the hydrographic theme, the NTDB uses … whereas Vmap uses …. For the vegetation theme, VMap uses … whereas British Columbia uses ….If I were a geospatial data user, I could have the following questions :What geospatial data fit my needs best?How do I integrate these data into a consistent set?Even more,(Click) how do I assess if the result I get from a query fit the initial perception I had of Reality when I submitted that query?Spatial pictogram descriptions: :0D ; :1D ; :2D ; ?:unknown geometry ; :multiple geometry ; :alternate geometry (see [Bédard, 1999 #231] and [Brodeur, 2000 #149] for more details). 1 [Natural Resources Canada, 1996 #240]; 2[VMap, 1995 #117]; 3[BC Ministry of Environment Lands and Parks (Geographic Data BC), 1992 #121]; 4[OBM, 1996 #120]; 5[Québec, 2000 #123]; 6[New Brunswick, 2000 #243].
7 Context – Metadata discovery To bridge terminology and language gapsSearch exactly the same concepts, vocabulary and language that the database uses; otherwise, their search may not yield relevant results.One of the important problems facing users and providers of geospatial information is to bridge terminology and language gaps that currently hinder the flow of information.At the moment, users searching for information from geospatial databases must know and use in their search exactly the same concepts, vocabulary and language that the database uses; otherwise, their search may not yield relevant results. (Jean Brodeur’s slide is a very good example to showcase).For instance, users searching a database that is documented in English may not get results if they enter a key word that is French.Similarly, they may not get results if they enter a key word that is singular instead of plural, or that relates to but does not match a term recognized by the provider.
8 Project – Multiusage, Multistandard, and Multilingual Geospatial Ontology Service Develop a geospatial ontology service that can be used by applications and other servicesThe project was funded in March under the CGDI GeoInnovations program- An ontology is an explicit specification of some topic.- It is a formal and declarative representation which includes the vocabulary (or names) for referring to the terms in that subject area and the logical statements that describe what the terms are, how they are related to each other, and how they can or cannot be related to each other.- Ontologies therefore provide a vocabulary for representing and communicating knowledge about some topic and a set of relationships that hold among the terms in that vocabulary (e.g. Wine example Ontology 101)- Thesauri or Taxonomies are specific cases of ontologies
9 Objectives Examine requirements related to geospatial ontologies Identify the operations that a service must fulfill to meet requirementsDefine Web protocols to access the serviceDevelop the service using interoperability standardsTechnology assessmentImplement a server with ontologies in at least 2 languages (French and English)experiment results with project partnersIntegrate the M3GO service to portal services (not implemented yet)Integrate the M3GO service to the M3Cat cataloguing tool (not implemented yet)
10 Participants Developers Users CRG, Université Laval Intelec Geomatics Ministry of National DefenceMinistère des Ressources naturelles du QuébecMinistry of Fisheries and Oceans (CHSNatural Resources Canada (CTI-S & CCRS)NatureServe CanadaEnvironment CanadaCommission for Environmental CooperationCRG - Centre for Research in Geomatics, Laval University QuebecCIT-S – Centre for Topographic Information in Sherbrooke (Natural Resources Canada)CCRS – Canada Centre for Remote Sensing (Natural Resources Canada)
11 Inputs Scope Ontology in text or DBMS Language known by client (service)Ontology of keywordsOntology in text or DBMSInitial Content (GCMD-bilingual, IHO B6 and S57)Guide for building ontologiesUTF-8 for character encodingGCMD – Global Change Master Directory Science Keywords in French and EnglishIHO B6 – International Hydrographic Organization Feature names gazetteers e.g. Ridge, Valley, Slope, fansIHO S57 – International Hydrographic Organization Marine Objects e.g. Fog Signal, Ice area, Inshore traffic zone, Light float, Pipeline, Tidal stream
12 Protégé - software related Free, open source, javaCustomizable editorPlugins can be addedDatabase can be accessed by an API
13 Protégé can be used for the following Class modeling. Protégé provides a graphical user interface (GUI) that models classes (domain concepts) and their attributes and relationships.Instance editing. From these classes, Protégé automatically generates interactive forms that enable you or domain experts to enter valid instances.Model processing. Protégé has a library of plug-ins that help you define semantics, perform queries, and define logical behavior.Model exchange. The resulting models (classes and instances) can be loaded and saved in various formats, including XML, UML, and RDF (Resource Description Framework). Protégé also provides a scalable database back end.
14 Data ModelThis data model was developed by JM Proulx (CRG-Laval University) as guideline for M3GO project.Intelec team uses the model to guide the development of meta classes, classes and subclasses (11)within Protege 2000, 3 metaclasess are developed within Protege 2000 i.e. Ontology, Concept and Name.Then populated the classes/sub-classes with GCMD Science Keywords provided by CCRS (French and English),build the relationship between french and english terms, etc…
15 TechnologiesA Java API is used to populate Protégé 2000 classes with GCMD content
17 Demonstration http://intelecgeomatics.com:8080/ogm3/default.jsp Use terms related to “Climate Change” for instance: Snow Mass, Skin temperature, Methane, OzoneSelect GetPreferred with GCMD in English for the term Ozone, the service returns that the term is preferredSelect GetSimilar …..Try to find a “definition” by using “GetDefinition” operation with term = Guyot and GCMD as an Ontology- no definition is provided within GCMDUse the same term with IHO B6 a definition is providedUse skin temperature with GetGraph operation or Ottawa with Gazetteer
35 Protégé - software related Free, open source, javaCustomizable editorPlugins can be addedDatabase can be accessed by an API
36 Protégé can be used for the following Class modeling. Protégé provides a graphical user interface (GUI) that models classes (domain concepts) and their attributes and relationships.Instance editing. From these classes, Protégé automatically generates interactive forms that enable you or domain experts to enter valid instances.Model processing. Protégé has a library of plug-ins that help you define semantics, perform queries, and define logical behavior.Model exchange. The resulting models (classes and instances) can be loaded and saved in various formats, including XML, UML, and RDF (Resource Description Framework). Protégé also provides a scalable database back end.
37 MetaclassesM3GO implementation inside Protégé is composed of 3 metaclasses:ONTOLOGIECONCEPTNOMA metaclass is a template, or a class whose instances are themselves classes
38 Each metaclass is defined by a set of attributes called slots
39 Subclasses M3GO uses 11 subclasses to implement the model Each subclass is also defined by a series of properties (slots)
40 Slots are properties or relationships between classes Adding a slotSlots are properties or relationships between classes
41 Building an OntologyBuilding an ontology is done by implementing previously defined metaclasses in a hierarchical manner