1 Technical Projects Ecoinformatics International Technical Collaboration Seattle, Washington, USA January 26, 2010 Bruce Bargmeyer Lawrence Berkeley National.

1 1 Technical Projects Ecoinformatics International Technical Collaboration Seattle, Washington, USA January 26, 2010 Bruce Bargmeyer Lawrence Berkeley National Laboratory and Berkeley Water Center University of California, Berkeley Tel: +1 510-495-2905

2 EITC Technical Projects Work Group F WG pursues information exchange, collaborative R&D, & collaborative activities under umbrella of EITC F Met as part of the EITC Strategic Planning meeting in Copenhagen, March 2009 F Met after the Ecoterm meeting in Rome in October 2009 F Holds telecons monthly (more or less) during the year. F Individuals hold more frequent telecons on specific activities

3 Some Tech. Proj. WG Activities F Ecoterm F GEMET F Terminology F Standards Array F Abu Dhabi initiative F Collaborative R&D F Metadata Registry and Semantics Management F Web deployment F SciScope, Eye on Earth F Standards 3

4 ECOTERM V WORKSHOP F Ecoterm started meeting in 2004 with the focus on the identification and implementation of best practices related to environmental terminologies and knowledge organization systems on the Web. F This meeting was held at U.N. Food and Agriculture Organization, Rome, Italy, 5-6 October 2009 u attended by approximately 30 people from 18 organizations and 8 countries both in-person and virtually. u Presentations are on the Ecoinformatics web site F Agreement: Ecoterm product/service to be developed is a federated approach to accessing terminology and knowledge organization systems in the area of the environment that would allow them to be accessed, interchanged, and used in traditional indexing and search approaches, as well as semantic web applications. 4

5 Ecoterm V Workshop In order to achieve this, there are several requirements which must be addressed. F A common way to find these resources F Common descriptions (metadata) for these resources so that others (especially systems) can know how to use them F Common Web services to provide external access and consistent functionality F Unique IDs at the system and concept/term level to clearly and uniquely establish the links F An approach that will work globally F An approach that is based on open standards F An approach that can be implemented by organizations somewhat independent of one another, since the Ecoinformatics International Technical Collaboration, while based on an international agreement, does not have an extensive governance structure or funding support F An approach that can begin as a prototype but is robust enough to grow into a useful product/service F Technologies that are being used by other communities so that this effort can link to initiatives in other disciplines that may eventually be relevant to the environment 5

6 Ecoterm V Workshop F Technical approach u SKOS (Simple Knowledge Organization System) would be used to publish the thesauri and other terminology systems u The SKOS files would also be published as linked data u Complement this with voiD files which are used to describe RDF resources in a linked data environment. u A simple umbrella web page would be created to aid in the management, promotion, and access to this network of linked data 6

7 GEneral Multilingual Environmental Thesaurus (GEMET) F Created by EEA F A thesaurus of 5,000+ concepts with expression in 23 languages F INSPIRE initiative gives new impetus, GEMET is included in legislation. F Extend GEMET – u Ecoterm action u EPA update? u EEA invests in the translations and some to advance to SKOS. u EPA and UNEP have written letters of support to China MEP to encourage development of Chinese translation (with help of Wuhan University) 7

8 Technical Projects WG Activities F Terminology u Work underway by EPA, USGS, EEA F Standards Array u Initiated at EITC Strategic Directions meeting in Copenhagen, March 2009 u Scope described during this meeting u Work Underway, Technical Projects WG will spend a half day (Thursday) on this. 8

9 Technical Projects WG Activities F Abu Dhabi initiative u Introduced in this meeting. We will pursue it further in the Tech. Proj. WG meeting on Tuesday afternoon, Wednesday and, Thursday F Collaborative R&D u Have developed a proposal outline, looking for funding possibilities 9

10 SciScope F Developed in collaborative effort between Microsoft Research, Berkeley Water Center (UCB), Lawrence Berkeley National Laboratory F EITC leadership wanted demo of how semantics management – terminology, ontologies, and metadata – could be used to help prepare indicators and assessments. F SciScope demonstrates capabilities that help users to discover, evaluate, and access water data for analysis, presentation, indicators, assessment, …. F SciScope shows the use of a water ontology, linked to water “variables” (data elements), with metadata descriptions, and an easy to use geographic interface. 10

11 11 Broad Use Case: Semantics Management - Linking Ontologies, Models, Metadata and Data IDDateTempHg A06-09-134.44 B06-09-139.32 X06-09-136.778 NameDatatypeDefinitionUnits IDtext Monitoring Station Identifier not applicable DatedateDateyy-mm-dd Tempnumber Temperature (to 0.1 degree C) degrees Celcius Hgnumber Mercury contamination micrograms per liter Register, curate, interrelate and manage Semantics. Data Metadata Biological Radioactive Contamination lead cadmium mercury Chemical Ontology Model

12 Data sources… USGS EPA CIMS TCEQ NADP Source: Bora Beran

13 What are we after? A search engine that creates a unified view over multiple heterogeneous data repositories allowing scientists to discover and retrieve data in a simple and intuitive way. In technical terms: F A searchable metadata repository/aggregator F An ontology based interface for data discovery F Links to metadata describing the available data F A mediator (semantics/syntax/structure) F A light-weight web GIS Adapted from: Bora Beran

14 Data retrieval F SciScope currently hosts only metadata. F Data are requested on the fly from the original publisher using web service wrappers written specifically for each data source. F Data are reformatted to provide a unified view over the repositories. What is behind SciScope? Source: Bora Beran

15 Knowledge Base F Relationships are stored as RDF triples in a relational database F Supports transitive, symmetric and inverse properties F Inferred statements are pre-computed What is behind SciScope? ‘Escherichia coli’ same-as ‘E. coli’ ‘E. coli’ is-a ‘Indicator Organism’ ‘Nitrogen’ is-a ‘Macronutrient’ ‘Macronutrient’ is-a ‘Nutrient’ ‘Hypoxia’ isMeasuredUsing ‘DissolvedOxygen’ ‘Hypoxia’ isRelatedTo ‘Eutrophication’ Source: Bora Beran

16 Inference F Transitive ‘Nitrogen’ is-a ‘Macronutrient’ ‘Macronutrient’ is-a ‘Nutrient’ Inference: ‘Nitrogen’ is-a ‘Nutrient’ F Symmetric ‘Hypoxia’ isRelatedTo ‘Eutrophication’ Inference: ‘Eutrophication’ isRelatedTo ‘Hypoxia’ F Inverse ‘Macronutrient’ is-a ‘Nutrient’ Inference: ‘Nutrient’ isBroaderThan ‘Macronutrient’ Source: Bora Beran

17 Geographical Features Catalog F Collection of features such as dams, aquifers, geologic formations, watersheds, sensors F Based on data and maps from USGS, EPA, National Atlas What is behind SciScope? Source: Bora Beran

18 F SciScope provides access to observations from approximately 1.65 million sensors in the US adding up to 358 million observations. F Tutorials 1. Managing map layers 2. Setting layer transparency 3. Browsing geographical features 4. Drawing polygons 5. Search and data retrieval F Code available for download at: 18

19 SciScope F Demos given to u EITC in December 2008 & March 2009 u EPA Office of Environmental Information u Computational Methods for Water Resources (Conference) u Japan Construction Information Center (JACIC) u Japan Life Cycle Data Management (LCDM ) Forum u Open Forum on Metadata Registries, Seoul Korea June, 2009 u Ecoterm V, Rome, Italy, October, 2009 u Earth Sciences Information Partners, Winter Meeting, January, 2010 u Many online accesses at u … 19

20 SciScope & Eye on Earth F Usability for end users F Demonstrates utility of the underlying technologies and standards F Already descried at this meeting F Question: how can Eye on Earth and SciScope play together? F Topic of Technical Project WG meeting this afternoon 20

21 Technical Projects Activities F Metadata Registry and Semantics Management u ISO/IEC 11179-3 is in Final Committee draft u Considerable work on Ontology: ISO/IEC 19763 Metamodel Framework for Interoperability u Wuhan University leading work on several standards. We will hear from Rong PENG in the Technical Projects WG 21

22 Acknowledgement This material is based upon work supported by the National Science Foundation, under Grant No. 0637122 and by USEPA. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or USEPA. 22

