Presentation is loading. Please wait.

Presentation is loading. Please wait.

Extended Metadata Registries and Semantics (Part 2: Implementation) Karlo Berket Ecoterm IV Environmental Terminology Workshop April 18, 2007 Diplomatic.

Similar presentations


Presentation on theme: "Extended Metadata Registries and Semantics (Part 2: Implementation) Karlo Berket Ecoterm IV Environmental Terminology Workshop April 18, 2007 Diplomatic."— Presentation transcript:

1 Extended Metadata Registries and Semantics (Part 2: Implementation) Karlo Berket Ecoterm IV Environmental Terminology Workshop April 18, 2007 Diplomatic Academy of Vienna Vienna, Austria

2 printed 7/14/2006 9:05 AM page 2 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt XMDR Prototype Progress Outline REST API Revised packaging of XMDR prototype code Content loading Demonstrate current XMDR Prototype

3 printed 7/14/2006 9:05 AM page 3 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt XMDR Prototype Modular Architecture Registry Store Search & Content Serving (Jena, Lucene) XMDR metamodel (OWL & xml schema) standard XMDR files Logic Index Content Loading & Transformation (Lexgrid & custom) Human User Interface (HTML fromJSP and javascript; Exhibit) Metadata Sources concept systems, data elements USERS Web Browsers…..Client Software Application Program Interface (REST) Authentication Service Validation (XML Schema) Mapping Engine Logic Indexer (Jane & Pellet) Text Indexer (Lucene) Metamodel specs (UML & Editing) (Poseidon, Protege) XMDR data model & exchange format XML, RDF, OWL Text Index

4 printed 7/14/2006 9:05 AM page 4 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt REST API (Search Methods)

5 printed 7/14/2006 9:05 AM page 5 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt REST API (Search Results)

6 printed 7/14/2006 9:05 AM page 6 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt REST API (Registry methods)

7 printed 7/14/2006 9:05 AM page 7 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt REST API (Registry Results)

8 printed 7/14/2006 9:05 AM page 8 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt REST API (Method Parameters)

9 printed 7/14/2006 9:05 AM page 9 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt Revised XMDR Text and SPARQL Searches run using REST API

10 printed 7/14/2006 9:05 AM page 10 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt Results Display

11 printed 7/14/2006 9:05 AM page 11 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt Faceted Browsing of Search Results NOTE: only interface-specified attributes are included in results from text searches.

12 printed 7/14/2006 9:05 AM page 12 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt Content loading, with XMDR metamodel used for inferred indexing and validation CONTENT Terminology A Terminology D Terminology B Thesaurus C Data Element Source E Terminology Source F Ontology Source G External Source H VALIDATIONTRANSFORMATIONREGISTRY INFORMATIONINDEXING Lexgrid Reasoner (Pellet) Text Indexing (Lucene) Inferred LogicIndex Asserted LogicIndex Full Text Text Index Search & Inference Framework (Jena) XSLT script E XSLT script F XSLT script G XSLT script G XMDR Files A XMDR Files D XMDR Files B XMDR Files C XMDR Files E XMDR Files F XMDR Files G XMDR Files H (virtual) Subversion XMDR metamodel In XML schema From OWL

13 printed 7/14/2006 9:05 AM page 13 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt Streamlined Content Loading Data loading process enables parallel processing –(1) raw files create inferred triples, –(2) load everything to DB –(3) create text index from DB rather than from raw XMDR files –old one created text index from raw XMDR files; then infer/load DON'T NEED XML SCHEMA to know item mappings XMDR software uses separate Jena models for diff concept systems -- so can be done in parallel on diff machines

14 printed 7/14/2006 9:05 AM page 14 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt Example content currently loaded into XMDR Prototype Concept Systems via Lexgrid NBII_2002-2003 biodiversity NCI_Thesaurus_06.02d health GEMET_2001.0 Multilingual Environmental Thesaurus ISO4217_1981 currency codes ISO3166_V-10 country codes (only 2 letter codes) Mouse_1.32 anatomy DTIC_1.0 Department of Defense Portions of EPA controlled vocabulary SIC and NAICS Concept Systems & Ontologies via special purpose scripts Omega ontology (reloading) 11179 Data Element Registries caDSR (full NCI Cancer Data Standards Registry via ca-Core API) (reloading)

15 printed 7/14/2006 9:05 AM page 15 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt Load Times

16 printed 7/14/2006 9:05 AM page 16 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt Load Times (w/out NCI Thesaurus)

17 printed 7/14/2006 9:05 AM page 17 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt Load Times (zoom in)

18 printed 7/14/2006 9:05 AM page 18 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt Loading time for Omega Ontology Omega is a “terminological ontology” reorganization & synthesis of WordNet & Mikrokosmos adds higher level ontology to organize multiple ontologies Ready to try reloading 1 st try required over a week to process & load 4m files, ~250k/24 hrs

19 printed 7/14/2006 9:05 AM page 19 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt Demo Tomorrow morning

20 printed 7/14/2006 9:05 AM page 20 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt XMDR Software Split into 2 packages –XMDR Core code (general purpose for RDF/OWL) Can be used with any RDF/OWL data files –11179-specific code Smaller set of software is easier to replace when model changes, etc. Release featuring these changes and more –End of April 2007

21 printed 7/14/2006 9:05 AM page 21 of xxx XMDR-Prototype-Progress-July-2006-v2.ppt XMDR Prototype Web Site has downloadable code & content http://xmdr.lbl.gov/ Note tabs for other sections!


Download ppt "Extended Metadata Registries and Semantics (Part 2: Implementation) Karlo Berket Ecoterm IV Environmental Terminology Workshop April 18, 2007 Diplomatic."

Similar presentations


Ads by Google