Presentation is loading. Please wait.

Presentation is loading. Please wait.

Revelytix SICoP Presentation DRM 3.0 with WordNet Senses in a Semantic Wiki Michael Lang February 6, 2007.

Similar presentations


Presentation on theme: "Revelytix SICoP Presentation DRM 3.0 with WordNet Senses in a Semantic Wiki Michael Lang February 6, 2007."— Presentation transcript:

1 Revelytix SICoP Presentation DRM 3.0 with WordNet Senses in a Semantic Wiki Michael Lang February 6, 2007

2 Agenda ► Semantic Matching using Wordnet ► Bootstrapping COI based vocabularies  With WordNet ► DRM 3.0 in a semantic Wiki  With WordNet integration  DRM implementation tool for the agencies ► Demonstration

3 DRM Mission ► Facilitate information sharing ► How can I know that a data element or service I have discovered is the one I really want?  Description  Context ► How can I describe and provide sufficient context for anyone to know they have found what they want ► Knowledge model  Excel, ISO 11179, DRM 2.0 will not be sufficient  OWL

4 Semantic Matching

5 ► MatchIT  Extracts terms from data bases  Creates a MatchIT vocabulary based on the collection of terms  Uses WordNet to match terms in disparate systems  Uses WordNet to match terms to domain vocabularies (NIEM)  Attaches WordNet “senses” to the vocabulary terms ► MatchIT vocabularies can be exported as OWL models  With the WordNet senses and synsets ► MatchIT can use other knowledgebases to facilitate matching

6 Bootstrapping COI Vocabularies ► MatchIT vocabularies are imported  Either as OWL classes for vocabulary development  Or as OWL individuals for DRM development ► These vocabularies can enriched using Knoodl.com for community based development  Data dictionary  Vocabulary  Knowledge base

7 Conceptual / Logical / Physical Data Models Relational XMLXML XMLXML XMLXML XML Ontologies [OWL/RDF] Domain [UML/ER] Data Harmonization Complete Metadata Access Data/Content Access Ontological Semantics Access OWL / RDF Model Complete Import Export Representations Find Matches Ontological Semantics Access Enterprise Information Sources Custom Any Source XML File System JDBC RDMS Semantic Ontology Platform Fact Repositories Onomasticons Lexicons Domain Ontology Models & Files [versioned] Search Index Web Reporting Instance- level Match Schema- level Match Build Knoodl.com Third-Party Modeling Tool MatchIT Vocabulary Manager

8 Collaborative OWL editor

9 Information Management ► Knoodl is a new kind of modeling tool for modeling the structure, semantics and knowledge of any domain  The modeling process is necessarily collaborative  The process is necessarily extensible and additive  Community of Interest (COI) based tool  OWL based

10 Knoodl.com is … ► An internet application where people can collaborate with others in their communities of interest to  Create, edit, share and find  Vocabularies / ontologies ► OWL Repository  Free, but licensing controlled by COI’s ► Social Computing Paradigm  Users contribute content and benefit from the content  Vocabularies capture much of the institutional knowledge of an enterprise or community  Gain value over time  Used by people and machines

11 Knoodl.com ► Knoodl is a collaborative framework ► Interoperability depends on three groups of stakeholders contributing to the description and context of the services ► Businesspeople ► Technical people ► Data people  Knoodl provides the features for the business people to participate

12 FEADRM Person Person Harmonization Workgroup Data Architecture Subcommittee Meeting January 11, 2007

13 Gathering Information We asked those on the workgroup to share their models of PERSON with us. We received documents from the Department of the Interior (DOI), the Veterans’ Administration (VA), the Federal Aviation Administration (FAA), and the Environmental Protection Agency (EPA). You can view them on CORE.gov at https://collab.core.gov/CommunityBrowser.aspx?id= 10833

14 Analyzing the Data We compared the entities and attributes from all the documentation. We created an Excel Workbook. – The first sheet contains all the entities and attributes from each model. – The second sheet contains a mapping of the entities from the other agencies to those of the Social Security Administration (SSA) – The third sheets contains the entities, attributes, and their definitions from the SSA FEADRM Model The Excel document is named ‘Person Entities and Attributes from Various Feds’ and you can view it on CORE.gov at https://collab.core.gov/CommunityBrowser.aspx?id=11682

15 Observations A data model should have a point of view, we should have a common one at the Federal level. Everyone should be modeling business data rather than creating logical data base models. PERSON is probably the area in which resides most of the non-administrative sharable data. This is what we at SSA call “common shared.” The definition of business concepts represented by entities at the “top” of the data model should not be in terms so rigorously tied to the business of any one agency. Data that are “regulated” require formal agreement to be sharable. PERSON cannot be addressed in a vacuum. The concepts of organization, party, and role should be addressed at the same time.

16 DRM 3.0

17 Communities of Interest (COI) Vision Each COI will implement the 3 pillar framework strategy. Business & Data Goals drive Information Sharing/Exchange (Services) Governance Data Strategy Data Architecture (Structure)

18 The FEA Data Reference Model 2.0 Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3. http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf NIEM 1.0 NIEM Roadmap Pilot

19 DRM 2.0 Implementation Metamodel ► Definitions:  Metamodel: Precise definitions of constructs and rules needed for abstraction, generalization, and semantic models.  Model: Relationships between the data and its metadata - W3C.  Metadata: Data about the data for: Discovery, Integration, and Execution.  Data: Structured e.g. Table, Semi-Structured e.g. Email, and Unstructured e.g. Paragraph. Source: Professor Andreas Tolk, 2005.

20 The Revelytix Solution ► OWL MetaModel:  owl:Class  owl:Property ► DRM Model:  Topic (owl class)  Entity (owl class)  Relationship (owl object property) Use existing MetaModel languages to model the FEA DRM – OWL Model the DRM in a collaborative environment - Knoodl Extend the DRM to model the type of information that will be created – JDBC metadata, Wordnet synset and word data Use existing MetaModel languages to model the FEA DRM – OWL Model the DRM in a collaborative environment - Knoodl Extend the DRM to model the type of information that will be created – JDBC metadata, Wordnet synset and word data

21 DRM Implementation: Data Description Area ► Model JDBC Metadata to Data Description Area Entity Attribute DRM v2.0 Vocabulary View Column Table MatchIT Data Dictionary Vocabulary Relationship ForeignKey

22 DRM Implementation: Data Context Area ► Model Wordnet data to Data Context Area Relationship Topic DRM v2.0 Vocabulary Hyponym Synset Hypernym MatchIT Data Dictionary Vocabulary Taxonomy Wordnet

23 SICoP Knowledge Reference Model The point of this graph is that Increasing Metadata (from glossaries to ontologies) is highly correlated with Increasing Search Capability (from discovery to reasoning).

24 Demonstration

25 Contextualize (Interpret) Automated term tokenization Automated semantic linking using the default knowledge-base contained within MatchIT ArticleAmount AmountArticle Sum Assets Creation Synonym Type-of

26 Semantic Matching (Mediate) ► Relationships pre-established within the knowledge-base… Identify the Target and the Source(s) and run the match. ArticleAmount ProductShares Automatically linked by a specific % distance

27 Semantic Matching (Mediate) Not all direct matches are the most relevant… In many cases the most valuable match are the distant matches. By adding a domain knowledge-base these relationships become more obvious. Abstraction Evidence


Download ppt "Revelytix SICoP Presentation DRM 3.0 with WordNet Senses in a Semantic Wiki Michael Lang February 6, 2007."

Similar presentations


Ads by Google