4 North Park Suite 106 Hunt Valley, MD 21030 410-584-0009 www.revelytix.com Ontology Based Information Management MatchIT 1.1: Data Integration with Semantic.

Slides:



Advertisements
Similar presentations
Semantic Interoperability: Automatically Resolving Vocabularies 4 th Semantic Interoperability Conference February 10, 2006 Chuck Mosher 8500 Leesburg.
Advertisements

Meta Data Larry, Stirling md on data access – data types, domain meta-data discovery Scott, Ohio State – caBIG md driven architecture semantic md Alexander.
Project Proposal Anton Tkacik, Lukas Sedlak
© Copyright 2012 STI INNSBRUCK Apache Lucene Ioan Toma based on slides from Aaron Bannert
Page 1 Integrating Multiple Data Sources using a Standardized XML Dictionary Ramon Lawrence Integrating Multiple Data Sources using a Standardized XML.
© Copyright 2009 Dow Jones and Company Taxonomy and SharePoint: A Powerful Combination Laura Antos Dan Segal Dow Jones Client Solutions SLA 2009 Tech Zone.
From Relational to Semantics A Methodology Arka Mukherjee, Ph.D. Founder / CTO Global IDs David Schaengold Director,
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
Wrap up  Matching  Geometry  Semantics  Multiscale modelling / incremental update / generalization  Geometric algorithms  Web Services.
Information and Business Work
224 Schilling Circle Suite 240 Hunt Valley, MD (410) Ontology-Driven Information Management Standards-Based Collaborative.
S.R.F.E.R.S. State, Regional, and Federal Enterprise Retrieval System Inter-Agency & Inter-State Integration Using GJXML.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Integrating data sources on the World-Wide Web Ramon Lawrence and Ken Barker U. of Manitoba, U. of Calgary
Developing an Ontology-based Metadata Management System for Heterogeneous Clinical Databases By Quddus Chong Winter 2002.
Microsoft Office Open XML Formats Brian Jones Lead Program Manager Microsoft Corporation.
Page 1 Multidatabase Querying by Context Ramon Lawrence, Ken Barker Multidatabase Querying by Context.
Automatic Data Ramon Lawrence University of Manitoba
Mining Metamodels From Instance Models: The MARS System Faizan Javed Department of Computer & Information Sciences, University of Alabama at Birmingham.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
Limited Distribution Release Open Information Interoperability Tool Suite Dr. Len Seligman, Dr. Ken Smith, Catherine Macheret, Chris Wolf
Information Integration Intelligence with TopBraid Suite SemTech, San Jose, Holger Knublauch
1 Introduction to databases concepts CCIS – IS department Level 4.
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Metadata Agents and Semantic Mediation Mikhaila Burgess Cardiff University.
Break Out Session on Infrastructure and Technology: A Report Vipul Kashyap AOS Workshop, Rome, 15 November 2001
Using Vocabulary Services in Validation of Water Data May 2010 Simon Cox, JRC Jonathan Yu & David Ratcliffe, CSIRO.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
Funded by: European Commission – 6th Framework Project Reference: IST WP 2: Learning Web-service Domain Ontologies Miha Grčar Jožef Stefan.
Research Information System for Materials - Database, Simulation and Knowledge Toshihiro Ashino Toyo University
Mobile Topic Maps for e-Learning John McDonald & Darina Dicheva Intelligent Information Systems Group Computer Science Department Winston-Salem State University,
PART IV: REPRESENTING, EXPLAINING, AND PROCESSING ALIGNMENTS & PART V: CONCLUSIONS Ontology Matching Jerome Euzenat and Pavel Shvaiko.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Ontology Summit2007 Survey Response Analysis -- Issues Ken Baclawski Northeastern University.
10/18/20151 Business Process Management and Semantic Technologies B. Ramamurthy.
Development Process and Testing Tools for Content Standards OASIS Symposium: The Meaning of Interoperability May 9, 2006 Simon Frechette, NIST.
Revelytix SICoP Presentation DRM 3.0 with WordNet Senses in a Semantic Wiki Michael Lang February 6, 2007.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Interoperability & Knowledge Sharing Advisor: Dr. Sudha Ram Dr. Jinsoo Park Kangsuk Kim (former MS Student) Yousub Hwang (Ph.D. Student)
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
SOA-25: Data Distribution Solutions Using DataXtend ® Semantic Integrator for Sonic ™ ESB Users Jim Barton Solution Architect.
Service Component Registry and Repository (SCRR) April 26, 2007 Lico Galindo 2007 Exchange Network National Meeting.
The future of the Web: Semantic Web 9/30/2004 Xiangming Mu.
ReSeTrus Development of a digital library technology based on redundancy elimination and semantic elevation, with special emphasis on trust management.
Working with Ontologies Introduction to DOGMA and related research.
1 Resolving Schematic Discrepancy in the Integration of Entity-Relationship Schemas Qi He Tok Wang Ling Dept. of Computer Science School of Computing National.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
CUSTOMER RELATIONSHIP MANAGEMENT DISCOVER | PLAN | EXECUTE.
Challenges in the Business Digital Ecosystems Pierfranco Ferronato, Soluta.net DBE Principal Architect Digital Ecosystem Workshop, 18 May 2005 “Towards.
Personalized Recommendation of Related Content Based on Automatic Metadata Extraction Andreas Nauerz 1, Fedor Bakalov 2, Birgitta.
Knowledge Modeling and Discovery. About Thetus Thetus develops knowledge modeling and discovery infrastructure software for customers who: Have high-value.
SICoP Presentation A story about communication Michael Lang BEARevelytix April 25, 2007.
IRS Tax Map Electronic Research Tool David Brown Internal Revenue Service Media and Publications Division David Brown Internal Revenue Service Media and.
Manufacturing Systems Integration Division Development Process and Testing Tools for Content Standards Simon Frechette National Institute of Standards.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
An Ontological Approach to Financial Analysis and Monitoring.
Ontologies Reasoning Components Agents Simulations An Overview of Model-Driven Engineering and Architecture Jacques Robin.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Enterprise Resource Planning - PeopleSoft. An ERP system is a business support system that maintains in a single database the data needed for a variety.
Metadata Michael J. Watts
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
ENTERPRISE BUSINESS SYSTEMS part II
KNOWLEDGE MANAGEMENT (KM) Session # 34
One Language. One Enterprise.™
Chaitali Gupta, Madhusudhan Govindaraju
Business Process Management and Semantic Technologies
Presentation transcript:

4 North Park Suite 106 Hunt Valley, MD Ontology Based Information Management MatchIT 1.1: Data Integration with Semantic Mapping Technologies Michael Schidlowsky Sr. Software Architect

Data Integration Motivated by: Organizational Changes Mergers and Acquisitions Internal reorganizations (e.g., DHS) Data Mining Standards Conformance Migration Efforts Legacy Systems Decouple data sources from application code

Data Integration Challenges for integration specialist include: Domain-specific terms Unfamiliarity with source schemas Large size of schema set Semantics often not captured Captured semantics Stored in ad-hoc formats Cannot be reused to facilitate future data integration efforts

Data Integration: Example Background: Acme Inc., merges with CompuGlobalHyperMeganet. Technical Challenge: Need “Virtual Database” of all sales for all stores in real-time. Which fields represent customers? CUSTOMERID CUST_ID SSN Which fields represent ‘Price’? Sale_Amt Total_Sale What if your database has 10,000 columns?

Data Integration: Example Background: HR needs to use employee information for new company portal. Technical Challenge: Data must be in XML and conform to standard HR schema. Find all fields related to Address? RESIDENCE PREV_RESIDENCE What if your database has 10,000 columns?

Ideal Matching Solution Finds lexical relationships Captures semantic information Finds semantic relationships Provides programmatic access to results (API) Fast Scalable Human Involvement

MatchIT Philosophy Best Matching tool already exists! What is meant by “ID”?

MatchIT Philosophy Best Matching tool already exists! What is meant by “ID”? -“PLEASE PRESENT ID”

MatchIT Philosophy Best Matching tool already exists! What is meant by “ID”? -“PLEASE PRESENT ID” -NY, NJ, ID

MatchIT Philosophy Best Matching tool already exists! What is meant by “ID”? -“PLEASE PRESENT ID” -NY, NJ, ID -SUPEREGO, EGO, ID

MatchIT MatchIT is a semantic and lexical matching tool. - Session Outline: -Import and process schemas -Perform lexical matching -Create and manage a semantic vocabulary -Perform semantic matching -Demonstrate 3 rd Party integration with Data Integration tool (MetaMatrix)

Import & Process Schemas Revelytix Models are RDF/OWL Flexible model architecture Extensible Interoperable Current Importers: JDBC XML Schema MetaMatrix XMI Models Importer Demo

Lexical Matching Uses lexical distance measures to determine lexical similarity. Fastest matching technique Requires no work other than importing schemas Often yields interesting results Lexical Matching Demo

Create Vocabulary from Schemas A Vocabulary is A set of symbols Occurrences of those symbols in your schemas Binding of each symbol to one or more semantic concepts Created by MatchIT from schemas using tokenization algorithms. Reusable

Tokenization Algorithms Different schemas require different tokenization techniques. Tokenization algorithms determine how symbols are extracted from schemas: Capitalization Delimiters English Language Vocabulary Demo

Matching Techniques MatchIT currently uses two types of matching techniques: Lexical Matching Attempts to determine similarity based on the lexical distance between them. Semantic Matching Attempts to determine similarity based on the ontological distance between them within a semantic knowledge base.

Parts Supplier Schema (as seen by a person)

Parts Supplier Schema (as seen by a computer)

Semantic Matching How semantically similar are two concepts?

Semantic Matching Uses knowledge base distance measures to determine semantic similarity. Presents ranked candidate matches Based on semantics captured in Vocabularies The only way to effectively find relationships between lexically dissimilar symbols: GenderCodeSexCode ProviderSupplier AmountQuantity Semantic Matching Demo

3 rd Party Integration MatchIT Integration MatchIT Java API Stand-alone application Embeddable application (as Eclipse plug-ins). Hides unapproved matches Useful for various 3 rd Party applications: -Data Integration -Data Discovery -Ontology Mediation -Search -Metadata Management -Data Cleansing MetaMatrix Demo

4 North Park Suite 106 Hunt Valley, MD Ontology Based Information Management Questions? MatchIT 30-day trial available at Michael Schidlowsky