CoopIS2001 Trento, Italy The Use of Machine-Generated Ontologies in Dynamic Information Seeking Giovanni Modica Avigdor Gal Hasan M. Jamil.

Slides:



Advertisements
Similar presentations
Andrea Maurino Web Service Design Methodology Batini, De Paoli, Maurino, Grega, Comerio WP2-WP3 Roma 24/11/2005.
Advertisements

eClassifier: Tool for Taxonomies
R 2 O+ODEMapster : Upgrading Relational Legacy Data to the Semantic Web Jesús Barrasa Rodríguez
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Dynamics Research Corporation Hot DAML Submission: DAML Dining Lee Lacy (407) x104 DAML PI Meeting, Nashua, NH July 18-20, 2001.
© Arjen P. de Vries Arjen P. de Vries Fascinating Relationships between Media and Text.
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
Spelling Correction for Search Engine Queries Bruno Martins, Mario J. Silva In Proceedings of EsTAL-04, España for Natural Language Processing Presenter:
April 15, 2004SPIE1 Association in Level 2 Fusion Mieczyslaw M. Kokar Christopher J. Matheus Jerzy A. Letkowski Kenneth Baclawski Paul Kogut.
1 UIM with DAML-S Service Description Team Members: Jean-Yves Ouellet Kevin Lam Yun Xu.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
T.Sharon-A.Frank 1 Internet Resources Discovery (IRD) Shopping Agents.
Reverse Engineering © SERG Code Cloning: Detection, Classification, and Refactoring.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Eduard C. Dragut Ramon Lawrence Eduard C. Dragut Ramon Lawrence.
Supporting e-learning with automatic glossary extraction Experiments with Portuguese Rosa Del Gaudio, António Branco RANLP, Borovets 2007.
Aki Hecht Seminar in Databases (236826) January 2009
ODE: Ontology-assisted Data Extraction WEIFENG SU et al. Presented by: Meher Talat Shaikh.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
A Fully Automated Object Extraction System for the World Wide Web a paper by David Buttler, Ling Liu and Calton Pu, Georgia Tech.
6/17/20151 Table Structure Understanding by Sibling Page Comparison Cui Tao Data Extraction Group Department of Computer Science Brigham Young University.
Data-rich Section Extraction from HTML pages Introducing the DSE-Algorithm Original Paper from: Jiying Wang and Fred H. Lochovsky Department of Computer.
Query Rewriting for Extracting Data Behind HTML Forms Xueqi Chen, 1 David W. Embley 1 Stephen W. Liddle 2 1 Department of Computer Science 2 Rollins Center.
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
Filtering Multiple-Record Web Documents Based on Application Ontologies Presenter: L. Xu Advisor: D.W.Embley.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.
1 Cui Tao PhD Dissertation Defense Ontology Generation, Information Harvesting and Semantic Annotation For Machine-Generated Web Pages.
Query Rewriting for Extracting Data Behind HTML Forms Xueqi Chen Department of Computer Science Brigham Young University March 31, 2004 Funded by National.
Overview of Search Engines
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
WP5.4 - Introduction  Knowledge Extraction from Complementary Sources  This activity is concerned with augmenting the semantic multimedia metadata basis.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Accurately and Reliably Extracting Data from the Web: A Machine Learning Approach by: Craig A. Knoblock, Kristina Lerman Steven Minton, Ion Muslea Presented.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Workshop – 10, December 2014, Berlin ICCS / NTUA Greece Efthymios Chondrogiannis An Intelligent Ontology Alignment Tool Dealing with Complicated Mismatches.
Analysis of DOM Structures for Site-Level Template Extraction (PSI 2015) Joint work done in colaboration with Julián Alarte, Josep Silva, Salvador Tamarit.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
Extracting metadata for spatially- aware information retrieval on the internet Pual Clough Presented by Ali Khodaei CS 572.
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
CROSSMARC Web Pages Collection: Crawling and Spidering Components Vangelis Karkaletsis Institute of Informatics & Telecommunications NCSR “Demokritos”
Chapter 8 Cookies And Security JavaScript, Third Edition.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
Feature Detection in Ajax-enabled Web Applications Natalia Negara Nikolaos Tsantalis Eleni Stroulia 1 17th European Conference on Software Maintenance.
Google’s Deep-Web Crawl By Jayant Madhavan, David Ko, Lucja Kot, Vignesh Ganapathy, Alex Rasmussen, and Alon Halevy August 30, 2008 Speaker : Sahana Chiwane.
Towards the better software metrics tool motivation and the first experiences Gordana Rakić Zoran Budimac.
Server-side Programming The combination of –HTML –JavaScript –DOM is sometimes referred to as Dynamic HTML (DHTML) Web pages that include scripting are.
Web- and Multimedia-based Information Systems Lecture 2.
Some questions -What is metadata? -Data about data.
Shridhar Bhalerao CMSC 601 Finding Implicit Relations in the Semantic Web.
Dictionary based interchanges for iSURF -An Interoperability Service Utility for Collaborative Supply Chain Planning across Multiple Domains David Webber.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Information Visualization, Human-Computer Interaction, and Cognitive Psychology: Domain Visualizations Kevin W. Boyack Sandia National Laboratories.
Chapter 7 K NOWLEDGE R EPRESENTATION, O NTOLOGICAL E NGINEERING, AND T OPIC M APS L EO O BRST AND H OWARD L IU.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Selecting Relevant Documents Assume: –we already have a corpus of documents defined. –goal is to return a subset of those documents. –Individual documents.
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
XML Extensible Markup Language
Of 24 lecture 11: ontology – mediation, merging & aligning.
An Ontology-based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design Feng Wang, Lanfen Lin, Zhou Yang College.
POAD Book: Chapter 8 POAD: Analysis Phase
Based on Menu Information
Web Data Extraction Based on Partial Tree Alignment
R2O+ODEMapster: Upgrading Relational Legacy Data to the Semantic Web
CSc4730/6730 Scientific Visualization
Use Cases Simple Machine Translation (using Rainbow)
Presentation transcript:

CoopIS2001 Trento, Italy The Use of Machine-Generated Ontologies in Dynamic Information Seeking Giovanni Modica Avigdor Gal Hasan M. Jamil

CoopIS2001 Trento, Italy Motivating example

CoopIS2001 Trento, Italy Preliminaries Definition: An ontology is an explicit representation of a conceptualization. (Gruber 1993) Conjecture I: Applications in a given domain base their information exchange on some (shared) underlying ontology. Observation: Application in a given domain use different ontology representation. Conjecture II: Given an application A such that A utilizes an ontology representation O A, and an ontology O, there exists an invertible mapping f A such that f A (O A )=O

CoopIS2001 Trento, Italy Problem description Given two applications A and B, such that A utilizes an ontology representation O A and B utilizes an ontology representation O B, introduce a mapping f BA such that f BA (O B )=O A In a perfect world: –O is known. –f A is known. –f B is known. O A = f A -1 (f B (O B )) Alas: –O is unknown. At best, an approximation of O exists, in a form of a standard. –f A and f B are unknown: lack of documentation, the mental state of a designer, etc.

CoopIS2001 Trento, Italy Proposed solution Given two applications A and B, such that A utilizes an ontology representation O A and B utilizes an ontology representation O B, introduce a mapping f BA such that f BA depends on the ontology representation. A matching is associated with a degree of confidence in the matching. 0 identifies non-matching terms. 1 identifies a crisp matching.

CoopIS2001 Trento, Italy Ontology representation Dynamic information seeking: –HTML forms Labels Input fields Scripts –Assumptions: Labels represent terms in an ontology ( e.g., Pick-up Date). Input fields provide constraints on the value domains ( e.g., {Day, 1, … 31}). Scripts, among other things, suggest a precedence relationship (e.g., Pick-up Locations is required before selecting a Car Type).

CoopIS2001 Trento, Italy Ontology representation Conceptual modeling approach Based on Bunge: –Terms (things) –Values –Composition –Precedence

CoopIS2001 Trento, Italy Ontology extraction and matching URL (e.g. HTML Parsing DOM Tree Phase 1 Parsing Phase 2 Labeling HTML Elements Label Identification FORM Elements rules Form Rendering Phase 3 Ontology Phase 4 Merging KB Submission Matching Algorithms Target/Candidate Ontology Target Ontology CandidateO ntology Refined Ontology Ontology Creation Thesaurus

CoopIS2001 Trento, Italy Phase 1: Parsing

CoopIS2001 Trento, Italy Phase 2: Labeling

CoopIS2001 Trento, Italy Phase 2: Labeling

CoopIS2001 Trento, Italy Phase 2: Labeling

CoopIS2001 Trento, Italy Merging Heuristics for the ontology merging (Frakes and Baeza-Yates, 1992) : Textual matching: Date datePickup pickup Ignorable characters removal: *Country country De-hyphenation: Pick-up PickupPickup Pick up Stop terms removal: Date of Return Return Date Stop terms: a, to, do, does, the, in, or, and, this, those, that, … etc. Substring matching: Pickup Location Code Pick-up location (66%) Content matching: Dropoff Day (1,..,31) Return Day (1,..,31)(100%) Dropoff Return Thesaurus matching: Dropoff Location Return Location (100%)

CoopIS2001 Trento, Italy Phase 4: Merging

CoopIS2001 Trento, Italy Preliminary Results Two metrics are used for performance analysis (Frakes and Baeza-Yates, 1992) : Recall (completeness) Precision (soundness) Parameters: t r : number of terms retrieved t m : number of terms matched t e : number of terms effectively matched Recall:Precision:

CoopIS2001 Trento, Italy Preliminary Results Example: # of terms in Ontology1: 20 # of matches identified: 15 Recall: 75%(15/20) # of effective matches: 10 Precision: 66% (10/15) A third metric is used to compare the recall and precision. For a precision value P, a recall value R and an importance measure b, the combined metric E is calculated as (Frakes and Baeza-Yates, 1992) :

CoopIS2001 Trento, Italy Preliminary Results

CoopIS2001 Trento, Italy Preliminary Results

CoopIS2001 Trento, Italy Preliminary Results

CoopIS2001 Trento, Italy Summary and Future Work We have introduced: –Automatic ontology creation –Automatic matching process –Preliminary results Future work oriented towards: –Incorporation of query facilities into the tool –Automatic navigation of web sites for ontology extraction –Dynamic translation between queries against the target ontology to queries against the multiple candidate ontologies