ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 1 DB unimo Searching for data and services F. Guerra 1, A. Maurino 2, M. Palmonari.

Slides:



Advertisements
Similar presentations
May 23, 2004OWL-S straw proposal for SWSL1 OWL-S Straw Proposal Presentation to SWSL Committee May 23, 2004 David Martin Mark Burstein Drew McDermott Deb.
Advertisements

Semi-automatic compound nouns annotation for data integration systems Tuesday, 23 June 2009 SEBD 2009 Sonia Bergamaschi Serena Sorrentino
ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 1 DB unimo Semantic Analysis for an Advanced ETL framework S.Bergamaschi 1, F.
Università di Modena e Reggio Emilia ;-)WINK Maurizio Vincini UniMORE Researcher Università di Modena e Reggio Emilia WINK System: Intelligent Integration.
eClassifier: Tool for Taxonomies
OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
…to Ontology Repositories Mathieu dAquin Knowledge Media Institute, The Open University From…
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
A Novel Visualization Model for Web Search Results An Application of the Solar System Metaphor Tien N. Nguyen and Jin Zhang Electrical and Computer Engineering.
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Language Specification using Metamodelling Joachim Fischer Humboldt University Berlin LAB Workshop Geneva
Profiles Construction Eclipse ECESIS Project Construction of Complex UML Profiles UPM ETSI Telecomunicación Ciudad Universitaria s/n Madrid 28040,
Ontology-based User Modeling for Web-based Information Systems Anton Andrejko, Michal Barla and Mária Bieliková {andrejko, barla,
26/10/2008 SWESE'08 1 Enhanced Semantic Access to Software Artefacts Danica Damljanović and Kalina Bontcheva.
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
07 - Special Session on Agricultural Metadata & Semantics Antonio Sala - Università di Modena e Reggio Emilia 1 Creating and Querying.
12/03/ Second International Workshop on New Generation Enterprise and Business Innovation NGEBIS 2013 Cross Domain Crawling for Innovation Pieruigi.
Traditional IR models Jian-Yun Nie.
NDIA SoS SE Committee Topics of Interest May 25, 2009.
2009 – E. Félix Security DSL Toward model-based security engineering: developing a security analysis DSML Véronique Normand, Edith Félix, Thales Research.
Personalized Navigation in the Semantic Web: An Enhanced Faceted Browser Michal Tvarožek FIIT STU BA.
Chapter 5: Introduction to Information Retrieval
Semantic Access to Data from the Web Raquel Trillo *, Laura Po +, Sergio Ilarri *, Sonia Bergamaschi + and E. Mena * 1st International Workshop on Interoperability.
Heterogeneous Data Warehouse Analysis and Dimensional Integration Marius Octavian Olaru XXVI Cycle Computer Engineering and Science Advisor: Prof. Maurizio.
 Andisheh Keikha Ryerson University Ebrahim Bagheri Ryerson University May 7 th
A SLA evaluation Methodology in Service Oriented Architectures V.Casola, A.Mazzeo, N.Mazzocca, M.Rak University of Naples “Federico II”, Italy Second University.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
Automatic Data Ramon Lawrence University of Manitoba
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Metadata Agents and Semantic Mediation Mikhaila Burgess Cardiff University.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Name : Emad Zargoun Id number : EASTERN MEDITERRANEAN UNIVERSITY DEPARTMENT OF Computing and technology “ITEC547- text mining“ Prof.Dr. Nazife Dimiriler.
Università degli Studi di Modena and Reggio Emilia Dipartimento di Ingegneria dell’Informazione Prototypes selection with.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Query Routing in Peer-to-Peer Web Search Engine Speaker: Pavel Serdyukov Supervisors: Gerhard Weikum Christian Zimmer Matthias Bender International Max.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Semantic Learning Instructor: Professor Cercone Razieh Niazi.
ISURF -An Interoperability Service Utility for Collaborative Supply Chain Planning across Multiple Domains Prof. Dr. Asuman Dogac METU-SRDC Turkey METU.
The Agricultural Ontology Service (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Library and Documentation Systems.
EU Project proposal. Andrei S. Lopatenko 1 EU Project Proposal CERIF-SW Andrei S. Lopatenko Vienna University of Technology
TOPIC CENTRIC QUERY ROUTING Research Methods (CS689) 11/21/00 By Anupam Khanal.
Dimitrios Skoutas Alkis Simitsis
LRI Université Paris-Sud ORSAY Nicolas Spyratos Philippe Rigaux.
NeP4B Aims and Innovations: Toward a Unified View of Data and Services Carlo Batini Matteo Palmonari Andrea Maurino University of Milan-Bicocca Italy Sonia.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
ITrails: Pay-as-you-go Information Integration in Dataspaces Presented By Marcos Vaz Salles, Jens Dittrich, Shant Karakashian, Olivier Girard, Lukas Blunschi.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
1 Integration of data sources Patrick Lambrix Department of Computer and Information Science Linköpings universitet.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Of 24 lecture 11: ontology – mediation, merging & aligning.
METADATA MANAGEMENT AT ISTAT: CONCEPTUAL FOUNDATIONS AND TOOLS Istituto Nazionale di Statistica ITALY.
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Associative Query Answering via Query Feature Similarity
Information Retrieval and Web Design
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
WSExpress: A QoS-Aware Search Engine for Web Services
Presentation transcript:

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 1 DB unimo Searching for data and services F. Guerra 1, A. Maurino 2, M. Palmonari 2, G. Pasi 2, A. Sala 3 1 DEA - Università di Modena e Reggio Emilia, v.le Sarca 336, Milano, Italy 2 DISCO - Università di Milano Bicocca, v.le Risorgimento 2, Bologna, Italy 3 DII - Università di Modena e Reggio Emilia, via Vignolese 905, Modena, Italy 1st International Workshop on Interoperability through Semantic Data and Service Integration 25 June 2009 Camogli, Italy

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 2 DB unimo Outline 1.Motivation 2.Building the Global Data and Service View at Set-up Time 3.Data and eService Retrieval 4.Conclusion and future work

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 3 DB unimo Motivation The research on data integration and service discovering has involved from the beginning different (not always overlapping) communities. –Data and services are described with different models, and different techniques to retrieve data and services have been developed. From a user perspective, the border between data and services is often not so definite, since data and services provide a complementary vision about the available resources. Users need new techniques to manage data and services in a unified way. Integration of data and services can be tackled from different perspectives. –Access to data is guaranteed though Service Oriented Architectures (SOA), and Web services are exploited to provide information integration platforms; –Providing a global view on the data sources and on eServices available in the peer to support the access to the two complementary kinds of resources at a same time.

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 4 DB unimo Motivation (2) The problem we address in is to retrieve, among the many services available, the ones that are related to the query, according to the semantics of the terms involved in the query. Select Name, Country from Accommodation Where City=Modena

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 5 DB unimo The approach (overview) We assume to have a mediator-based data integration system which provides a global virtual view of data - the Semantic Peer Data Ontology (SPDO). We assume to have a set of semantically annotated service descriptions. –Ontologies used in the service descriptions can be developed outside the peer and are not known in advance, in the integration process. We propose a semantic-based approach to perform data and service integration: –given a SQL- like query expressed in the terminology of the SPDO, retrieve all the services that can be considered related to the query on the data sources. The approach developed is based on: –a mediator-based data integration system, the MOMIS system (Mediator envirOnment for Multiple Information Sources); –a service retrieval engine based on IR techniques performing semantic indexing of service descriptions and keyword-based semantic search.

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 6 DB unimo The approach (overview) The integration of data and services is achieved by: 1.building the SPDO (a functionality already provided by MOMIS), 2.building a Global Service Ontology (GSO) consisting of the ontologies used in the service semantic descriptions, 3.defining a set of mappings between the SPDO and the GSO, 4.exploiting, at query time, query rewriting techniques based on these mappings to build a keyword-based query for service retrieval expressed in the GSO terminology starting from a SQL-like query on the data sources.

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 7 DB unimo Building the Global Data and Service View The global light service ontology is built by means of the following steps: Service indexing, Global Service Ontology (GSO) construction, Global Light Service Ontology (GLSO) construction and Semantic Similarity Matrix (SSM) definition. The SPDO is built by exploiting the MOMIS integration system

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 8 DB unimoMOMIS

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 9 DB unimo Service Indexing Our approach requires a formal representation of the service descriptions and it is based on full text indexing which extracts terms from six specific sections of the service description: –service name, –Service description, –input, –output, –pre-condition –post-condition A set of index terms I that will be part of the dictionary is extracted. –I O = the set of index terms consisting of ontology –I T = the set index terms extracted from textual descriptions The indexing structure is based on a structured document approach, where inverted file structure consists of: –a dictionary file based on I, –a posting file, with a list of references to the services sections where the considered term occurs

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 10 DB unimo GSO construction The GSO is built by: –loosely merging each service ontology O such that i belongs to O for some i in I O –associating a concept Ci with each i in I T, introducing a class Terms subclass of Thing in the GSO and stating that for every i in I T, Ci is subclass of Terms loosely merging means that SOs are merged without attempting to integrate similar concepts across the different integrated ontologies. –if the source SOs are consistent, the GSO can be assumed to be consistent –Loose merging is clearly not the optimal choice with respect to ontology integration –Since the XIRE component is based on approximate IR techniques and semantic similarity, approximate solutions to the ontology integration problem can be considered acceptable; instead, the whole GSO building process need to be fully automatized.

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 11 DB unimo GLSO construction and Semantic Similarity Matrix The GSO may result extremely large in size: only a subset of the terms of the ontologies are relevant to the SWS descriptions. –a technique to reduce the ontology size is exploited and a GLSO (Global Light Service Ontology) is obtained. –We extract from the GSO, the subontology that preserves the meanings of the terms explicitly used in the service descriptions, namely, the set of the index terms I. The Semantic Similarity Matrix (SSM), which is exploited later on for query expansion at query time, is computed. –The SSM is defined by analyzing the GLSO structure, according to some semantic measure developed in literature and takes into account subclass paths, domain and range restrictions on properties, membership of instances, and so on.

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 12 DB unimo Mapping of Data and Service Ontologies Mappings between the elements of the SPDO and the GLSO are generated by exploiting and properly modifying the MOMIS clustering algorithm. The clustering algorithm takes as input the SPDO and the GLSO with their associated metadata and generates a set of clusters of classes belonging to the SPDO and the GLSO. Mappings are automatically generated exploiting the clustering result. –A cluster contains only SPDO classes: it is not exploited for the mapping generation; this cluster is caused by the selection of a clustering threshold less selective than the one chosen in the SPDO creation process –A cluster contains only GLSO classes: it is not exploited for the mapping generation; it means that there are descriptions of Web Services which are strongly related –A cluster contains classes belonging to the SPDO and the GLSO: this cluster produces for each SPDO class a mapping to each GLSO class

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 13 DB unimoExample The following mappings are generated with the application of our technology: Accommodation --> Hotel Accommodation.Name --> Hotel.Denomination Accommodation.City --> Hotel.Location Accommodation.Country --> Hotel.Country Hotel Hotel.Denomination Hotel.Location Hotel.Country GLSO fragment SPDO fragment

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 14 DB unimo Data and eService Retrieval select from where The answer to this query is a data set from the data sources together with a set of services which are potentially useful, since they are related to the concepts appearing in the query and then to the retrieved data. The query processing is divided into two simultaneously executed steps: –data set from the data sources is obtained with a query processing on an integrated view The results are obtained by exploiting the MOMIS Query Manager which rewrites the global query as an equivalent set of queries expressed on the local schemata (local queries), by means of an unfolding process – a set of services related to the query is obtained by exploiting the mapping between SPDO and GLSOs and the concept of relevant service mapping. Services are retrieved by the XIRE (eXtended Information Retrieval Engine) component, which is a service search engine based on the vector space.

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 15 DB unimo Data and eService Retrieval (overview)

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 16 DB unimo Managing keywords Given a query in an SQL-like notation expressed the SPDO terminology, the set of keywords extracted consists of: –all the classes given in the FROM clause, –all the attributes and the values used in the SELECT and WHERE clauses –all their ranges defined by ontology classes. The set of keywords are exploiting the mappings between the SPDO and the GLSO. Semantic similarity between GLSO terms defined in the SSM is exploited to expand the keyword set into a weighted terms

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 17 DB unimo eServices retrieval Query evaluation is based on the vector space model: –by this model both documents (that is Web Service descriptions) and queries (extracted keywords) are represented as a vector in a n-dimensional space. –Each vector represents a document, and it will have weights different from zero for those keywords which are indexes for that description. –Relevance weights are used to modify the weights in the list resulting from keyword evaluation process.

ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 18 DB unimo Conclusion and future work In this paper we introduced a technique for publishing and retrieving a unified view of data and services. Such unified view may be exploited for improving the user knowledge of a set of sources and for retrieving a list of web services relate to a data set. The approach is semi-automatic, and works jointly with the tools which are typically provided for searching for data and services separately. Future work will be addressed on evaluating the effectiveness of the approach in the real cases provided within the NeP4B project, and against the OWLS-TC benchmark.