Presentation is loading. Please wait.

Presentation is loading. Please wait.

D2I Modena, 27 Aprile 2001 Methodologies and techniques for the extraction, the representation and the integration of structured and semi-structured information.

Similar presentations


Presentation on theme: "D2I Modena, 27 Aprile 2001 Methodologies and techniques for the extraction, the representation and the integration of structured and semi-structured information."— Presentation transcript:

1 D2I Modena, 27 Aprile 2001 Methodologies and techniques for the extraction, the representation and the integration of structured and semi-structured information sources Unità Responsabile: CS-RC Unità Coinvolte: BO, CS-RC, MI, MO, RM

2 Synthesis Aim: developing a framework for uniformly and semi- automatically handling information sources having large sizes and different formats and structures The proposed framework consists of three steps: –The representation of involved information sources through a conceptual model –The exploitation of the conceptual model for extracting interscheme properties –The exploitation of interscheme properties for obtaining an integrated and uniform representation of involved information sources

3 Synthesis The framework stores all the necessary information in a Metadata Repository This contains all information about involved sources, their concepts, properties existing among concepts, etc. All steps of the proposed framework exploit the Metadata Repository for taking their inputs and storing their outputs In addition, the Metadata Repository is used by all those applications, such as Data Warehousing and Data Mining, which exploit the integrated and uniform representation of involved information sources which constitute the output of our framework

4 Synthesis Many implementations of the proposed framework, based on completely different conceptual models and algorithms, can be designed In this report we propose three approaches which implement the general ideas of the framework: –A graph based approach, which extends, to semi-structured data, the ideas at the basis of the system DIKE –An object-oriented based approach, which extends, to semi-structured data, the ideas underlying the system MOMIS –A Description Logic based approach

5 Synthesis Conceptual Models for representing and handling information sources having different formats and structures: –Graph based approach:The SDR-Network –The object-oriented approach:The ODL I 3 data model –The Description Logic approach:The DLR Desciption Logic

6 Synthesis Metadata Repository Architectures A Metadata Repository Architecture based on the SDR-Network –It is composed by A metascheme, storing the information about involved sources, their concepts and interscheme properties among concepts A set of meta-operators, for querying and modifying the metascheme

7 Synthesis A Metadata Repository Architecture based on the SDR-Network

8 Synthesis A Metadata Repository Architecture based on the ODL I 3 data model

9 Synthesis Extraction of interscheme properties –The graph based approach both extracts and represents interscheme properties by exploiting the SDR-Network and the related metrics –The object oriented approach both extracts and represents interscheme properties by exploiting the ODL I 3 data model –Both of them store extracted interscheme properties in the corresponding Metadata Repositories –The Description Logic based approach can suitably represent interscheme properties derived by other approaches

10 Synthesis Integration of involved information sources –The graph based approach exploits interscheme properties for carrying out a scheme integration –The object oriented approach exploits derived interscheme properties for carrying out a scheme integration –The Description Logic based approach is able to carry out a data integration

11 Open Problems and Future Work While there is a clear convergence about the general structure of the framework to be adopted for the project, each unit provides its own perspective into the issues We have three different approaches to the problem lying on the table, each of which concentrates on certain aspects of the problem Each approach uses its own formalism and technical grounding As future work, it is necessary to harmonize those approaches to attain a unique, well structured and detailed framework to support integration activities This will be possible only in the context of a more general agreement on the features of the Metadata Repository


Download ppt "D2I Modena, 27 Aprile 2001 Methodologies and techniques for the extraction, the representation and the integration of structured and semi-structured information."

Similar presentations


Ads by Google