Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.

Similar presentations


Presentation on theme: "Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis."— Presentation transcript:

1 Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis

2 Contribution A new methodology for lightweight data integration in an incremental pay-as-you-go environment based on the concept of “Intersection Schemas”, utilising bidirectional transformations at a schema level. Improve on existing workflows for data integration, to increase the productivity of the incremental Data Integration process. Development of a demonstrator and user interface to aid the data integrator 8/21/20142

3 Intersection Schemas Implements a framework for incremental data integration. A component within the existing AutoMed data integration framework. Introduces a new “pay-as-you-go” technique of Intersection Schemas. This allows the integrator to incrementally identify intersections between schemas, and integrate them into the Global Schema. 8/21/20143

4 AutoMed Architecture 8/21/20144

5 Data Integration via Union-compatible Schemas 8/21/20145

6 Intersection Schema 8/21/20146

7 Integrated Intersection and Extensional Schemas 8/21/20147

8 Global schema derived from Intersection and Extensional Schemas 8/21/20148

9 Case Study ISpider Proteomics data from three different data sources Mappings defined by domain experts Mappings constitute the domain knowledge 8/21/20149

10 Illustrative Use Case Based on iSpider Datasets o Three data sources: gpmDB Pedro Pepseeker 8/21/201410

11 Illustrative Use Case GUI 8/21/201411

12 Workflow 1.Identify the extensional schemas representing the set of data sources that are to be integrated. 2.Initially a federated schema is created from the schemas identified in Step 1. 3.Inspect the schemas identified in Step 1 and select two of them from which to derive an intersection schema. 4.Identify mappings between these two schemas and create an intersection schema. 5.A new Global Schema is created automatically from the Intersection Schema and the extensional schemas by our tool. The user may optionally elect for any redundant objects in the new Global schema to be dropped. 6.The user may test the Intersection schema or Global schema at this stage by running queries on it. 7.Repeat Steps 3 to 6 for each integration iteration. 8/21/201412

13 Evaluation Comparison of Intersection Schema methodology versus a “classical” ladder based integration methodology: For ladder based integration integration: 95 manually defined transformations For Intersection schema based integration: 26 manually defined transformations 8/21/201413

14 Conclusions We have demonstrated the technique on a real-world data integration scenario and have seen that the number of user-defined steps required to perform the integration is significantly reduced compared to the original data integration methodology used by the domain experts on that project. We have shown how the AutoMed toolkit and bidirectional schema transformations can be used to underpin a new light-weight data integration technique within an incremental pay-as-you-go data integration process. 8/21/201414

15 Future Work Extending the methodology so that intersections can be created between any number of source schemas at each iteration of the process, rather than just two as at present. Detailed user evaluations. 8/21/201415

16 Any Questions 8/21/201416

17 Appendix Example iSpider transformations from original project. 8/21/201417

18 8/21/201418

19 8/21/201419


Download ppt "Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis."

Similar presentations


Ads by Google