Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh.
Published byModified over 4 years ago
Presentation on theme: "Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh."— Presentation transcript:
Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh
Objectives: Creation of unified schema based on a set of existing source schemas. provides standard representation of data that deals with heterogeneous sources. Applications: Unified schema provides a single access point against which queries are posed. Consolidation of databases of merged organization.
Overview convert each schema into a graph of concepts with Has-A relationships. Identify matching concepts between different graphs. For every pair of matching concept merge/separate the concepts. Allow user to specify constraints on the merging process. Result is adaptive and interactive enumeration method.
Correspondences Signify “semantic equivalent” elements in two schemas. Bidirectional. Can be specified by the user or discovered by schema matching techniques. The approach considers correspondences between attributes.
Schema Integration The integrated target schema T is capable of representing all the attributes in the source schemas. Every attribute in T must represent some attribute of the input source schemas. The source schema data is transformed into T data via mapping ‘M’. In short, all basic relationships in the source schemas are preserved in T.
Graph of Concepts Each schema is converted into a logical view (graph of concepts). Each concept is a relation name with an associated set of attributes. Concepts in a schema may have references to other concepts and these references are captured by Has-A edges. A concept graph is a pair (V; HasA).
Graph of Concepts A concept graph is a pair (V; HasA).
Matching of Concepts Use the correspondences to match the concepts. FORMAL DEFINITION: Let S1 and S2 be two source schemas and let C be a set of correspondences between attributes of S1 and S2. Let A be a concept of S1 and B be a concept of S2. We say that A and B match if there is at least one attribute a in A and one attribute b in B such that there is a correspondence at the schema level between attribute a and attribute b. Result is a matching graph.
Merging of Concepts Maintains an integration function f x for the given assignment X. Specifies how each individual concept, attribute and HasA edge in a source concept graph relates to integrated concept graph.
Mapping Input: integrated concept graph G 1, matching graph G and the integration function fx. Output: mapping M between source and integrated concepts. For every source concept C a mapping M C is created in M, which specifies how an instance of C, together with all its relationships, is to be transformed into an instance of an integrated concept C 1.
Adaptive Enumeration Does not enumerate all integrated schemas. Target schemas are output one by one and the user is allowed to browse through the schemas. Enumeration constraints: The user can express additional constraints on how concepts should be merged. Apply(x), (NOT)Apply(x), Merge(A1…An), (NOT)Merge(A1..An). eg. (NOT) Merge(org, location, emp, phone, fund).
Strengths And Weakness Simple model: includes HasA edges as the basic form of relationships, and a simpler form of Contains. Graphs of concepts can express most of the essential features that appear in schemas or in conceptual models. The input is just a set of atomic correspondences. Weaknesses: Do not resolve type and representation conflicts. No weight or probabilities involved into matching between concepts.