Download presentation

Presentation is loading. Please wait.

Published byDominique McKinney Modified about 1 year ago

1
SECTION 21.5 Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION

2
Presentation Outline 21.5 Optimizing Mediator Queries Simplified Adornment Notation Obtaining Answers for Subgoals The Chain Algorithm Incorporating Union Views at the Mediator

3
21.5 Optimizing Mediator Queries Chain algorithm – a greed algorithm that finds a way to answer the query by sending a sequence of requests to its sources. Will always find a solution assuming at least one solution exists. The solution may not be optimal.

4
Simplified Adornment Notation A query at the mediator is limited to b (bound) and f (free) adornments. We use the following convention for describing adornments: name adornments (attributes) where: name is the name of the relation the number of adornments = the number of attributes

5
Obtaining Answers for Subgoals Rules for subgoals and sources: Suppose we have the following subgoal: R x 1 x 2 …x n (a 1, a 2, …, a n ), and source adornments for R are: y 1 y 2 …y n. If y i is b or c[S], then x i = b. If x i = f, then y i is not output restricted. The adornment on the subgoal matches the adornment at the source: If y i is f, u, or o[S] and x i is either b or f.

6
The Chain Algorithm Maintains 2 types of information: An adornment for each subgoal. A relation X that is the join of the relations for all the subgoals that have been resolved. Initially, the adornment for a subgoal is b iff the mediator query provides a constant binding for the corresponding argument of that subgoal. Initially, X is a relation over no attributes, containing just an empty tuple.

7
The Chain Algorithm (con’t) First, initialize adornments of subgoals and X. Then, repeatedly select a subgoal that can be resolved. Let R α (a 1, a 2, …, a n ) be the subgoal: 1. Wherever α has a b, we shall find the argument in R is a constant, or a variable in the schema of R. Project X onto its variables that appear in R.

8
The Chain Algorithm (con’t) 2. For each tuple t in the project of X, issue a query to the source as follows ( β is a source adornment). If a component of β is b, then the corresponding component of α is b, and we can use the corresponding component of t for source query. If a component of β is c[S], and the corresponding component of t is in S, then the corresponding component of α is b, and we can use the corresponding component of t for the source query. If a component of β is f, and the corresponding component of α is b, provide a constant value for source query.

9
The Chain Algorithm (con’t) If a component of β is u, then provide no binding for this component in the source query. If a component of β is o[S], and the corresponding component of α is f, then treat it as if it was a f. If a component of β is o[S], and the corresponding component of α is b, then treat it as if it was c[S]. 3. Every variable among a 1, a 2, …, a n is now bound. For each remaining unresolved subgoal, change its adornment so any position holding one of these variables is b.

10
The Chain Algorithm (con’t) 4. Replace X with X π s(R), where S is all of the variables among: a 1, a 2, …, a n. 5. Project out of X all components that correspond to variables that do not appear in the head or in any unresolved subgoal. If every subgoal is resolved, then X is the answer. If every subgoal is not resolved, then the algorithm fails. α

11
The Chain Algorithm Example Mediator query: Q: Answer(c) ← R bf (1,a) AND S ff (a,b) AND T ff (b,c) Example: Relation R S T Data Adornment bfc’[2,3,5]f bu wx xy yz

12
The Chain Algorithm Example (con’t) Initially, the adornments on the subgoals are the same as Q, and X contains an empty tuple. S and T cannot be resolved because they each have ff adornments, but the sources have either a b or c. R(1,a) can be resolved because its adornments are matched by the source’s adornments. Send R(w,x) with w=1 to get the tables on the previous page.

13
The Chain Algorithm Example (con’t) Project the subgoal’s relation onto its second component, since only the second component of R(1,a) is a variable. This is joined with X, resulting in X equaling this relation. Change adornment on S from ff to bf. a 2 3 4

14
The Chain Algorithm Example (con’t) Now we resolve S bf (a,b): Project X onto a, resulting in X. Now, search S for tuples with attribute a equivalent to attribute a in X. Join this relation with X, and remove a because it doesn’t appear in the head nor any unresolved subgoal: ab b 4 5

15
The Chain Algorithm Example (con’t) Now we resolve T bf (b,c): Join this relation with X and project onto the c attribute to get the relation for the head. Solution is {(6), (7), (8)}. bc

16
Incorporating Union Views at the Mediator This implementation of the Chain Algorithm does not consider that several sources can contribute tuples to a relation. If specific sources have tuples to contribute that other sources may not have, it adds complexity. To resolve this, we can consult all sources, or make best efforts to return all the answers.

17
Incorporating Union Views at the Mediator (con’t) Consulting All Sources We can only resolve a subgoal when each source for its relation has an adornment matched by the current adornment of the subgoal. Less practical because it makes queries harder to answer and impossible if any source is down. Best Efforts We need only 1 source with a matching adornment to resolve a subgoal. Need to modify chain algorithm to revisit each subgoal when that subgoal has new bound requirements.

18
Questions

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google