# SECTION 21.5 Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION.

## Presentation on theme: "SECTION 21.5 Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION."— Presentation transcript:

SECTION 21.5 Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION

Presentation Outline  21.5 Optimizing Mediator Queries  21.5.1 Simplified Adornment Notation  21.5.2 Obtaining Answers for Subgoals  21.5.3 The Chain Algorithm  21.5.4 Incorporating Union Views at the Mediator

21.5 Optimizing Mediator Queries  Chain algorithm – a greed algorithm that finds a way to answer the query by sending a sequence of requests to its sources.  Will always find a solution assuming at least one solution exists.  The solution may not be optimal.

21.5.1 Simplified Adornment Notation  A query at the mediator is limited to b (bound) and f (free) adornments.  We use the following convention for describing adornments:  name adornments (attributes)  where: name is the name of the relation the number of adornments = the number of attributes

21.5.2 Obtaining Answers for Subgoals  Rules for subgoals and sources:  Suppose we have the following subgoal: R x 1 x 2 …x n (a 1, a 2, …, a n ), and source adornments for R are: y 1 y 2 …y n. If y i is b or c[S], then x i = b. If x i = f, then y i is not output restricted.  The adornment on the subgoal matches the adornment at the source: If y i is f, u, or o[S] and x i is either b or f.

21.5.3 The Chain Algorithm  Maintains 2 types of information:  An adornment for each subgoal.  A relation X that is the join of the relations for all the subgoals that have been resolved.  Initially, the adornment for a subgoal is b iff the mediator query provides a constant binding for the corresponding argument of that subgoal.  Initially, X is a relation over no attributes, containing just an empty tuple.

21.5.3 The Chain Algorithm (con’t)  First, initialize adornments of subgoals and X.  Then, repeatedly select a subgoal that can be resolved. Let R α (a 1, a 2, …, a n ) be the subgoal: 1. Wherever α has a b, we shall find the argument in R is a constant, or a variable in the schema of R.  Project X onto its variables that appear in R.

21.5.3 The Chain Algorithm (con’t) 2. For each tuple t in the project of X, issue a query to the source as follows ( β is a source adornment).  If a component of β is b, then the corresponding component of α is b, and we can use the corresponding component of t for source query.  If a component of β is c[S], and the corresponding component of t is in S, then the corresponding component of α is b, and we can use the corresponding component of t for the source query.  If a component of β is f, and the corresponding component of α is b, provide a constant value for source query.

21.5.3 The Chain Algorithm (con’t)  If a component of β is u, then provide no binding for this component in the source query.  If a component of β is o[S], and the corresponding component of α is f, then treat it as if it was a f.  If a component of β is o[S], and the corresponding component of α is b, then treat it as if it was c[S]. 3. Every variable among a 1, a 2, …, a n is now bound. For each remaining unresolved subgoal, change its adornment so any position holding one of these variables is b.

21.5.3 The Chain Algorithm (con’t) 4. Replace X with X π s(R), where S is all of the variables among: a 1, a 2, …, a n. 5. Project out of X all components that correspond to variables that do not appear in the head or in any unresolved subgoal.  If every subgoal is resolved, then X is the answer.  If every subgoal is not resolved, then the algorithm fails. α

21.5.3 The Chain Algorithm Example  Mediator query:  Q: Answer(c) ← R bf (1,a) AND S ff (a,b) AND T ff (b,c)  Example: Relation R S T Data Adornment bfc’[2,3,5]f bu wx 12 13 14 xy 24 35 yz 46 57 58

21.5.3 The Chain Algorithm Example (con’t)  Initially, the adornments on the subgoals are the same as Q, and X contains an empty tuple.  S and T cannot be resolved because they each have ff adornments, but the sources have either a b or c.  R(1,a) can be resolved because its adornments are matched by the source’s adornments.  Send R(w,x) with w=1 to get the tables on the previous page.

21.5.3 The Chain Algorithm Example (con’t)  Project the subgoal’s relation onto its second component, since only the second component of R(1,a) is a variable.  This is joined with X, resulting in X equaling this relation.  Change adornment on S from ff to bf. a 2 3 4

21.5.3 The Chain Algorithm Example (con’t)  Now we resolve S bf (a,b):  Project X onto a, resulting in X.  Now, search S for tuples with attribute a equivalent to attribute a in X.  Join this relation with X, and remove a because it doesn’t appear in the head nor any unresolved subgoal: ab 24 35 b 4 5

21.5.3 The Chain Algorithm Example (con’t)  Now we resolve T bf (b,c):  Join this relation with X and project onto the c attribute to get the relation for the head.  Solution is {(6), (7), (8)}. bc 46 57 58

21.5.4 Incorporating Union Views at the Mediator  This implementation of the Chain Algorithm does not consider that several sources can contribute tuples to a relation.  If specific sources have tuples to contribute that other sources may not have, it adds complexity.  To resolve this, we can consult all sources, or make best efforts to return all the answers.

21.5.4 Incorporating Union Views at the Mediator (con’t)  Consulting All Sources  We can only resolve a subgoal when each source for its relation has an adornment matched by the current adornment of the subgoal.  Less practical because it makes queries harder to answer and impossible if any source is down.  Best Efforts  We need only 1 source with a matching adornment to resolve a subgoal.  Need to modify chain algorithm to revisit each subgoal when that subgoal has new bound requirements.

Questions

Download ppt "SECTION 21.5 Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION."

Similar presentations