Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prototipo di un query manager per la gestione di query globali

Similar presentations


Presentation on theme: "Prototipo di un query manager per la gestione di query globali"— Presentation transcript:

1 Prototipo di un query manager per la gestione di query globali
MOMIS Query Manager Prototipo di un query manager per la gestione di query globali D. Beneventano, S. Bergamaschi, F. Mandreoli Università degli Studi di Modena e Reggio Emilia D2I Integrazione, Warehousing e Mining di sorgenti eterogenee Tema 1: Integrazione di dati provenienti da sorgenti eterogenee ROMA, 11 OTTOBRE 2002

2 Example Local classes (relational) L1(firstn,lastn,year,e_mail)
L2(name,e_mail,dept_code,s_code) INTEGRATION Global Class: G Name E_mail Section Year Dept L1 firstn and lastn e_mail null year L2 name s_code dept_code Global Class Schema: G S(G) = (Name,E_mail,Year,Dept,Section) Local Class Schemata w.r.t. Global Class: S(L1) = (Name,E_mail,Year) S(L2) = (Name,E_mail,Dept,Section)

3 Data cleaning and reconciliation
Integration at the extensional level the data returned by various sources need to be converted/reconciled interpretation and merging of the data provided by the sources Schema Translation (example: firstn and lastn to Name) Data conversion (example: ‘Rita’ + ‘Verde’ to ‘Rita Verde’) firstn lastn e_mail year Rita Verde 2 Ada Rossi 1 name e_mail dept_c S_code Rossi_Ada Dept1 413245 Po_Ugo 2314 L1 L2 Name E_mail Year Rita Verde 2 Ada Rossi 1 Name E_mail Dept Section Ada Rossi Dept1 413245 Ugo Po 2314

4 Redundancy and Reconcilation
Hypothesis Instances of the same object in different local class must have the same value for a common attribute L2 L1 O1 O O2 L1 L2 Name E_mail Year Rita Verde 2 Ada Rossi 1 Name E_mail Dept Section Ada Rossi Dept1 413245 Ugo Po 2314 O1 O O O2

5 Object fusion To identify instances of the same object and fuse them: JoinMap - join criteria among classes L1 L2 O1 O O2 Name E_mail Year Rita Verde 2 Ada Rossi 1 Name E_mail Dept Section Ada Rossi Dept1 413245 Ugo Po 2314 O1 O O O2 JoinMap JM(L1,L2) L1.Name=L2.Name Name Ada Rossi

6 Object fusion : indirect map
L1 L2 O1 O2 O3 Id Name E_mail Year 123 Rita Verde 2 243 Ada Rossi 1 E_mail Dept SN Dept1 XY413245 XZ2314 O1 O2 O2 O3 JoinMap JMCS.S,UNI.RS Matr SN 243 XY413245

7 Global Class Instance G GAV with “Single database property”
(Lenzerini - Data Integration: A Theoretical Perspective, PODS 2002) The computation is based on “FULL DISJUNCTION” (Rajarama, Ullman - Integrating Information by Outerjoins and Full Disjunctions. PODS 1996) “Computing the natural outerjoin of many relations in a way that preserves all possible connections amon facts” L1 Name E_mail Year Rita Verde 2 Ada Rossi 1 L2 Name E_mail Dept Section Ada Rossi Dept1 413245 Ugo Po 2314 G: select S(G) from L1 outer join L2 on JM(L1,L2) G Name E_mail Year Dept Section Ada Rossi 1 Dept1 413245 Rita Verde 2 Ugo Po 2314

8 FULL DISJUNCTION COMPUTATION
Question: when a full disjunction can be computed by some sequence of natural outerjoins Answer: there is a natural outerjoin sequence producing the full disjunction if and only if the set of relation schemes forms a connected, -acyclic hypergraph (Fagin ) A Global class with n local classes, n >2 : -cyclic hypergraph L1 JM(L1,L2) JM(L1,L3) New Method JM(L2,L3) L2 L3 Example: n = 3 : G: select S(G) from (L1 outer join L2 on JM(L1,L2)) outer join (L1 outer join L3 on JM(L1,L3)) on JM(L2,L3)

9 Query rewiting method Global query (in DNF) : Q1
Local query for the class L : Q1_L where-condition of Q1_L : all factors of DNF which can be solved in L residual factors of Q1 : factors not included in all local where-condition select-list of Q1_L : attributes of the select-list of Q1 + residual factors +JoinMap Global query reformulation full disjunction based on the JoinMap + residual factors

10 Query rewiting example
Global query Q1: select E_mail from G where (E_mail like ’*.it' and Dept='Dept1') or (E_mail like ’*.it' and Year=2) Local queries Q1_L1: select Name, Year, E_mail from L1 where (E_mail like ’*.it' or Year=2) Q1_L2: select Name, Dept, E_mail from L2 where (E_mail like ’*.it' or Dept='Dept1') Global query reformulation: Q1: select E_mail from Q1_L1 outer join Q1_L2 on JM where (Dept='Dept1' or Year=2) residual factor


Download ppt "Prototipo di un query manager per la gestione di query globali"

Similar presentations


Ads by Google