Download presentation
Presentation is loading. Please wait.
Published byTristen Crosse Modified over 9 years ago
1
Dzielenie relacyjne (Relational division) Bazy i hurtownie danych, TWO1, 2010 https://ophelia.cs.put.poznan.pl/webdav/dbdw/students/ dbdw-winter_2010-11/
2
References 1.V.M. Matos, R. Grasser, Assessing performance of the relational division operator. Data Base Management 2001, 22-20-30, 1-15 2.V.M. Matos, R. Grasser, A Simpler (and Better) SQL Approach to Relational Division, Journal of Information System Education 2002, 13 (2), 85-88. 21/11/2010Bazy i hurtownie danych2
3
Relational Division The basic operators of the relational algebra: – Union (UNION) – Difference (MINUS/EXCEPT) – Cartesian product – Projection & selection (SELECT... FROM...) Additional operators added to the relational algebra: – Join most popular in practice – Rename (AS) – Intersection (INTERSECT) – Division 21/11/2010Bazy i hurtownie danych3
4
Relational Division The division operator is less common than select-project-join queries, however, it is applicable to many common queries: – Find suppliers who supply all the engine parts – Find students who have taken all the core courses – Find customers who have ordered all items from a given line of products The division operator can be also employed in data mining algorithms (e.g., generation of association rules) 21/11/2010Bazy i hurtownie danych4
5
Informal Definition The division operator allows verifying whether or not a candidate subject is related to each of the values held in the base set. The base set is called the divisor (or denominator T2[B]), and the table holding the subject’s data is called the dividend (or nominator T1[A, B]). The expression T1[A, B]/T2[B] selects the A values from the dividend table T1[A, B], whose B values are a superset of those B values held in the divisor table T2[B]. 21/11/2010Bazy i hurtownie danych5
6
Informal Definition 21/11/2010Bazy i hurtownie danych T3[A] = T1[A,B] / T2[B] 6
7
More Practical Example #1 Given the relations ORDERS[SID, PID, QTY] and PRODUCTS[PID, PRICE] find all the stores that ordered at least 10 of all products priced over 15 $ 21/11/2010Bazy i hurtownie danych SIDPIDQTY s2p225 s2p320 s2p420 s1p125 s1p44 s1p312 s1p230 s3p412 s3p214 ORDERS PIDPRICE p110 p225 p318 p420 PRODUCTS ORDERS[SID,PID] / PRODUCTS[PID] = { s2 } 7
8
More Practical Example #2 Given the relations ORDERS[SID, PID, PMNT] and PRODUCTS[PID, PRICE] we focus on ordered products priced over 15$ and paid either with cash or a credit card. What would be the result of ORDERS[SID, PID, PMNT] / PRODUCTS[PID]? 21/11/2010Bazy i hurtownie danych SIDPIDPMNT s1p1cash s1p2bank s1p2credit s1p3cash s1p4bank s1p3credit s2p2cash s2p2bank s2p3credit ORDERS PIDPRICE p110 p225 p318 p414 PRODUCTS ORDERS[SID,PID,PMNT] / PRODUCTS[PID] = { (s1, credit) } 8
9
Formal Definition: Relational Algebra Let’s assume that the numerator table T1 always consists of two columns A and B, and the denominator has only one B attribute. Then, the expression T1[A, B]/T2[B] is semantically equivalent to: T1[A, B]/T2[B] = T1[A] – ((T1[A] × T2[B]) – T1[A, B])[A] 21/11/2010Bazy i hurtownie danych9
10
Formal Definition: Relational Algebra 21/11/2010Bazy i hurtownie danych10
11
Formal Definition: Tuple-Calculus Using relational tuple-calculus language, the division operator can be rephrased as follows: T1[A, B]/T2[B] = { t1[A] / t1 T1 and for-all t2 (t2 T2 exists t3 (t3 T1 and (t1[A] = t3[A]) and (t2[B] = t3[B]))) } 21/11/2010Bazy i hurtownie danych11
12
Formal Definition: NFNF Databases (Non First-Normal Form) Assumption of a NFNF format (fields with sets of atomic values) results in a much simplified definition of the division operator in tuple-calculus (attribute *B in T1 and T2 is defined as a set of atomic values): T1[A, B]/T2[B] = { t1[A] / t1 T1 and t2 T2 and t2[*B] t1[*B] } 21/11/2010Bazy i hurtownie danych12
13
Formal Definition: NFNF Databases 21/11/2010Bazy i hurtownie danych13
14
SQL Implementation: Q0 SELECT A FROM T1 WHERE B IN (SELECT B FROM T2) GROUP BY A HAVING COUNT(*) = (SELECT COUNT(*) FROM T2) 21/11/2010Bazy i hurtownie danych14
15
SQL Implementation: Q1 (Byzantine Method ) Based on the formal predicate calculus definition modified to fit SQL: – The universal quantifier for-all x (f(x)) replaced by not exists x (not f(x)) – The implication X Y replaced by (not(X) or Y) T1[A, B]/T2[B] = { t1[A] / t1 T1 and not exists t2 (not(not( t2 T2) or (exists t3 (t3 T1 and (t1[A] = t3[A]) and (t2[B] = t3[B]))))) } 21/11/2010Bazy i hurtownie danych15
16
SQL Implementation: Q1 Previous definition is equivalent (following De Morgan’s law not (P or Q) = not P and not Q ) to: T1[A, B]/T2[B] = { t1[A] / t1 T1 and not exists t2 (( t2 T2) and (not exists t3 (t3 T1 and (t1[A] = t3[A]) and (t2[B] = t3[B])))) } 21/11/2010Bazy i hurtownie danych16
17
SQL Implementation: Q1 SELECT DISTINCT x.A FROM T1 AS x WHERE NOT EXISTS (SELECT * FROM T2 y WHERE NOT EXISTS (SELECT * FROM T1 AS z WHERE (z.A=x.A) AND (z.B=y.B))) 21/11/2010Bazy i hurtownie danych17
18
SQL Implementation: Q2 Based on the algebraic definition of the division operator and broken into two steps: SELECT DISTINCT y.A, z.B INTO T3 FROM T1 AS y, T2 AS z WHERE NOT EXISTS (SELECT * FROM T1 WHERE (T1.A = y.A) AND (T1.B=z.B)) SELECT DISTINCT A FROM T1 WHERE NOT EXISTS (SELECT * FROM T3 WHERE (T3.A=T1.A)) 21/11/2010Bazy i hurtownie danych T1[A, B]/T2[B] = T1[A] – ((T1[A] × T2[B]) – T1[A, B])[A] 18
19
SQL Implementation: Q3 Based on the definition for the NFNF and tuple-calculus: SELECT DISTINCT x.A FROM T1 AS x WHERE (SELECT COUNT(*) FROM T2) = (SELECT COUNT(*) FROM T1, T2 WHERE (T1.A=x.A) AND (T1.B=T2.B)) 21/11/2010Bazy i hurtownie danych19
20
Zero Division The divide operator is defined in such a way that T1[A,B]/T2[B] produces exactly all A values in T1 each time that T2[B] is empty. An empty set would be a more appropriate answer this is how Q0 works. 21/11/2010Bazy i hurtownie danych20
21
Experimental Evaluation of Q0…Q3 Assume basic structure of tables (T1[A, B], T2[B], integer or char) Conduct an experiment with the following settings: – number of A-values in T1 = 10 000, – number of B-values in T1 = 100, – number of B-values in T2 = 0, 20, 40, 60, 80, 100. Use provided scripts to generate sample tables and to run specific queries (Q0…Q3). Observer performance of specific queries (execution time – CPU time). Collect the observations in a tabular and graphical form. 21/11/2010Bazy i hurtownie danych https://ophelia.cs.put.poznan.pl/webdav/dbdw/students/ dbdw-winter_2010-11/ 21
22
Checking Execution Time 21/11/2010Bazy i hurtownie danych Turn SET STATISTICS TIME on (Tools Options) 22
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.