Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Databases: Relational and XML Models and Languages Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin.

Similar presentations


Presentation on theme: "Introduction to Databases: Relational and XML Models and Languages Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin."— Presentation transcript:

1 Introduction to Databases: Relational and XML Models and Languages Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

2 Introduction to Databases, B. Ludaescher & K. Lin 2 Overview 09:15-10:20Relational Databases (1h05’) 10:20-10:30BREAK (10’) 10:30-11:50Relational Databases (1h20’) 11:50-13:15LUNCH (1h25’) 13:15-13:45 Demo & Hands-on (30’) 13:45-15:10 XML: Basics (1h25’) 15:10-15:30BREAK (20’) 15:30-16:30 XML: Querying (1h) 16:30-17:00 Demo & Hands-on (30’)

3 Introduction to Databases, B. Ludaescher & K. Lin 3 Scope Today: Introduction to Databases, in particular –Relational database model –Relational Operations and Queries –Constraints –XML “data” model –Querying and transforming XML –Some demos & simple hands-on exercise Tomorrow: –Introduction to Knowledge Representation and Ontologies But first: … déjà vu …

4 Introduction to Databases, B. Ludaescher & K. Lin 4

5 5 What is a Database System? Database (system) = –Database Instance (set of tables of rows) –Database Management System (DBMS) Origins in the commercial world: –to organize, query, and manipulate data more effectively, efficiently, and independently Scientific databases –often special features: spatial, temporal, spatiotemporal, GIS, units, uncertainty, raw & derived data, …

6 Introduction to Databases, B. Ludaescher & K. Lin 6 Why not just use files as “databases”? For some applications: yeah… why not? But in general: –scanning & ’grep’ing large files can be very inefficient –no language support for selecting desired data, joining them, etc. cannot express the kinds of questions/queries you’d like to ask ‘grep’ is no substitute for a query language –redundant and/or inconsistent storage of data –no transaction management and concurrency control among multiple users –no security –no recovery –no data independence (application  data) –no data modeling support –…–…

7 Introduction to Databases, B. Ludaescher & K. Lin 7 Features of a Database System A data model (relational, object-oriented, XML) prescribes how data can be organized: –as relations (tables) of tuples (rows) –as classes of (linked) objects –as XML trees A (database) schema (stored in the “data dictionary”) defines the structure of a specific database instance: –Relational schema –OO schema –XML Schema (or XML DTD)

8 Introduction to Databases, B. Ludaescher & K. Lin 8 Features of a Database System Data is treated uniformly and separately from the application Efficient data access Queries and views are expressed over the schema Integrity constraints (checking and enforcement) Transactions combine sets of operations into logical units (all-or-nothing) Synchronization of concurrent user transactions Recovery (after system crash) –not to be confused w/ backup –instead: guarantee consistency by “roll-back” of partially executed transactions (how? Hint: logging) …

9 Introduction to Databases, B. Ludaescher & K. Lin 9 DB features, e.g., Concurrency Control Concurrent execution of simultaneous requests –long before web servers where around... –transaction management guarantees consistency despite concurrent/interleaved execution Transaction (= sequence of read/write operations) –Atomicity: a transaction is executed completely or not at all –Consistency: a transaction creates a new consistent DB state, i.e., in which all integrity constraints are maintained –Isolation: to the user, a transaction seems to run in isolation –Durability: the effect of a successful (“committed”) transaction remains even after system failure

10 Introduction to Databases, B. Ludaescher & K. Lin 10 Levels of Abstraction: Architecture Overview DB instances Physical level Logical (“conceptual”) level View 1 View 2View n Conceptual … Level Index structures Tables Export schemas ER-Model (Entity-Relationship) OO Models (Classes…) part of DB design  conceptual design … often lost in the process… logical data independence physical data independence User

11 Introduction to Databases, B. Ludaescher & K. Lin 11 Database Design: Entity-Relationship (ER) Model Entities: Relationships: Attributes: ER Model: –initial, high-level DB design (conceptual model) –easy to map to a relational schema (database tables) –comes with more constraints (cardinalities, aggregation) and extensions: EER (is-a => class hierarchies) –related: UML (Unified Modeling Language) class diagrams EmployeeDepartment works-for NameSalaryManagerName since

12 Introduction to Databases, B. Ludaescher & K. Lin 12 The Relational Model Relation/Table Name: –employee, dept Attributes = Column Names: –Emp, Salary, DeptNo, Name, Mgr Relational Schema: –employee(Emp:string, Salary:integer, DeptNo:integer),... Tuple = Row of the table: –(“tom”, “60000”, “1”) Relation = Set of tuples: –{(...), (...),...} EmpSalaryDNo tom60k1 tim57k1 sally45k3 carol30k1 carol 35k 2 …. Employee DNo NameMgr 1Toyscarol 2Comp.carol 3Shoessam Department FK: foreign key, pointing to another key

13 Introduction to Databases, B. Ludaescher & K. Lin 13 Ex: Creating a Relational Database in SQL CREATE TABLE employee ( ssnCHAR(11), nameVARCHAR(30), deptNoINTEGER, PRIMARY KEY (ssn), FOREIGN KEY (deptNo) REFERENCES department ) CREATE TABLE department ( deptNoINTEGER, nameVARCHAR(20), managerCHAR(11), PRIMARY KEY (deptNo), FOREIGN KEY (manager) REFERENCES employee(ssn) )

14 Introduction to Databases, B. Ludaescher & K. Lin 14 What is a Query? Intuitively: –An “executable question” in terms of a database schema –Evaluating a query Q against a database instance D yields a set of answer objects: Relational tuples or XML elements Example: –Who are the employees in the ‘Toys’ dept.? –Who is (are) the manager(s) of ‘Tom’? –Show all pairs (Employee, Mgr) Technically: –A mapping from an input schema (the given table schemas) to a result schema (the new columns you are interested in) defined in some query language

15 Introduction to Databases, B. Ludaescher & K. Lin 15 Why (Declarative) Query Languages? Things we talk and think about in PLs and QLs … –Assembly languages: registers, memory locations, jumps,... –C and the likes: if-then-else, for, while, memory (de-)allocation, pointers,... –Object-oriented languages: C++: C plus objects, methods, classes,... Java: objects, methods, classes, references,... Smalltalk: objects, objects, objects,... OQL: object-query language,,Die Grenzen meiner Sprache bedeuten die Grenzen meiner Welt.” “The limits of my language mean the limits of my world.” Ludwig Wittgenstein, Tractatus Logico-Philosophicus “If you have a hammer, everything looks like a nail.”

16 Introduction to Databases, B. Ludaescher & K. Lin 16 Why (Declarative) Query Languages? Things we talk and think about in PLs and QLs … –Functional languages (Haskell, ML): (higher-order) functions, fold(l|r), recursion, patterns,... => Relational languages (SQL, Datalog) relations (tables), tuples (rows); conceptual level: ER relational operations: , , , ,..., , , , , ,..., , , |X| => Semistructured/XML (Tree) & Graph Query Languages trees, graphs, nodes, edges, children nodes, siblings, … XPath, XQuery, … Also: –Focus on what, and not how!

17 Introduction to Databases, B. Ludaescher & K. Lin 17 Example: Querying a Relational Database EmpSalaryDeptNo anne62k2 john60k1 Employee DeptNoMgr 1anne 2anne Department SELECT e.Emp, d.Mgr FROM Employee e, Department d WHERE e.DeptNo = d.DeptNo EmpMgr johnanneanne result join input tables SQL query (or view def.) answer (or view) we don’t say how to evaluate this expression

18 Introduction to Databases, B. Ludaescher & K. Lin 18 Example Query: SQL vs DATALOG “List all employees and their managers” In SQL: SELECT e.name, d.manager FROM Employee e, Department d WHERE e.deptNo = d.deptNo In DATALOG: q(E, M) :- employee(E, S, D), department(D, N, M). a “join” operation

19 Introduction to Databases, B. Ludaescher & K. Lin 19 Important Relational Operations select(R, Condition) –filter rows of a table wrt. a condition project(R, Attr) –remove unwanted columns; keep rest join(R1, A2, R2, A2, Condition) –find “matches” in a “related” table –e.g. match R1.foreign key = R2.primary key cartesian product(R1, R2) union (“OR”), intersection (“AND”) set-difference (“NOT IN”)

20 Introduction to Databases, B. Ludaescher & K. Lin 20 Relational Operations (in DATALOG) condition Y1=Y2  Y independent same multiple rules  union results (query) output :– (query) input

21 Introduction to Databases, B. Ludaescher & K. Lin 21 Demo Relational Queries in DATALOG

22 Introduction to Databases, B. Ludaescher & K. Lin 22 Queries, Views, Integrity Constraints … can all be seen as “special queries” Query q(…) :- … ad-hoc queries View v(…) :- … exported views; Integrity Constraints –ic (…) :- …. MgrSal < EmpSal … –say what shouldn’t happen –if it does: alert the user (or refuse an update, …)

23 Introduction to Databases, B. Ludaescher & K. Lin 23 Query Evaluation vs Reasoning Query evaluation –Given a database instance D and a query Q, run Q(D) –What databases do all the time Reasoning (aka “Semantic Query Optimization”) –Given a query Q and a constraint C, “optimize” Q&C (e.g., given C, Q might be unsatisfiable) –Given Q1 and Q2 decide whether Q1  Q2 –Given Q1,Q2, C decide whether Q1  Q2 | C –Note: we are NOT given a database instance D here; just the schema and the query/IC expressions

24 Introduction to Databases, B. Ludaescher & K. Lin 24 Summary QLs for Relational Databases Natural Hoin: same attribute name  add condition that values must match

25 Introduction to Databases, B. Ludaescher & K. Lin 25 Relational Algebra

26 Introduction to Databases, B. Ludaescher & K. Lin 26 Relational Algebra

27 Introduction to Databases, B. Ludaescher & K. Lin 27 Relational Algebra

28 Introduction to Databases, B. Ludaescher & K. Lin 28 Relational Algebra

29 Introduction to Databases, B. Ludaescher & K. Lin 29 Relational Algebra

30 Introduction to Databases, B. Ludaescher & K. Lin 30 Hands-on Part DBDesigner 4

31 Introduction to Databases, B. Ludaescher & K. Lin 31 DBDesigner 4 An open source (GPL) database design tool 1.Goto http://www.fabforce.net/dbdesigner4/http://www.fabforce.net/dbdesigner4/ 2.Download and install (5 min) 3.Open the example schema Order (File -> Open -> Order), and examine the relations between the tables forumtopic and forumpost. Find the foreign key used in the relation postHasTopic (5 min) 4.Connect to a sample MySQL database host: geon07.sdsc.edu port: 3306 database: summer_institute username: root password: [blank] (5 min) 5.Select all records in the table forumtopic (2 min) 6.Select all records in the table forumpost, and sort the result according to their idforumpost (3min) 7.Find all forum posts with the topic Cars (5 min) 8.Insert a record into the table forumpost (5 min)

32 Introduction to Databases, B. Ludaescher & K. Lin 32 Additional Material (not presented)

33 Introduction to Databases, B. Ludaescher & K. Lin 33 Non-Relational Data Models Relational model is “flat”: atomic data values –nesting is modeled via “pointers” (foreign keys) and “Skolem-ids” –extension: nested relational model (“tables within tables”, cf. nested HTML tables) –values can be nested lists {...}, tuples (...), sets [...] –ISO standard(s): SQL –identity is value based Object-oriented data model: –complex (structured) objects with object-identity (oid) –class and type hierarchies (sub-/superclass, sub-/supertype) –OODB schema may be very close to “world model” (no translation into tables) (+) queries fit your OO schema (-) (new) queries that don’t fit nicely –ODMG standard, OQL (Object Query Language)

34 Introduction to Databases, B. Ludaescher & K. Lin 34 Example: Object Query Language (OQL) Q: what does this OQL query compute? Note the use of path expressions like e.manager.children => Semistructured/Graph Databases SELECT DISTINCT STRUCT( E: e.name, C: e.manager.name, M: ( SELECT c.name FROM c IN e.children WHERE FOR ALL d IN e.manager.children: c.age > d.age ) ) FROM e IN Employees;

35 Introduction to Databases, B. Ludaescher & K. Lin 35 A Graph Database

36 Introduction to Databases, B. Ludaescher & K. Lin 36 Querying Graphs with OO-Path Expressions ?- dblp."Inf. Systems".L.P, substr("Volume",L), P : person.spouse[lives_in = P.lives_in]. ?- dblp."Inf. Systems".L."Michael E. Senko". Answer: L="Volume 1, 1975”; L="Volume 5, 1980".

37 Introduction to Databases, B. Ludaescher & K. Lin 37 Constructs for Querying Graphs Example: ?- dblp. any*. (if(  vldb )| if(  sigmod ))

38 Introduction to Databases, B. Ludaescher & K. Lin 38 Keys, Keys, and more Keys A key is a minimal set of attributes that: –uniquely identify a tuple –determine every other attribute value in a tuple There may be many keys for a relation; we designate one as the primary key The phrase candidate key is used in place of “key” where “the key” denotes the primary key A superkey is a superset of a key (i.e., not necessarily minimal)

39 Introduction to Databases, B. Ludaescher & K. Lin 39 Example of “good” design Employee(EName, SSN, BDate, Address, DNumber) Example of “bad” design (why is it bad?) Emp(EName, SSN, BDate, Address, DNum, DName, DMgrSSN) Normalization of Relations Department(DName, DNumber, DMgrSSN)

40 Introduction to Databases, B. Ludaescher & K. Lin 40 The description of the department (DName, DMgrSSN) is repeated for every employee that works in that department. Redundancy! The department is described redundantly. This leads to update anomalies! (… and wastes space) Digression (for experts only): redundancy can be used to increase performance; e.g. “materialized views”) What’s Wrong? Emp(EName, SSN, BDate, Address, DNum, DName, DMgrSSN)

41 Introduction to Databases, B. Ludaescher & K. Lin 41 Update Anomalies (caused by redundancy) Insertion Anomalies If you insert an employee Need to know which department he/she works Need to know the description information for that department If you want to insert a department, you can’t … until there is at least one employee Deletion Anomalies: if you delete an employee, is that dept. gone? was this the last employee in that dept? * Modification Anomalies: if you change DName, for example, it needs to be changed everywhere!

42 Introduction to Databases, B. Ludaescher & K. Lin 42 Null values also cause problems Null values might help in special cases, but are not a general solution to update anomalies For example, they may: –Waste space –Make it hard to specify and understand joins –Make it hard to aggregate (count, sum, etc.) –Have different meanings: Attribute does not apply to this tuple Attribute value is unkown Value is known but absent (not yet recorded) –Causes problems with queries (can’t interpret query answers)

43 Introduction to Databases, B. Ludaescher & K. Lin 43 Why worry about normalization? To … … reduce redundancy & update anomalies … reduce the need for null values Last not least: it’s a solved problem: all algorithms & proofs have been worked out Here: Normalization based on FDs (there’s more…)

44 Introduction to Databases, B. Ludaescher & K. Lin 44 Functional Dependencies Statement that if two tuples agree on attributes A they must agree on attributes B A  B If the value of the first attribute(s), A, is known, then the value of the second attribute(s), B, is known (read “A determines B”) We want to know if it is always true in the application

45 Introduction to Databases, B. Ludaescher & K. Lin 45 Functional Dependencies Examples of functional dependencies: social-security-number  employee-name course-number  course-title Examples that are NOT functional dependencies course-number -  book course-number -  car-color

46 Introduction to Databases, B. Ludaescher & K. Lin 46 Remember what it means to be a function: x f(x) x g(x) x h(x) 1 2 1 2 1 10 1 3 2 2 2 20 2 5 3 5 3 30 3 5 we are looking for functional relationships (that must occur in a relation) among attribute values What is a functional dependency? f is not a function for x=1, f(x) is not unique

47 Introduction to Databases, B. Ludaescher & K. Lin 47 EMP(ENAME, SSN, BDATE, ADDRESS, DNUM, DNAME, DMGRSSN) EMP_PROJ(SSN, PNUM, HOURS, ENAME, PNAME, PLOCATION) What are the FDs?

48 Introduction to Databases, B. Ludaescher & K. Lin 48 EMP(ENAME, SSN, BDATE, ADDRESS, DNUM, DNAME, DMGRSSN) EMP_PROJ(SSN, PNUM, HOURS, ENAME, PNAME, PLOCATION) 1 2 3 4 5 6 7 8 9 10 What are the FDs?

49 Introduction to Databases, B. Ludaescher & K. Lin 49 EMPLOYEE (SSN, NAME, SALARY, JOB_DESC) Why do we care? If all FDs are “implied by the key” it means that the DBMS only enforces keys (not FDs) and the DBMS is going to enforce the keys anyway otherwise, we have to perform expensive operations to maintain consistency (code or check statements)

50 Introduction to Databases, B. Ludaescher & K. Lin 50 Decomposition Main refinement technique: decomposition (replacing ABCD with, say, AB and BCD, or ACD and ABD) based on the projection operator. Decomposition should be used judiciously: –Is there reason to decompose a relation? (via Normal Forms) –What problems (if any) does the decomposition cause? (lost information or dependencies?)

51 Introduction to Databases, B. Ludaescher & K. Lin 51 EMP(ENAME, SSN, BDATE, ADDRESS, DNUM, DNAME, DMGRSSN) EMP_PROJ(SSN, PNUM, HOURS, ENAME, PNAME, PLOCATION) EMP1(SSN, ENAME, BDATE, ADDRESS, DNUM) X(DNUM, DNAME, DMGRSSN) EMP2(SSN, ENAME) X(PNUM, PNAME, PLOCATION) Y(SSN, PNUM, HOURS) 1 2 3 4 5 6 7 8 9 10 How can we decompose using the Project Operator?

52 Introduction to Databases, B. Ludaescher & K. Lin 52 Correct Decompositions How do we know if a decomposition is correct? That we haven’t lost anything? We have three goals: lossless-join decomposition (don’t throw any information away) (be able to reconstruct the original relation) dependency preservation all of the FDs end up in just one relation (not split across two or more relations) Boyce-Codd Normal Form (BCNF) - no redundancy

53 Introduction to Databases, B. Ludaescher & K. Lin 53 Lossless Decompositions What is a lossless decomposition? What is a lossy decomposition? When R is decomposed into R 1 and R 2 Check to see if (R 1 R 2 ) = R if it is a lossy decomposition, then R 1 R 2 gives you TOO MANY tuples. Note: we are doing a natural join

54 Introduction to Databases, B. Ludaescher & K. Lin 54 Example: a lossy decomposition Now join them using natural join: we get extra tuples!!! 1 smith p2 billing Original: Employee SSNNameProjectPtitle 1Smithp1accounting 2Jonesp1accounting 3Smithp2billing Decomposition: Employee SSNNameProjectPIDPtitleName 1Smithp1accountingSmith 2Jonesp1accountingJones 3Smithp2billingSmith

55 Introduction to Databases, B. Ludaescher & K. Lin 55 Testing for a Lossless Decomposition For decomposition into two relations Let R 1 and R 2 form a decomposition of R. R 1 and R 2 are both sets of attributes from R. For the decomposition to be lossless... The attributes in common must be a key for 1 of the relations! Note: You are joining a key and a foreign key

56 Introduction to Databases, B. Ludaescher & K. Lin 56 Example: test for a lossless decomposition Employee(SSN, name, project, p-title) decomposition: Employee (SSN, name) Project (project, p-title, name) Which attribute is in common? Employee.Name and Project.Name Is name a key for either of these two tables? NO! We have a problem.

57 Introduction to Databases, B. Ludaescher & K. Lin 57 Example: test for a lossless decomposition Employee(SSN, name, project, p-title) decomposition: Employee (SSN, name) Project (project, p-title) Which attribute is in common? None Is this decomposition lossless? NO! We have a problem.

58 Introduction to Databases, B. Ludaescher & K. Lin 58 Example: test for a lossless decomposition Employee(SSN, name, project, p-title) decomposition: Employee (SSN, name, project) Project (project, p-title) Which attribute is in common? Employee.project and Project.project Is project a key for either of these two tables? YES! We have a lossless decomposition.

59 Introduction to Databases, B. Ludaescher & K. Lin 59 Some Preliminary Normal Form Definitions Prime Attribute - an attribute A is prime if it is a member of a key, otherwise it is nonprime Partial Dependency - given an FD X  Y, Y is partially dependent on X if there is a proper subset X of X such that X  Y. Transitive Dependency - A is transitively dependent upon X if X  Y, Y-/  X, and Y  A and A  XY.

60 Introduction to Databases, B. Ludaescher & K. Lin 60 EMP(ENAME, SSN, BDATE, ADDRESS, DNUM, DNAME, DMGRSSN) EMP_PROJ(SSN, PNUM, HOURS, ENAME, PNAME, PLOCATION) 1 2 3 4 5 6 7 8 9 10 What are the FDs?

61 Introduction to Databases, B. Ludaescher & K. Lin 61 Normal Forms Based on FDs 1NF - all attribute values (domain values) are atomic (part of the definition of the relational model) 2NF - 1NF + no key partial dependencies (every nonprime attribute fully depends on every key of R) R(A B C D)B  C (not allowed) 3NF - 2NF + no nonprime attribute is transitively dependent upon any key R(A B C D)AB  C, C  D (not allowed) BCNF - 3NF + no attribute is transitively dependent upon any key (all FDs determined by a key) R(A B C D)AB  C, C  B (not allowed)

62 Introduction to Databases, B. Ludaescher & K. Lin 62 BCNF, Lossless, and Dependency-Preserving (first choice) 3NF, Lossless and Dependency-Preserving (second choice) because sometimes you can’t preserve all dependencies What’s the Goal?

63 Introduction to Databases, B. Ludaescher & K. Lin 63 Finding all of the FDs Armstrong’s Axioms Reflexivity: If X  Y, then X  Y Trivially, all attributes A  A, e.g., name  name and gender  gender Augmentation: If X  Y, then XZ  YZ for any Z Transitivity: If X  Y and Y  Z, then X  Z X, Y, and Z are sets of attributes in R Armstrong’s Axioms are a sound & complete set of inference rules Union: If X  Y and X  Z, then X  YZ Decomposition: If X  YZ, then X  Y and X  Z

64 Introduction to Databases, B. Ludaescher & K. Lin 64 Finding all FDs: the closure of a set of FDs The closure of a set of FDs: Let F be a set of FDs and F + the closure of F. F + is the set of all FDs implied (or derivable) from F using Armstrong’s Axioms F + is computed by applying the inference rules until no new FDs can be found

65 Introduction to Databases, B. Ludaescher & K. Lin 65 What is “Dependency Preserving” Suppose F is the original set of FDs and G is the set of FDs after decomposition If we compute F + and G +, and F + = G + then the decomposition is dependency preserving

66 Introduction to Databases, B. Ludaescher & K. Lin 66 Other Results (textbook) Algorithms To: –Compute F + –Compute the Minimal Cover for F –Find a dependency-preserving decomposition into 3NF –Find a lossless-join decomposition into BCNF –Find a lossless-join & dependency preserving decomposition into 3NF

67 Introduction to Databases, B. Ludaescher & K. Lin 67 addr(number, street, city, state, zip) number, street, city, state  zip zip  state If we decompose, this FD won’t occur within one relation. Thus, since the DBMS only enforces keys (and not FDs directly), this can’t be enforced. (If we leave it alone, it is in 3NF) Example: Not dependency preserving

68 Introduction to Databases, B. Ludaescher & K. Lin 68 Lossless join decomposition algorithm 1.Set D = {R}(the current set of relations) 2.While there is a relation in D that’s not in BCNF Choose a relation scheme Q that is not in BCNF Find a FD X  Y in Q that violates BCNF Replace Q in D by (Q – Y) and (X  Y) End While; 3.Identify dependences that are not preserved (X  A). 4.Add XA as a table to the set D

69 Introduction to Databases, B. Ludaescher & K. Lin 69 Tuning the Conceptual Schema The choice of conceptual schema should be guided by the workload, in addition to redundancy issues: –We may settle for a 3NF schema rather than BCNF. –E.g., the workload may influence our choice to decompose a relation into 3NF or BCNF. –We may further decompose a BCNF schema! –We might denormalize (i.e., undo a decomposition step), or add fields to a relation. We must take care to avoid the problems caused by the redundancy!

70 Introduction to Databases, B. Ludaescher & K. Lin 70 Decomposition of one table into several A relation is replaced by a collection of relations that are projections. Most important case. (Vertical partitioning) Sometimes, we might want to replace a relation by a collection of relations that are selections. (Horizontal partitioning) –Each new relation has the same schema as the original, but a subset of the rows. –Collectively, new relations contain all rows of the original. Typically, the new relations are disjoint. Why might we do this?

71 Introduction to Databases, B. Ludaescher & K. Lin 71 Tuning the Conceptual Schema Vertical decomposition using the project operator Course c#cnameinstructorroomdays Course2 c#cname Course1 c# roomdays Course3 c#instructor

72 Introduction to Databases, B. Ludaescher & K. Lin 72 Tuning the Conceptual Schema Horizontal partition/decomposition using the select operator Course c#cnameinstructorroomdays Undergraduate-Course c#cnameinstructorroomdays Graduate-Course c#cnameinstructorroomdays

73 Introduction to Databases, B. Ludaescher & K. Lin 73 Tuning the Conceptual Schema Denormalizing……always introduces redundancy! Course-Offering (offering#, quarter, c#, instructor, time, days) Course-Offering (offering#, quarter, c#, cname, instructor, room, time, days) This introduces redundancy - which must be managed. Course (c#, cname) Instructor-Room (instructor, room)

74 Introduction to Databases, B. Ludaescher & K. Lin 74 Denormalization interferes with dependency preservation Course-Offering (offering#, quarter, c#, cname, instructor, room, time, days) Note that having the FD instructor  room (and nothing else) in a single table allows the FD to be enforced (by enforcing the candidate key of instructor). Instructor-Room (instructor, room) Denormalization


Download ppt "Introduction to Databases: Relational and XML Models and Languages Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin."

Similar presentations


Ads by Google