Presentation on theme: "D ATABASE S YSTEMS I W EEK 2: T HE E NTITY -R ELATIONSHIP M ODEL."— Presentation transcript:
D ATABASE S YSTEMS I W EEK 2: T HE E NTITY -R ELATIONSHIP M ODEL
22 O VERVIEW OF D ATABASE D EVELOPMENT Requirements Analysis / Ideas High-Level Database Design Conceptual Database Design / Relational Database Schema Physical Database Design / Relational DBMS Similar to software development
33 O VERVIEW OF D ATABASE D EVELOPMENT Requirements Analysis What data are to be stored in the enterprise? What are the required applications? What are the most important operations? High-level database design What are the entities and relationships in the enterprise? What information about these entities and relationships should we store in the database? What are the integrity constraints or business rules that hold? ER model or UML to represent high-level design
44 O VERVIEW OF D ATABASE D EVELOPMENT Conceptual database design What data model to implement for the DBS? E.g., relational data model Map the high-level design (e.g., ER diagram) to a (conceptual) database schema of the chosen data model. Physical database design What DBMS to use? What are the typical workloads of the DBS? Build indexes to support efficient query processing. What redesign of the conceptual database schema is necessary from the point of view of efficient implementation?
55 E NTITY -R ELATIONSHIP M ODEL Short: ER model. A lot of similarities with other modeling languages such as UML. Concepts Entities / Entity sets, Attributes, Relationships/ Relationship sets, and Constraints. Offers more modeling concepts than the relational data model (which only offers relations). Closer to the way in which people think.
66 E NTITY -R ELATIONSHIP D IAGRAMS An Entity-Relationship diagram ( ER diagram ) is a graph with nodes representing entity sets, attributes and relationship sets. Entity sets denoted by rectangles. Attributes denoted by ovals. Relationship sets denoted by diamonds. Edges ( lines ) connect entity sets to their attributes and relationship sets to their entity sets. lot dname budget did since name Works_In DepartmentsEmployees ssn
77 E NTITIES AND E NTITY S ETS Entity: Real-world object distinguishable from other objects e.g. employee Miller. Entity can be physical or abstract object. An entity is associated with the attributes describing its properties. Attribute values are atomic e.g. strings, integer or real numbers. Contain a single piece of information Ex: first name Age or date-of-birth? Entity set: A collection of similar entities. E.g., all employees.
88 E NTITIES AND E NTITY S ETS All entities in an entity set have the same set of attributes. (At least, for the moment!) Each entity set has a key, i.e. a minimal set of attributes to uniquely identify an entity of this set. Key attributes are underlined. Each attribute has a domain, i.e. a set of all possible attribute values. Employees ssn name age
99 E NTITIES AND E NTITY S ETS A key must be unique across all possible (not just the current) entities of its set. A key can consist of more than one attribute. There can be more than one key for a given entity set, but we choose one ( primary key ) for the ER diagram. Employees firstname lastname birthdate salary
10 R ELATIONSHIPS AND R ELATIONSHIP S ETS Relationship : Association among two or more entities. E.g., Miller works in Pharmacy department. Relationship set : Collection of similar relationships among two or more entity sets. age dname budget did name Works_In DepartmentsEmployees ssn
11 R ELATIONSHIPS AND R ELATIONSHIP S ETS An n-ary relationship set R relates n entity sets E1... En. Each relationship in R involves entities e1 E1,..., en En. Binary relationship sets most common. Same entity set can participate in different relationship sets, or in different “roles” in same set. Reports_To age name Employees subor- dinate super- visor ssn
12 R ELATIONSHIPS AND R ELATIONSHIP S ETS Entity object that is distinguishable from other objects Ex: your home address, CMPT 354 Entity Set All home addresses Collection of CMPT courses Each entity set has 1-to-many entities Each entity can belong to multiple entity sets Relationship Joe lives at 45 Main St. Mary lives at 89 Wood Ave. Relationship Set Person lives at home address
13 R ELATIONSHIPS AND R ELATIONSHIP S ETS Relationship sets can also have attributes. Useful for properties that cannot reasonably be associated with one of the participating entity sets. age dname budget did since name Works_In DepartmentsEmployees ssn
14 I NSTANCES OF AN ER D IAGRAM Entity set contains a set of entities. Each entity has one value for each of its attributes. No duplicate instances. ssnnameage 12345678“John Miller”30 14789632“Paul Li”25... Employees
15 I NSTANCES OF AN ER D IAGRAM Relationship set contains a set (no duplicates!) of relationships, each relating a set of entities, one from each of the participating entity sets. Components are entities, not attribute values. Employee (ssn)Department (did) 123456781 147896321 567563222... Works_In
16 R ELATIONSHIPS AND R ELATIONSHIP S ETS Multiway relationship sets (n > 2) are used whenever binary relationships cannot capture the application semantics. Tasks Works_For name Employees ssn age Projects pid pbudget description tid Infrequent.
17 R ELATIONSHIPS AND R ELATIONSHIP S ETS Works_For name Employees ssn age Projects pid pbudget Employee (ssn)Tasks (tid)Project (pid) 123456781000101 123456781500106 567563221500106... Works_For Tasks description tid
18 M ULTIPLICITY OF R ELATIONSHIPS An employee can work in many departments; a dept can have many employees. Each dept has at most one manager, who may manage several ( many ) departments. dname budgetdid since age name ssn ManagesEmployees Departments age dname budget did since name Works_In DepartmentsEmployees ssn
19 M ULTIPLICITY OF R ELATIONSHIPS The different types of (binary) relationships from a multiplicity point of view: One to one One to many Many to one Many to many many-to-many one-to-oneone-to-manymany-to-one
20 K EY C ONSTRAINTS A key constraint on a relationship set specifies that the marked entity set participates in at most one relationship of this relationship set. Entity set is marked with an arrow. dname budgetdid since age name ssn ManagesEmployees Departments Key constraint
21 P ARTICIPATION C ONSTRAINTS A participation constraint on a relationship set specifies that the marked entity set participates in at least one relationship of this relationship set. Entity set is marked with a bold line. age name dname budgetdid since name dname budgetdid since Manages since Departments Employees ssn Works_In Participation constraint
22 W EAK E NTITIES A weak entity exists only in the context of another ( owner ) entity. The weak entity can be identified uniquely only by considering the primary key of the owner and its own partial key. Owner entity set and weak entity set must participate in a one-to-many relationship set (one owner, many weak entities). Weak entity set must have total participation in this supporting relationship set. Ex: If there is no employee, there cannot be a dependent. age name age name Dependents Employees ssn Policy cost
23 S UBCLASSES Sometimes, an entity set contains some entities that do share many, but not all properties with the entity set hierarchies. A ISA B: every A entity is also considered to be a B entity. A specializes B, B generalizes A. A is called subclass, B is called superclass. A subclass inherits the attributes of a superclass, may define additional attributes. Contract_Emps Employees ISA Hourly_Emps
24 S UBCLASSES Contract_Emps name ssn Employees age hourly_wages ISA Hourly_Emps contractid hours_worked Hourly_Emps and Contract_Emps inherit the ssn (key!), name and age attributes from Employees. They define additional attributes hourly_wages, hours_worked and contractid, resp.
25 S UBCLASSES Covering constraints : Does every Employees entity have to be either an Hourly_Emps or a Contract_Emps entity? NO. Unless Hourly_Emps AND Contract_Emps COVER Employees Overlap constraints : Can Joe be an Hourly_Emps as well as a Contract_Emps entity? YES. Hourly_Emps OVERLAPS Contract_Emps
26 S UBCLASSES There are several good reasons for using ISA relationships and subclasses : Do not have to redefine all the attributes. Can add descriptive attributes specific to a subclass. To identify entitity sets that participate in a relationship set as precisely as possible. ISA relationships form a tree structure (taxonomy) with one entity set serving as root.
27 D ESIGN P RINCIPLES Faithfulness Design must be faithful to the specification / reality. Relevant aspects of reality must be represented in the model. Avoiding redundancy Redundant representation blows up ER diagram and makes it harder to understand. Redundant representation wastes storage. Redundancy may lead to inconsistencies in the database.
28 D ESIGN P RINCIPLES Keep it simple The simpler, the easier to understand for some (external) reader of the ER diagrams. Avoid introducing more elements than necessary. If possible, prefer attributes over entity sets and relationship sets. Formulate constraints as far as possible A lot of data semantics can (and should) be captured. But some constraints cannot be captured in ER diagrams.
29 H IGH -L EVEL D ESIGN W ITH ER M ODEL Major design choices Should a concept be modeled as an entity or an attribute? a relationship? What relationships to use: binary or ternary? Should address be an attribute of Employees or an entity (connected to Employees by a relationship)? Depends upon the use we want to make of address information, and the semantics of the data: If we have several addresses per employee, address must be an entity (since attributes cannot be set-valued).
30 E NTITY VS. A TTRIBUTE Works_In2 does not allow an employee to work in the same department for two or more periods (why?). We want to record several values of the descriptive attributes for each instance of this relationship.
31 E NTITY VS. R ELATIONSHIP This ER diagram o.k. if a manager gets a separate discretionary budget for each dept. But what if a manager gets a discretionary budget that covers all managed depts? Redundancy of dbudget, which is stored for each dept managed by the manager. Misleading: suggests dbudget tied to managed dept. Manages2 name dname budget did Employees Departments ssn lot dbudget since
32 E NTITY VS. R ELATIONSHIP What about this diagram? Employees who are not managers will have dbudget=null ? The following ER diagram is more appropriate and avoids the above problems! Each manager now has a budget.
33 B INARY VS. T ERNARY R ELATIONSHIPS If each policy is owned by just one employee: Key constraint on Policies would mean policy can only cover 1 dependent! (only 1 combination of Employees and Policies can be in Covers ) Bad design! age pname Dependents Covers name Employees ssn lot Policies policyid cost ER diagram says Employee can own several policies Each policy can be owned by several employees Each dependent can be covered by several policies
34 B INARY VS. T ERNARY R ELATIONSHIPS This diagram is a better design. Policy can only exist for employees. Dependents only exist if they are covered by a policy. Beneficiary age pname Dependents policyid cost Policies Purchaser name Employees ssn lot
35 B INARY VS. T ERNARY R ELATIONSHIPS Previous example illustrated a case when two binary relationships were better than one ternary relationship. An example in the other direction: a ternary relation Contracts relates entity sets Parts, Departments and Suppliers, and has descriptive attribute qty. No combination of binary relationships is an adequate substitute: S “can-supply” P, D “needs” P, and D “deals-with” S does not imply that D has agreed to buy P from S. How do we record qty?
36 C ONCEPTUAL D ESIGN : ER TO R ELATIONAL How to represent Entity sets, Relationship sets, Attributes, Key and participation constraints, Subclasses, Weak entity sets... ?
37 E NTITY S ETS Entity sets are translated to tables. CREATE TABLE Employees (ssn CHAR (11), name CHAR (20), lot INTEGER, PRIMARY KEY (ssn)); Employees ssn name lot
38 R ELATIONSHIP S ETS Relationship sets are also translated to tables. Keys for each participating entity set (as foreign keys). The combination of these keys forms a superkey for the table. All descriptive attributes of the relationship set. CREATE TABLE Works_In( ssn CHAR(11), did INTEGER, since DATE, PRIMARY KEY (ssn, did), FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN KEY (did) REFERENCES Departments);
39 K EY C ONSTRAINTS Each dept has at most one manager, according to the key constraint on Manages. Translation to relational model? many-to-manyone-to-oneone-to-manymany-to-one dname budget did since lot name ssn Manages Employees Departments
40 K EY C ONSTRAINTS Map relationship set to a table: Separate tables for Employees and Departments. Note that did is the key now! Since each department has a unique manager, we could instead combine Manages and Departments. CREATE TABLE Manages( ssn CHAR(11), did INTEGER, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN KEY (did) REFERENCES Departments) CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, manager CHAR(11), since DATE, PRIMARY KEY (did), FOREIGN KEY (manager) REFERENCES Employees)
41 P ARTICIPATION C ONSTRAINTS We can capture participation constraints involving one entity set in a binary relationship, using NOT NULL. In other cases, we need CHECK constraints. CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, manager CHAR(11) NOT NULL, since DATE, PRIMARY KEY (did), FOREIGN KEY (manager) REFERENCES Employees, ON DELETE NO ACTION )
42 W EAK E NTITY S ETS A weak entity set can be identified uniquely only by considering the primary key of another (owner) entity set. Owner entity set and weak entity set must participate in a one-to-many relationship set (one owner, many weak entities). Weak entity set must have total participation in this identifying relationship set. lot name age pname Dependents Employees ssn Policy cost
43 W EAK E NTITY S ETS Weak entity set and identifying relationship set are translated into a single table. When the owner entity is deleted, all owned weak entities must also be deleted. CREATE TABLE Dep_Policy ( pname CHAR(20), age INTEGER, cost REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY (pname, ssn), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE )
44 S UBCLASSES If we declare A ISA B, every A entity is also considered to be a B entity. Attributes of B are inherited to A. Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Emps entity? (Allowed/disallowed) Covering constraints: Does every Employees entity either have to be an Hourly_Emps or a Contract_Emps entity? (Yes/no) Contract_Emps name ssn Employees lot hourly_wages ISA Hourly_Emps contractid hours_worked
45 S UBCLASSES ER style translation One table for each of the entity sets (superclass and subclasses). ISA relationship does not require additional table. All tables have the same key, i.e. the key of the superclass. E.g.: One table each for Employees, Hourly_Emps and Contract_Emps. General employee attributes are recorded in Employees. For hourly emps and contract emps, extra info recorded in the respective relations.
46 S UBCLASSES Queries involving all employees easy, those involving just Hourly_Emps require a join to get their special attributes. CREATE TABLE Hourly_Emps( ssn CHAR(11), hourly_wages REAL, hours_worked INTEGER, PRIMARY KEY (ssn), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE ) CREATE TABLE Employees( ssn CHAR(11), name CHAR(20), lot INTEGER, PRIMARY KEY (ssn))
47 S UBCLASSES Alternative translation Create tables for the subclasses only. These tables have all attributes of the superclass(es) and the subclass. This approach is applicable only if the subclasses cover the superclass. E.g.: Hourly_Emps: ssn, name, lot, hourly_wages,hours_worked. Contract_Emps: ssn, name, lot, contractid. Queries involving all employees difficult, those on Hourly_Emps and Contract_Emps alone are easy. Only applicable, if Hourly_Emps AND Contract_Emps COVER Employees
48 B INARY VS. T ERNARY R ELATIONSHIPS The key constraints allow us to combine Purchaser with Policies and Beneficiary with Dependents. Participation constraints lead to NOT NULL constraints. CREATE TABLE Policies ( policyid INTEGER, cost REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY (policyid). FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE ) CREATE TABLE Dependents ( pname CHAR(20), age INTEGER, policyid INTEGER NOT NULL, PRIMARY KEY (pname, policyid). FOREIGN KEY (policyid) REFERENCES Policies, ON DELETE CASCADE )
49 S UMMARY High-level design follows requirements analysis and yields a high-level description of data to be stored. ER model popular for high-level design. Constructs are expressive, close to the way people think about their applications. Basic constructs: entities, relationships, and attributes (of entities and relationships). Some additional constructs: weak entities, subclasses, and constraints. ER design is subjective. There are often many ways to model a given scenario! Analyzing alternatives can be tricky, especially for a large enterprise.
50 S UMMARY There are guidelines to translate ER diagrams to a relational database schema. However, there are often alternatives that need to be carefully considered. Entity sets and relationship sets are all represented by relations. Some constructs of the ER model cannot be easily translated, e.g. multiple participation constraints.