Presentation is loading. Please wait.

Presentation is loading. Please wait.

Practical tips for creating entity relationship diagrams (ERDs)

Similar presentations


Presentation on theme: "Practical tips for creating entity relationship diagrams (ERDs)"— Presentation transcript:

1 Practical tips for creating entity relationship diagrams (ERDs)
Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia University, Montréal

2 Entities

3 What is an ENTITY in a business case?
An entity is a noun (person, place, thing or event) that contains details (the details will be attributes) If it contains no details whatsoever (not even a name or description), then it might be an attribute of something else, not an entity in its own right Entities describe the contents of entire rows Strictly speaking, an entity refers to any specific row in a table, not the table as a whole Specifically, the name of the table is what the primary key of each row describes For example, each attribute in one row in the Employee table should refer to an individual employee Attributes are also nouns (persons, places, things or dates), but they are not entities if they describe other things E.g. a name or date is a noun, but it describes a thing, e.g. a person or sale; therefore name or date is the attribute, and the Person or Sale is the entity

4 Common false entities: The Organization that owns or operates the database
The organization that owns the database is NEVER modeled as an entity within the database Similarly, the database itself or the information system in which the database runs cannot be modeled as an entity However, its customers, suppliers, and other organizations for which it stores data will be modeled Multiple branches or locations within the organization might be modeled, but never the overall organization itself

5 Common false entities: Forms and reports
In general, a form or a report is NOT an entity A form is an input tool that receives data that needs to go into a database A report is an output tool that displays data taken from a database Reports include schedules, summaries, bills, receipts, queries, etc. Forms and reports are great sources for identifying entities and attributes, but they normally represent multiple entities at the same time They are a good point to start table normalization, but rarely survive as valid entities

6 Common false entities: Forms and reports: should be EVENTS
Rather than a form or report, the EVENT associated with a form or report is often an entity E.g. An order form is not an entity, but a Sale event is an entity E.g. A job application form is not an entity, but the JobApplication event is an entity Even when a form or report represents an event, multiple entities are usually involved If it is truly an event, then at least one important attribute must be a date or time It is perfectly legitimate, though, for the form or report number or ID to be a valid attribute for an event entity (e.g. InvoiceNo could be the PK for a Sale entity) Notable exception (which is not really an exception): An Invoice is often expressed as an entity, with InvoiceLine as a bridge entity (M:M) connecting the Invoice to Product Actually, the “Invoice” entity is just a Sale event (an invoice is just a report for a single sale), and “InvoiceLine” could just as well be called “Product_Sale” Events: Movie rentals: Customer_Rentals “Exception”: Invoice: INVOICE, INV_LINE

7 Common false entities: “Inventory”, “Stock”, “Warehouse”, etc.
These entity names violate the rule that “an entity refers to any specific row in a table, not the table as a whole” What does one row of an Inventory table describe? An individual inventory? That makes no sense. The entity should be named according to what each row describes Appropriate names: Product, Item, Part, etc. The key is: you should name the entire entity based on the contents of one individual row To handle the “inventory” or “stock” aspect, there should be an attribute called something like QuantityInStock This assumes that each row represents a specific product model, e.g. one product is one product model, and QuantityInStock = 5 means you have 5 of that product model in stock EXCEPTION: case of correct usage: Inventory or Stock can be correctly modeled as a M:M relationship between Product and Store/Warehouse entities Video rental: Movies: number_in_stock Invoice: PRODUCT: PROD_QOH EXCEPTION: StackOverflow Inventory

8 Relationships

9 What kind of RELATIONSHIP is it?
To accurately determine a relationship, you must always test each side in turn. For each side, you must ask: One instance of EntityA can have zero/one (minimum) or one/many (maximum) instance(s) of EntityB? One instance of EntityB can have zero/one (minimum) or one/many (maximum) instance(s) of EntityA? Examples: One department can employ zero or many employees; one employee can work for one and only one department 1:M: One ( II ) on the department side, many ( O< ) on the employee side One department can be managed by zero or one managers; one manager can manage zero or one departments 1:1: One ( IO ) on the department side, one ( OI ) on the manager side One project can employ zero or many employees; one employee can work for zero or many projects M:M: Many ( >O ) on the project side, many ( O< ) on the employee side Projects

10 Maximum and minimum cardinalities
1:1, 1:M and M:M refer to maximum cardinalities E.g. 1:M means maximum one of EntityA and maximum many of EntityB Minimum cardinalities are expressed with Crow’s foot notation: OI: Minimum zero, maximum one II: Minimum one, maximum one O<: Minimum zero, maximum many I< : Minimum one, maximum many Minimum zero (OI or O<) means that this side of the relationship is optional The other entity can exist in the database without this side Minimum one (II or I<) means that this side of the relationship is mandatory (required) The other entity cannot exist in the database without this side For a bridge entity (representing a M:M relationship), the one side is always II (minimum one, maximum one). If either main entity were absent, the record would not exist. Projects: manages and records

11 How COMMON are various kinds of relationship?
One-to-many (1:M) is the most common Many-to-many (M:M) is quite common, but in an ERD, it must always be decomposed into two 1:M relationships One-to-one (1:1) is actually rather rare Many times when you think it’s 1:1, you might have done something wrong Double- and triple-check to make sure it really is 1:1

12 Legitimate 1:1 relationships
IS-A or supertype/subtype: E.g. A person is a specific type of person: an Employee is an Accountant, a Doctor is a Specialist, a Student is an Undergraduate E.g. A product is a specific type of product: a Bicycle is a type of Vehicle, a Violin is a type of MusicalInstrument Boss, where the rule is that you can only have one boss, and a boss can be boss of only one organizational unit E.g. Employee manages a Department, Student is president of a Club Some kinds of scarce resource allocation E.g. one employee can be assigned only one company car Marriage: one Person can be married to only one other Person at a time Matching reservations E.g. an advance car Reservation is matched to one eventual car Rental NOTE: Most of these relationships cannot store historical information If you add dates to track historical changes, it becomes M:M In addition to these, there are other database performance scenarios where 1:1 relationships could make sense Projects

13 Naming conventions

14 REQUIREMENTS for database naming conventions
Many variations exist for naming conventions, but one rule is absolute: Whatever you choose, BE CONSISTENT! Entity and attribute names: Never uses spaces (“ ”) in instead, use either CamelCase or underscores_between_words Spaces can seriously mess up database programming in unexpected places Relationship names Relationship names are required in an ERD Even in drafts, you should name your relationships—unnamed relationships are one of the top reasons for errors in ERD design For relationship names in an ERD, spaces are perfectly OK, since they never appear in programming Movies, College, Inventory demonstrate different conventions that are all mostly good

15 Avoid reserved words Each database has a list of keywords that have special meaning; you can’t use these as entity or attribute names E.g. Date, Number, Order, Row, etc. Some DBMSs let you use them (maybe you have to use quotes or brackets), but trouble is almost guaranteed when you start programming Lists of reserved names for popular DBMSs Oracle SQL Reserved Words MySQL Keywords and Reserved Words Microsoft Access SQL Reserved Words Microsoft SQL Server Reserved Keywords You can’t memorize them all, but it’s a good idea to at least scan through the list once when you start using a new DBMS

16 SUGGESTIONS for database naming conventions
My preferred (optional) tips: Avoid abbreviating table or attribute names (e.g. don’t use CustID for CustomerID, or EmpName for EmployeeName) Requires a little more typing, but clarity in coding is priceless Avoid redundant table prefixes such as “tbl” in tblCustomer; just Customer is fine The context of use makes it clear that it’s a table Always use a surrogate primary key (PK) for every table and call it ID, and always call the foreign key (FK) TableID or Table_ID. For example, the PK for Customer table is ID (Customer.ID) and its corresponding FK in the Sale table is CustomerID (Sale.CustomerID) Detailed specifications (these are just suggestions): Jason Mauss’s detailed guide Justin Watts’ tips Inventory demonstrates ID usage

17 SUGGESTIONS for naming Entities and Attributes
Entity and attribute names are always a singular noun, not plural (e.g. Customer, not Customers) However, many people prefer the opposite For bridge/associative entities from M:M relationships, there are two choices: Be creative in forming a noun from the verb of the relationship. E.g. for Employee and Project, name the bridge entity “Assignment” Simply form a name by merging the two constituent entity names. E.g. for Employee and Project, name it Employee_Project or Project_Employee Movies: plural entity names Projects

18 SUGGESTIONS for naming Relationships
Never use “has” to name relationships—it is meaningless E.g. Don’t use “Customer has Sale” or “Sale has Product”. Instead, use “Customer purchases Sale” or “Sale offers Product” Anything is better than “has” Relationship names are always verbs, usually lower case; spaces are allowed Relationships are named from the one side to the many side (in a 1:M relationship) For 1:1 relationships, they are named from the side with the PK to the side with the FK For bridge entity relationships, name them as if the many side were directly connected to the original entity College: good examples, except for DEPARTMENT has STUDENT and M:M relationship names Inventory: Good M:M relationship names

19 Major stages for designing an ERD
Identify the entities, attributes and relationships from the business case description Verify the attributes, select primary keys, add foreign keys and normalize Finalize the ERD using modeling software

20 Identify relationships between entities
Designing an ERD: Stage 1 Identify entities, attributes and relationships Identify entities (main nouns) and attributes (nouns that are details of entities) Analyze the case top-down first (entities, then attributes) Analyze again, this time bottom-up (attributes, then entities) Each attribute must apply specifically and only to its own entity Leave out primary keys and foreign keys at this stage Identify relationships between entities It is never the case that all entities are related to all relationships. You must make sure that the relationship exists. For a relationship to exist, the case description must mention the entities involved together in the same sentence. Otherwise, there is no relationship. Use the two-step procedure for verifying 1:M, M:M or 1:1 (first verify one side of the relationship, and then the other side) Verify minimum and maximum cardinalities for each relationship (e.g. minimum 0, maximum many; minimum 1, maximum 1 Convert any M:M relationships into two 1:M relationships

21 Designing an ERD: Stage 2 Verify attributes, add keys, normalize
Populate all your entities with attributes Review and revise attributes In particular, verify if some of the attributes identified earlier might actually belong in the new bridge entities created for M:M Decide on the primary keys (PK, unique identifiers) for each entity Add any appropriate foreign keys (FK) to all entities Normalize each entity to fourth normal form (4NF) Normalization might require you to make major revisions to some entities If revisions are required, reiterate until you are satisfied with all the entities

22 In which entity does the foreign key go?
1:M relationships (including bridge entities): The PK from the one side must always be copied to the many side as a FK 1:1 relationships: Copy the FK to only one of the two sides, never both (1,1):(0,1): Foreign key must go on the (0,1) side (1,0):(0,1): You can choose either side, but never both Preferably, place the FK on whichever side will cause fewer nulls (1,1):(1,1): This relationship almost never exists in the real world. You probably made a mistake. Verify your cardinalities, or merge the two tables into one. Exception: this relationship is sometimes used deliberately for database performance optimization College

23 Designing an ERD: Stage 3 (modeling software) Draw the ERD
Draw the final ERD using modeling software Stages 1 and 2 are drafts (maybe even on paper); you don’t consider it final until you’ve got your drafts polished If you draw something in software at the beginning, it is much harder to see and correct your errors Verify various items: All relationships must be named By default, all relationships are non-identifying/weak (dashed lines) Identifying/strong relationships (solid lines) are used only when an FK is part of one entity’s PK Required attributes (including all PKs and some FKs) should be bolded College

24


Download ppt "Practical tips for creating entity relationship diagrams (ERDs)"

Similar presentations


Ads by Google