Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dale Roberts 11/26/2015 1 Department of Computer and Information Science, School of Science, IUPUI Fall 2003 Dale Roberts, Lecturer Computer Science, IUPUI.

Similar presentations


Presentation on theme: "Dale Roberts 11/26/2015 1 Department of Computer and Information Science, School of Science, IUPUI Fall 2003 Dale Roberts, Lecturer Computer Science, IUPUI."— Presentation transcript:

1 Dale Roberts 11/26/2015 1 Department of Computer and Information Science, School of Science, IUPUI Fall 2003 Dale Roberts, Lecturer Computer Science, IUPUI E-mail: droberts@cs.iupui.edu Data Modeling

2 Dale Roberts Normalization Defined 1st Definition... the process of taking a wide table with lots of columns but few rows and redesigning it as several narrow tables with fewer columns but more rows. A properly normalized design allows you to use storage space efficiently, eliminate redundant data, reduce or eliminate inconsistent data, and ease the data maintenence burden... Jose’s Database Programming Corner www.citilink.com/~jgarrick/vbasic/database 2nd Definition Procedure to ensure that a data model conforms to some useful standards.... to minimize the duplication of data, to provide the flexibility necessary to support different functional requirements, and to enable the data modeler to verify the business requirements. Oracle 7.3 Developer’s Guide, p. 466 3rd Definition Normalization is the process of putting things right, making them normal......separating elements of data into affinity groups, and defining the normal, or “right,” relationships between them. Oracle: The Complete Reference

3 Dale Roberts Sample Table StudentNameAdvisorNameCourseID 1 CourseDescription1CourseInstructor Name1 CourseID 2 CourseDescription2CourseInstructor Name2 Al GoreBill ClintonVB1Intro to Visual BasicBruce McKinneyDAO1Intro to DAO Programming Joe Garrick Dan QuayleGeorge BushDAO1Intro to DAO Programming Joe GarrickVBSQL1Client/Server Programming with VBSQL William Vaughn George BushRonald Reagan API1API Programming with VB Dan ApplemanOOP1Object Oriented Programming in VB Deborah Kurata Walter Mondale Jimmy CarterVB1Intro to Visual BasicBruce McKinneyAPI1API Programming with VB Dan Appleman 11/26/2015 3

4 Dale Roberts Sample Table Problems Problems, problems... Repeating Groups The course ID, description, and instructor are repeated for each class. If a student needs a third class, you need to go back and modify the table design in order to record it. While you could add CourseID3, CourseID4, CourseID5, etc., along with the associated description and instructor fields, no matter how far you take it there may one day be someone who wants one more class. Additionally, adding all those fields when most students would never use them is a waste of storage. Update Anomalies Let's say that after entering these rows, you discover that Bruce McKinney's course is actually title "Intro to Advanced Visual Basic". In order to reflect this change, you would need to examine all the rows and change each individually. This introduces the potential for errors if one of the changes is omitted or done incorrectly. Delete Anomalies If you no longer wished to track Joe Garrick's Intro to DAO class, you would need to delete two students, two advisors, and one additional instructor in order to do it. If you remove the first two rows of the table, all of the data is deleted with the reference to the course. Insert Anomalies Perhaps the department head wishes to add a new class - let's call it "Advanced DAO Programming" - but hasn't yet set up a schedule or even an instructor. What would you enter for the student, advisor, and instructor names? 11/26/2015 4

5 Dale Roberts Forms of Normalization First Normal Form (1NF) no repeating groups. Second Normal Form (2NF) 1NF + no nonkey attributes depend on a portion of the primary key. Third Normal Form (3NF) 2NF + no attributes depend on other nonkey attributes. “Each column depends on the key, the whole key, and nothing but the key, so help me Codd.”

6 Dale Roberts 6 First Normal Form (1NF) –Each column contains values about the same attribute, and each table cell value must be a single value. –Each column has a distince name, order of columns is immaterial. –Each row is distinct, rows cannot be duplicate for the same key –The sequence of rows is immaterial. Second Normal Form (2NF) –All non-key attributes must be fully dependent on the whole key. Third Normal Form (3NF) –Each nonkey attribute should be dependent only on the relation’s key, not on any other nonkey. Database Management: Normalization

7 Dale Roberts 7 How to reduce the confusion… Normalization 1NF: 2NF: 3NF: English names for tables and columns English code names The Dangers in a Relational Database

8 Dale Roberts 8 The Dangers in a Relational Database

9 Dale Roberts 9 Purpose of E/R Model The E/R model allows us to sketch database schema designs. Includes some constraints, but not operations. Designs are pictures called entity- relationship diagrams. Later: convert E/R designs to relational DB designs.

10 Dale Roberts 10 Framework for E/R Design is a serious business. The “boss” knows they want a database, but they don’t know what they want in it. Sketching the key components is an efficient way to develop a working database.

11 Dale Roberts 11 Entity Sets Entity = “thing” or object. Entity set = collection of similar entities. Similar to a class in object-oriented languages. Attribute = property of (the entities of) an entity set. Attributes are simple values, e.g. integers or character strings, not structs, sets, etc.

12 Dale Roberts 12 E/R Diagrams In an entity-relationship diagram: Entity set = rectangle. Attribute = oval, with a line to the rectangle representing its entity set. Warning! Graphical representations are inconsistent. Attributes using ovals shall be changed to more of a “Class Diagram” style later, to save space. The later style insists on singular nouns and uses lines for relationships.

13 Dale Roberts 13 Example: Entity Entity set Beers has two attributes, name and manf (manufacturer). Each Beers entity has values for these two attributes, e.g. (Bud, Anheuser-Busch) Underlined attribute(s) is(are) primary key(s) Beers name manf

14 Dale Roberts 14 Relationships This relationship has an attribute associated with it. A relationship connects two or more entity sets. It is represented by a diamond, with lines to each of the entity sets involved.

15 Dale Roberts 15 Example: Relationships Drinkers addrname Beers manfname Bars name license addr Note: license = beer, full, none Sells Bars sell some beers. Likes Drinkers like some beers. Frequents Drinkers frequent some bars.

16 Dale Roberts 16 Relationship Set The current “value” of an entity set is the set of entities that belong to it. Example: the set of all bars in our database. The “value” of a relationship is a relationship set, a set of tuples with one component for each related entity set.

17 Dale Roberts 17 Example: Relationship Set For the relationship Sells, we might have a relationship set like: BarBeer Joe’s BarBud Joe’s BarMiller Sue’s Bar Bud Sue’s BarPete’s Ale Sue’s BarBud Lite

18 Dale Roberts 18 Multiway Relationships Sometimes, we need a relationship that connects more than two entity sets. Suppose that drinkers will only drink certain beers at certain bars. Our three binary relationships Likes, Sells, and Frequents do not allow us to make this distinction. But a 3-way relationship would.

19 Dale Roberts 19 Bars Beers Drinkers name addr manf nameaddr license Preferences Example: 3-Way Relationship

20 Dale Roberts Example: 3-Way Relationship 11/26/2015 20

21 Dale Roberts 21 A Typical Relationship Set BarDrinkerBeer Joe’s BarAnnMiller Sue’s BarAnnBud Sue’s BarAnnPete’s Ale Joe’s BarBobBud Joe’s BarBobMiller Joe’s BarCalMiller Sue’s BarCalBud Lite

22 Dale Roberts 22 Many-Many Relationships Focus: binary relationships, such as Sells between Bars and Beers. In a many-many relationship, an entity of either set can be connected to many entities of the other set. E.g., a bar sells many beers; a beer is sold by many bars.

23 Dale Roberts 23 In Pictures: many-many

24 Dale Roberts 24 Many-One Relationships Some binary relationships are many -one from one entity set to another. Each entity of the first set is connected to at most one entity of the second set. But an entity of the second set can be connected to zero, one, or many entities of the first set.

25 Dale Roberts 25 In Pictures: many-one

26 Dale Roberts 26 Example: Many-One Relationship Favorite, from Drinkers to Beers is many-one. A drinker has at most one favorite beer. But a beer can be the favorite of any number of drinkers, including zero.

27 Dale Roberts 27 One-One Relationships In a one-one relationship, each entity of either entity set is related to at most one entity of the other set. Example: Relationship Best-seller between entity sets Manfs (manufacturer) and Beers. A beer cannot be made by more than one manufacturer, and no manufacturer can have more than one best- seller (assume no ties).

28 Dale Roberts 28 In Pictures: one-one

29 Dale Roberts 29 Representing “Multiplicity” Show a many-one relationship by an arrow entering the “one” side. Remember: Like a functional dependency. Show a one-one relationship by arrows entering both entity sets. Rounded arrow = “exactly one,” i.e., each entity of the first set is related to exactly one entity of the target set.

30 Dale Roberts 30 Example: Many-One Relationship DrinkersBeers Likes Favorite Notice: two relationships connect the same entity sets, but are different.

31 Dale Roberts 31 Example: One-One Relationship Consider Best-seller between Manfs and Beers. Some beers are not the best-seller of any manufacturer, so a rounded arrow to Manfs would be inappropriate. But a beer manufacturer has to have a best- seller.

32 Dale Roberts 32 In the E/R Diagram ManfsBeers Best- seller A manufacturer has exactly one best seller. A beer is the best- seller for 0 or 1 manufacturer.

33 Dale Roberts 33 Attributes on Relationships Sometimes it is useful to attach an attribute to a relationship. Think of this attribute as a property of tuples in the relationship set.

34 Dale Roberts 34 Example: Attribute on Relationship BarsBeers Sells price Price is a function of both the bar and the beer, not of one alone.

35 Dale Roberts 35 Equivalent Diagrams Without Attributes on Relationships Create an entity set representing values of the attribute. Make that entity set participate in the relationship.

36 Dale Roberts 36 Example: Removing an Attribute from a Relationship BarsBeers Sells price Prices Note convention: arrow from multiway relationship = “all other entity sets together determine a unique one of these.”

37 Dale Roberts 37 Roles Sometimes an entity set appears more than once in a relationship. Label the edges between the relationship and the entity set with names called roles.

38 Dale Roberts 38 Example: Roles Drinkers Married husbandwife Relationship Set HusbandWife BobAnn JoeSue… Drinkers Buddies 12 Relationship Set Buddy1 Buddy2 Bob Ann Joe Sue Ann Bob Joe Moe …

39 Dale Roberts 39 Example: Roles

40 Dale Roberts 40 Subclasses Subclass = special case = fewer entities = more properties. Example: Ales are a kind of beer. Not every beer is an ale, but some are. Let us suppose that in addition to all the properties (attributes and relationships) of beers, ales also have the attribute color.

41 Dale Roberts 41 Subclasses in E/R Diagrams Assume subclasses form a tree. I.e., no multiple inheritance. Isa triangles indicate the subclass relationship. Point to the superclass.

42 Dale Roberts 42 Example: Subclasses Beers Ales isa namemanf color

43 Dale Roberts 43 E/R Vs. Object-Oriented Subclasses In OO, objects are in one class only. Subclasses inherit from superclasses. In contrast, E/R entities have representatives in all subclasses to which they belong. Rule: if entity e is represented in a subclass, then e is represented in the superclass (and recursively up the tree).

44 Dale Roberts 44 Example: Representatives of Entities Beers Ales isa namemanf color Pete’s Ale

45 Dale Roberts 45 Keys A key is a set of attributes for one entity set such that no two entities in this set agree on all the attributes of the key. It is allowed for two entities to agree on some, but not all, of the key attributes. We must designate a key for every entity set.

46 Dale Roberts 46 Keys in E/R Diagrams Underline the key attribute(s). In an Isa hierarchy, only the root entity set has a key, and it must serve as the key for all entities in the hierarchy.

47 Dale Roberts 47 Example: name is Key for Beers Beers Ales isa namemanf color

48 Dale Roberts 48 Example: a Multi-attribute Key Courses dept number hoursroom Note that hours and room could also serve as a key, but we must select only one key.

49 Dale Roberts 49 Weak Entity Sets Occasionally, entities of an entity set need “help” to identify them uniquely. Entity set E is said to be weak if in order to identify entities of E uniquely, we need to follow one or more many-one relationships from E and include the key of the related entities from the connected entity sets.

50 Dale Roberts 50 Example: Weak Entity Set name is almost a key for football players, but there might be two with the same name. number is certainly not a key, since players on two teams could have the same number. But number, together with the team name related to the player by Plays-on should be unique.

51 Dale Roberts 51 Weak Entity PlayersTeams Plays- on name number Double diamond for supporting many-one relationship. Double rectangle for the weak entity set. Note: must be rounded because each player needs a team to help with the key.

52 Dale Roberts Weak Entity 11/26/2015 52

53 Dale Roberts 53 Weak Entity-Set Rules A weak entity set has one or more many-one relationships to other (supporting) entity sets. Not every many-one relationship from a weak entity set need be supporting. But supporting relationships must have a rounded arrow (entity at the “one” end is guaranteed). The key for a weak entity set is its own underlined attributes and the keys for the supporting entity sets. E.g., (player) number and (team) name is a key for Players in the previous example.

54 Dale Roberts 54 Design Techniques 1. Avoid redundancy. 2. Limit the use of weak entity sets. 3. Don’t use an entity set when an attribute will do.

55 Dale Roberts 55 Avoiding Redundancy Redundancy = saying the same thing in two (or more) different ways. Wastes space and (more importantly) encourages inconsistency. Two representations of the same fact become inconsistent if we change one and forget to change the other. Recall anomalies due to FD’s.

56 Dale Roberts 56 Example: Good BeersManfs ManfBy name This design gives the address of each manufacturer exactly once. nameaddr

57 Dale Roberts 57 Example: Bad BeersManfs ManfBy name This design states the manufacturer of a beer twice: as an attribute and as a related entity. name manf addr

58 Dale Roberts 58 Example: Bad Beers name This design repeats the manufacturer’s address once for each beer and loses the address if there are temporarily no beers for a manufacturer. manfmanfAddr

59 Dale Roberts 59 Entity Sets Versus Attributes An entity set should satisfy at least one of the following conditions: It is more than the name of something; it has at least one nonkey attribute. or It is the “many” in a many-one or many-many relationship.

60 Dale Roberts 60 Example: Good BeersManfs ManfBy name Manfs deserves to be an entity set because of the nonkey attribute addr. Beers deserves to be an entity set because it is the “many” of the many-one relationship ManfBy. nameaddr

61 Dale Roberts 61 Example: Good Beers name There is no need to make the manufacturer an entity set, because we record nothing about manufacturers besides their name. manf

62 Dale Roberts 62 Example: Bad BeersManfs ManfBy name Since the manufacturer is nothing but a name, and is not at the “many” end of any relationship, it should not be an entity set. name

63 Dale Roberts 63 Don’t Overuse Weak Entity Sets Beginning database designers often doubt that anything could be a key by itself. They make all entity sets weak, supported by all other entity sets to which they are linked. In reality, we usually create unique ID’s for entity sets. Examples include social-security numbers, automobile VIN’s etc.

64 Dale Roberts 64 When Do We Need Weak Entity Sets? The usual reason is that there is no global authority capable of creating unique ID’s. Example: it is unlikely that there could be an agreement to assign unique player numbers across all football teams in the world.

65 Dale Roberts 65 From E/R Diagrams to Relations Entity set -> relation. Attributes -> attributes. Relationships -> relations whose attributes are only: The keys of the connected entity sets. Attributes of the relationship itself.

66 Dale Roberts 66 Entity Set -> Relation Relation: Beers(name, manf) Beers name manf

67 Dale Roberts 67 Relationship -> Relation DrinkersBeers Likes Likes(drinker, beer) Favorite Favorite(drinker, beer) Married husband wife Married(husband, wife) name addr name manf Buddies 1 2 Buddies(name1, name2)

68 Dale Roberts 68 Combining Relations OK to combine into one relation: 1. The relation for an entity-set E 2. The relations for many-one relationships of which E is the “many.” Example: Drinkers(name, addr) and Favorite(drinker, beer) combine to make Drinker1(name, addr, favBeer).

69 Dale Roberts 69 Risk with Many-Many Relationships Combining Drinkers with Likes would be a mistake. It leads to redundancy, as: name addr beer Sally 123 Maple Bud Sally 123 Maple Miller Redundancy

70 Dale Roberts 70 Handling Weak Entity Sets Relation for a weak entity set must include attributes for its complete key (including those belonging to other entity sets), as well as its own, nonkey attributes. A supporting relationship is redundant and yields no relation (unless it has attributes).

71 Dale Roberts Enterprise E-R Diagrams Placing attributed in ovals, and listing every relationship with diamonds in okay in small models are in an academic exercise. Enterprise E-R Diagrams contains hundreds of entities and thousands of attributes. A more compact representation is necessary. Rectangles are entities, lines are relationships, associative entities (use to resolve many-many) are diamonds. Cardinality is included on relationship lines (0, 1, Many = “crows feet” 11/26/2015 71

72 Dale Roberts TicketMaster Example 72

73 Dale Roberts 11/26/2015 73 Acknowledgements McFadden and Hoffer. Database Management Loney, Kevin. Oracle Database 10g The Complete Reference Ullman, Jeff. Database Systems The Complete Book.


Download ppt "Dale Roberts 11/26/2015 1 Department of Computer and Information Science, School of Science, IUPUI Fall 2003 Dale Roberts, Lecturer Computer Science, IUPUI."

Similar presentations


Ads by Google