Presentation is loading. Please wait.

Presentation is loading. Please wait.

Databases 67506 Winter 2011.

Similar presentations


Presentation on theme: "Databases 67506 Winter 2011."— Presentation transcript:

1 Databases 67506 Winter 2011

2 Administration

3 Administration Lecturers: TA: same as above
Dr. Sara Cohen Prof. Yehoshua Sagiv TA: same as above Course homepage:

4 Exercises Approximately 10 assignments 25% of the final grade
Exercises are done individually Theoretical and practical assignments, possibly student interviews (or similar activities) Each exercise has a bodek in charge. Any, and all questions about the exercise should be sent to this bodek ALL EXERCISES ARE MANDATORY EXERCISES CANNOT BE GIVEN IN LATE

5 Gradience System This year several of our exercises will be submitted through the Gradience System. You will now be given the secret token for registration. Write this down. It does not appear online. Registration instructions will be available online

6 Using Gradience Gradience is a system for automatic grading of exercises The question are mostly multiple choice You should complete all questions and then submit (there is no partial submission) You will then get feedback on the number of questions you got right/wrong Typically, you will have 3 chances to submit Each time you redo the exercise, you will have the same questions, but new answers.

7 Books H. Garcia-Molina, J. Ullman, J. Widom, Database Systems: The Complete Book R. Ramakrishnan and J. Gehrke, Database Management Systems, 3rd edition, 2002, McGraw Hill. A lot of online supporting material is available at J. Ullman, Principles of Database & Knowledge-Base Systems Vol. 1, Computer Science Press

8 Topics Modeling: ER Diagrams (1 week, Cohen)
Querying: Relational Algebra (1 week, Sagiv) Querying: SQL (2 weeks, Cohen) Programming: JDBC, PLSQL (1 week, Cohen) Design Theory (3 weeks, Cohen) Query Evaluation and Optimization (3 weeks, Sagiv) Concurrency Control (2 weeks, Sagiv) Recovery (1 week, Sagiv)

9 Teaching The lectures and tirgulim will be given by myself and Prof. Sagiv WARNING: If you miss class, you should make up the material before the tirgul

10 Modeling: Entity-Relationship Diagrams

11 Scenario wants to store information about movies and has chosen you to help them Four steps: Requirements Analysis: Discover what information needs to be stored, how the stored information will be used, etc. Taught in course on system analysis and design Conceptual Database Design: High level description of data to be stored (ER model) Logical Database Design: Translation of ER diagram to a relational database schema (description of tables) Physical Database Design: Done by the DB system

12 Requirements For actors and directors, we want to store their name, a unique identification number, address and birthday (why not age?) For actors, we also want to store a photograph For films, we want to store the title, year of production and type (thriller, comedy, etc.) We want to know who directed and who acted in each film. Every film has one director. We store the salary of each actor for each film Etc…

13 ER-Diagrams: General Information
ER-diagrams are a formalism to model real-world scenarios There are many versions of ER-diagrams that differ both in their appearance and in their meaning We will use the version appearing in the book Database Systems: The Complete Book ER-diagrams have a formal semantics (meaning) that must be thoroughly understood, in order to create correct diagrams Goal of modeling is to translate informal requirements to a precise diagram. This diagram can then be translated into to the desired data model, to allow data to be stored in a database

14 Basic Concepts: Entities, Attributes, Relationships

15 Entities, Entity Sets Entity (ישות): An object in the world that can be distinguished from other objects Examples of entities: Examples of things that are not entities: Entity set (קבוצת ישויות): A set of similar entities Examples of entity sets:  Entity sets are drawn as rectangles Examples of entities: Bill Gates, Database course in HUJI Examples of entity sets: People, Courses Actor

16 Attributes Attributes (תכונות): Used to describe entities
All entities in the set have the same attributes A minimal set of attributes that uniquely identify an entity is called a key An attribute contains a single piece of information (and not a list of data)

17 Attributes (2) Examples of attributes:
Examples of things that cannot be attributes:  Attributes are drawn using ovals  The names of the attributes which make up a key are underlined Attribute example: age, birthday, id number, name Not attribute: spouse (may be missing), children (can be multiple)

18 Example birthday id Actor name address

19 Another Option for a Key?
birthday id Actor name address

20 Another Option for a Key?
birthday id Actor name address

21 Relationships, Relationship Sets
Relationship (קשר): Association among two or more entities Examples of Relationships: Relationship Set (קבוצת קשרים): Set of similar relationships Examples of Relationship sets:  Relationship sets are drawn using diamonds Relationship example: Bill Gates is married to Melinda French, Shimon Levy is taking the HUJI DB course Relationship set example: Marriage, Studying

22 Example id Actor Film Produced title name

23 Recursive Relationships
An entity set can participate more than once in a relationship In this case, we add a description of the role to the ER-diagram phone number manager id Employee Manages worker name address

24 n-ary Relationship An n-ary relationship set R involves exactly n entity sets: E1, …, En. Each relationship in R involves exactly n entities: e1 in E1, …, en in En Formally, R E1x …x En Actor id name Produced Film title Director

25 Example Suppose that there are: 2 Actors 3 Directors 4 Film
How many triples can be in the relationship set “Produced”? How many pairs can be in the relationship set “Produced”? Actor id name Produced Film title Director No pairs! Produced contains triples At most 2*3*4 = 24 triples

26 Multiplicity of Relationships

27 Multiplicity of Relationships
A member of E may be connected by R to any number of members from F, and vice versa This is called a many-many relationship

28 Many-to-Many A film is directed by any number of directors
A director can direct any number of films id Director Film Directed title name Director Directed Film

29 Multiplicity of Relationships
By adding arrows to the diagram, we can indicate constraints on the relationship An arrow towards F indicates that: A member of E may be connected by R to at most one members from F (Still, a member of F may be connected by R to any number of members from E) This is called a many-one relationship

30 One-to-Many A film is directed by at most one director
A director can direct any number of films id Director Film Directed title name Director Directed Film

31 Multiplicity of Relationships
An arrow towards F and towards E indicates that: A member of E may be connected by R to at most one member from F A member of F may be connected by R to at most one member from E This is called a one-one relationship

32 One-to-One A film is directed by at most one director
A director can direct at most one film id Director Film Directed title name Director Directed Film

33 Another Example Where would you put the arrow? age father id Person
FatherOf child On the top line name

34 Multiplicities in Multiway Relationships
Man id name id Child Woman childOf id name name Each man and woman can have a single child together. (However, a man/woman can have many children, each with a different woman/man) Note that for many reasons, this is a bad modeling What does this mean?

35 Multiplicities in Multiway Relationships
Man id name id Child Woman childOf id name name Each man and woman can have a single child together. (However, a man/woman can have many children, each with a different woman/man) Note that for many reasons, this is a bad modeling Each pair a man and a woman can have at most one child

36 More Generally F1 E1 … R … En Fm For any 1<=i<=m
For any tuple of entities e1,…,en,f1,…fi-1,fi+1,...,fm there is at most one fi, such that e1,…,en,f1,…,fm are connected by R

37 Example C A R B D What is the maximum number of 4-tuples in R?
Suppose that there are a entities in A b entities in B c entities in C d entities in D Maximum number min(abc,abd) Minimum number = 0 What is the minimum number of 4-tuples in R?

38 Referential Integrity and Degree Constraints

39 Referential Integrity
So far, we can say that an entity participates at most one time, but cannot require it to participate at least one time The rounded arrow above indicates that each entity in E must participate exactly one time in an R-relationship with an entity in F

40 Degree Constraints StarsIn Actor Movie
<=10 Actor Movie We can attach a bounding number to edges to indicate limits on the number of entities that can be connected to a single entity via a relationship set In the example above, a move has at most 10 stars Note: a regular arrow is the constraint <=1 Note: a rounded arrow is the constraint =1

41 Relationship Sets with Attributes

42 Relationship Sets Can have Attributes
title birthday id Actor Film Acted In year name type address It is an attribute of the relationship. If we attached salary to actor, this would imply that actors have a set salary, regardless of the movie that they are playing in. If we attached salary to film, this would imply that all actors in a film have the same salary. salary Where does the salary attribute belong?

43 Are Relationship Attributes Needed?
An alternative modeling for a relationship attribute is to use an additional entity Actor Film Acted In salary Actor Film Acted In Salary value

44 Important Note The entities in a relationship set must identify the relationship Attributes of the relationship set cannot be used for identification! Suppose we wanted to store the role of an actor in a film. How should we store the role of the actor? How would we store information about a person who acted in one film in several roles? If the actor can have several roles, then we must use a third entity set! id Actor Film Acted In title name

45 Subclasses

46 ISA Hierarchies ISA Relationships: Defines a hierarchy between entity sets ISA is similar to inheritance  ISA relationships are drawn as a triangle with the word ISA inside it. The "super entity-set" is above the triangle and the "sub entity-sets" are below

47 Example What are the keys of: Movie Person Actor Director address id
birthday Movie Person name ISA The key of all of these is id picture Actor Director

48 Implications of an ISA Relationship
Every entity in B or in C belongs to A There may be entities in A that do not belong to B or to C There may be entities that belong to both B and C A isa B C

49 Example Is this good method of modeling data
parent woman ParentOf child Person man Married name id No! Same-sex marriages are allowed (outlawed by the rabanut, though). A person can have any number of parents. To fix: Create an entity set person, and use isa to define woman and man. Then, have married as a relationship between woman and man. Have a mother-of and father-of relationship from woman and man to person. Is this good method of modeling data for the רבנות database on marriage? How can you fix it?

50 Weak Entity Sets

51 Intuition Sometimes, entity sets fall into a hierarchy based on classifications unrelated to the ISA relationship Elements of E may be sub-units of elements of F An element of E is not unique without taking F into consideration! To model this correctly we need weak entity sets

52 Example A פלוגה has a letter (א, ב, ...) To uniquely identify the
פלוגה, one must know which גדוד it belongs to פלוגה is a weak entity set The relationship set שייכת ל is the supporting relationship for פלוגה גדוד מספר גדוד שייכת ל פלוגה אות פלוגה

53 Notation Weak entity set has a double line
Supporting relationships have double lines Rounded arrow pointing into the identifying entity sets גדוד מספר גדוד שייכת ל פלוגה אות פלוגה

54 Example (1) Keys: גדוד: מספר גדוד פלוגה: אות פלוגה, מספר גדוד
מחלקה: מספר מחלקה, אות פלוגה, מספר גדוד מספר גדוד שייכת ל פלוגה אות פלוגה שייכת ל מחלקה מספר מחלקה

55 Example (2) Same award can be given by several organizations (“Academy award for Best Actor 2007”), A year, award name and organization name uniquely define an award Weak entity set can participate in additional (non-supporting) relationships phone number name Organization Gives Won Award Actor year name

56 Example (3) Awards are now identified by organization and country
Same award can be given by same organization in different countries (“Academy award for Best Actor Israeli 2007”) Weak identity set has 2 supporting relationships What is identifying key for award? phone number name name Organization Country Gives Location Award Year, award name, country name, organization name year name

57 Design Principles

58 Faithfulness The design should be accurate to the specifications
This is ok only if each actor has a set salary, regardless of all movies Actor Film Acted In salary

59 Avoiding Redundancy The design should not model the same information in multiple ways Leads to fact repetition Leads to inconsistencies Movie Studio Owns studioName name

60 Simplicity Counts Avoid introducing elements that are not needed
If we never need to store information about people that are not movie people, don’t put it in the diagram address id birthday Person name isa Movie Person isa picture Actor Director

61 Picking the Right Kind of Element
Actor Studio Owned By name The bottom diagram is sufficient if Studio has no attributes other than name Actor sname

62 Summary Given a set of requirements, to translate the requirements into a diagram: Identify the entity sets Determine if there are hierarchies (ISA or weak relationships) among entity sets Identify the relationship sets Identify the attributes Determine constraints on relationship participation

63 The Relational Model

64 Data Models A data model is a notation for describing data
Conceptual structure of the data Operations on the data Constraints on the data In this course we focus on the relational data model

65 Conceptual Structure of the Data
The basic element of the relational model is a relation (which is similar to a table) A relation has a schema, consisting of a Name List of attributes, possibly with domains A relation may also have an instance, which is a set of tuples (i.e., rows) in the relation

66 Example Schema: Movies(title, year, length, genre)
Follow… 1985 90 children Who … 1987 mystery Example Schema: Movies(title, year, length, genre) Relation name: Movies Attributes: title, year, length, genre Possible tuple instance (“Follow that Bird”, 1985, 90, children) Scheme with domains: Movies(title: string, year: number, length: number, genre: string)

67 Operations on the Data Relational algebra
Selection, projection, union, minus, join Stay tuned… Discussed in detail next week

68 Constraints on the Data
We discuss complex constraints later on in the course For now, we introduce key constraints A set of attributes forms a key for a relation if there cannot be 2 different tuples with the same values for all attributes of the key Noted with underline Examples: Movies(title, year, length, genre) Actor(teudatZehut, name, address)

69 A Step Closer Once we have a set of relational schemas in the relational model, we are a step closer to storing data in a DBMS A DBMS has a data definition language (DDL), used to define tables in the database Once we have decided on the relational schemas, these can be directly translated into the database using the DDL

70 ER Diagrams to Relational Schemas

71 General Principles When converting ER diagrams to Relations, we should: Reduce duplicate information Constrain as tightly as possible Notes: Some scenarios can be represented in different ways. Sometimes we will not be able to fully represent constraints, or will be forced to represent information more than once.

72 Entity Set Translation
birthday id Actor name address Actor (id, name, birthday, address) General Rule: Create a relation with the name of the Entity. There is a column for each attribute The key in the diagram is the primary key of the relation

73 Relationship Sets (without constraints)
birthday title id Actor Film Acted In year name salary address type General Rule: Create a table with the name of the relationship set Relationship table attributes: its own attributes (salary) + all keys of the relating entities (title, id) What is the primary key of the table? Note: Do not define two attributes with the same name – instead rename one of them in the schema Id, title

74 Translating Recursive Relationship Sets (without constraints)
manager id Employee Manages worker name address What are all the relations created for this diagram? Employee(id,name, address) – id is the key Manages(mid,wid) – mid,wid is the key

75 Translating relationships (one-to-many): Option 1
id Director Film Directed title name salary year Option 1: Same as without key constraints (3 tables), except that the primary key of Directed is now title (why?)

76 Translating relationships (one-to-many): Option 2
id Director Film Directed title name salary year Film (title, year, salary, id) Director(id, name) Option 2: Do not create a table for the relationship Add information columns that would have been in the relationship's relation to the relation of the entity which does not have the incoming arrow

77 Translating Weak Entity Sets
phone number Key of relation for weak entity set includes its own keys and the keys of its supporting entity sets No relation is created for the supporting relationship sets Example: Translate diagram on the left to relations name Organization Gives Award Organization(name,phone) – name is key Award(year,name,oname) – all are key year name

78 Translating ISA: E/R Style Conversion
address Translating ISA: E/R Style Conversion id Movie Person name ISA picture Actor Director A relation for each entity set. An entity may appear in more than one relation MoviePerson(id, address, name) Actor(id, picture) Director(id)

79 Translating ISA: Object-Oriented Approach
address Translating ISA: Object-Oriented Approach id Movie Person name ISA picture Actor Director A relation for each possible combinations of how entities may appear in the entity sets. MoviePerson(id, address, name) MoviePersonActor(id, address, name, picture) MoviePersonDirector(id, address, name) MoviePersonActorDirector(id, address, name, picture)

80 Translating ISA: Null Value Approach
address Translating ISA: Null Value Approach id Movie Person name ISA picture Actor Director A single relation for containing all values Possible only if NULL values (i.e., missing values) are allowed – not in the pure relational model MoviePerson(id, address, name, picture)


Download ppt "Databases 67506 Winter 2011."

Similar presentations


Ads by Google