Presentation on theme: "Relational Database Design UNIT II 1. 2 Advantages of Using Database Systems Centralized control of a firm’s data Redundancy can be reduced (avoid keeping."— Presentation transcript:
2 Advantages of Using Database Systems Centralized control of a firm’s data Redundancy can be reduced (avoid keeping copies of the same data in many places) Inconsistency can be avoided to some extent (updating occurs at one place instead of many places)
3 The data can be shared Security restrictions can be applied (passwords) Integrity of data can be maintained (e.g., an instructor has to exist before being assigned to teach a class) Advantages of Using Database Systems (cont’d)
5Relation Definition: A relation is a named, two-dimensional table of dataDefinition: A relation is a named, two-dimensional table of data Table consists of rows (records) and columns (attribute or field)Table consists of rows (records) and columns (attribute or field) Requirements for a table to qualify as a relation:Requirements for a table to qualify as a relation: –It must have a unique name –Every attribute value must be atomic (not multivalued, not composite) –Every row must be unique (can’t have two rows with exactly the same values for all their fields) –Attributes (columns) in tables must have unique names –The order of the columns must be irrelevant –The order of the rows must be irrelevant
6 The table representation of an entity set An entity set is usually represented as a table in the database. The name of the table is the same name as the entity set. For example: SSNName Street City 586-41-4125Brown2404 MaileHonolulu 332-18-6752Yang809 Dole St.Honolulu 121-898-5569Gregson211 Main St.Chicago Customer
7 The table representation of an entity set In the table, columns are called attributes, rows are called tuples, observations, or records. Each row is an entity, and the table is an entity set. Customer SSNName Street City 586-41-4125Brown2404 MaileHonolulu 332-18-6752Yang809 Dole St.Honolulu 121-898-5569Gregson211 Main St.Chicago
8Relation Definition: A relation is a named, two-dimensional table of dataDefinition: A relation is a named, two-dimensional table of data Table consists of rows (records) and columns (attribute or field)Table consists of rows (records) and columns (attribute or field) Requirements for a table to qualify as a relation:Requirements for a table to qualify as a relation: –It must have a unique name –Every attribute value must be atomic (not multivalued, not composite) –Every row must be unique (can’t have two rows with exactly the same values for all their fields) –Attributes (columns) in tables must have unique names –The order of the columns must be irrelevant –The order of the rows must be irrelevant
9 Keys It is important that any entity in an entity set be uniquely identifiable. Practically, we use the values of certain attributes to uniquely identify an entity. Keys can be simple (a single field) or composite (more than one field)
10 Customer-name Customer-street customer SSN Customer-city In this particular case, the value of the combination of any attribute(s) with SSN can uniquely identify a particular customer.
11 Keys Meanwhile, it is also obvious that some of these combinations may have redundant attributes. Therefore, in theory, we systematically classify these combinations of attributes using three types of Keys: superkey, candidate key, and primary key.
12 Keys (1) Superkey: is a set of attributes whose value can uniquely identify an entity in the entity set. A superkey contains one or more attributes. Example: The combination of “SSN” and “Name” is a superkey of the following entity set customer.
13 Because: The value of attributes “SSN” and “Name”, such as 558-36-1234 and Susan, can uniquely identify that particular customer in customer entity set, which is the pool of all customers. Customer-name Customer-street customer SSN Customer-city
14 Notationally, we write, Superkey: (SSN, name) Other superkeys for the entity set customer: Superkey: (SSN) Superkey: (SSN, Street) Superkey: (SSN, City) Superkey: (SSN, Name, Street) Superkey: (SSN, Name, City) Superkey: (SSN, Street, City) Superkey: (SSN, Name, Street, City)
15 Superkey is the broadest definition of unique identifiers of an entity in an entity set. We are unsurprisingly very interested in the most economical combination(s) of attributes that can uniquely identify any particular entity. Therefore, we introduce Candidate Key next.
16 (2) Candidate key: is a set of one or more attributes whose value can uniquely identify an entity in the entity set, and any attribute in the candidate key cannot be omitted without destroying the uniqueness property of the candidate key. (It is minimal superkey). Example: (SSN, Name) is NOT a candidate key, because taking out “name” still leaves “SSN” which can uniquely identify an entity. “SSN” is a candidate key of customer. Candidate key could have more than one attributes.
17 Also, while most entity sets have only one candidate key, some entity sets could have more than one candidate key. Example: Both “SSN” and “License #” are candidate keys of Driver entity set. In building a database in a database software, the software will only allow to use one candidate key to be the unique identifier of an entity for an entity set.
18 (3) Primary Key: is the candidate key that is chosen by the database designer as the unique identifier of an entity. The database designer chooses only one candidate key as the primary key in building the system. Example: “SSN” and “License #” are both candidate keys of Driver entity set. The database designer can choose either one as the primary key.
Foreign Key ·A foreign key is an attribute or attribute combination of one relation R2 whose values are required to match those of the primary key of relation R1 where R1 and R2 are not necessarily distinct. Note that a foreign key and the corresponding primary key should be defined on the same domain(s). Emp# e1 e2 e3 ename red blue brown Dept d1 d2 d3 Worksfordept d1 d2 Dname Pay Tax Art Employee Dept Foreign key
20 Primary Key Foreign Key (implements 1:N relationship between customer and order) Combined, these are a composite primary key (uniquely identifies the order line)…individually they are foreign keys (implement M:N relationship between order and product) Figure 5-3 Schema for four relations (Pine Valley Furniture Company)
Domain For each attribute, a set of permitted values, is called a the domain 21
Relational Data Structure EMPLOYEE Emp# Name Emp Sex Mgr Emp# E1 JonesMaleE65 E6 SmithMaleE28 E28Jones Female- Relation Attribute Heading Body Female Male Gender Domain
The Domain Emp# E1 E2 E3 Name Red Brown Black Mgr# - E1 E1, E2, E3,E4 Red, Brown Black, Blue Attributes Domains Employee Person Name Emp#
25 Overall, superkey is the broadest unique identifier; candidate key is a subset of superkey; and primary key is a subset of candidate key. In practice, we would first look for superkeys. Then we look for candidate keys based on experience and common sense. If there is only one candidate key, it naturally will be designated as the primary key. If we find more than one candidate key, then we can designate any one of them as primary key.
Integrity Constraints It ensures that changes made to the database by authorized users do not result in a loss of data consistency. It guards against accidental damage to the database 26
Domain Constraints They define valid values for attributes They are the most elementary form of integrity constraint. They test values inserted in the database, and test queries to ensure that the comparisons make sense.
28 Integrity Constraints –NOT NULL: specification prohibits the insertion of a null value for the attribute. Any data base modification that would cause a null to be inserted in an attribute declared to be not null generates an error diagnostic –E.g. balance numeric(12,2) not null
29 Integrity Constraints –Unique Constraints: specification says that attributes A1,A2… Aj, form a candidate key; –no two tuples in the relation can be equal on all the primary key attributes.
30 Integrity Constraints –The Check Clause:ensures that attribute values satisfy specified conditions. –E.g. –Create table student (name char(15) not null, Stud_id char(10), Degree_level char(15), Primary key(stud_id),check (degree_level in(‘Bachelors’, ‘Masters’))) The check clause ensures that the degree_level must be one of ‘Bachelors’ or’Masters’
Referential Integrity Ensures that a value that appears in one relation for a given set of attributes also appears for a certain set of attribute in another relation. –If an account exists in the database with branch name “Perryridge”, then the branch “Perryridge” must actually exist in the database. Primary keys of respective relations Foreign key branch (branch-name, branch-city, asset ) Perryridge Brooklyn 500,000 account ( account-no, branch-name, balance ) A-123 Perryridge 5000 A set of attributes X in R is a foreign key if it is not a primary key of R but it is a primary key of some relation S.
Assertions An assertion is a predicate expressing a condition that we wish the database to always satisfy. Domain constraints and referential integrity constraints are special forms of assertions. But there are many constraints we cannot express by using only these special forms. EX1: Every loan has atleast one customer who maintains an account with a minimum balance of Rs 1000.00
Creating an Assertion When an assertion is created, the system will test it for validity. If the assertion is valid, then any future modification to the database is allowed only if it does not cause the assertion to be violated. But assertions can create a considerable amount of overhead, especially if complex assertions have been made.