Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database Management Systems. What is a DBMS? Database management systems: Provide efficient (speed and space) and secure access to large amount of data.

Similar presentations


Presentation on theme: "Database Management Systems. What is a DBMS? Database management systems: Provide efficient (speed and space) and secure access to large amount of data."— Presentation transcript:

1 Database Management Systems

2 What is a DBMS? Database management systems: Provide efficient (speed and space) and secure access to large amount of data. Address problems such as: How to store the data efficiently How to query data efficiently How to update the data reliably and securely (by multiple users) Contrast with using file systems for the same task

3 Relational Databases Based on the relational model Separates the logical view from the physical view of the data. StudentCourseTerm Charles SYSC3001 Fall, 2011 DanSYSC4602 Summer, 2010 ………

4 Querying a Database Find all the students who have taken SYSC3001 in Fall 2011. S(tructured) Q(uery) L(anguage) select E.name from Enroll E where E.course=SYSC3001 and E.term=“Fall_2011” Query processor figures out how to answer the query efficiently.

5 Database Industry Relational databases are a great success of theoretical ideas. “Big 3” DBMS companies are among the largest software companies in the world. IBM (with DB2) and Microsoft (SQL Server, Microsoft Access) are also important players. $20B industry (several years old) Challenged by object-oriented DBMS.

6 Why Use a DBMS? Data independence and efficient access Reduced application development time Data integrity and security Uniform data administration Concurrent access and recovery from crashes

7 Functionalities of a DBMS Storage management Abstract data model High level query and data manipulation language Efficient query processing Transaction (concurrency) processing Resiliency: recovery from crashes Interface with programming languages

8 The Study of DBMS Some aspects: Modeling and design of databases Database programming: querying and update operations Database implementation DBMS study cuts across many fields of Computer Science and Engineering: OS, languages, software engineering, AI, Logic, multimedia, theory,...

9 Database Modeling and Design Why do we need it? Agree on structure of the database before deciding on a particular implementation. Consider issues such as: What entities to model? How entities are related? What constraints exist in the domain? How to achieve good design? Performance, memory space, reliability, and security

10 Database Design Formalisms Entity/Relationship model (E/R): More relational in nature Conceptually similar to OO analysis and design Can be translated (semi-automatically) to relational schemas (with varying amount of pain). New comers: UML and XML

11 Entity / Relationship Diagrams Objects entities Classes entity sets Attributes are the names of roles played by some domain (a set of atomic values) in a relation (a table of values or file of records). Relationships are associations among entities. Product address buys

12 address namessn Person buys makes employs Company Product namecategory stockprice name price

13 Multi-way Relationships Purchase How do we model a purchase relationship between buyers, products and stores? Product Person Store

14 Roles in Relationships Purchase What if we need an entity set twice in one relationship? Product Person Store salesperson buyer

15 Attributes on Relationships Purchase Product Person Store date

16 The Relational Data Model Database Model (E/R, UML) Relational Schema Physical storage Diagrams (E/R, UML) Tables: row names: attributes rows: tuples Complex file organization and index structures.

17 Terminology Name Price Category Manufacturer iPhone $459.99 phoneApple Vista $299.99 OS MS SingleTouch $149.99 photography Canon MultiTouch $203.99 household Hitachi tuples Attribute names Product

18 More Terminology Every attribute has an atomic type. Relation Schema: relation name + attribute names + attribute types Relation instance: a set of tuples. Only one copy of any tuple Database Schema: a set of relation schemas. Database instance: a relation instance for every relation in the schema.

19 More on Tuples Formally, a mapping from attribute names to values: name iPhone price $449.99 category phone manufacturer Apple Sometimes we refer to a tuple by itself: (note order of attributes) (iPhone, $449.99, phone, Apple) or Product (iPhone, $449.99, phone, Apple).

20 Updates The database maintains a current database state. Updates to the data: 1) add a tuple 2) delete a tuple 3) modify an attribute in a tuple Updates to the data happen very frequently. Updates to the schema: relatively rare. Rather painful. Need good DB design Speed and space (security)

21 From E/R Diagrams to Relational Schema - relationships are already independent entities - only atomic types exist in the E/R model. Entity sets relations Relationships relations Special care for weak entity sets – existence depends on existence of another entity. Example: Dependent of Employee.

22 address namessn Person buys makes employs Company Product namecategory Stock price name price

23 Entity Sets to Relations Product namecategory price Product: Name Category Price iPhone phone $450

24 Relationships to Relations makes Company Product namecategory Stock price name Relation MAKES (watch out for attribute name conflicts) Product-name Product-Category Company-name Starting-year iPhone phoneApple 2010 Start Year

25 Mapping an UML Object Model to a Database UML object models can be mapped to relational databases: Some degradation occurs because all UML constructs must be mapped to a single relational database construct - the table Mapping of classes and attributes Each class is mapped to a table Each attribute is mapped onto a column in the table An instance of a class represents a row in the table Methods are not mapped.

26 Mapping a Class to a Table User +firstName:String +login:String +email:String +id:long id:long firstName:text[25]login:text[8]email:text[32] User table

27 Primary and Foreign Keys Any set of attributes that could be used to uniquely identify any data record in a relational table is called a candidate key The actual candidate key that is used in the application to identify the records is called the primary key The primary key of a table is a set of attributes whose values uniquely identify the data records in the table A foreign key is an attribute (or a set of attributes) that references the primary key of another table.

28 Example for Primary and Foreign Keys User table Candidate key loginemail “am384”“am384@mail.org” “js289”“john@mail.de” firstName “alice” “john” “bd”“bobd@mail.ch”“bob” Candidate key Primary key League table login “am384” “bd” name “tictactoeNovice” “tictactoeExpert” “js289”“chessNovice” Foreign key referencing User table

29 Buried Association League LeagueOwner * 1 id:long LeagueOwner table...owner:long League table...id:long Associations with multiplicity “one” can be implemented using a foreign key For one-to-many associations we add the foreign key to the table representing the class on the “many” end owner

30 Another Example for Buried Association Transaction transactionID Portfolio portfolioID... * portfolioID... Portfolio TableTransaction Table transactionID portfolioID Foreign Key

31 Mapping Many-To-Many Associations City cityName Airport airportCode airportName * * Serves cityName Houston Albany Munich Hamburg City Table airportCode IAH HOU ALB MUC HAM Airport Table airportName Intercontinental Hobby Albany County Munich Airport Hamburg Airport cityName Houston Albany Munich Hamburg Serves Table airportCode IAH HOU ALB MUC HAM In this case we need a separate table for the association Separate table for the association “Serves” Primary Key

32 Another Many-to-Many Association Mapping PlayerTournament ** id Tournament table 23 name... novice 24expert tournamentplayer TournamentPlayerAssociation table 2356 2379 Player table id 56 name... alice 79john We need the Tournament/Player association as a separate table

33 Problems in Designing Schema Title ISBN Publisher Phone Address Problems: - redundancy - update anomalies - deletion anomalies OS 1234-390-231 Wiley 312-1234567 87 1 st Ave, NY, … DB 3234-390-241 Wiley 312-1234567 87 1 st Ave, NY, … SE 5234-390-281 Wiley 312-1234567 87 1 st Ave, NY, … ….

34 Relation Decomposition Title ISBN Author Publisher Name Phone Number Address Wiley (201) 555-1234 87 1 st Ave, NY, … McGraw (320) 234-9876 87 1 st Ave, NY, … Break the relation into two relations: OS 1234-390-231 xxx Wiley DB 3234-390-241 yyy Wiley SE 5234-390-281 aaa Wiley …. Book Publisher

35 Anomalies The updated programs will not operate correctly. Examples: EMP_DEPT relation EName SIN BDate ADDR Dnumber Dname DMgrSIN Insertion anomalies: It is difficult to insert a new department that has no employees as yet in the EMP_DEPT relation. Deletion anomalies: If we delete from the EMP_DEPT an employee tuple that happens to represent the last employee working for a particular department, the information concerning that department is lost from the database. Update anomalies: In EMP_DEPT relation, if we want to change the value of one of the attributes of a particular department, say the manager of department 5, we must update the tuples of all employees who work in that department; otherwise, the database will become inconsistent.

36 Decompositions in General Let R be a relation with attributes A 1, A 2, …, A n Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such that: B, B, … B 12m C, C, … C 12l  = A, A, … A 12n And -- R1 is the projection of R on -- R2 is the projection of R on B, B, … B 1 2m C, C, … C 12l

37 Boyce-Codd Normal Form A simple condition for removing anomalies from relations: A relation R is in BCNF if and only if: Whenever there is a nontrivial dependency A 1, A 2, …, A n for R, it is the case that {A 1, A 2, …, A n } a super-key for R. B 1 In English (though a bit vague): Whenever a set of attributes of R is determining another attribute, it should determine all the attributes of R.

38 Example Title ISBN Publisher Author Phone Addr OS 0-471-20284-3 Wiley xxx (201) 555-1234 1234 1 st DB 0-471-20282-3 Wiley yyy (206) 572-4312 1234 1 st SE 0-471-20267-8 Wiley aaa (201) 555-1234 1234 1 st Netw. 0-471-20267-8 Wiley bbb (201) 555-1234 1234 1 st What are the dependencies? What are the keys? Is it in BCNF?

39 And Now? Title ISBN Publisher Author OS 0-471-20284-3 Wiley xxx DB 0-471-20282-3 Wiley yyy SE 0-471-20267-8 Wiley aaaa Netw. 0-471-20267-8 Wiley bbb Publisher Phone Addr Wiley 555-1234 1234 1 st St. …… McGraw 234-9876 9876 5 th Ave. ….

40 More Examples EMP_DEPT : ENAME SIN BDATE ADDR DNUM DNAME DMGRSIN What’s wrong? How to decompose? Functional dependency. Decompose EMP_DEPT into: EMP ENAME SIN BDATE ADDR DNUM DEPT DNUM DNAME DMGRSIN

41 More Examples (cont’d) Example: EMP_PROJ SIN PNUMBER HOURS ENAME PNAME PLOCATOIN Can be decomposed into EP1 SIN PNUMBER HOURS EP2 SIN ENAME EP3 PNUMBER PNAME PLOCATOIN

42 More Examples (cont’d) EMP ENAME Proj_NAME Dep_NAME SmithXjohn Smithyanna Smithxanna Smithyjohn Brown wjim Brown x jim Brown yjim Brown z jim Brown wJoan Brown x joan Brown yjoan Brown z joan Brown wbob Brown x bob Brown ybob Brown z bob Decompose EMP into:

43 More Examples (cont’d) EMP_PROJECTS ENAMEProj_NAME Smith x Smith y Brown w Brown x Brown y Brown z EMP_DEPENDENTS ENAMEDep_NAME Smith anna Smith john Brown jim Brown joan Brown bob

44 SQL Introduction Standard language for querying and manipulating data Structured Query Language Many standards out there: SQL92, SQL2, SQL3. Vendors support various subsets of these, but all of what we’ll be talking about. Basic form: (many many more bells and whistles in addition) Select attributes From relations (possibly multiple, joined) Where conditions (selections)

45 SQL Examples Employee (FNAME, LNAME, SSN, BDATE, ADDR, SALARY, SUPERSSN, DNO) Department (DNAME, DNUMBER, MGRSSN, MGRSTARTDATE) Research, 5, 333445555, 22-May-78 Administration, 4, 987654321, 1-Jan-85 Headquarters, 1, 888665555, 19-Jun-71 Q1: Find John Smith’s birthday and address: Q2: Find the salary of all employees: Q3: Find all the attributes of all employees who work for department 5 Q4: Find all employees who work for the Research department Q5: For each employee, retrieve the employee’s first and last name, and the first and last name of all employees who work in the same department. Q6: For each employee, retrieve the employee’s first and last name, and the first name and last name of his/her supervisor.

46 SQL Examples - 1 Employee (FNAME, LNAME, SSN, BDATE, ADDR, SALARY, SUPERSSN, DNO) Q1: Find John Smith’s birthday and address: SELECT BDATE, ADDRESS FROM EMPLOYEE WHERE FNAME = ‘John’ AND LNAME = ‘Smith’ Q2: Find the salary of all employees: SELECT SALARY FROM EMPLOYEE Q3: Find all the attributes of all employees who work for department 5 SELECT * FROM EMPLOYEE WHERE DNO = 5

47 SQL Examples - 2 Employee (FNAME, LNAME, SSN, BDATE, ADDR, SALARY, SUPERSSN, DNO) Department (DNAME, DNUMBER, MGRSSN, MGRSTARTDATE) Research, 5, 333445555, 22-May-78 Administration, 4, 987654321, 1-Jan-85 Headquarters, 1, 888665555, 19-Jun-71 Q4: Find all employees who work for the Research department SELECT FNAME, LNAME, ADDRESS FROM EMPLOYEE, DEPARTMENT WHERE DNAME = ‘Research’ AND DNUMBER = DNO

48 SQL Examples - 3 Employee (FNAME, LNAME, SSN, BDATE, ADDR, SALARY, SUPERSSN, DNO) Department (DNAME, DNUMBER, MGRSSN, MGRSTARTDATE) Q5: For each employee, retrieve the employee’s first and last name, and the first and last name of all employees who work in the same department. SELECT E.FNAME, E.LNAME, S.FNAME, S.LNAME FROM EMPLOYEE AS E, EMPLOYEE AS S WHERE E.DNO = S.DNO Q6: For each employee, retrieve the employee’s first and last name, and the first name and last name of his/her supervisor. SELECT E.FNAME, E. LNAME, S.FNAME, S.LNAME FROM EMPLOYEE AS E, EMPLOYEE AS S WHERE E.SUPERSSN=S.SSN

49 Selections SELECT * FROM Company WHERE country=“USA” AND stockPrice > 50 You can use: attribute names of the relation(s) used in the FROM. comparison operators: =, <>,, = apply arithmetic operations: stockprice*2 operations on strings (e.g., “||” for concatenation). lexicographic order on strings. pattern matching: s LIKE p special stuff for comparing dates and times.

50 Projections SELECT name AS company, stockprice AS price FROM Company WHERE country=“USA” AND stockPrice > 50 SELECT name, stock price FROM Company WHERE country=“USA” AND stockPrice > 50 Select only a subset of the attributes Rename the attributes in the resulting table

51 Ordering the Results SELECT name, stock price FROM Company WHERE country=“USA” AND stockPrice > 50 ORDERBY country, name Ordering is ascending, unless you specify the DESC keyword. Ties are broken by the second attribute on the ORDERBY list, etc.

52 Joins SELECT name, store FROM Person, Purchase WHERE name=buyer AND city=“Ottawa” AND product=“iPhone” Product ( name, price, category, maker) Purchase (buyer, seller, store, product) Company (name, stock price, country) Person (name, phone number, city)

53 Disambiguating Attributes SELECT Person.name FROM Person, Purchase, Product WHERE Person.name=buyer AND product=Product.name AND Product.category=“telephony” Product ( name, price, category, maker) Purchase (buyer, seller, store, product) Person( name, phone number, city) Find names of people buying telephony products:

54 Tuple Variables SELECT product1.maker, product2.maker FROM Product AS product1, Product AS product2 WHERE product1.category=product2.category AND product1.maker <> product2.maker Product ( name, price, category, maker) Find pairs of companies making products in the same category

55 Union, Intersection, Difference (SELECT name FROM Person WHERE City=“Seattle”) UNION (SELECT name FROM Person, Purchase WHERE buyer = name AND store = “The Bon”) Similarly, you can use INTERSECT and EXCEPT. You must have the same attribute names (otherwise: rename).

56 Subqueries SELECT Purchase.product FROM Purchase WHERE buyer = (SELECT name FROM Person WHERE social-security-number = “123 - 45 - 6789”); In this case, the subquery returns one value. If it returns more, it’s a run-time error.

57 Subqueries Returning Relations SELECT Company.name FROM Company, Product WHERE Company.name=maker AND Product.name IN (SELECT product FROM Purchase WHERE buyer = “Joe Blow”); Find companies who manufacture products bought by Joe Blow. You can also use: s > ALL R s > ANY R EXISTS R

58 Conditions on Tuples SELECT Company.name FROM Company, Product WHERE Company.name=maker AND (Product.name,price) IN (SELECT product, price) FROM Purchase WHERE buyer = “Joe Blow”);

59 Correlated Queries SELECT title FROM Movie AS Old WHERE year < ANY (SELECT year FROM Movie WHERE title = Old.title); Movie (title, year, director, length) Movie titles are not unique (titles may reappear in a later year). Find movies whose title appears more than once. Note scope of variables

60 Removing Duplicates SELECT DISTINCT Company.name FROM Company, Product WHERE Company.name=maker AND (Product.name,price) IN (SELECT product, price) FROM Purchase WHERE buyer = “Joe Blow”);

61 Conserving Duplicates (SELECT name FROM Person WHERE City=“Seattle”) UNION ALL (SELECT name FROM Person, Purchase WHERE buyer=name AND store=“The Bon”) The UNION, INTERSECTION and EXCEPT operators operate as sets, not bags.

62 Aggregation SELECT Sum(price) FROM Product WHERE manufacturer=“Toyota” SQL supports several aggregation operations: SUM, MIN, MAX, AVG, COUNT Except COUNT, all aggregations apply to a single attribute SELECT Count(*) FROM Purchase

63 Grouping and Aggregation Usually, we want aggregations on certain parts of the relation. Find how much we sold of every product SELECT product, Sum(price) FROM Product, Purchase WHERE Product.name = Purchase.product GROUPBY Product.name 1. Compute the relation (I.e., the FROM and WHERE). 2. Group by the attributes in the GROUPBY 3. Select one tuple for every group (and apply aggregation) SELECT can have (1) grouped attributes or (2) aggregates.

64 HAVING Clause SELECT product, Sum(price) FROM Product, Purchase WHERE Product.name = Purchase.product GROUPBY Product.name HAVING Count(buyer) > 100 Same query, except that we consider only products that had at least 100 buyers. HAVING clause contains conditions on aggregates.

65 Modifying the Database We have 3 kinds of modifications: insertion, deletion, update. Insertion: general form -- INSERT INTO R(A1,…., An) VALUES (v1,…., vn) Insert a new purchase to the database: INSERT INTO Purchase(buyer, seller, product, store) VALUES (Joe, Fred, wakeup-clock-espresso-machine, “The Sharper Image”) If we don’t provide all the attributes of R, they will be filled with NULL. We can drop the attribute names if we’re providing all of them in order.

66 More Interesting Insertions INSERT INTO PRODUCT(name) SELECT DISTINCT product FROM Purchase WHERE product NOT IN (SELECT name FROM Product) The query replaces the VALUES keyword. Note the order of querying and inserting.

67 Deletions DELETE FROM PURCHASE WHERE seller = “Joe” AND product = “Brooklyn Bridge” Factoid about SQL: there is no way to delete only a single occurrence of a tuple that appears twice in a relation.

68 Updates UPDATE PRODUCT SET price = price/2 WHERE Product.name IN (SELECT product FROM Sales WHERE Date = today);

69 Defining Views Views are relations, except that they are not physically stored. They are used mostly in order to simplify complex queries and to define conceptually different views of the database to different classes of users. View: purchases of telephony products: CREATE VIEW telephony-purchases AS SELECT product, buyer, seller, store FROM Purchase, Product WHERE Purchase.product = Product.name AND Product.category = “telephony”

70 A Different View CREATE VIEW Seattle-view AS SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = “Seattle” AND Person.name = Purchase.buyer We can later use the views: SELECT name, store FROM Seattle-view, Product WHERE Seattle-view.product = Product.name AND Product.category = “shoes” What’s really happening when we query a view??

71 What is a Transaction? Any action that reads from and/or writes to a database may consist of Simple SELECT statement to generate a list of table contents A series of related UPDATE statements to change the values of attributes in various tables A series of INSERT statements to add rows to one or more tables A combination of SELECT, UPDATE, and INSERT statements

72 What is a Transaction? ( cont’d ) A logical unit of work that must be either entirely completed or aborted Successful transaction changes the database from one consistent state to another One in which all data integrity constraints are satisfied Most real-world database transactions are formed by two or more database requests The equivalent of a single SQL statement in an application program or transaction

73 Evaluating Transaction Results Not all transactions update the database SQL code represents a transaction because database was accessed Improper or incomplete transactions can have a devastating effect on database integrity Some DBMSs provide means by which user can define enforceable constraints based on business rules Other integrity rules are enforced automatically by the DBMS when table structures are properly defined, thereby letting the DBMS validate some transactions

74 Transaction Properties Atomicity Requires that all operations (SQL requests) of a transaction be completed Durability Indicates permanence of database’s consistent state

75 Transaction Properties ( continued ) Serializability Ensures that the concurrent execution of several transactions yields consistent results Isolation Data used during execution of a transaction cannot be used by second transaction until first one is completed

76 Transaction Management with SQL ANSI has defined standards that govern SQL database transactions Transaction support is provided by two SQL statements: COMMIT: permanent change to a DB ROLLBACK: undo a change to a DB up to the COMMIT point ANSI standards require that, when a transaction sequence is initiated by a user or an application program, it must continue through all succeeding SQL statements until one of four events occurs

77 The Transaction Log Stores A record for the beginning of transaction For each transaction component (SQL statement) Type of operation being performed (update, delete, insert) Names of objects affected by the transaction (the name of the table) “Before” and “after” values for updated fields Pointers to previous and next transaction log entries for the same transaction The ending (COMMIT) of the transaction


Download ppt "Database Management Systems. What is a DBMS? Database management systems: Provide efficient (speed and space) and secure access to large amount of data."

Similar presentations


Ads by Google