Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 COP 4710 Databases Fall, 2000 Today’s Topic Review for Final Exam David A. Gaitros November 6, 2000 Department of Computer Science Copyright by Dr.

Similar presentations


Presentation on theme: "1 COP 4710 Databases Fall, 2000 Today’s Topic Review for Final Exam David A. Gaitros November 6, 2000 Department of Computer Science Copyright by Dr."— Presentation transcript:

1

2 1 COP 4710 Databases Fall, 2000 Today’s Topic Review for Final Exam David A. Gaitros November 6, 2000 Department of Computer Science Copyright by Dr. Greg Riccardi

3 2 Outline of Course n Study of principals and techniques of databases n Grades assigned as in information sheetinformation sheet n Examples of use of databases n Programming projects in database design and implementation –Programming in Microsoft Access –Programming in Java with a Unix database –Development of a web site with database support n Course notes in http://www.cs.fsu.edu/cop4710/lectures n Next class, Chapter 2

4 3 Representation of Information n Data is collections of bits –physical database n Information is data with meaning –logical database n Representation of meta-data –database system is self-describing n Database Management System (DBMS) –define information content –construct database –manipulate by queries, reports and updates –data plus software

5 4 Vocabulary n Glossary of terms n Define the terms as used in this subject –Database literature is filled with terms n Example of terms –Data, bits –Information, bits with meaning (type) –Entity –Schema

6 5 Data Modeling n A data model is a specification of the information content of a system –conceptual data model describes information in terms the users will understand –logical data model describes information in a way that can be used to build a database –physical data model describes information in terms of its representation in physical storage

7 6 Schemas and Instances n Schema is the structure of a database –intention or meaning of the data –data models are schemas –table definitions are schemas –class definitions are schemas n Instances are the contents of a database –extension or values of the data –objects are instances –objects in a database are typically rows in a table

8 7 Levels of database schemas n Different schemas are presented to different users

9 8 Database Languages n DDL, data definition language, conceptual schema –describe conceptual schemas n SDL, storage definition language, internal schema –describe file structures, indexes n VDL, view definition language, external schema n DML, data manipulation language –High-level or non-procedural (e.g. SQL) Select Last Name from Roster where Section = 2 –Low-level or procedural For r in Roster loop if r.section = 2 then result.Add ( r.lastname );

10 9 Principals of ER Modeling n Entities and classes –Entity, a thing in the real world –Entity Class, the structure of a collection of similar entities n Attributes –Attribute, a property of an entity –Each entity has a value for each of its attributes n Types of attributes –simple vs. composite, single-valued vs. multi- valued, stored vs. derived –domains of attributes

11 10 Relationships Between Entities n Relationship type defines a set of associations among given types. n Relationsip Instances are particular relationships among objects. n Examples of relationship types in company database –Manages: 1:1 between employee and department –Works-for: 1:N between department and employee –Controls: 1:N between department and project

12 11 Find the Entities, Attributes and Relationships

13 12 ER schema diagram for BigHit Video

14 13 Chapter 4 The Relational Data Model n A Relation is a two-dimensional table –Fixed list of columns –One object per row n An attribute represents a single column of a table and has a name and a type n A relation schema is the name and the list of attributes of a relation –Grade (studentId, assignmentId, points, dateSubmitted) n A tuple is a row of a table, one value for each attribute –(123, 14, 27, 5/28/98)

15 Characteristics of Relational Model n Relation is a set of tuples –No ordering of tuples –No duplicate tuples no two rows have all the same values n Each attribute value is atomic –hence no multiple-valued or composite attributes –called first normal form n Each relation is a set of assertions –Each represents a fact –Some facts are about relationships n That’s it! –no other data structures –no explicit representation of relationships

16 15 Representing E-R Model as Relations n Entity class  Relation schema n Entity  row of table –set of all entities of class  table n Attribute  column definition (attribute) –attribute value  table element n Relationship type  –relation schema –attribute(s) of relation schema

17 16 Rules for Relationship Types n One-to-many –For each one-to-many relationship type R between subject class S and target class T, add the key attributes of class S to class T as foreign keys. Name the attributes using the role that S plays in relationship type R. –Add the attributes of the relationship type R to class T. n One-to-one –choose one side and use above rule n Examples in class

18 17 Many-to-many relationship types n Create a relation schema for the relationship type –foreign key attributes for the key of the related schema –add attributes of the relationship type n Examples in class!

19 18 Representing relationships as attributes n One-to-many –For each one-to-many relationship type R subject class S (one side) target class T (many side), –add the key attributes of S to the schema of T as foreign keys. –Name the foreign key attributes ues the role that S plays in relationship type R. –Add the attributes of the relationship type R to schema for T. n One-to-one –choose one side and use above rule

20 19 Representing Weak Entity Classes n Create a relation schema –Add foreign key for each defining relationship type –Key is partial key plus defining foreign keys n Consider Fig. 2.5, weak class Rental n Schema: Rental (videoId,dateDue, dateRented, cost) –key videoId (foreign key)

21 20 Representing specialization hierarchies n Three possibilities –1. Create a table for the superclass with its attributes and a table for each subclass with its attributes –2. Create a table for the superclass with all of the subclass attributes –3. Create a table for each subclass that includes both subclass and superclass attributes

22 21 Functional Dependencies and Normalization n Begin by discussing good and bad relation schemas n Informal measures of the quality of relation schema design –Semantics of the attributes –Reducing the redundant values in tuples –Reducing the null values in tuples –Disallowing spurious tuples n Define Normal Forms as formal measures of the quality of schemas –restrictions on the form of relation schemas

23 22 Update Anomalies n Insertion Anomalies –When inserting a new owner, we must correctly insert the Manuf field, or will create inconsistencies –Cannot create a car without an owner –Cannot create a make without a car and an owner n Deletion Anomalies –Deletion of owner of a car also deletes make and manufacturer of car –Deletion of owner of the last Plymouth deletes relationship between Plymouth and Chrysler n Modification Anomalies –Changing the make of a car requires consistency check –Cannot change so that a Plymouth is made by Ford n Guideline 2: no insertion, deletion, or modification anomalies allowed!

24 23 Some definitions n superkey: a set of attributes of a relation whose values are unique within the relation. n key, a superkey in which removal of any attribute makes it not a superkey. If there is more than one key, they are called candidate keys. n primary key, arbitrarily designated candidate key, all other candidate keys are secondary keys. n prime attribute, one which is a member of any key. n nonprime attribute, one which is not prime.

25 24 Definition of Functional Dependency n A functional dependency is a constraint between 2 sets of attributes from the database –For each value of the first set there is a unique value of the second set n X-->Y restricts the tuples that can be instances of R n if t1 and t2 are instances of R –t1(X) = t2(X) then t1(Y) = t2(Y) n For example, –{DLNum} --> {Oname} –{CarId} --> {Make, Manuf} –{Make} --> {Manuf} n Candidate keys are left hand sides of functional dependencies

26 25 Second Normal Form (2NF) n X-->Y is a full functional dependency if the removal of any attribute A from X removes the dependency –not X-{A} --> Y n X-->Y is a partial dependency if some attribute A may be removed without removing the dependency –X-{A} --> Y n A relation schema R is in 2NF if every nonprime attribute is fully functionally dependent on the primary key of R

27 26 Putting the CarReg Schema into 2NF n Consider the Owner relation schema –{DLNum} is the primary key –Hence Owner is in 2NF n Consider the Car relation schema –{CarId, DLNum} is primary key (multiple owners) –{CarId} --> {Make, Model,...} –Hence Car is not 2NF n Create new relations –CarOwner = {CarId, Owner, PurchDate, TagNum, RegisDate} –Car = {CarId, Make, Model, Manuf, Year, Color} n Is it 2NF?

28 27 Rules for Functional Dependencies n Given a particular set of functional dependencies, we can find others using inference rules –Splitting/combining rules A -> B1 B2 A-> B1 and A->B2 –Trivial rules A B -> B, for all A, B –Transitive rule A -> B and B -> C => A B -> C n We are interested in the closure of the set of functional dependencies under these (and other) rules

29 28 Inference Rules for Functional Dependency n There are semantically obvious functional dependencies, usually specified by schema designer n Other functional dependencies can be inferred from those n Inference rules –Reflexive, X includes Y, X-->Y –Augmentation, X-->Y then XZ-->YZ –Transitive, X-->Y-->Z then X-->Z –Decomposition, X-->YZ then X-->Y –Union, X-->Y and X-->Z then X-->YZ –Pseudotransitive, X-->Y and WY-->Z then WX-->Z

30 29 Definition of Key n A set of one or more attributes {A1,...Ak} is a key for a relation R –Those attributes functionally determine all other attributes of R no 2 distinct tuples can agree on the key –no proper subset of {A1,... Ak} is a key of R a key must be minimal n There can be more than one key in a relation –Department (DeptName, DeptNo,...) since both are unique, both are keys n A superkey (superset of a key) is a set of attributes that functionally determine all other attributes of the relation.

31 30 Third Normal Form (3NF) n Based on transitive dependency, or non- key dependency n A functional dependency X-->Y is a transitive dependency if there is a set Z which is not a subset of any key, and for which X-->Z and Z- ->Y n A relation schema is in 3NF if there is no nonprime attribute which is functionally dependent on a non-key set of attributes. n Example of {make}-->{manuf} violates 3NF since make is not a key.

32 31 Section 6.1 Relational Algebra n Look at the formal basis for operations on the relational data model n An “algebra” is a collection of operations on some domain n Relational Algebra is a collection of operators –operands and results are relations –operators projection and selection remove parts of a relation set operators, union, intersection and difference joins and products combine the tuples of two relations –other operators follow

33 32 Join Operations n Natural join is based on the cartesian product –With a restriction on the tuples and attributes each common attribute appears once in result tuples are included only where the common attributes have the same values –R join S on A has those tuples of R  S where R.A = S.A –Each tuple from R is joined to all tuples of S that have the same value for attribute A n Example –Every combination of Customer and Rental where the accountId fields match

34 33 Combining Operations to Form Queries n Can put all operations together –Names and grades of students who made took quiz 1 n We’ll see how this works in in Access n In class, time permitting –Demonstration of Queries in Access

35 34 Relational Expressions Select account 113, project videoId and dateDue –  videoId, dateDue (  accountId=113 (Rental)) VideoId, title and date due for account 113 –  videoId, title, dateDue ((  accountId=113 (Rental))  videoId Videotape  movieId Movie) –  videoId, title, dateDue (  accountId=113 ( Rental  videoId Videotape  movieId Movie)) What is the order of evaluation?

36 35 Chapter 7: SQL n Standard Query Language –ANSI and ISO standard –SQL2 or SQL-92 is current standard n SQL is a data manipulation language (DML) and a data definition language (DDL) and a programming language n We can use SQL for –Logical database specification (database schema definitions –Physical database specifications (indexes, etc.) –Querying database contents –Modifying database contents

37 36 Relational Operations in SQL n Select statement –select from where n Projection in SQL using select clause –Select title from Movies n Selection in SQL using where clause –select * from Customer where lastName = 'Doe' –select distinct lastName, firstName from Customer no duplicates with distinct

38 37 Products and Joins in SQL n Cartesian product in SQL using from clause –Select * from Employee, Timecard n Join using from and where clauses –Select * from Employee, Timecard where Employee.ssn = Timecard.ssn n Join using join and on (non-standard) –Select * from Employee join TimeCard on Employee.ssn = TimeCard.ssn

39 38 Nested Queries n Nested select query –Select videoId, dateAcquired from Videotape where videoId = ( select videoId from Rental where dateRented=‘1/1/99’) n compare with –Select v.videoId, dateAcquired from Videotape v, Rental r where v.videoId = r.videoId and dateRented=‘1/1/99’) n Same result?

40 39 Select Using Group by and Having n Group by forms groups of rows with the same column values n What is the average hourly rate by store? –select storeId, avg(hourlyRate) from HourlyEmployee e, WorksAt w where e.ssn = w.ssn group by stroreId n How many employees work at each store? –select storeId, name, count (*) from Store s, WorksAt w where s.storeId = w.storeId group by storeId, name n Having filters the groups –having count (*)>2

41 40 Substrings, arithmetic and order n Find a movie with ‘Lion’ in the title –select title from Movie where title like ‘%Lion%’ n List the monthly salaries of salaried employees who work in in store 3 –select salary/12 from Employees e, WorksAt w where e.ssn=w.ssn and storeId=3 n Give the list of employees in store 3, ordered by salary –select firstName, lastName from Employees e, WorksAt w where e.ssn=w.ssn and storeId=3

42 41 Modifying Content with SQL n Insert queries –insert into Customer values (555, 'Yu', 'Jia','540 Magnolia Hall','Tallahassee', 'FL', '32306') –insert into Customer (firstName, lastName, accountId) values ('Jia', 'Yu', 555) n Update queries –update TimeCard set paid = true where paid = false –update HourlyEmployee set hourlyRate = hourlyRate *1.1 where ssn = '145-09-0967' n Samples in Access

43 42 Creating Pay Statements with SQL n Find the number of hours worked for each employee entry –select TimeCard.ssn, sum((endTime- startTime)*24) as hoursWorked from TimeCard where paid=false group by ssn n Create the Pay Statement entries for each Employee –select ssn, hourlyRate, hoursWorked, hoursWorked * hourlyRate as amountPaid, today from … n Insert into the PayStatement table –Insert into PayStatement select … n Look at the Access example in BigHit.mdb

44 43 Create Table Statement n create table Customer ( accountId int, lastName varchar(32), firstName varchar(32), street varchar(100), city varchar(32), state char(2), zipcode varchar(9) ) n Note that SQL has specific types

45 44 Key Constraints in SQL n Key declarations are part of create table –create table Store ( storeId int primary key, –create table Movie ( movieId varchar(10) primary key, –create table Rental ( accountId int, videoId varchar(10), primary key (accountId, videoId)

46 45 Java Objects and variables n Objects are dynamically allocated –Figures A.1 and A.2 show String variables Assignment (=) and equality (==)

47 46 Java DB Connectivity (JDBC) n Figure 8.4 Strategies for implementing JDBC packages

48 47 Executing Insert and Update Statements n Create new customer, using String + int rowcount = stmt.executeUpdate( ”insert into Customer ” +”(accountId,lastName,firstName) ” +”values (1239,’Brown’,’Mary’)”); if (rowcount == 0) // insert failed n Update –String updateSQL = “update TimeCard set “ +”TimeCard.paid = 'yes’ where “ +”paid<>'yes’”; int count = stmt.execute(updateSQL); // count is number of rows affected

49 48 Chapter 13 Query Processing n Strategies for processing queries n Query optimization n First: How to represent relational DB? –Each table is a file Record structure to store tuples File is a random access collection of records –Query is executed by reading records from files Read record, create object in memory Process object Write result as a file of records or keep in memory

50 49 Processing a range query n Figure 13.3 Illustration of query processing for query –select * from Customer where accountId >= 101 and accountId < 300

51 50 Using hashing to eliminate duplicates n A hash function partitions values so that –All values that are the same are in the same partition –Values that are different are often in different partitions n We can find duplicates by hashing –For each tuple in the table Mash all attribute values in the tuple into a single value Apply hash function –For each partition Compare all pairs of tuples Eliminate duplicates –Why does this work?

52 51 Processing join queries with indexes n Indexed nested loop join while (not customer.eof()) { Customer c= customer.read(); rental.reset(); while (not rental.eof()) { Rental r[] = rental.readByAcctId(c.accountId); for (int i=0; i<r.length; i++) { result.write(c,r[i]); result.write(c,r[I]); }}} Cost is B c + R r instead of B c + R c × B r without index n Reduce cost by processing a block at a time?

53 52 ACID Transactions n Atomicity: the property of a transaction that all of the updates are successful, or there is no update at all. n Consistency: each transaction should leave the database in a consistent state. Properties such as referential integrity must be preserved. n Isolation: each transaction when executed concurrently with other transactions should have the same affect as if it had been executed by itself. n Durability: once a transaction has completed successfully, its changes to the database should be permanent. Even serious failures should not affect the permanence of a transaction.

54 53 Example of transaction open transaction videoId video1 = select id of a copy of "Star Wars" if (video1 == null) rollback transaction insert row into Reservation for video1 videoId video2 = select id of a copy of "Return of the Jedi" if (video2 == null) rollback transaction insert row into Reservation for video2 videoId video3 = select id of a copy of "The Empire Strikes Back" if (video3 == null) rollback transaction insert row into Reservation for video3 commit transaction

55 54 Transaction isolation n Consider these transactions –Actions of T1 A: balance1 = (select balance from Customer where accountId = 101); balance1 += 5.00; B: update Customer set balance = ?balance1 where accountId = 101; –Actions of T2 A: balance2 = (select balance from Customer where accountId = 101); balance2 += 10.00; B: update Customer set balance = ?balance1 where accountId = 101; n Problems –Lost update: T1.a, T2.a, T1.b, T2.b –Dirty read: T1.a, T1.b, T2.a, T1.rollback, T2.b, and T2 commit –Incorrect Summary: example in class

56 55 Locking database objects n Allow transaction operations to lock objects –Read (shared) locks –Write (exclusive) locks n Lock granularity –What size object to lock? –Table, row, field, column n Effect on concurrency –T1:Select sum(balance) from Customers –T2: Update Customers set firstName=‘Joe’ where accountId=101 n Effect on size and cost –Smaller objects = more locks

57 56 Two phase locking (2PL) n Locks granted and released in two phases –Growing phase Request and upgrade locks –Request read on X –Request write on X –Shrinking phase Release and downgrade locks –Request read on X (downgrade from write) –Release read on X n 2PL guarantees serializability –Any conflicting operation is blocked

58 57 Transaction problems n Lost update –Two transactions update, last one persists n Dirty read –One transaction reads a value written by a transaction that subsequently rolls back n Incorrect summary –One transaction calculates an aggregate while another is updating n Unrepeatable read –One transaction reads the same object twice and receives two different values n Phantom read –A transaction reads a value inserted by another transaction that subsequently rolls back n Deadlock –Two transactions hold and request

59 58 Transactions in SQL n Transaction management statements –set transaction read only; –set transaction read write; –set transaction isolation level serializable; –commit transaction; –rollback transaction; n Executing SQL statement without opening transaction –autocommit mode

60 59 Causes of Failure, Possibilities of Recovery n Database server –computer crashes –server program crashes –disk drive corruption n Client failure –computer crashes –client program crashes n Network failure –connection fails, often temporary n Transaction failure –executes rollback (voluntary) –executes illegal operation (server created) –deadlock –introduces errors into the database

61 60 Recovery from failure n Primary technique, restart from consistent backup/checkpoint n Reprocessing –ask all committed transactions to execute again n Roll Forward –Back to consistent backup state –Apply redo transaction log n Roll Back –Remove the effect of each transaction with undo log –Can be used to cancel the effects of rogue transactions

62 61 Security in Relational Database Systems n Account security for validation of users –Database accounts –Operating system accounts n SQL statements for security –create user –alter user –create profile –create role –grant privileges to users, roles

63 62 Stored Procedures n Define numberRented function –create function numberRented (accId int) return int as select sum(*) from Rental where Customer.accountId = accId; n Define checkIn procedure –create procedure checkIn (vidId int, cost double) as begin insert into PreviousRental … n Grant privileges to procedures –grant update on PreviousRental to checkIn –grant checkIn to clerk –revoke update on PreviousRental to public n User in the clerk role can update the table, no one else can

64 63 Distributed Database Systems Os net = Network Communications portion of Operating System Os dm = Data management portion of Operating System DDBMS = Distributed Database System Database Database Database DDBMS AP 1 AP 2 OS net OS dm DDBMS AP 1 AP 2 OS net OS dm DDBMS AP 2 AP 3 OS net OS dm

65 64 Distributed Databases n Single schema with multiple servers –Not one application connecting to multiple servers –An application connects to a single server n Fragmentation of tables –Horizontal, rows in different servers –Vertical, columns in different servers –Replicated, some rows or columns in multiple servers n Distributed Transactions –Two phase commit –Discussion in class


Download ppt "1 COP 4710 Databases Fall, 2000 Today’s Topic Review for Final Exam David A. Gaitros November 6, 2000 Department of Computer Science Copyright by Dr."

Similar presentations


Ads by Google