Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database Systems Marcus Kaiser School of Computing Science Newcastle University.

Similar presentations


Presentation on theme: "Database Systems Marcus Kaiser School of Computing Science Newcastle University."— Presentation transcript:

1 Database Systems Marcus Kaiser School of Computing Science Newcastle University

2 Recap: Data Inconsistency when a Computer Fails A bank wishes to move £500 from account 12 to account 17 The sequence of actions is: 1. Reduce the Balance of account 12 by £ Increase the Balance of account 17 by £500 What if the computer crashes after 1 and before 2 ?

3 Transactions A Transaction is a logical unit of work A Transaction can consist of a sequence of database operations e.g. 1. Reduce the Balance of account 12 by £ Increase the Balance of account 17 by £500 A Transaction either executes in its entirety or is totally cancelled

4 Transactions (examples) A Transaction either executes in its entirety or is totally cancelled e.g. Start Transaction 1. Reduce the Balance of account 12 by £ Increase the Balance of account 17 by £500 End Transaction A. Both steps of the transaction complete successfully The database has been changed B. The computer crashes after Step 1. the DBMS restores the database to the state it was in before the Transaction began How does it do this ?......

5 Logging The DBMS keeps a log (on Disk) in which it records: transactions starts db updates (old and new values) transaction ends e.g. Start Transaction 1. Reduce the Balance of account 12 by £ Increase the Balance of account 17 by £500 End Transaction if all goes well, the log entries are: Start Transaction Update (Account: 12, > 12, 500) Update (Account: 17, > 17, 2500) End Transaction

6 Logging e.g. Start Transaction 1. Reduce the Balance of account 12 by £ Increase the Balance of account 17 by £500 End Transaction If the computer crashes after Step 1, the log entries are: Start Transaction Update (Account: 12, > 12, 500) When the computer comes up again, the DBMS can undo all updates made by incomplete transactions

7 Logging Example A personnel department keep a database with 2 Tables: Employee: SurnameInitialJob SmithDesigner 1 Title MrA Id 12 SmithDesigner 2MsK75 BrownMrs Implementer M34 Wage Id Payroll : Mrs Brown is promoted to Manager, and her salary increased to The transaction is: Start Transaction 1. Update the Job of Employee Id 34 to Manager 2. Update the Wage of Id 34 to End Transaction

8 Avoiding Data Loss when a Disk Fails The Log can also allow us to Recover from Disk Failure The database is regularly copied onto tape (archiving) nightly is common The log is stored on a different disk to the database

9 Actions on Disk Failure: If a Database Disk fails: 1. Replace the Disk 2. Copy the last database archive back onto the disk 3. Process all log entries made after the last archive if the log entry is an update for a completed transaction then do it

10 Recap: Simultaneous Access to the Data Newcastle University ATM Check Balance ATM says there’s £ Ask for £ ATM Finds £200 in account ATM Gives £ ATM Stores £0 in Balance Metro Centre ATM Check Balance ATM says there’s £ Ask for £ ATM Finds £200 in account ATM Gives £ ATM Stores £0 in Balance Sometimes problems can occur when a file is being updated if there is more than one user. e.g. Sue and Jim have a joint bank account. they go shopping separately and both run out of money at the same time they both head for the nearest ATM

11 Atomic Transactions Transactions can be made atomic An atomic action has exclusive access to the data Changing the balance on an account is atomic Either Sue or Jim will get exclusive access to the balance change The other will need to wait until the atomic transaction is finished At which stage they will see the new balance

12 Keys Primary Keys, Foreign Keys and Candidate Keys

13 Keys Keys are a subset of the fields of a table which uniquely define a record Primary Key – the key within a table Foreign key – a primary key in another table Candidate Key – one of the possible options for primary key Staff IdNamePostcodePhone 001Scott TNE1 7RU Primary Key Staff IdJob TitleSalary 001Chef£10, Manager£11,000

14 Database Normalization Helping to identify good designs

15 Database Normal forms Normal forms are a set of requirements on a database They won’t tell you you’ve got a good design Just tell you your design isn’t bad There are ~7 Normal Forms each becoming progressively more restrictive Though the first three are the ones that are used most often When you’ve designed your tables you can use them to check you’ve not made a bad design

16 First Normal Form 1NF A table is in 1NF if: 1. There's no top-to-bottom ordering to the rows. 2. There's no left-to-right ordering to the columns. 3. There are no duplicate rows. 4. Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else). 5. All columns are regular [i.e. rows have no hidden components such as row IDs, object IDs, or hidden timestamps].

17 Examples of NOT 1NF NameIDModule Scott1M1 M2 Simons2M1 M2 M3 Trevor3M3 NameIDModMod2Mod3 Scott1M1M2 Simons2M1M2M3 Trevor3M3 NameIDMod Scott1M1 Simons2M1 Trevor3M3 NameIDMod Scott1M1,M2 Simons2M1,M2,M3 Trevor3M3 M2 M3

18 Fixing 1NF problems In general splitting a table into separate tables can fix 1NF problems NameID Scott1 Simons2 Trevor3 StudentIDModuleID 1M1 1M2 2M1 2M2 2M3 2

19 Second Normal Form 2NF A table is in 2NF if It is in 1NF and Any attribute in the table depends on the whole of the candidate key and not just part of it.

20 Examples of NOT 2NF Student IDA-level SubjectDegree 001MathsComp-Sci 001PhysicsComp-Sci 001BiologyComp-Sci 002Maths 002PhysicsMaths 002BusinessMaths NameSportAddress TimFootball11, The Acres TimTennis11, The Acres MaryFootball44, Beech St MaryBadminton44, Beech St CK

21 Fixing 2NF problems Again in general splitting the table up will solve 2NF problems Student IDA-level Subject 001Maths 001Physics 001Biology 002Maths 002Physics 002Business Student IDDegree 001Comp-Sci 002Maths NameSport TimFootball TimTennis MaryFootball MaryBadminton NameAddress Tim11, The Acres Mary44, Beech St

22 Third Normal Form 3NF A Table is in 3NF if It is in 2NF and Any attribute is only dependent on the candidate key and nothing else

23 Examples of NOT 3NF ModuleModule YearModule Leader IDModule Leader Name CSC Jenny Palmer CSC Paul Watson CSC Marcus Kaiser CSC Marcus Kaiser CSC Pete Lee CK

24 Fixing 3NF problems Again splitting into separate tables can fix 3NF ModuleModule YearModule Leader ID CSC CSC CSC CSC CSC Module Leader IDModule Leader Name 0001Jenny Palmer 0002Marcus Kaiser 0003Pete Lee 0004Paul Watson

25 Mnemonic for normal forms Data should depend on the key (1NF) no duplicate entries the whole key(2NF) not only part of the candidate key and nothing but the key(3NF) no dependency on other attribute

26 More SQL

27 Running Example A Company keeps records for boat hires Each boat is crewed by a sailor Boats can be reserved Three tables: Sailors, Boats and Reservations SidSnameRatingAge BidBnameColor SidBidDay Sailor Boat Reservation

28 Rough Matching If you don’t know exactly what you’re looking for in a string you can use LIKE _ - Matches with exactly one unknown character % - Matches with 0 or more unknown characters SELECT Age FROM Sailor WHERE Sailor.Sname LIKE ‘Ne_o’ Would match ‘Nemo’, ‘Neto’, ‘Nebo’, …. SELECT Age FROM Sailor WHERE Sailor.Sname LIKE ‘Ne%o’ Would match ‘Neo’, ‘Nemo’, ‘Nemo von bo’, …

29 Mathematical Operations You can use mathematical operations within the SELECT statement You’ve already seen this as MIN, MAX, AVG You can also have +, -, *, /, % What is the sailors rating per year? SELECT Rating / Age FROM Sailor

30 Union, Intersect, Except SQL provides set-manipulation constructs: UNION (  ) INTERSECT (  ) EXCEPT () By default, duplicates are eliminated in results To retain duplicates, use UNION ALL, INTERSECT ALL, EXCEPT ALL

31 Find Those who’ve hired a red OR Green boat SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND (B.colour=‘red’ OR B.colour=‘green’) OR SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘red’ UNION SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘green’ Why?...

32 Find Those who’ve hired a red AND Green boat SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND (B.colour=‘red’ AND B.colour=‘green’) OR SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘red’ INTERSECT SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘green’

33 Find Those who’ve hired a red boat but not a Green boat SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND (B.colour=‘red’ AND B.colour!=‘green’) OR SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘red’ EXCEPT SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘green’

34 GROUP BY So far, we’ve applied aggregate operators to all (qualifying) records. Sometimes, we want to apply them to each of several groups of records. Consider: Find the age of the youngest sailor for each rating level. How many rating levels are there? What are the rating values for these levels? In general, we don’t know! Suppose we know that rating values go from 1 to 10; we can write 10 queries that look like this (!): For i = 1, 2,..., 10: SELECT MIN (S.age) FROM Sailors S WHERE S.rating = i

35 GROUP BY To write such queries, we need GROUP BY clause, a major extension to the basic SQL query form. E.g. Find the age of the youngest sailor for each rating level - can be expressed as follows SELECT S.rating, MIN (S.age) FROM Sailors AS S GROUP BY S.rating S.rating after ‘GROUP BY’ is called a grouping-list

36 HAVING What if we’re only interested in some of the groups? We can restrict this with HAVING For sailors over 18 what is the highest rating for each age? SELECT S.age, MAX(S.rating) FROM Sailors AS S GROUP BY S.age HAVING S.age > 18

37 Summary Databases are the primary way in which information is managed in organisations They offer a range of valuable functions querying, security, transactions… Database design is important Normal forms (1NF, 2NF, 3NF): data should depend on the key, the whole key, and nothing but the key You should now be able to: explain the main functions of databases identify opportunities to exploit them to meet business needs design databases perform queries against an SQL database


Download ppt "Database Systems Marcus Kaiser School of Computing Science Newcastle University."

Similar presentations


Ads by Google