Presentation is loading. Please wait.

Presentation is loading. Please wait.

Failsafe systems Fail by Failing to be Failsafe. Or to put it simply Don’t worry, nothing can go wrong click go wrong click go wrong click.

Similar presentations


Presentation on theme: "Failsafe systems Fail by Failing to be Failsafe. Or to put it simply Don’t worry, nothing can go wrong click go wrong click go wrong click."— Presentation transcript:

1 Failsafe systems Fail by Failing to be Failsafe. Or to put it simply Don’t worry, nothing can go wrong click go wrong click go wrong click

2 Data Integrity If you receive a bill (Yes, snail mail still exists) with the date 30 Feb 2013, what would be your reaction? Would you pay it? Even though only one thing was wrong, you would not trust the WHOLE THING! That word trust has a technical definition: Integrity A Database has the same problem, If we get anything out that is not consistent with the other data, or with what we expect, it loses its integrity. It means that it does not hold together. Data Integrity means we trust all the data to be meaningful and reliable.

3 Row integrity So basic that it is often overlooked in books…. Row integrity is a really basic property. Many people use spreadsheets to hold tables. This brings many benefits and makes the data really accessible. But for serious data storage a spreadsheet has a fundamental weakness. It is easy to lose the row integrity. Suppose one column is sorted but the other columns are not sorted. Spreadsheet data can be scrambled. Even worse, there is no way to tell that this accident has happened. This is why we should use a relational database for the primary storage of important row data.

4 3 more kinds of Integrity The bad date error is an example of Domain integrity. A domain is a range of allowed values. You already know about primary key integrity – every value must be unique. Referential integrity is relationship integrity. As we all know, you must be faithful in a relationship. Well so do databases. If a foreign key references a primary key, then it must be valid and it must be there at all times. (Just like being there for your partner).

5 Transactions… How do you transfer $100 from account A to Account B? What would a script look like? How about: Read account A balance Subtract $100 from A balance Write new balance to account A Read account B balance Add $100 to account B Write new balance to account B You could change the order of some of these lines and it would still work, right?

6 But there is a problem…. Anywhere during your procedure, pull out the power cable from the wall. Read account A balance Subtract $100 from A balance Write new balance to account A Read account B balance ---- CRASH ---- Add $100 to account B Write new balance to account B Now what are you going to do ?

7 OK let’s start the computer up and just run the procedure again? No, that makes things even worse! It seems that there are some data changes that involve more than one row…. They must be run on an ALL OR NOTHING basis. This way of processing is called a transaction. A transaction is a unit of work that must be completed, or send back to the start. It cannot be left partially completed.

8 Transaction script BEGIN TRANSACTION Read account A balance Subtract $100 from A balance Write new balance to account A Read account B balance Add $100 to account B Write new balance to account B COMMIT TRANSACTION If a problem is detected during the transaction, or it is being restarted, then it must be wound back But once it is committed then it cannot be undone

9 Are there other interesting problems like this one? Yes, there are a few interesting problems to be aware of. Knowing about them is one step along the path to becoming an expert…. Suppose we have a numbering system for sales order numbers that we want to generate. The database itself could generate these and give them out. Bob gets a new number 1026 but he goes to lunch and forgets to complete his order. His terminal logs off. After lunch he restarts and gets 1058. There will be a gap in the numbering system.

10 A solution? The auditor complains that numbers are missing but there is no reason given. They fix the system so that numbers are only given out by the database itself when the sale is submitted. The new sale is not given an order number on Bob’s screen, he only gets the number when he has finished the sale.

11 Numbering systems again…. Suppose Bob and Alice and the database are all in different locations…and what happens when the link goes down? The management point out that it is essential that Bob and Alice continue to make sales even when the computer goes down…..They just don’t want to wait for a number…. What would you do now?

12 A solution We can put in a branch computer like this: Bob located at Newport can use NP2354, NP2355…. Alice at Southbridge can use SB4432, SB4433…. That way they do not have to rely on a central computer link to be always up. This is example of the need to understand the business issues as well as the database technical issues.

13 Ready for another problem? This is a tricky one. Suppose that Bob and Alice have a joint Bank account. Let’s imagine it is a row in a database. Alice wants to buy some shoes but the account has only got $50 balance. It is Alice’s birthday and Bob has promised to deposit $200 in the account to cover the purchase. Alice is drinking coffee ($4) whilst Bob is across town doing his banking. 10:00amBob’s program reads the row and sees $50. 10:05amAlice’s program reads the row and sees $50. 10:06amBob’s program changes the balance to $250. 10:07amAlice’s program changes the balance to $46…………what??????

14 Meantime… Mary the bank clerk reads the row at 10:00am She updates the address details on the account She saves the row back again at 10:20am The account now reads $50 …. OMG! Mary gets free coffee.

15 A solution There are different ways to solve this, but the most common way is called locking. A row can be opened with a LOCK on it. This lock prevents other users or programs changing the row until the lock is released. They have to get a fresh lock themselves before they can change the row.

16 Locking Issues Sometimes a solution can lead to more problems!!!! Eventually though the pyramid of problems tops out. The next problem is caused by locking Joe is updating the customer contact details and is also taking an order. Sue is doing the same thing, but she started on the order first, and is now trying to update the customer data. Joe has a lock on customer and waiting for the order lock to be released. Sue has a lock on order and waiting for the customer lock to be released. They are both wondering why it is taking such a long time……..

17 A solution Obtain all the resources needed for a transaction (including locks) BEFORE beginning the transaction. The transaction will fail at the start if it can’t get the locks it needs, so it can’t get “deadlocked” Another idea might also be to get the locks in alphabetical order, so Sue would have to compete with Joe for the first lock. If she won that then Joe would be out of the race.

18 Update across multiple computers It gets complicated when you want to update across more than one computer. This is an advanced topic so I give only brief mention here. Theses days we want the computers to handle that so we only see one logical computer or database. We call that “Cloud computing”. The technical term for Relational databases is “Replication”. Multiple copies of the data must appear to be a single copy.


Download ppt "Failsafe systems Fail by Failing to be Failsafe. Or to put it simply Don’t worry, nothing can go wrong click go wrong click go wrong click."

Similar presentations


Ads by Google