Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.

Similar presentations


Presentation on theme: "1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009."— Presentation transcript:

1 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009

2 2 Mapping the logical model onto physical design n Entities become tables ä More often than not! n Attributes become fields (columns) n Unique identifiers become primary keys n Relationships implemented by foreign key columns n Resolve M:N relationships by inserting intersection table

3 3 Mapping considerations n Independence n Privacy n Efficiency of queries

4 4 Denormalisation n Joins take time! n Split or merge normalised entities based on frequent associated use ä Remove redundant relationships ä Merge entities with 1:1 relationships ä Use summary fields ä Use summary tables and views

5 5 Using summary field(1) n Consider running a query “give the total value of all orders for customer X” How many joins?

6 6 Using summary field (2) n Note summary field in Orders table How many joins now ?

7 7 Distributed database systems n Special rules apply!

8 8 The traditional model n One centralised database n Terminals at remote locations n Disadvantages ä Networks are slow (esp WANS!) ä Central machine does all processing ä If central machine fails, database is down (Integrity, redundancy and disaster recovery considered in later lectures!)

9 9 The Client/Server model n Client – application – “front end” n Server – DBMS – “back end” n Still dependent on central database

10 10 Client responsibilities n Manages user interface n Accepts user data n Has local processing capability within the application n Generates database requests and transmits them via network to server n Receives results from server and formats them as required by application

11 11 Server responsibilities n Accepts database requests from client n Processes database requests ä Handles security issues ä Deals with concurrency issues ä Optimizes queries ä Handles recovery/rollback issues n Returns results to client

12 12 Distributed database architecture n A collection of logically related “sites”, connected together so that the users view is that of a single database at a single location. n Each site is a database in it’s own right n Not necessarily physically or geographically separated, but often are – and are logically separated.

13 13 Advantages n Organisations are distributed, why shouldn’t their data be? n Improved efficiency ä Store data close to where it’s used

14 14 Types of DDS n Homogenous – same type of RDBMS at each site (easy!) n Heterogeneous – different types of DBMS at each site (not so easy!)

15 15 Implementation methods (1) n Fragmentation – splitting data between sites ä Horizontal – row based – e.g. store all employee records for a location at that location ä Vertical – column based – e.g. store all payroll columns in payroll department, all other employee data in HR n Either way, fragments must be able to be put back together!

16 16 Implementation methods (2) n Replication ä Controlled duplication of data at more than one site n Update propagation?

17 17 Objectives (1) n Local autonomy ä Local data locally owned and managed – minimal data requirements from remote sites. n No reliance on central site n Continuous operation ä Reliability ä Availability

18 18 Objectives (2) n Location independence ä From user’s view, all data is at their site. n Fragmentation independence ä Needs joins and unions to put fragments back together n Replication independence

19 19 Objectives (3) n Distributed query processing n Distributed transaction management ä Transactions carried out by “agents” at distributed sites ä Two-phase commit ä Locking issues (later lecture)

20 20 Objectives (4) n Hardware independence n Operating system independence n Network independence n DBMS independence

21 21 DDS issues n Query processing ä Optimisation even more important n Catalogue (data dictionary) management ä Centralised? ä Fully replicated? ä Partitioned? ä Combination of first and third?

22 22 DDS issues n Update propagation ä An issue where replication is used. ä “Primary copy” system n Recovery ä Two-phase commit n Recovery ä Locking strategies

23 23 Summary n Mapping the logical model n Denormalisation n Traditional database architecture n Client/server model n Distributed Database systems ä Advantages ä Objectives ä Implementation methods ä Issues

24 24 Further reading n Rolland chapter 10 n Hoffer chapters 12 n Denormalisation - click to follow the link! Denormalisation - click to follow the link!


Download ppt "1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009."

Similar presentations


Ads by Google