Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 9: Database Systems

Similar presentations


Presentation on theme: "Chapter 9: Database Systems"— Presentation transcript:

1 Chapter 9: Database Systems
9.1 Database Fundamentals 9.2 The Relational Model 9.3 Object-Oriented Databases 9.4 Maintaining Database Integrity 9.5 Traditional File Structures 9.6 Data Mining 9.7 Social Impact of Database Technology

2 Definition of a Database
A database (DB) is collection of data with internal links that allow users to access the data from a variety of perspectives. Contrast: a flat file, which can only be accessed in one way.

3 Database Database is multidimensional, since internal links between its entries make the information accessible from a variety of perspectives Flat File is the traditional one-dimensional file storage system that presents its information from a single point of view

4 Example of a Flat File Panel 2.4.06 £3400 205 North Street Honeywell
Case 5.5.06 £2000 15 Retail Road Univac 43 Plate 1.7.06 £1000 1 The Lane Sperry 37 Product Due date Price Customer address Customer name Order no.

5 Relational Database Order file Customer file Address Name Customer no.
Product Due date Price Customer no. Order no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file

6 Relational Database linkage Order file Customer file Address Name
Customer no. Product Due date Price Customer no. Order no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file

7 Figure 9.1 A file versus a database organization

8 Databases emerged as a means of integrating the information stored and maintained by a particular organisation Different departments/users can use the same data. Benefits: Reduce (or remove) duplication of data being stored in several places. Ensure all departments use the same versions of the data Easier to make changes to data -- change in one place only, not in several files

9 Advantages of a Single Central Database
Easier to control access to information. Easier to update information. Easier (overall) to maintain and upgrade as technology improves.

10 Potential problems of data integration
Security and Privacy issues Sensitive data (personal data, salary records ...) may be accessed by unauthorised personnel e.g. Need to restrict payroll user's access to exclude other financial data. The ability to control access to information in the database is as important as the ability to share it

11 Schemas To provide different users with different information within a database, database systems often rely on schemas and subschemas A schema is a description of the entire database, i.e. the contents of records and the links between them.

12 Example of a Schema Customer(Customer ID, Name, Address)
Order(Order No. , Customer ID, Invoice No. , Status) Invoice(Invoice No. , Customer ID, Order No. , Status) Product(Product Code, Description) Bold underline: primary key Underline: foreign key

13 Schemas and subschemas
Schema = a description of the structure of an entire database, used by database software to maintain the database Subschema = a description of only that portion of the database pertinent to a particular user’s needs, used to prevent sensitive data from being accessed by unauthorised personnel

14 Database management systems (DBMS)
Database Management System = a software layer that maintains a database and manipulates it in response to requests from applications Distributed Database = a database stored on multiple machines DBMS will mask this organisational detail from its users Data independence = the ability to change the organisation of a database without changing the application software which uses it

15 Figure 9.2 The conceptual layers of a database implementation
Two software layers: application software (user interface) DBMS (database manipulation)

16 Advantages of the Conceptual Layers
Application software is simplified as it is insulated from database management details. Easier to control user access through a single DBMS. Data independence: database organisation can be changed without changing the application software. to make a change in the DB for one user, need to change only the overall schema and the subschemas of users involved in the change.

17 Database models Database model = conceptual view of a database
Relational database model Object-oriented database model Abstraction actual storage details are hidden from the user

18 9.2 Relational database model
Name of relation Customers Address Name Customer no. Table heading 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Table All data is stored in rectangular tables called relations. Each row in a relation is a tuple cardinality = no.of tuples Each column is an attribute degree = no. of attributes

19 Advantages of this Relational Model
Information is preserved. e.g. the name and address of a customer are recorded even if the customer has no orders. Information is not duplicated: e.g. the name and address of a customer with two orders are not entered twice.

20 Figure 9.3 A relation containing employee information

21 A relation does not specify the order of the attributes or the order of the tuples.
Customers Customers 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Univac 15 Retail Road 103 Honeywell 205 North Street 54 Sperry 1 The Lane 102 is the same as

22 Repeated Tuples A relation cannot have repeated tuples
205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Not a relation in a relational database.

23 Attributes in Different Relations
An attribute of a given relation is not an attribute of any other relation. E.g. “Orders.customer no” is a different attribute from “Customers.customer no”.

24 Keys Primary key: an attribute whose value uniquely identifies a tuple. Composite key: a minimal set of attributes whose values together uniquely identify a tuple Foreign key: set of attributes pointing to a primary key or a composite key in another table.

25 Example of a Composite Key
Post Code Town Street House no. Composite key: House no., Post Code. CH3 8PN Chester High St. 23 DB21 0TQ Bury Acacia Ave. 1

26 Example of a Foreign Key
Product Due date Price Customer no. Order no. Address Name Customer no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file

27 Evaluating a relational design
Avoid multiple concepts within one relation Can lead to redundant data Deleting a tuple could also delete necessary but unrelated information

28 Table Structure Each table should correspond to a single concept or task. Each row of a table should be uniquely identified by a key. Avoid multiple copies of information.

29 Improving a relational design
Decomposition = dividing the columns of a relation into two or more relations, duplicating those columns necessary to maintain relationships Lossless or nonloss decomposition = a “correct” decomposition that does not lose any information Aim: to produce a better table structure

30 Example of a Lossless Decomposition
Address Name Product Due date Price Customer no. Order no. 205 North Street Honeywell Panel 2.4.06 £3400 54 20 25 Retail Road Univac Case 5.5.06 £2000 103 43 1 The Lane Sperry Plate 1.7.06 £1000 102 37 Original table

31 Lossless Decomposition
Address Name Customer no. Product Due date Price Customer no. Order no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file

32 Example of a Lossy Decomposition
Dept Job Title Empl Id Planning Manager 18 Finance 17 Assistant 203

33 Lossy Decomposition Dept Job Title Job Title Empl Id Planning Manager Finance Assistant Manager 18 17 Assistant 203 After decomposition we can no longer tell in which department employees 17 and 18 worked

34 Figure 9.7 A relation and a proposed decomposition

35 Figure 9.4 A relation containing redundancy
Problem: more than one concept has been combined in a single relation. Redesign the system with 3 relations...

36 Figure 9.5 An employee database consisting of three relations

37 Figure 9.6 Finding the departments in which employee 23Y34 has worked

38 Rules for database normalisation

39 Relational operations
Extracting information from a database Can the answers to database queries be found in a systematic way? In the relational model the answers are always relations.

40 Example Database queries
Get List of Order no and Product. List of tuples of orders for Plate. List of Order no and Name.

41 Contents of a Relational Database
Customers Orders Address Name Customer no Product Due date Price Customer no Order no 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 All data in a relational database are stored in tables

42 Query 1: List of Order Nos. and Products
Project: remove the attributes Customer no., Price and Due date to leave... Orders Product Due date Price Customer no Order no Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Product Order no Ans1: Panel 20 Case 43 Plate 37

43 Query 2: List of Tuples of Orders for Plate
Select: take all tuples for which the product is Plate, to give... Product Due date Price Customer no Order no Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Ans2: Product Due date Price Customer no Order no Plate 1.7.06 £1000 102 37

44 Query 3: List of Order No. and Name
Customers Orders Address Name Customer no Product Due date Price Customer no Order no 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Join: combine tuples in Orders and Customers, then use Project to retain the attributes Order no. and Name.

45 Query 3: List of Order No. and Name (cont'd)
Join and then Project Temp <- Join Orders and Customers where O.Cno=C.Cno C.add C.Nm C.Cno O.Pr O.Due O.pr O.Cno O.no 205 North Street Honeywell 54 Panel 2.4.06 £3400 20 15 Retail Road Univac 103 Case 5.5.06 £2000 43 1 The Lane Sperry 102 Plate 1.7.06 £1000 37 Ans3 C.Nm O.no Honeywell 20 Univac 43 Sperry 37 Names of attributes are abbreviated

46 Project The original relation is R1, the new relation created by Project is R2 and the attributes projected to R2 are att1,…, attn. R2 <- Project att1,…attn from R1 Example: Ans1 <- Project Orders.Order no, Orders.Product from Orders

47 Figure 9.9 The PROJECT operation

48 Select The original relation is R1, the new relation created by Select is R2 and the conditions on the attribute values in R2 are c1,…cn. R2 <- Select from R1 where c1,…, cn. Example: Ans2 <- Select from Orders where Orders.Product=‘Plate’

49 Figure 9.8 The SELECT operation

50 Join The original relations are R1, R2, the new relation created by Join is R3 and the conditions on the attribute values in R3 are c1,…cn. R3 <- Join R1 and R2 where c1,…, cn Example: Temp <- Join Orders, Customers where O.Cno=C.Cno

51 Figure 9.10 The JOIN operation

52 Figure 9.11 Another example of the JOIN operation

53 Another Example for Join
R3  Join Orders and Customers where Orders.Price>£1500 and O.Cno=C.Cno R3 C.add C.Nm C.Cno O.Pr O.Due O.pr O.Cno O.no 205 North Street Honeywell 54 Panel 2.4.06 £3400 20 15 Retail Road Univac 103 Case 5.5.06 £2000 43

54 Join and Keys A B C  Join A and B where A.foreign key=B.key C data
22 DB1 16 D2 22 28 D1 16 34 C  Join A and B where A.foreign key=B.key B.data B.key A.data A.Foreign key A.key C DB2 22 D2 28 DB1 16 D1 34

55 Structured Query Language (SQL)
operations to manipulate tuples insert update delete select

56 Structured Query Language
SQL is used to retrieve data and manipulate data. SQL is declarative. It describes the required information. It does not specify how the information is obtained. Adopted by ANSI (1986) and ISO (1987). First commercial release: Oracle Corporation (1979), then IBM (1979)

57 SQL Data Retrieval 1 List of Order no and Product.
select Order no, Product from Orders SQL command The attributes Order no, Product are projected from Orders Result

58 SQL Data Retrieval 2 List of tuples of orders for Plate. select *
from Orders where Orders.Product=‘Plate’ SQL command * Means all attributes. The resulting relation contains all tuples in Orders for which Orders.Product=‘Plate’. Result

59 SQL Data Retrieval 3 List of Order no and Name.
select Orders.Order no, Customers.name from Orders, Customers where Orders.Customer no = Customers.Customer no SQL command Join Orders and Customers. Project the the attributes Orders.Order no, Customers.name Result

60 SQL Command for Data Retrieval
select <list of attributes> from <list of relations> where <list of conditions> select * from Orders, Customers where Orders.Price > 1500 and Orders.Customer no=Customers.Customer no Example

61 SQL Commands for Data Manipulation
insert into Orders values (56, 54, ‘£1000’, ‘1.8.06’, ‘Plate’) delete from Orders where Orders.Customer no = 54 update Customers set Customers.Address = ‘16 High Street’ where Customers.Customer no = 102

62 Advantages of SQL High level language with short but powerful commands. Simple, natural syntax. Details of information retrieval left to the DBMS, allowing quick applications development. Multipurpose: querying, modifying, controlling access to data, defining the database

63 Problems with SQL Large, complicated standard.
Standard does not always fully specify the effects of a command. There are many commercial extensions of and variations on the standard, thus SQL code is usually not easily portable.

64 SQL and Relational Databases
SQL allows many deviations from the relational model, including: Duplicate tuples Unnamed attributes Two or more attributes with the same name Specification of the order of attributes

65 9.3 Object-oriented databases
Object-oriented database = a database constructed by applying the object-oriented paradigm Each data entity stored as a persistent object Relationships indicated by links between objects DBMS maintains inter-object links

66 Figure 9.13 The associations between objects in an object-oriented database

67 Advantages of object-oriented databases
Matches design paradigm of object-oriented applications Intelligence can be built into attribute handlers Example: names of people Can handle exotic data types Example: multimedia Can store intelligent entities

68 9.4 Maintaining database integrity
Transaction = a sequence of operations that must all happen together Example: transferring money between bank accounts Transaction log = non-volatile record of each transaction’s activities, built before the transaction is allowed to happen Commit point = point at which transaction has been recorded in log Roll-back = procedure to undo a failed, partially completed transaction Problem of cascading rollback.

69 Maintaining database integrity (continued)
Simultaneous access problems Incorrect summary problem Lost update problem Locking = preventing others from accessing data being used by a transaction Shared lock: used when reading data Exclusive lock: used when altering data

70 9.5 Traditional File Structures
Sequential Files Indexed Files Hash Files

71 Sequential files Sequential file = file whose contents can only be read in order Reader must be able to detect end-of-file (EOF) Data can be stored in logical records, sorted by a key field Greatly increases the speed of batch updates

72 Figure 9.14 A procedure for merging two sequential files

73 Figure Applying the merge algorithm (Letters are used to represent entire records.  The particular letter indicates the value of the record’s key field.)

74 Figure 9.16 The structure of a simple employee file implemented as a text file

75 Indexed files Index = list of (key, location) pairs
Sorted by key values location = where the record is stored

76 Figure 9.17 Opening an indexed file

77 Hashing Each record has a key The master file is divided into buckets
A hash function computes a bucket number for each key value Each record is stored in the bucket corresponding to the hash of its key

78 Figure 9.18 Hashing the key field value 25X3Z to one of 41 buckets

79 Figure 9.19 The rudiments of a hashing system

80 Collisions in Hashing Collision = when two keys hash to the same bucket Major problem when table is over 75% full Solution: increase number of buckets and rehash all data

81 9.6 Data mining Data mining = a set of techniques for discovering patterns in collections of data Relies heavily on statistical analyses Data warehouse = static data collection to be mined Data cube = data presented from many perspectives to enable mining Raises significant ethical issues when it involves personal information

82 Data mining strategies
Class description Class discrimination Cluster analysis Association analysis Outlier analysis Sequential pattern analysis

83 Data mining strategies
Class description identifying properties that characterise a given group of data items what characterises people who buy small economical cars, or expensive mobile phones Class discrimination identifying properties that divide two groups what distinguishes customers who buy used cars from those who buy new ones; or what distinguishes a reader of Vatan from a reader of Cumhuriyet

84 Data mining strategies
Cluster analysis trying to discover classes find properties of data items that identify groupings Association analysis looking for links between data groups do people who buy beer also buy potato chips?

85 Data mining strategies
Outlier analysis identify data entries that are out of the ordinary e.g. identifying use of stolen credit cards from altered spending patterns Sequential pattern analysis try to identify patterns of behaviour over time e.g. economic trends, climate conditions...

86 Social impact of database technology
Problems Massive amounts of personal data are being collected Often without knowledge or meaningful consent of affected people Data merging produces new, more invasive information Errors are widely disseminated and hard to correct Remedies Existing legal remedies largely ineffective Negative publicity may be more effective


Download ppt "Chapter 9: Database Systems"

Similar presentations


Ads by Google