Presentation is loading. Please wait.

Presentation is loading. Please wait.

Databases and DBMS Paul Wong bld39 rm6 Tues 12:30-2:30 Thurs 8:30-1030.

Similar presentations


Presentation on theme: "Databases and DBMS Paul Wong bld39 rm6 Tues 12:30-2:30 Thurs 8:30-1030."— Presentation transcript:

1 Databases and DBMS Paul Wong bld39 rm6 Tues 12:30-2:30 Thurs 8:30-1030

2 A Brief Recap Data: individual facts Data: individual facts Information: organisation of one or more pieces of data to answer a complex question Information: organisation of one or more pieces of data to answer a complex question Knowledge: inferred and applied information to establish useful patterns Knowledge: inferred and applied information to establish useful patterns

3 Database System Databases store data that has a common structure Databases store data that has a common structure For example For example  the customers in a bank,  the car parts in an inventory system,  the books in a library  the students enrolled in a University For every customer we need more or less the same data: Name, address, phone, account number etc. For every customer we need more or less the same data: Name, address, phone, account number etc.

4 Database Concepts Each of the things we want to know about a customer ( car part, book or student) is known as an attribute and is stored in a separate “field” Each of the things we want to know about a customer ( car part, book or student) is known as an attribute and is stored in a separate “field”  Name and phone numbers are fields  Name could be divided into three fields  Title, Family Name and Given Name All of the fields that describe a customer are arranged to form a “record” All of the fields that describe a customer are arranged to form a “record”  G. Smith, 3 Square St, Perth, 5554321, C50716 A set of records is known as a “file”. A “database” may be a single file but usually it is several files A set of records is known as a “file”. A “database” may be a single file but usually it is several files

5 Databases, flat files and tables When the data in an individual file is displayed it looks like this: When the data in an individual file is displayed it looks like this: Such files only have two dimensions and are called flat files or tables. Such files only have two dimensions and are called flat files or tables. Some applications only need a single flat file Some applications only need a single flat file

6 Field attributes Each field in the record must have: Each field in the record must have: a unique name (not a unique identifier, see later) a unique name (not a unique identifier, see later)  Part_name, Customer_name, PhoneWork, PhoneHome a field type - number, character, date, logical etc a field type - number, character, date, logical etc  numbers may be defined as integers or decimals for most business databases, but floating-point may be required for scientific data  logical is just Yes or No (or True and False)  useful for indicating status: Payment received - No a field width - the number of spaces required a field width - the number of spaces required Each record usually has a unique number (identifier) Each record usually has a unique number (identifier)

7 Transaction systems If you were building a system that stored data about which parts a customer had purchased, the system would be a Transaction Processing System (TPS) If you were building a system that stored data about which parts a customer had purchased, the system would be a Transaction Processing System (TPS) You could do this with a single file but it would be VERY redundant. This is called data redundancy problem. Redundancies can be wasteful and degrade performance of the system. BUT, it may lead to more serious problems. You could do this with a single file but it would be VERY redundant. This is called data redundancy problem. Redundancies can be wasteful and degrade performance of the system. BUT, it may lead to more serious problems.

8 Every time we make a sale, we have to enter all the customer’s details and all the details for every part. If a customer changes his address, it is now wrong on every previous sale - which address do we use? Hence redundanciesmay lead to data inconsistencies. Every time we make a sale, we have to enter all the customer’s details and all the details for every part. If a customer changes his address, it is now wrong on every previous sale - which address do we use? Hence redundancies may lead to data inconsistencies.

9 Database organisation We can solve all these problems we need to rethink about how we structure our data to avoid these problems. We can solve all these problems we need to rethink about how we structure our data to avoid these problems. One possibility is to use multiple files, one for each entity that we are interested in. One possibility is to use multiple files, one for each entity that we are interested in. In our previous example, we need a customer file, a parts file and a sales file. In our previous example, we need a customer file, a parts file and a sales file. These files form a database that can be These files form a database that can be  relational - based on tables structure  hierarchical - based on trees structure  network - based on graphs structure Relational are the most common so we’ll stick to those but the other two are also useful. The underlying structures are all based on precisely defined concepts. So they have special meaning. Relational are the most common so we’ll stick to those but the other two are also useful. The underlying structures are all based on precisely defined concepts. So they have special meaning.

10 Relational database terminology Relational databases use slightly different terms Relational databases use slightly different terms The “files” in a relational database are known as tables and each table describes a collection of real world entity, e.g. the collection of customers, the collection of parts etc. The “files” in a relational database are known as tables and each table describes a collection of real world entity, e.g. the collection of customers, the collection of parts etc. Each table consists of a set of records or tuples Each table consists of a set of records or tuples Each tuple is made up of fields or attributes Each tuple is made up of fields or attributes The name relational refers to the fact that special relationships are defined between the tables or entities in the database, e.g. sales file. These relationships are known as The name relational refers to the fact that special relationships are defined between the tables or entities in the database, e.g. sales file. These relationships are known as “one to one”, “one to many” and “many to many” “one to one”, “one to many” and “many to many”

11 How Does it Work? G. Smith, 1 Bruce Rd nut, 35mm nail, 15mm customers parts sales

12 File organisation Unlike text files, DB files have their own internal structure - the record. Records in a DB file are often stored in sequential order e.g. part number order. This is useful for reporting on all parts etc. Unlike text files, DB files have their own internal structure - the record. Records in a DB file are often stored in sequential order e.g. part number order. This is useful for reporting on all parts etc. When searching a DB file, we often want a single record or group of records with a common property. In this case, it is better to store the file for direct access - the address of every record is known in advance. If we know we want part #512, we can go directly to the record for that part. When searching a DB file, we often want a single record or group of records with a common property. In this case, it is better to store the file for direct access - the address of every record is known in advance. If we know we want part #512, we can go directly to the record for that part. Direct access files use relative or absolute addresses Direct access files use relative or absolute addresses

13 Indexing Obviously, we cannot actually put a file in order of two different fields e.g. part_num & part_name Obviously, we cannot actually put a file in order of two different fields e.g. part_num & part_name If we need to search on both these fields we can build an index file for each one. If we need to search on both these fields we can build an index file for each one. An index file could have all part_nums in order and beside each, the corresponding record number An index file could have all part_nums in order and beside each, the corresponding record number An index on part name would have every part name in alphabetical order with a corresponding record number An index on part name would have every part name in alphabetical order with a corresponding record number

14 Deletion and indexing When we delete a record in a database, we only mark it as deleted - it still exists and it still has a valid record number. When we delete a record in a database, we only mark it as deleted - it still exists and it still has a valid record number. When we purge deleted records, the record that used to be number 23, could become 21. When we purge deleted records, the record that used to be number 23, could become 21. We should regenerate the index files immediately after any purge procedure We should regenerate the index files immediately after any purge procedure

15 DBMS Organisations often have a number of applications that use overlapping data Organisations often have a number of applications that use overlapping data Our customer file may be used by both the sales department and by the marketing department. Our customer file may be used by both the sales department and by the marketing department. When the Sales TPS was written, it may have used the field name: Customer_ID When the Sales TPS was written, it may have used the field name: Customer_ID While the Marketing application uses: CustId While the Marketing application uses: CustId The names must coincide exactly with the field names in the database. But this makes application development more difficult. The names must coincide exactly with the field names in the database. But this makes application development more difficult.

16 Database Management Systems - DBMS Make it possible for different applications to use different names for the same fields and entities Make it possible for different applications to use different names for the same fields and entities Alternative names are called aliases and are stored in a data dictionary. Alternative names are called aliases and are stored in a data dictionary. The data dictionary also contains all the definitions of tables and fields, these are called metadata – data about the structure of data The data dictionary also contains all the definitions of tables and fields, these are called metadata – data about the structure of data Application developers do not need to inspect each table or file to find out the type or length of a field, they simply use the data dictionary Application developers do not need to inspect each table or file to find out the type or length of a field, they simply use the data dictionary Data dictionaries also contain access rights Data dictionaries also contain access rights

17 Goals of DBMS Data efficiency Data efficiency  reduces redundancy of data compared to paper ??? Access flexibility Access flexibility  provides simultaneous access to many users Data integrity Data integrity  the quality of the data is more reliable, up to date Data independence Data independence  developers interact with the DBMS, not the data Data security Data security  granting of authorization to modify data, access etc.

18 Database Administrator (DBA) Documents all databases and database applications (as they affect the DBMS) Documents all databases and database applications (as they affect the DBMS) Set up new databases and integrate new database applications Set up new databases and integrate new database applications Modifying database structures Modifying database structures Notifying staff of database alterations Notifying staff of database alterations Monitoring and tuning the databases Monitoring and tuning the databases Carries out routine functions like purging and re- indexing Carries out routine functions like purging and re- indexing Ensures correct access for authorised users Ensures correct access for authorised users Selects DBMS and develops policies Selects DBMS and develops policies

19 User and developer views Many DB packages (engines), particularly for the PC provide the developer with the ability to view all the records or step through the records one at a time. The records can be edited interactively.This is handy BUT dangerous Many DB packages (engines), particularly for the PC provide the developer with the ability to view all the records or step through the records one at a time. The records can be edited interactively.This is handy BUT dangerous  e.g an adventurous user could delete all index files or delete the primary key from a table, because he or she “never uses them” Users do NOT usually access the database engine itself. Instead, a DB application provides each user with the required data and functions Users do NOT usually access the database engine itself. Instead, a DB application provides each user with the required data and functions

20 DB procedural languages & SQL Many DB packages (DB4, Oracle etc) provide their own procedural languages to access the DB Many DB packages (DB4, Oracle etc) provide their own procedural languages to access the DB Many different languages made it difficult. IBM developed the Structured Query Language which is now widely used by DB systems Many different languages made it difficult. IBM developed the Structured Query Language which is now widely used by DB systems SELECT ITEM, PRICE FROM SALES_FILE WHERE NUMBER.3 ORDER BY ITEM This produces a list of the items and prices for all sales where more than 3 items were purchased. The list is in alphabetical order on item name This produces a list of the items and prices for all sales where more than 3 items were purchased. The list is in alphabetical order on item name

21 Uses of SQL If a user typed in the code we just saw, we would say it was interactive SQL use If a user typed in the code we just saw, we would say it was interactive SQL use SQL commands can also be embedded in other procedural languages, like Cobol - embedded SQL SQL commands can also be embedded in other procedural languages, like Cobol - embedded SQL Although intended to be simple, SQL was still too hard for most managers to use Although intended to be simple, SQL was still too hard for most managers to use Complex queries need to “join” relational tables to get the required result. This is complex and slow Complex queries need to “join” relational tables to get the required result. This is complex and slow

22 Access control and Data sharing DBs, generally, and SQL, specifically, provide mechanisms to control who gets access to data. DBs, generally, and SQL, specifically, provide mechanisms to control who gets access to data. Remember that DBs allow multiple users to share the data simultaneously. This has some problems. Remember that DBs allow multiple users to share the data simultaneously. This has some problems. If one user accesses a client RECORD to change a the phone number while another user is changing the address, they BOTH have copies of the record open. The user who saves the record first will lose the changes made when the second user saves his or her record. If one user accesses a client RECORD to change a the phone number while another user is changing the address, they BOTH have copies of the record open. The user who saves the record first will lose the changes made when the second user saves his or her record. Smith, 5554321, 3 Big St Perth Smith, 5554321, 3 Big St Perth

23 Concurrency When two or more users access a record simultaneously it is called concurrency. When two or more users access a record simultaneously it is called concurrency. DBMS provide file locking and record locking to overcome concurrency problems. DBMS provide file locking and record locking to overcome concurrency problems. Several users can view the record without problems Several users can view the record without problems The first person to use a system function to EDIT or DELETE a record also places a lock on that record. No other user can then start to edit or delete that same record until the first user has finished The first person to use a system function to EDIT or DELETE a record also places a lock on that record. No other user can then start to edit or delete that same record until the first user has finished This causes delays and doesn’t solve all problems This causes delays and doesn’t solve all problems

24 Object oriented DBs Object Oriented (OO) DBs store special data called objects. An object contains not only the data but the procedures associated with that data. Object Oriented (OO) DBs store special data called objects. An object contains not only the data but the procedures associated with that data. They are more flexible than traditional databases and can handle more complex queries than SQL They are more flexible than traditional databases and can handle more complex queries than SQL

25 Graphics/video - MDDB Most early databases were text based - all of the entities described in all the fields were either characters or numbers Most early databases were text based - all of the entities described in all the fields were either characters or numbers More modern databases often include graphics - e.g. mug shots, house photos etc. They can also include video clips and sound files More modern databases often include graphics - e.g. mug shots, house photos etc. They can also include video clips and sound files One recent development is the MultiDimensional database (MDDB) which allows users to rapidly analyse statistics about a company’s performance One recent development is the MultiDimensional database (MDDB) which allows users to rapidly analyse statistics about a company’s performance e.g. no. of fridges sold in each state last month e.g. no. of fridges sold in each state last month

26 Client Server Many of the DBs we have seen use a powerful search engine to access data. This engine may be located on a remote computer whose task is to serve up the requested data - a DB server Many of the DBs we have seen use a powerful search engine to access data. This engine may be located on a remote computer whose task is to serve up the requested data - a DB server As PCs have become more powerful, they have taken on some of the processing task from DB servers. They do this by running their own “client” software which integrates with the DB server As PCs have become more powerful, they have taken on some of the processing task from DB servers. They do this by running their own “client” software which integrates with the DB server The client asks for a set of records and then it orders them and filters out unwanted data The client asks for a set of records and then it orders them and filters out unwanted data The DB engine/server is often called the “back end” and the client software is called the “front end” The DB engine/server is often called the “back end” and the client software is called the “front end”

27 Data Warehouses (DW) & the Web Many organisations have a lot of DBs and DB applications. Frequently, the data in these DBs has grown in an ad hoc fashion and cannot be easily integrated. To overcome this, may organisations are now developing data warehouses, where snapshots of the active DBs can be stored Many organisations have a lot of DBs and DB applications. Frequently, the data in these DBs has grown in an ad hoc fashion and cannot be easily integrated. To overcome this, may organisations are now developing data warehouses, where snapshots of the active DBs can be stored Many organisations prefer to provide access to DWs through browser based applications. This allows people all over the organisation to access the data through a common interface Many organisations prefer to provide access to DWs through browser based applications. This allows people all over the organisation to access the data through a common interface


Download ppt "Databases and DBMS Paul Wong bld39 rm6 Tues 12:30-2:30 Thurs 8:30-1030."

Similar presentations


Ads by Google