Presentation on theme: "Jump to first page Relational Databases Prepared for SYS364 Systems Design by Joyce Walton."— Presentation transcript:
Jump to first page Relational Databases Prepared for SYS364 Systems Design by Joyce Walton
Jump to first page Relational Databases n In the beginning: Hierarchical and Network DBMS were awkward to design, and hopelessly unreliable n e.g., IBM’s DBMS called IMS (Information Management System) was in use at U of T for Student records … two identical inquiries would get the same answer only 14% of the time!
Jump to first page DB: Dinosaur Base n File Processing u Program centric – the code knows how to make information from data u Collections of data organized sequentially, direct (random) with hashing, and by index. n Hierarchical u Structure centric – programs must know how to navigate data u Tree structure, parent-child relationships n Network u Files with records & fields contain data u Set of records provide one to many relationships (similar to RDBMS Join) u High performance but changes mean recompiling
Jump to first page Relational Databases, continued n Meanwhile, at U of T and other grad schools of Computer Science around the world, people were using a new way of talking about data: the Relational Calculus n Great for discussion, but no one thought that a commercially viable DBMS could be built to use the Relational Calculus! n Then a student did it, not for a big mainframe but for a PC! (dBASEII)
Jump to first page The Relational Calculus n A scary name for some very reasonable stuff! n But unfortunately, it has a lot of fancy words we need to memorize ! n A relation is a named table of data, arranged in rows (called tuples) and columns (called attributes)
Jump to first page The Relational Calculus, continued n It may help to think of file equivalents to learn this vocabulary: u think of a Relation as being a file u and a Tuple as being a record u and an Attribute as being a field n Theoretically, there are no duplicate tuples in a Relation. However, most real-life applications use them, and most Relational DBMS allow them !
Jump to first page The Relational Calculus, continued n Each Attribute (column, like field) has a “Domain”, the range or list of values it may contain. This is quite different from the “Data Type” idea we use with files, because it is more restrictive. n Now let’s talk about some of the operations we can perform on Relations...
Jump to first page The Relational Calculus, continued n The Selection operation consists of using one Relation to create another which has only some of the Tuples of the original n These Tuples in the new Relation are “selected” from the original on the basis of some criterion -- really just a “condition” like those we use in the Control Constructs in programming languages, based on the “Relational Operators”: == != >= n And just like conditions, criteria can be combined into more complex criteria using the “Logical Operators”: AND, OR, NOT
Jump to first page The Relational Calculus, continued n Projection is another operation we can perform on one Relation to create a second n The new “projected” relation will contain only some of the attributes (columns) of the original, and these are selected by name n I usually declare a student contest to come up with a “Memory Crutch” to help you remember that you Select Rows, but Project Columns
Jump to first page The Relational Calculus, continued n In addition to the operations you can perform on a single Relation, there are several you can do to two Relations to create a third. n The simplest of these is the Union, and it defines a new term: Union-Compatible or UC n The Union of two Relations which have the same attributes with the same domains (UC) is formed by appending all the tuples of the second after all the tuples of the first
Jump to first page The Relational Calculus, continued n The Intersection of two UC Relations is a third which contains only those tuples in the second which are identical to some in the first. (Compare the Intersection to the AND operation on Venn diagrams!) n The Difference of two UC Relations is a third which contains only the tuples which are present in the first but not in the second. (Just like subtraction!)
Jump to first page The Relational Calculus, continued n And now we come to a truly ugly operation! But we’ll see later that it can be useful… n The Product of two Relations (not necessarily UC) is a third relation which has every tuple of the second appended to each tuple of the first. n So if you had one Relation which was 3 tuples (rows) by 4 attributes (columns) and another which was 2 by 2, their product would have 6 tuples (3 times 2) each having 6 attributes (4+2), truly big and ugly !
Jump to first page Relational Calculus Examples n Here’s a relation called DOGS:
Jump to first page Relational Calculus Examples n And another relation called CATS:
Jump to first page Relational Calculus Examples n As you can see, the DOGS and CATS relations are Union-Compatible (UC) because they have the same Attributes with the same Domains n Since DOGS has 5 Attributes with 3 Tuples and CATS has the same 5 Attributes with 4 Tuples, their Union would have the same 5 Attributes with 7 Tuples
Jump to first page Relational Calculus Examples n Now let’s look at another Relation, called FOLK: FOLK is not UC with either DOGS or CATS However, we can make a Product of DOGS and FOLK It will be Huge! And useless. 7 Attributes, by 12 Tuples !
Jump to first page Relational Calculus Examples n However ugly that Product is, we can make it useful! All we have to do is Select the Tuples for which DOG’s OWNER is identical to FOLK’s NAME, and then Project only the Attributes of Name and Number! This will let us determine the Number (address on Street) for all the dogs! n What we’ve just done is called a Join, a Product which has been Selected and Projected
Jump to first page Relational Calculus, continued n The Join operation is what makes the whole Relational idea useful n It allows us to keep data in separate tables where they can easily be maintained, and then combine them whenever we need their data united
Jump to first page Files vs. Relations: n Records in files can have “repeating groups”; I.e., several fields which are repeated, and in some records, some groups are omitted n In Relations, there are no repeating groups; every tuple is the same size, and each tuple carries enough data to recombine them, so that there are no variable-sized records, or blank records
Jump to first page Files vs. Relations, continued n Files can be “chained” in much the same way as Relations can be Joined n However, with files, you can only do “Equijoin” operations, or put together records from the two files which have matching fields (as we did with DOGS and FOLK) n In Relations, however, you can Select a Product to create a Join using any criterion (condition) n For example, you could Join a Payroll file with a TaxTable to link the appropriate Tax percentage for a Salary between two values in the TaxTable
Jump to first page Files vs. Relations, continued n This often means that you can keep “Dictionary” data (which is relatively static and unchanging) separate from “Master” data (which is very volatile, constantly changing) n With Transaction records, you can keep the “Header and Footer” data in one Table, and all the Detail data in another Table, which can then be indexed several ways, and re-united with its Header and Footer data any time it’s needed
Jump to first page File Media and Storage n PCs: floppy disk, CD-ROM, cartridges, hard drives. n Large systems: n Archival storage n RAID 1 and 5 n Magnetic tape n Sizing: a few megabytes…who cares? Giga/Tera/Petabytes…you care.
Jump to first page DB transaction features n Important in client/server architecture I.e. e-commerce on the web n Journaling: DB changes recorded n Rollback: use the Journal to “undo” a change n Commitment control (2 phase commit) n Multiple record locking.
Jump to first page Next Steps: n Continue to read more about DBMS, especially Relational ones in Chapter 8 n Also, read about Normalization and Normal Forms n With the new efficient Relational DBMS approach, an entirely different approach to System Design is feasible: u start with Entity Relationships Chart u build a Relational DBMS (3NF) u Experiment to create necessary reports with simple DBMS queries, and input dialogues using available DBMS tools !