2 INTRODUCTIONRelational databases underlie most modern integrated AISs.They are the most popular type of database used for transaction processing.
3 What Is a Database?Efficiently and centrally coordinates information for a related group of filesA file is a related group of recordsA record is a related group of fieldsA field is a specific attribute ofinterest for the entity (record)To make this topic clear it is good to think about a spreadsheet as an analogy. An excel worksheet could be thought of as a file (e.g., customer) where you would want to capture as much information about your customers. These “attributes” of your customers may include the customer number, customer name, address, and so on.Each customer in your excel spreadsheet would be in a row for the spreadsheet, which is called a record in the database.Each column of the spreadsheet is an attribute of the customer, the cell in the spreadsheet would be specific to that customer record and is considered a field.For example, customer AAA Motor’s record would show 123 in the field for the attribute for customer number, whereas customer ABC Motor’s may have a 130 in the field for the attribute for customer number.
4 FILE VS. DATABASESDatabase systems were developed to address the problems associated with the proliferation of master files.For years, each time a new information need arose, companies created new files and programs.The result: a significant increase in the number of master files.
5 FILE VS. DATABASESMaster File 1Fact AFact BFact CFact DThis proliferation of master files created problems:Often the same information was stored in multiple master files.Made it more difficult to effectively integrate data and obtain an organization-wide view of the data.Also, the same information may not have been consistent between files.If a customer changed their phone number, it may have been updated in one master file but not another.SalesProgramMaster File 2Fact AFact CFact EFact FShippingProgramMaster File 3Fact AFact DFact EFact GBillingProgram
6 FILE VS. DATABASESDatabaseFact A Fact BFact C Fact DFact E Fact FFact GA database is a set of inter-related, centrally coordinated files.DatabaseManagementSystemSalesProgramShippingProgramBillingProgram
7 FILE VS. DATABASESThe database approach treats data as an organizational resource that should be used by and managed for the entire organization, not just a particular department.A database management system (DBMS) serves as the interface between the database and the various application programs.DatabaseFact A Fact BFact C Fact DFact E Fact FFact GDatabaseManagementSystemSalesProgramShippingProgramBillingProgram
8 FILE VS. DATABASESDatabaseFact A Fact BFact C Fact DFact E Fact FFact GThe combination of the database, the DBMS, and the application programs that access the database is referred to as the database system.DatabaseManagementSystemSalesProgramShippingProgramBillingProgram
9 FILE VS. DATABASESThe person responsible for the database is the database administrator.As technology improves, many large companies are developing very large databases called data warehouses.Data MiningAnalysis to identify relationships in the data, new knowledge about business processes, etc.Example, Credit card issuers efforts to defect fraudDatabaseFact A Fact BFact C Fact DFact E Fact FFact GDatabaseManagementSystemSalesProgramShippingProgramBillingProgram
10 FILE VS. DATABASES Sales Program Shipping Program Database Management Master File 1Fact AFact BFact CFact DDatabaseFact A Fact BFact C Fact DFact E Fact FFact GSalesProgramMaster File 2Fact AFact CFact EFact FShippingProgramDatabaseManagementSystemMaster File 3Fact AFact DFact EFact GSalesProgramShippingProgramBillingProgramBillingProgram
11 Advantages of Database Systems Data IntegrationFiles are logically combined and made accessible to various systems.Data SharingWith data in one place it is more easily accessed by authorized users.Reporting flexibility: Reports can be revised easily and generated as needed and the database can be easily browsed to research a problem or obtain detailed information underlying a summary report
12 Advantages of Database Systems Minimizing Data Redundancy and Data InconsistencyEliminates the same data being stored in multiple files, thus reducing inconsistency in multiple versions of the same data.Data IndependenceData is separate from the programs that access it. Changes can be made to the data without necessitating a change in the programs and vice versa
13 Advantages of Database Systems Central management of data:Data management is more efficient because a database administrator is responsible for coordinating, controlling and managing dataCross-Functional AnalysisRelationships between data from various organizational departments can be more easily combined.Example: association between sales and promotional campaigns
14 Advantages of Database Systems One-time Data Entry and Storage:In the data-base approach to data management data are input into the data base once, stored in a particular location and available for use my multiple applications and users
15 IMPORTANCE AND ADVANTAGES OF DATABASE SYSTEMS The importance of good data:Bad data leads to:Bad decisionsEmbarrassmentAngry users
16 Database Users and Designers Different users of the database information are at an external level of the database. These users have logical views of the data.At an internal level of the database is the physical view of the data which is how the data is actually physically stored in the system.Designers of a database need to understand user’s needs and the conceptual level of the entire database as well as the physical view.Users of the database who need information to make decisions will always have a logical view. Database administrators are generally concerned with the physical view. Logical view does not need to be concerned with how the data is stored in the system.
17 Physical View Physical views of data In file-oriented systems, programmers must know the physical location and layout of records used by a program.They must reference the location, length, and format of every field they utilize.When data is used from several files, this process becomes more complex.
20 Schemas Describe the logical structure of a database Conceptual Level Organization wide view of the dataExternal LevelIndividual users view of the dataEach view is a subschemaInternal LevelDescribes how data are stored and accessedDescription of: records, definitions, addresses, and indexes4-20
21 Database DesignTo design a database, you need to have a conceptual view of the entire database.The conceptual view illustrates the different files and relationships between the files.The data dictionary is a “blueprint” of the structure of the database and includes data elements, field types, programs that use the data element, outputs, and so on.Table 4-1 provides a good example of a data dictionary. Think about how you would build a house, you wouldn’t just start pouring concrete and hammering nails into wood, you would first have a plan, the same concept applies to building a database. You first need to have a conceptualization of what information you would want to have and what types of decisions you think you would be making. From this conceptual view you would begin to see how the data that you want to capture would need to be laid out as a blueprint, this would be the details to include in your data dictionary.
23 DBMS Languages Report Writer Data Definition Language (DDL) Builds the data dictionaryCreates the databaseDescribes logical views for each userSpecifies record or field security constraintsData Manipulation Language (DML)Changes the content in the databaseCreates, updates, insertions, and deletionsData Query Language (DQL)Enables users to retrieve, sort, and display specific data from the databaseReport WriterSimplifies report creation
24 Relational DatabaseRepresents the conceptual and external schema as if that “data view” were truly stored in one table.Although the conceptual view appears to the user that this information is in one big table, it really is a set of tables that relate to one another.
25 Conceptual View Example Customer NameSales Invoice #Invoice TotalD. Ainge101$1,447G. Kite102$4,394103$104$F. Roberts105$3,994This slide provides an example of a conceptual view that shows Customer sales information. It would seem like this information may be held in one table; however, the data is really stored in a set of four related tables.
26 Relational Data Tables Invoice # is PKCustomer # is PKItem # is PKThe conceptual view from the previous slide is taken from these four relational tables. If all of this information resided in one table, it would be a VERY LARGE table with redundant data.If you look in the upper left of the slide we can see that invoice # 101 (in the sales table) is related to two items (in the Sales-Inventory table) that are sold to the customer # 151 (quantity of 2 Item #10 and quantity of 1 item 50). At this point we need to then go to the Inventory table to see how much these items sell for to calculate the Invoice total =[Quantity (from Sales-Inventory table) * Unit Price (Inventory Table)]. To find the name of the customer, we need to get that information from the Customer table.Combination of Invoice # & Item # forms the PK
27 Types of Attributes Primary key Foreign key Other Non Key Attributes is the attribute, or combination of attributes, that uniquely identifies a specific row(record) in a table. Foreign keyis an attribute in a table that is a primary key in another table.Foreign keys are used to link tables.Other Non Key AttributesStore other important data about the entity
28 Relational Data Tables Primary KeysForeign Key (Customer # is a Foreign key in the Sales Table because it is a Primary key that uniquely identifies Customers in the Customer Table). Because of this, the Sales Table can relate to the Customer Table (see red arrow above).For the data to relate from one table to another table, the data tables must have a specific structure using Primary keys which uniquely identifies information for that specific table (primary key for Sales table is Sales Invoice # and primary key for Customer table is Customer #). So if we want our Sales table to relate to our customer table they each have a primary key that uniquely identifies each row of information for their respective tables.How can we get these two tables to share similar information? Logically sales and customers go together, and it is efficient to not have to look up who customer #151 is all the time when we get our sales information.For a user to have a certain conceptual view of the data (remember slide 4-9 conceptual view) a foreign key must reside in our transaction file (Sales table). That foreign key in the sales table is Customer #, the common attribute to connect these two tables is Customer #. This will allow the user who wants the conceptual view of knowing the customer name for a sales invoice to have that information because the Sales table is now connected (related) to the Customer table.It is important to understand this concept as this will also help you with creating queries of a database to create a specific conceptual view of the database.
29 Why Have a Set of Related Tables? Data stored in one large table can be redundant and inefficient causing the following problems:Update anomalyChanges to existing data are not correctly recorded.Due to multiple records with the same data attributesInsert anomalyUnable to add a record to the database.Delete anomalyRemoving a record also removes unintended data from the database.
30 Database Design Errors Alternatives for storing dataOne possible alternate approach would be to store all data in one uniform table.For example, instead of separate tables for students and classes at a University, we could store all data in one table and have a separate line for each student x class combination.
31 In the above, simplified example, a number of problems arise. Student IDLast NameFirst NamePhone No.Course No.SectionDayTimeSimpsonAliceACCT-36031M9:00 AMFIN-32133Th11:00 AMMGMT-30211112:00 PMSandersNedACCT-34332T10:00 AM5W8:00 AMANSI-14227FMooreArtieUsing the suggested approach, a student taking three classes would need three rows in the table.In the above, simplified example, a number of problems arise.4-31
32 This problem is referred to as an update anomaly. Student IDLast NameFirst NamePhone No.Course No.SectionDayTimeSimpsonAliceACCT-36031M9:00 AMFIN-32133Th11:00 AMMGMT-30211112:00 PMSandersNedACCT-34332T10:00 AM5W8:00 AMANSI-14227FMooreArtieSuppose Alice Simpson changes her phone number. You need to make the change in three places. If you fail to change it in all three places or change it incorrectly in one place, then the records for Alice will be inconsistent.This problem is referred to as an update anomaly.4-32
33 This problem is referred to as an insert anomaly. Student IDLast NameFirst NamePhone No.Course No.SectionDayTimeSimpsonAliceACCT-36031M9:00 AMFIN-32133Th11:00 AMMGMT-30211112:00 PMSandersNedACCT-34332T10:00 AM5W8:00 AMANSI-14227FMooreArtieWhat happens if you have a new student to add, but he hasn’t signed up for any courses yet?Or what if there is a new class to add, but there are no students enrolled in it yet? In either case, the record will be partially blank.This problem is referred to as an insert anomaly.4-33
34 This problem is referred to as a delete anomaly. Student IDLast NameFirst NamePhone No.Course No.SectionDayTimeSimpsonAliceACCT-36031M9:00 AMFIN-32133Th11:00 AMMGMT-30211112:00 PMSandersNedACCT-34332T10:00 AM5W8:00 AMANSI-14227FMooreArtieIf Ned withdraws from all his classes and you eliminate all three of his rows from the table, then you will no longer have a record of Ned. If Ned is planning to take classes next semester, then you probably didn’t really want to delete all records of him.This problem is referred to as a delete anomaly.4-34
35 Database Design Errors Alternatives for storing dataAnother possible approach would be to store each student in one row of the table and create multiple columns to accommodate each class that he is taking.
36 This approach is also fraught with problems: Student IDLast NameFirst NamePhone No.Class 1Class 2Class 3Class 4SimpsonAliceACCT-3603FIN-3213MGMT-3021SandersNedACCT-3433ANSI-1422MooreArtieThis approach is also fraught with problems:How many classes should you allow in building the table?The above table is quite simplified. In reality, you might need to allow for 20 or more classes (assuming a student could take many 1-hour classes). Also, more information than just the course number would be stored for each class. There would be a great deal of wasted space for all the students taking fewer than the maximum possible number of classes.Also, if you wanted a list of every student taking MGMT-3021, notice that you would have to search multiple attributes.4-36
37 Student x CourseStudent IDCourse ID123412361235The solution to the preceding problems is to use a set of tables in a relational database.Each entity is stored in a separate table, and separate tables or foreign keys can be used to link the entities together.4-37
38 Relational Database Design Rules Every column in a row must be single valuedPrimary key cannot be null (empty) also known as entity integrityIF a foreign key is not null, it must have a value that corresponds to the value of a primary key in another table (referential integrity)All other attributes in the table must describe characteristics of the object identified by the primary keyFollowing these rules allows databases to be normalized and solves the update, insert, and delete anomalies.Let’s look at how these design rules work with these two same tables (Sales and Customer)Rule 1: Every column in a row must be single valued: looking at both of the tables, there is not more than one piece of information in any of the fields. For example, If in the Customer table record for customer # 151 had two states in the cell: AZ, CA then that would be two values in this field which violates this rule.Rule 2: Primary key cannot be empty: for both of these tables, there are values for the primary key attributes for Sales table and the Customer table.Rule 3: If a foreign key is not null, it must have a value that corresponds to the value of a primary key in another table: As shown in the slide that is indeed true because the foreign key in the Sales table (customer #) has referential integrity with the Customer table. If, for example, the Sales table had customer #399 but there is no customer #399 in the customer table then this violates referential integrity between these two tables.Rule 4: All other attributes in the table must describe characteristics of the object identified by the primary key: when examining the Sales table, we can see that all of the other nonkey attributes (Date & Salesperson) do indeed describe information associated with the sales invoice #. The same rule is also valid for the Customer table as all the nonkey attributes (customer name, street, city, and state) all describe information associated with the customer #. If for example vendor information is found in the Customer table, then this rule would be violated.
39 Basic requirements of a relational database Every column(field) in a row must be single valued.In other words, every cell can have one and only one value.Student#NameAddress1TonyCleveland2EmilyNew York3LeighBirminghamCourse# *Acg4401, Acg3101Course#NameAcg4401AISAcg3101FAR 1Student#*1, 2, 3
40 Note that within each table, there are no duplicate primary keys and no null primary keys. Consistent with the entity integrity rule.Student#NameAddress1TonyCleveland2EmilyNew York3LeighBirminghamStudent#Course#1Acg4401Acg310123Course#NameAcg4401AISAcg3101FAR 1Relationship Table
41 FK example Not in Salesperson Table Referential integrity would prevent this from happening
42 RELATIONAL DATABASESAn important feature is that data about various things of interest (entities) are stored in separate tables.Makes it easier to add new data to the system.You add a new student by adding a row to the student table.You add a new course by adding a row to the course table.Means you can add a student even if he hasn’t signed up for any courses.And you can add a class even if no students are yet enrolled in it.Makes it easy to avoid the insert anomaly.Space is also used more efficiently than in the other schemes. There should be no blank rows or attributes.
43 Add a student here. Leaves no blank spaces. Student x Course Student IDCourse ID123412361235Add a course here.Leaves no blank spaces.When a particular student enrolls for a particular course, add that info here.4-43
44 RELATIONAL DATABASESDeletion of a class for a student would cause the elimination of one record in the student x class table.The student still exists in the student table.The class still exists in the class table.Avoids the delete anomaly.
45 Ned still exists in the student table. Student x CourseStudent IDCourse ID123412361235Even if Ned was the only student in the class, ACCT-3603 still exists in the course table.If Ned Sanders drops ACCT-3603, remove Ned’s class from this table.4-45
46 Database DesignThere are two basic ways to design well- structured relational databases.NormalizationSemantic data modeling(chapter 17)
47 Normalizing Relational Databases Initially, one table is used for all the data in a database.Following rules, the table is decomposed into multiple tables related by:Primary key–foreign key integrationDecomposed set of tables are in third normal form (3NF).