Presentation on theme: "Storing Organizational Information—Databases"— Presentation transcript:
1 Storing Organizational Information—Databases CHAPTER 7Storing Organizational Information—DatabasesCLASSROOM OPENERGREAT BUSINESS DECISIONS – Edgar Codd’s Relational Database TheoryEdgar Frank Codd was born at Portland, Dorset, in England. He studied mathematics and chemistry at Exeter College, Oxford, before serving as a pilot in the Royal Air Force during the Second World War. In 1948, he moved to New York to work for IBM as a mathematical programmer. In 1953 Codd moved to Ottawa, Canada. A decade later he returned to the USA and received his doctorate in computer science from the University of Michigan in Ann Arbor. Two years later he moved to San Jose, California to work at IBM's Almaden Research Center.In the 1960s and 1970s he worked out his theories of data arrangement, issuing his paper "A Relational Model of Data for Large Shared Data Banks" in 1970, after an internal IBM paper one year earlier. To his disappointment, IBM proved slow to exploit his suggestions until commercial rivals started implementing them.Initially, IBM refused to implement the relational model in order to preserve revenue from IMS/DB. Codd then showed IBM customers the potential of the implementation of its model, and they in turn pressured IBM. Then IBM included in its Future System project a System R subproject — but put in charge of it were developers who were not thoroughly familiar with Codd's ideas, and isolated the team from Codd. As a result, they did not use Codd's own Alpha language but created a non-relational one, SEQUEL. Even so, SEQUEL was so superior to pre-relational systems that it was copied, based on pre-launch papers presented at conferences, by Larry Ellison in his Oracle DBMS, which actually reached market before SQL/DS — due to the then-already proprietary status of the original moniker, SEQUEL had been renamed SQL.Codd continued to develop and extend his relational model, sometimes in collaboration with Chris Date. One of the normalized forms, the Boyce-Codd Normal Form, is named after Codd. Codd also coined the term OLAP and wrote the twelve laws of online analytical processing, although these were never truly accepted after it came out that his white paper on the subject was paid for by a software vendor. Edgar F. Codd died of heart failure at his home in Williams Island, Florida at the age of 79 on Friday, April 18, 2003.
2 LEARNING OUTCOMES7.1 Define the fundamental concepts of the relational database model 7.2 Evaluate the advantages of the relational database model 7.3 Compare relational integrity constraints and business-critical integrity constraints7.1 Define the fundamental concepts of the relational database modelThe relational database model stores information in the form of logically related two-dimensional tablesEntities, attributes, primary keys, and foreign keys are all fundamental concepts included in the relational database model7.2 Evaluate the advantages of the relational database modelDatabase advantages from a business perspective includeIncreased flexibilityIncreased scalability and performanceReduced information redundancyIncreased information integrity (quality)Increased information security7.3 Compare operational integrity constraints and business-critical integrity constraintsOperational integrity constraints are rules that enforce basic and fundamental information-based constraintsBusiness-critical integrity constraints are rules that enforce business rules vital to an organization’s success and often require more insight and knowledge than operational integrity constraints
3 LEARNING OUTCOMES7.4 Describe the benefits of a data driven website 7.5 Describe the two primary methods for integrating information across multiple databases7. 4 Describe the benefits of a data-driven website.A data-driven website is an interactive website kept constantly updated and relevant to the needs of its customers through the use of a database. Data-driven websites are especially useful when the site offers a great deal of information, products, or services. website visitors are frequently angered if they are buried under an avalanche of information when searching a website. A data-driven website invites visitors to select and view what they are interested in by inserting a query, which the website then analyzes and custom builds a webpage in real-time that satisfies the query.7.5 Describe the two primary methods for integrating information across multiple databases.Forward integration – takes information entered into a given system and sends it automatically to all downstream systems and processes.Backward integration – takes information entered into a given system and sends it automatically to all upstream systems and processes.
4 RELATIONAL DATABASE FUNDAMENTALS Information is everywhere in an organizationInformation is stored in databasesDatabase – maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses)How many of you are familiar with databases?What kinds of databases can be found around your college?Student registrationCourse evaluationPayrollParking servicesExplain to your students that almost every business decision is based on informationThe information required to make these decisions is typically stored in databases
5 RELATIONAL DATABASE FUNDAMENTALS Database models include:Hierarchical database model – information is organized into a tree-like structure (using parent/child relationships) in such a way that it cannot have too many relationshipsNetwork database model – a flexible way of representing objects and their relationshipsRelational database model – stores information in the form of logically related two-dimensional tablesMost organizations use the relational database modelThis text focuses on the relational database modelDiscuss the Coca-Cola Bottling Company of Egypt example in the text
6 Entities and Attributes Entity – a person, place, thing, transaction, or event about which information is storedThe rows in each table contain the entitiesIn Figure 7.1 CUSTOMER includes Dave’s Sub Shop and Pizza Palace entitiesAttributes (fields, columns) – characteristics or properties of an entity classThe columns in each table contain the attributesIn Figure 7.1 attributes for CUSTOMER include Customer ID, Customer Name, Contact NameThis text focuses on the relational database modelReview Figure 7.1What kinds of additional entity classes might be found in this database?INVENTORY, MARKETING CAMPAIGN, SALES QUOTE, INVOICE, PAYMENTWhat kinds of additional entities might be found in the CUSTOMER table?Could include any additional customer – Joe’s Mexican Restaurant, Fitness Forever, and Summer’s Flower Shop (these are all fictitious)What kinds of additional attributes might be found in the CUSTOMER table for Dave’s Sub Shop?Could include any additional customer information:AddressFaxCell phone
7 Keys and Relationships Primary keys and foreign keys identify the various entity classes (tables) in the databasePrimary key – a field (or group of fields) that uniquely identifies a given entity in a tableForeign key – a primary key of one table that appears an attribute in another table and acts to provide a logical relationship among the two tablesReview Figure 7.1Explain to your students that the logic that correlates the tables is implemented through the primary keysFor example: Hawkins Shipping in the DISTRIBUTOR table has a primary key called Distributor ID – DEN8001Notice that Hawkins Shipping (Distributor ID DEN8001) is responsible for delivering orders andTherefore, Distributor ID in the ORDER table creates a logical relationship (who shipped what order) between ORDER and DISTRIBUTOR
8 Keys and Relationships Potential relational database for Coca-ColaWalk your students through the relational database model in Figure 7.1To ensure your students are grasping the concepts, ask them to answer the following:How many orders have been placed for T’s Fun Zone?Ans: 1 Order IT 34563How many orders have been placed for Pizza Palace?Ans: NoneHow many items are included in Dave’s Sub Shop’s two orders?Ans: Order has 3 items and order has one item for a total of 4 items in both orders.Who is responsible for distributing Dave’s Sub Shop’s orders?Ans: Hawkins ShippingWhich products are included in Order 34562?Ans: 300 Vanilla Coke
9 RELATIONAL DATABASE ADVANTAGES Database advantages from a business perspective includeIncreased flexibilityIncreased scalability and performanceReduced information redundancyIncreased information integrity (quality)Increased information securityAll of the above are discussed in the following slides:A good way to explain databases is to compare them to spreadsheetsWhat are the limitations when using a spreadsheet?Limited number of rows and columns (Excel ,536 rows by 256 columns) Once you use more than 65,536 rows you have outgrown your spreadsheetOnly one users can access the spreadsheetUsers can view all information in the spreadsheetUsers can change all information in the spreadsheetAll of the disadvantages associated with a spreadsheet are fixed when using a databaseThese advantages are discussed in detail over the next several slides
10 Increased Flexibility A well-designed database should:Handle changes quickly and easilyProvide users with different viewsHave only one physical viewPhysical view – deals with the physical storage of information on a storage deviceHave multiple logical viewsLogical view – focuses on how users logically access informationThe separation between logical and physical views is what allows each user to access database information differentlyWhat would happen if a new database called “RealData” hit the market and allowed only one logical view?The “RealData” database simply would never sell. With only one logical view every person in an entire organization would have the same viewDefine two database views for your school’s student database (one for students, and one for instructors)What does the student view display when a student accesses the school’s student database?Courses enrolledGradesTuitionCredits for graduationWhat does the instructor view display when an instructor accesses the school’s student database?Courses teachingStudents in each coursePayment informationVacation time
11 Increased Scalability and Performance A database must scale to meet increased demand, while maintaining acceptable performance levelsScalability – refers to how well a system can adapt to increased demandsPerformance – measures how quickly a system performs a certain process or transactionWhat happens to a business if its suddenly experienced a 60 percent growth in sales and its IT systems fail with all of the increased activity?Remind your students that a big part of developing successful IT systems is being able to anticipate future growthCLASSROOM EXERCISEBuilding an ER DiagramBreak your students into groups and ask them to create an entity relationship diagram similar to the one in Figure 7.1 for a company or product of their choice. If the students are uncomfortable with databases, you should recommend that they stick to a company similar to the TCCBCE, perhaps a snack food producer, mountain bike equipment producer, or even a footwear producer. If your students are more comfortable with databases, ask them to choose a company that would challenge them such as a fast food restaurant, online book seller, or even a university’s course registration system.The important part of this exercise is for your students to begin to understand how the tables in a database relate. Be sure their ER diagrams include primary keys and foreign keys. Have your students present their ER diagrams to the class and ask the students to find any potential errors with the diagrams.
12 Reduced Information Redundancy Databases reduce information redundancyRedundancy – the duplication of information or storing the same information in multiple placesInconsistency is one of the primary problems with redundant informationOne of the primary goals of a database is to eliminate information redundancy by recording each piece of information in only one placeThis is a good time to tie the discussion back to the material in the previous chapter, low quality informationRecall what happens when a single customer is stored twice with different phone numbers, addresses, or order information in a single database
13 Increase Information Integrity (Quality) Information integrity – measures the quality of informationIntegrity constraint – rules that help ensure the quality of informationRelational integrity constraintBusiness-critical integrity constraintRelational integrity constraint – rule that enforces basic and fundamental information-based constraintsBusiness-critical integrity constraint – rule that enforce business rules vital to an organization’s success and often require more insight and knowledge than relational integrity constraintsCan you define two relational integrity constraints for an ordering system?Users cannot create an order for a nonexistent customerAn order cannot be shipped without an addressCan you define two business-critical integrity constraints for an ordering system?Product returns are not accepted for fresh product 15 days after purchaseA discount maximum of 20 percent
14 Increased Information Security Information is an organizational asset and must be protectedDatabases offer several security features including:Password – provides authentication of the userAccess level – determines who has access to the different types of informationAccess control – determines types of user access, such as read-only accessWhy you would want to define access level security?Access levels will typically mimic the hierarchical structure of the organization and protect organizational information from being viewed and manipulated by individuals who should not have access to the sensitive or confidential informationLow level employees typically have the lowest levels of accessHigh level employees typically have access to all types of database informationFor example: You would not want analysts viewing all salary information for the entire company - in general:Analysts can usually only view their own salaryManagers have higher access and can view the salaries of all their team members, but cannot view other managers’ salariesDirectors can view all of their managers’ and analysts’ salaries, but not other directors’ salariesThe CFO and CEO can view every employee’s salary
15 DATABASE MANAGEMENT SYSTEMS Database management systems (DBMS) – software through which users and application programs interact with a databaseDiscuss the two primary forms of user interaction with a databaseDirect interactionThe user interacts directly with the DBMSThe DBMS obtains the information from the databaseIndirect interactionUser interacts with an application (i.e., payroll application, manufacturing application, sales application)The application interacts with the DBMS
16 Data-Driven WebsitesData-driven websites – an interactive website kept constantly updated and relevant to the needs of its customers through the use of a databaseA data-driven website is an interactive website kept constantly updated and relevant to the needs of its customers through the use of a database. Data-driven websites are especially useful when the site offers a great deal of information, products, or services. website visitors are frequently angered if they are buried under an avalanche of information when searching a website. A data-driven website invites visitors to select and view what they are interested in by inserting a query, which the website then analyzes and custom builds a webpage in real-time that satisfies the query. The figure displays a Wikipedia user querying business intelligence and the database sending back the appropriate webpage that satisfies the user’s requestAsk your students what would happen to a website that is not data-driven?The users would need to continually update the website data manually as the business data is updated. This would be a redundant effort and most likely result in errors and the website could quickly become out of sync with the business data
17 Data-Driven Website Business Advantages Data Driven Website AdvantagesDevelopment: Allows the website owner to make changes any time—all without having to rely on a developer or knowing HTML programming. A well-structured, data-driven website enables updating with little or no training.Content management: A static website requires a programmer to make updates. This adds an unnecessary layer between the business and its webcontent, which can lead to misunderstandings and slow turnarounds for desired changes.Future expandability: Having a data-driven website enables the site to grow faster than would be possible with a static site. Changing the layout, displays, and functionality of the site (adding more features and sections) is easier with a data-driven solution.Minimizing human error: Even the most competent programmer charged with the task of maintaining many pages will overlook things and make mistakes. This will lead to bugs and inconsistencies that can be time consuming and expensive to track down and fix. Unfortunately, users who come across these bugs will likely become irritated and may leave the site. A well-designed, data-driven website will have ”error trapping” mechanisms to ensure that required information is filled out correctly and that content is entered and displayed in its correct format.Cutting production and update costs: A data-driven website can be updated and ”published” by any competent data entry or administrative person. In addition to being convenient and more affordable, changes and updates will take a fraction of the time that they would with a static site. While training a competent programmer can take months or even years, training a data entry person can be done in 30 to 60 minutes.More efficient: By their very nature, computers are excellent at keeping volumes of information intact. With a data-driven solution, the system keeps track of the templates, so users do not have to. Global changes to layout, navigation, or site structure would need to be programmed only once, in one place, and the site itself will take care of propagating those changes to the appropriate pages and areas. A data-driven infrastructure will improve the reliability and stability of a website, while greatly reducing the chance of ”breaking” some part of the site when adding new areas.Improved Stability: Any programmer who has to update a website from ”static” templates must be very organized to keep track of all the source files. If a programmer leaves unexpectedly, it could involve re-creating existing work if those source files cannot be found. Plus, if there were any changes to the templates, the new programmer must be careful to use only the latest version. With a data-driven website, there is peace of mind, knowing the content is never lost—even if your programmer is.
18 Data-Driven Business Intelligence BI in a data-driven websiteCompanies can gain business intelligence by viewing the data accessed and analyzed from their website. The figure displays how running queries or using analytical tools, such as a Pivot Table, on the database that is attached to the website can offer insight into the business, such as items browsed, frequent requests, items bought together, etc.
19 Integrating Information among Multiple Databases Integration – allows separate systems to communicate directly with each otherForward integration – takes information entered into a given system and sends it automatically to all downstream systems and processesBackward integration – takes information entered into a given system and sends it automatically to all upstream systems and processesOne of the biggest benefits of integration is that organizations only have to enter information into the systems once and it is automatically sent to all of the other systems throughout the organizationThis feature alone creates huge advantages for organizations because it reduces information redundancy and ensures accuracy and completenessWithout integrations an organization would have to enter information into every single system that requires the information from marketing and sales to billing and customer serviceFor example, customer information would have to be manually entered into the marketing, sales, ordering, inventory, billing, and shipping databases. (Each of these systems are separate and would have their own database – if the company doesn’t have a complete ERP installed.)Entering the same customer information into multiple systems is redundant, and chances of making a mistake in one of the systems is highIntegrations offer many advantages, but for the most part, the automated flow of information among separate systems is the biggest benefit
20 Integrating Information among Multiple Databases Forward integration and backward integrationIdentify the arrows along the top of the figure when explaining forward integrationsBasically, all information flows forward along the business processSales enters the information when it is negotiating the sale (looking for opportunities)The information is then passed to the order entry system when the order is actually placedThe order fulfillment system picks the products from the warehouse, packs the products, labels boxes, etcOnce the order is filled and shipped, the customer is billedWhat would happen if users could enter order information directly into the billing system?The systems would quickly become out-of-sync. There might be bills for nonexistent orders, or orders that do not have any bills (if someone deleted a bill)For this reason organizations typically place a business-critical integrity constraint on integrated systems: With a forward integration the information must be entered in the sales system, you could not enter information directly into the billing systemIntegrations are expensive to build and maintainIntegrations are difficult to implementFor these reasons many organizations only build forward integrations and use business-critical integrity constraints to ensure all information is always entered only at the start of the integration (one source of record)
21 Integrating Information among Multiple Databases Building a central repository specifically for integrated informationThe above figure displays an example of customer information integrated using this methodUsers can create, read, update, and delete in the main customer repository, and it is automatically sent to all of the other databasesThis method does not follow the business process when building the integrationsBusiness-critical integrity constraints still need to be built to ensure information is only ever entered into the customer repository, otherwise the information will become out-of-sync
22 OPENING CASE STUDY QUESTIONS It Takes A Village to Write an Encyclopedia Identify the different types of entity classes that might be stored in Wikipedia’s databaseExplain why database technology is so important to Wikipedia’s business modelExplain the difference between logical and physical views and why logical views are important to Wikipedia’s customers1. Identify the different types of entity classes that might be stored in Wikipedia’s database.Entity classes could include:SUBJECT AREA, SEARCH TERM, WEB PAGE, RESOURCE, EDITOR2. Explain why database technology is so important to Wikipedia’s business model.Without databases, Wikipedia simply would not exist for two primary reasons. First, vast amounts of information are at the heart of Wikipedia and without databases it would be impossible to store and retrieve the information. This is the information that Wikipedia’s customers are editing and researching. Second, Wikipedia uses database to store its indexes and to find and retrieve the information that its customers are looking for. Again, without databases Wikipedia simply would not exist – its business operates entirely on databases.3. Explain the difference between logical and physical views and why logical views are important to Wikipedia’s customers.A well-designed database should handle changes quickly and easily, and provide users with different views. Physical views deals with the physical storage of information on a storage device such as a hard disk. Logical views focus on how users logically access information to meet particular business needs. A database has only one physical view and multiple logical views. The separation between logical and physical views is what allows each user to access database information differently. If Wikipedia’s customers had to access physical views of information they would be confused and find the site difficult to use and understand. The site provides a logical view for each customer’s queries.
23 CHAPTER SEVEN CASE Keeper of the Keys Almost 90 million people had their personal information stolen or lost by organizationsBank of America: 1.2 million customersCardSystems: 40 million customersCitigroup: 3.9 million customersDSW Shoe Warehouse: 1.4 million customers.TJX Companies: 45.6 million customersWachovia: 676,000 customers
24 CHAPTER SEVEN CASE QUESTIONS How many organizations have your personal information, including your Social Security number, bank account numbers, and credit card numbers?What information is stored at your college? Is there any chance your information could be hacked and stolen from your college?What can you do to protect yourself from identity theft?1. How many organizations have your personal information, including your Social Security number, bank account numbers, and credit card numbers?This number will vary by student. Potential holders could include:Banks, Colleges, Credit Card Companies, Stores that Issue Credit, Insurance Companies, Auto Dealerships, Professors (if Social Security Number is used as Student ID), Government Agencies, Loan Applications, Hospitals, Doctor’s Offices, Dentist Offices2. What information is stored at your college? Is there any chance your information could be hacked and stolen from your college?All of your personal information is stored at your college from date of birth to social security number. Absolutely, information can be stolen from any organization. Colleges have numerous college students working at different locations across campuses who could easily access personal information. This is one reason many colleges no longer use social security numbers as student identification numbers.3. What can you do to protect yourself from identity theft?Continuously checking your credit report and perhaps purchasing identity theft protection services is the best way to ensure you are safe from identity theft. Be careful not to give your information to any individual who does not need it – especially via or telephone calls. Be aware of phising scams and other ways people might try to steal your information and buy a shredder for your documents.
25 CHAPTER SEVEN CASE QUESTIONS Do you agree or disagree with changing laws to hold the company where the data theft occurred accountable? Why or why not?What impact would holding the company liable where the data theft occurred have on large organizations?What impact would holding the company liable where the data theft occurred have on small business?4. Do you agree or disagree with changing laws to hold the company where the data theft occurred accountable? Why or why not?Student answers to this question will vary. The important part of their answer is the justification as to why or why not the company should be held accountable. One comment to get your students thinking would be should a bank be held liable if a gunman robs the bank? Is this the same type of theft and situation?5. What impact would holding the company liable where the data theft occurred have on large organizations?Companies would take greater actions to ensure the safety of customer information.6. What impact would holding the company liable where the data theft occurred have on small business?Small businesses would have to spend more money ensuring the safety of customer data and it might drain resources that are fundamental in keeping the business running.