Database and Data Warehouse

Slides:



Advertisements
Similar presentations
C6 Databases.
Advertisements

Lecture-7/ T. Nouf Almujally
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Management Information Systems, Sixth Edition
Accessing Organizational Information—Data Warehouse
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc. All rights reserved. 8-1 BUSINESS DRIVEN TECHNOLOGY Chapter Eight: Viewing and Protecting Organizational.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS MBNA
Managing Data Resources
Chapter 3 Database Management
McGraw-Hill/Irwin ©2008 The McGraw-Hill Companies, All Rights Reserved DATABASES AND DATA WAREHOUSES Opening Case Searching for Revenue - Google DATABASES.
Ch1: File Systems and Databases Hachim Haddouti
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
Business Driven Technology Unit 2
BUSINESS DRIVEN TECHNOLOGY
SESSION 7 MANAGING DATA DATARESOURCES. File Organization Terms and Concepts Field: Group of words or a complete number Record: Group of related fields.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
CHAPTER 08 Accessing Organizational Information – Data Warehouse
XP Information Information is everywhere in an organization Employees must be able to obtain and analyze the many different levels, formats, and granularities.
IST Databases and DBMSs Todd S. Bacastow January 2005.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS MBNA ebay
1 DATABASE TECHNOLOGIES BUS Abdou Illia, Fall 2007 (Week 3, Tuesday 9/4/2007)
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Chapter 5 Lecture 2. Principles of Information Systems2 Objectives Understand Data definition language (DDL) and data dictionary Learn about popular DBMSs.
1 DATABASE TECHNOLOGIES BUS Abdou Illia, Fall 2012 (September 5, 2012)
Essentials of Management Information Systems, 6e Chapter 7 Managing Data Resources 7.1 © 2005 by Prentice Hall Managing Data Resources Chapter 7.
6-1 DATABASE FUNDAMENTALS Information is everywhere in an organization Information is stored in databases –Database – maintains information about various.
The McGraw-Hill Companies, Inc Information Technology & Management Thompson Cats-Baril Chapter 3 Content Management.
CHAPTER SIX DATA Business Intelligence
STORING ORGANIZATIONAL INFORMATION— DATABASES CIS 429—Chapter 7.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
Management Information Systems By Effy Oz & Andy Jones
Organizing Data and Information AD660 – Databases, Security, and Web Technologies Marcus Goncalves Spring 2013.
CHAPTER 8: MANAGING DATA RESOURCES. File Organization Terms Field: group of characters that represent something Record: group of related fields File:
7.1 Managing Data Resources Chapter 7 Essentials of Management Information Systems, 6e Chapter 7 Managing Data Resources © 2005 by Prentice Hall.
Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin CHAPTER SIX DATA: BUSINESS INTELLIGENCE.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
BUS1MIS Management Information Systems Semester 1, 2012 Week 6 Lecture 1.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
Storing Organizational Information - Databases
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Database and Data Warehouse
Data resource management
Chapter 3 Databases and Data Warehouses: Building Business Intelligence Copyright © 2010 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
McGraw-Hill/Irwin ©2009 The McGraw-Hill Companies, All Rights Reserved CHAPTER 6 DATABASES AND DATA WAREHOUSES CHAPTER 6 DATABASES AND DATA WAREHOUSES.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
DATA RESOURCE MANAGEMENT
Foundations of Business Intelligence: Databases and Information Management.
Data: Business Intelligence Chapter 6 McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
3/6: Data Management, pt. 2 Refresh your memory Relational Data Model
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
McGraw-Hill/Irwin ©2008,The McGraw-Hill Companies, All Rights Reserved Chapter 5 Data Resource Management.
Managing Data Resources File Organization and databases for business information systems.
Management Information Systems by Prof. Park Kyung-Hye Chapter 7 (8th Week) Databases and Data Warehouses 07.
CHAPTER SIX DATA Business Intelligence
Basic Concepts in Data Management
MANAGING DATA RESOURCES
CHAPTER SIX OVERVIEW SECTION 6.1 – DATABASE FUNDAMENTALS
DATABASE TECHNOLOGIES
Presentation transcript:

Database and Data Warehouse June 27, 2012

LEARNING GOALS Explain basic concepts of data management. Describe traditional file systems and identify their problems. Define database management systems and describe their various functions. Explain how the relational database model works. Explain Object-Oriented databases. Explain Data Warehouse, Data Mart 2

Access and Management tools What is a database? Collection of related files containing records on people, places, or things. Databases make data easy to access and manage. Customers Info Accounts Info Employees Info Access and Management tools 3

Basic Concepts of Data Management Table 1 Table 2 Table 3 Report Form 1 Acc #:_______ Name:_______ Database: Collection of data organized in different containers 4

Basic Concepts of Database systems Accounts table AccountID Customer Type Balance 660001 John Smith Checking $120.00 660002 Linda Martin Saving $9450.00 660003 Paul Graham $3400.00 Each table has: Fields Records 1 Primary key Table Two-dimensional structure composed of rows and columns Field Like a column in a spreadsheet Field name Like a column name in a spreadsheet Examples: AccountID, Customer, Type, Balance Field values Actual data for the field Record Set of fields that describe an entity (a person, an account, etc.) Primary key A field, or group of fields, that uniquely identifies a record 5

Basic Concepts in Data Management A Primary key could be a single field like in these tables Primary key AccountID Customer Type Balance 660001 John Smith Checking $120.00 660002 Linda Martin Saving $9450.00 660003 Paul Graham $3400.00 A Primary key could be a composite key, i.e. multiple fields 6

Traditional File Systems System of files that store groups of records used by a particular software application Simple but with a cost Inability to share data Inadequate security Difficulties in maintenance and expansion Allows data duplication (e.g. redundancy) Application 1 Program 1 File 1 File 2 File 3 Program 2 Application 2 Program 1 File 1 File 2 File 3 Program 2 7

Traditional File System Anomalies Insertion anomaly Data needs to be entered more than once if located in multiple file systems Modification anomaly Redundant data in separate file systems Inconsistent data in your system Deletion anomaly Failure to simultaneously delete all copies of redundant data Deletion of critical data 8

Database Advantages Database advantages from a business perspective include Increased flexibility Handling changes quickly and easily Increased scalability and performance Scalability: how the DB can adapt to increased demand Reduced information redundancy & inconsistency Increased information integrity (quality) Increased information security All of the above are discussed in the following slides: A good way to explain databases is to compare them to spreadsheets What are the limitations when using a spreadsheet? Limited number of rows and columns (Excel 2003 - 65,536 rows by 256 columns) Once you use more than 65,536 rows you have outgrown your spreadsheet Only one users can access the spreadsheet Users can view all information in the spreadsheet Users can change all information in the spreadsheet All of the disadvantages associated with a spreadsheet are fixed when using a database These advantages are discussed in detail over the next several slides 9

Database Management System (DBMS) Combination of software and data for Collecting, storing and managing data in a database environment. A DBMS includes: Database Database engine (for accessing and modifying the DB content) Data Manipulation Language Application 1 Program-1 Program-2 Application 2 Program-1 Program-2 DBMS 10

Database Management System (DBMS) Software through which users and application programs interact with a database Discuss the two primary forms of user interaction with a database Direct interaction The user interacts directly with the DBMS The DBMS obtains the information from the database Indirect interaction User interacts with an application (i.e., payroll application, manufacturing application, sales application) The application interacts with the DBMS 11

ID Name Amt 01 John 23.00 02 Linda 3.00 03 Paul 53.00 DBMS Functions Store data (in tables) on secondary storage Transform data into information (reports, ..) Provide user with different logical views of actual database content Provide security: password authentication, access control DBMSs control who can add, view, change, or delete data in the database Logical views Physical view ID Name 02 Linda ID Name Amt 01 John 23.00 02 Linda 3.00 ID Name Amt 01 John 23.00 02 Linda 3.00 03 Paul 53.00 Name Amt Paul 53.00 12

DBMS Functions (cont.) Allow multi-user access Control concurrency of access to data Prevent one user from accessing data that has not been completely updated When selling tickets online, Ticketmaster allows you to hold a ticket for only 2 minutes to make your purchase decision, then the ticket is released to sell to someone else – that is concurrency control 13

Types of DBMSs Desktop Designed to run on desktop computers Server / Enterprise Handheld Desktop Designed to run on desktop computers Used by individuals or small businesses Requires little or no formal training Does not have all the capabilities of larger DBMSs Examples: Microsoft Access, FileMaker, Paradox 14

Types of DBMSs (Cont.) Server / Enterprise Designed for managing larger and complex databases by large organizations Typically operate in a client/server setup Either centralized or distributed Centralized – all data on one server Easy to maintain Prone to run slowly when many simultaneous users No access if the one server goes down Distributed – each location has part of the database Very complex database administration Usually faster than centralized If one server crashes, others can still continue to operate. Examples: Oracle Enterprise, DB2, Microsoft SQL Server 15

Types of DBMSs (Cont.) Handheld Designed to run on handheld devices Less complex and have less capabilities than Desktop or Server DBMSs Example: Oracle Database Lite, IBM’s DB2 Everywhere. 16

Database Models Database model = a representation of the relationship between structures (e.g. tables) in a database Common database models Flat file model Relational model (this one is the most common) Object-oriented database model Hierarchical model Network model 17

Flat File Database Stores data in basic table structures No relationship between tables Used on PDAs for address book 18

Relational Model Multiple tables related by common fields Uses controlled redundancy to create fields that provide linkage relationships between tables in the database These fields are called foreign keys – the secret to a relational database A foreign key is a field, or group of fields, in one table that is the primary key of another table 19

Object-Oriented DBMS Needed for multimedia applications that manage images, voice, videos, graphics, etc. in addition to numbers and characters. Popular in Web applications Slower compared to relational DBMS for processing large number of transactions Hybrid object-relational DBMS are emerging 20

Hierarchical Database Model Data is organized into a tree-like structure using parent-child relationships. Created in the 1960s by IBM Limited to storing data in one-to-many relationships One parent segment to many child segments Not very flexible

Network Model Developed in 1969 Many-to-many relationships between entities Any record may be linked to any other record Highly flexible but also highly complex Rarely used

Data Warehouse Many organizations need internal, external, current, and historical data Data Warehouse are designed to, typically, store and manage data from operational transaction systems, Web site transactions, etc. 23

Data Warehouse Fundamentals Data warehouse – a logical collection of information – gathered from many different operational databases – that supports business analysis activities and decision-making tasks The primary purpose of a data warehouse is to aggregate information throughout an organization into a single repository for decision-making purposes What is the primary difference between a database and data warehouse? The primary difference between a database and a data warehouse is that a database stores information for a single application, whereas a data warehouse stores information from multiple databases, or multiple applications, and external information such as industry information This enables cross-functional analysis, industry analysis, market analysis, etc., all from a single repository Data warehouses support only analytical processing (OLAP) 24 24

Data Warehouse Fundamentals The data warehouse modeled in the above figure compiles information from internal databases or transactional/operational databases and external databases through ETL It then send subsets of information to the data marts through the ETL process Ask your students to distinguish between a data warehouse and a data mart? Ans: A data warehouse has an enterprisewide organizational focus, while a data mart focuses on a subset of information for a given business unit such as finance Extraction, transformation, and loading (ETL) – a process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse. 25

Data Mart Subset of data warehouses that is highly focused and isolated for a specific population of users Example: Marketing data mart, Sales data mart, etc. 26

Database vs. Data Warehouse Databases contain information in a series of two-dimensional tables In a Data Warehouse and data mart, information is multidimensional, it contains layers of columns and rows Each layer in a data warehouse or data mart represents information according to an additional dimension Dimensions could include such things as: Products Promotions Stores Category Region Stock price Date Time Weather Why is the ability to look at information based on different dimensions critical to a businesses success? Ans: The ability to look at information from different dimensions can add tremendous business insight By slicing-and-dicing the information a business can uncover great unexpected insights 27 27

Multidimensional Analysis Data mining – the process of analyzing data to extract information not offered by the raw data alone Data-mining tool – uses a variety of techniques to find patterns and relationships in large volumes of information and infers rules that predict future behavior and guide decision making Data-mining tools include: query tools, statistical tools, intelligent agents, etc. Data mining can begin at a summary information level (coarse granularity) and progress through increasing levels of detail (drilling down), or the reverse (drilling up) Data-mining tools include query tools, reporting tools, multidimensional analysis tools, statistical tools, and intelligent agents Ask your students to provide an example of what an accountant might discover through the use of data-mining tools Ans: An accountant could drill down into the details of all of the expense and revenue finding great business intelligence including which employees are spending the most amount of money on long-distance phone calls to which customers are returning the most products Could the data warehousing team at Enron have discovered the accounting inaccuracies that caused the company to go bankrupt? If the did spot them, what should the team have done? 28 28

Information Cleansing or Scrubbing An organization must maintain high-quality data in the data warehouse Information cleansing or scrubbing – a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information Information cleansing or scrubbing, first, occurs during ETL. Then, when the data is in the Data Warehouse using Information cleansing or scrubbing tools. This is a an excellent time to return to the information learned in Chapter 6 on high-quality and low-quality information What would happen if the information contained in the data warehouse was only about 70 percent accurate? Would you use this information to make business decisions? Is it realistic to assume that an organization could get to a 100% accuracy level on information contained in its data warehouse? No, it is too expensive 29 29

Summary Questions Notes What is a database, a table, a field, a record, a primary key, a composite key? 2) What are the problems with traditional file systems? 3) What are the major functions of a DBMS? (a) Name some Desktop DBMSs. (b) Name some Enterprise DBMSs. (c) Handheld DBMSs Describe hierarchical database model, network model What are the differences between Flat File, Relational, and Object-oriented database models? What is Data warehouse? Data Mart? What is Extraction, transformation, and loading (ETL)? What is data-mining? What is Information cleansing or scrubbing? 30