CISB594 – Business Intelligence

Slides:



Advertisements
Similar presentations
Business Intelligence
Advertisements

5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Enterprise Data Warehousing (EDW) By: Jordan Olp.
Management Information Systems, Sixth Edition
7-1 INTRODUCTION: SoA Introduced SoA in Chapter 6 Service-oriented architecture (SoA) - perspective that focuses on the development, use, and reuse of.
Sharing Enterprise Data Data administration Data administration Data downloading Data downloading Data warehousing Data warehousing.
Basic guidelines for the creation of a DW Create corporate sponsors and plan thoroughly Determine a scalable architectural framework for the DW Identify.
Chapter 3 Database Management
1 SYS366 Week 1 - Lecture 2 How Businesses Work. 2 Today How Businesses Work What is a System Types of Systems The Role of the Systems Analyst The Programmer/Analyst.
Business Intelligence Michael Gross Tina Larsell Chad Anderson.
Integration of Applications MIS3502: Application Integration and Evaluation Paul Weinberg Adapted from material by Arnold Kurtz, David.
Chapter 5 DATA WAREHOUSING.
Chapter 8: Data Warehousing
CISB594 – Business Intelligence
Chapter 2 Data Warehousing
Data Warehouse Tools and Technologies - ETL
E-Data Jill Dyché Turning Data into Information with Data Warehousing.
2nd semester 2010 Dr. Qusai Abuein
Decision Support Systems Data Warehousing Chattrakul Sombattheera.
Chapter 2 Data Warehousing
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Understanding Data Warehousing
1 Brett Hanes 30 March 2007 Data Warehousing & Business Intelligence 30 March 2007 Brett Hanes.
DBS201: DBA/DBMS Lecture 13.
Moving into Design SYSTEMS ANALYSIS AND DESIGN, 6 TH EDITION DENNIS, WIXOM, AND ROTH © 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 1 Roberta M. Roth.
The McGraw-Hill Companies, Inc Information Technology & Management Thompson Cats-Baril Chapter 3 Content Management.
CISB594 – Business Intelligence
More ETL. ETL in a nutshell ETL is an abbreviation of the three words Extract, Transform and Load. It is an ETL process to –extract data, mostly from.
Case 2: Emerson and Sanofi Data stewards seek data conformity
CISB594 – Business Intelligence Business Performance Management.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
CISB594 – Business Intelligence Data Warehousing Part II.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
Announcements. Data Management Chapter 12 Traditional File Approach  Structure Field  Record  File  Fixed All records have common fields, and a field.
PPTTEST 10/24/ :07 1 IT Ron Williams Business Innovation Through Information Technology IS Organization.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
Strategically Managing the HRM Function McGraw-Hill/Irwin ©2012 The McGraw-Hill Companies, All Rights Reserved.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
CISB594 – Business Intelligence
CISB594 – Business Intelligence Data Warehousing Part I.
CISB594 – Business Intelligence Data Warehousing Part I.
 Understand the basic definitions and concepts of data warehouses  Describe data warehouse architectures (high level).  Describe the processes used.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
CISB594 – Business Intelligence Data Warehousing Part I.
CISB594 – Business Intelligence Data Warehousing Part II.
Decision Support Systems Data Warehousing. Modified from Decision Support Systems and Business Intelligence Systems 9E. 1-2 Learning Objectives Understand.
1 ISQS 3358, Business Intelligence Data Warehousing Zhangxi Lin Texas Tech University 1.
Chapter 2 Data Warehousing. Learning Objectives  Understand the basic definitions and concepts of data warehouses  Describe data warehouse architectures.
Chapter 2 Data Warehousing. Learning Objectives  Understand the basic definitions and concepts of data warehouses  Understand data warehousing architectures.
CISB594 – Business Intelligence Data Warehousing Part I.
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
CISB594 – Business Intelligence Data Warehousing Part II.
UTA/ARRI. Enterprise Engineering for The Agile Enterprise Don Liles The University of Texas at Arlington.
 Understand the basic definitions and concepts of data warehouses  Describe data warehouse architectures (high level).  Describe the processes used.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
Chapter 8: Data Warehousing. Data Warehouse Defined A physical repository where relational data are specially organized to provide enterprise- wide, cleansed.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 5: Data Warehousing.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
DATA WAREHOUSING. Learning Objectives  Understand the basic definitions and concepts of data warehouses  Understand data warehousing architectures 
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
By: Haytham Abdel-Qader. Topics in Data Management include: I. Data analysis II. Database management system III. Data modeling IV. Database administration.
Management Information Systems by Prof. Park Kyung-Hye Chapter 7 (8th Week) Databases and Data Warehouses 07.
Pengantar Sistem Informasi
Chapter 2 Data Warehousing
Advanced Applied IT for Business 2
Data Warehousing and Data Mining By N.Gopinath AP/CSE
Chapter 8: Data Warehousing
Presentation transcript:

CISB594 – Business Intelligence Data Warehousing Part II

Reference Materials used in this presentation are extracted mainly from the following texts, unless stated otherwise.

Objectives At the end of this lecture, you should be able to: Describe the processes used in developing and managing data warehouses Explain data integration and the extraction, transformation, and load (ETL) processes Describe real-time (active) data warehousing Understand data warehouse administration and security issues CISB594 – Business Intelligence

Data Integration: The Extraction, Transformation, and Load (ETL) Process Data integration is a term that covers three processes which combine to move data from multiple sources into a data warehouse:

Data Integration : The Extraction, Transformation, and Load (ETL) Process Extraction, transformation, and load (ETL) technologies Fundamentally, a DW could not exist without ETL The ETL process consists of extraction (reading data from one or more databases), transformation (converting the extracted data from its previous form into the form in which it needs to be so that it can be placed into a data warehouse) load (putting the data into the data warehouse) The ETL process also contributes to the quality of the data in a DW, its purpose is to load the data warehouse with integrated and cleansed data

Data Transformation Tools – To purchase or to build? The process can either be done through purchasing data transformation tools and setup ETL, or purchase the tool together with the expertise to setup ETL, or developing the tools using programming languages. Programmers can set up ETL processes using almost any programming language, but building such processes from scratch is very complex. (we are trying to talk to different sources here!)

Data Transformation Tools – To purchase or to build? Increasingly, companies are buying ETL tools to help in the creation of ETL process (Wikipedia) Examples of ETL tools IBM InfoSphere DataStage Microsoft SQL Server Integration Services (SSIS) Oracle Data Integrator (ODI)

Data Transformation Tools – To purchase or to build? A Consideration Should organization purchase data transformation tools or build the transformation process itself ? Data transformation tools are expensive Data transformation tools may have a long learning curve It is difficult to measure how the IT organization is doing until it has learned to use the data transformation tools Not an easy decision ! However many believe purchasing the tools should be able to kick start the project faster and simplify the maintenance of the data warehouse .

Data Warehouse Development Choosing the vendors Six guidelines to consider a vendor Financial strength Qualified consultants Market share Industry experience Established partnerships - These indicate that a vendor is likely to be in business for the long term, to have the support capabilities its customers need, and to provide products that interoperate with other products the potential user has or may obtain.

Data Warehouse Development A data warehousing project is a major undertaking MORE COMPLICATED AS IT COMPRISES AND INFLUENCES MANY DEPARTMENTS, INPUT OUTPUT INTERFACES AND CAN BE PART OF BUSINESS STRATEGY Data warehouse development approaches Inmon Model: EDW approach Kimball Model: Data mart approach Which model is best? There is no one-size-fits-all strategy to data warehousing It depends on the need and the capacity of the organization For many organizations, data mart approach is a convenient first step in implementing DW

Data Warehouse Development Kimball Model Inmon Model Pros: Easy to build organizationally Easy to build technologically Cons: Enterprise wide view unavailable Redundant data costs High ETL costs High DBA costs Business Enterprise View Design consistency Data reusability Require corporate leadership and vision

Eventually it can be this …

Data Warehouse Development Describe the major similarities and differences between the Inmon and Kimball data warehouse development approaches. Similarities: Both methods can produce an enterprise data warehouse and subset data marts. Differences: Inmon’s approach starts with an enterprise data warehouse, creating data marts as subsets of that EDW if appropriate. The focus is on proven, traditional methods and technologies. Kimball’s starts with data marts, consolidating them into an EDW later if appropriate. It focuses in creating a useful end-user capability quickly.

Data Warehouse Development Effort Data Mart (Kimball Model) EDW (Inmon Model) Scope One subject area Several subject areas Development Time Months Years Development Cost $10K - $100 K ++ $1000000 ++ Development Difficulty Low to medium High Sources Only some operational and external systems Many operational and external systems Size Megabytes to several gigabytes Gigabytes to petabytes Hardware Workstations and departmental servers Enterprise servers and mainframe OS Windows and Linux Unix, S/390 Number of simultaneous users 10s 100s to 1000s

Real-Time Data Warehousing Traditionally, a data warehouse are not business critical, data are commonly updated on a weekly basis – not allowing for responding to transactions in near real time Today, organizations are facing the need for real-time data warehousing, as decision support has become operational. The emergence of real-time data warehousing (RDW) or active data warehousing (ADW) The process of loading and providing data via a data warehouse as they become available

Real-Time Data Warehousing The need for real-time data A business often cannot afford to wait a whole day for its operational data to load into the data warehouse for analysis Real-time data collection can reduce or eliminate the nightly batch processes

Mistakes Starting with wrong sponsorship chain – You need the right champion, somebody influential over the sources and directions Setting unrealistic expectations – remember, BI initiative relies on management’s good decision making as much as it relies on the technology used. Engaging in political naïve behaviour – Don’t imply that managers have been making bad decisions without BI Loading DW with information just because it is available Believing that DW design is the same as transactional DB design. Choosing a DW admin who is only technology oriented.

Data Warehouse Administration and Security Issues Due to its huge size and complicated nature, DW requires strong monitoring and administrating Needs more than a DBA Needs data warehouse administrator (DWA) A person responsible for the administration and management of a data warehouse

Data Warehouse Administration and Security Issues What skills should a DWA possess? Why? Familiarity with high-performance hardware, software and networking technologies, since a data warehouse is based on those. Solid business insight, to understand the purpose of the DW and its business justification, familiarity with business decision, making processes to understand how the DW for business strategic purpose will be used easy. Excellent communication skills, to communicate with the rest of the organization

Data Warehouse Administration and Security Issues Effective security in a data warehouse should focus on four main areas: 1. Establishing effective corporate and security policies and procedures. An effective security policy should start at the top and be communicated to everyone in the organization. 2. Implementing logical security procedures and techniques to restrict access. This includes user authentication, access controls, and encryption. 3. Limiting physical access to the data center environment. 4. Establishing an effective internal control review process for security and privacy

Assignment question Discuss the architecture and data warehouse model approach – include diagram and justify the choice Discuss whether your team will recommend the organization to purchase tools from vendors or build the system itself. Justify your suggestion and include cost estimation.

Now ask if .. You are able to: Describe the processes used in developing and managing data warehouses Explain data integration and the extraction, transformation, and load (ETL) processes Describe real-time (active) data warehousing Understand data warehouse administration and security issues CISB594 – Business Intelligence