Data Warehousing.

Slides:



Advertisements
Similar presentations
Chapter 11: Data Warehousing
Advertisements

MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management
IS 4420 Database Fundamentals Chapter 11: Data Warehousing Leon Chen
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
Data Warehousing M R BRAHMAM.
Chapter 13 The Data Warehouse.
Data Warehousing - 2 ISYS 650. Data Warehouse Design - Star Schema - Dimension tables – contain descriptions about the subjects of the business such as.
Decision Support and Data Warehouse. Decision supports Systems Components Data management function –Data warehouse Model management function –Analytical.
Decision Support Systems. Decision Support Trends The emerging class of applications focuses on –Personalized decision support –Modeling –Information.
Chapter 11: Data Warehousing
Online Analytical Processing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional.
© 2007 by Prentice Hall 1 Chapter 11: Data Warehousing Modern Database Management 8 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
Business Intelligence. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views.
Chapter 15 Data Warehousing, OLAP, and Data Mining
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
Data Warehousing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their.
Chapter 13 The Data Warehouse
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
Chapter 1: Data Warehousing
Chapter 4 Data Warehousing.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
M ODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 ITEC 450.
Data Warehousing.
Business Intelligence. Topics Chart Online Analytical Process, OLAP – Excel’s Pivot table – Data visualization with dashboard Data warehousing Data Mining.
Chapter 9: data warehousing
Data Warehouse & Data Mining
MBA 664 Database Management Systems Dave Salisbury ( )
DATA WAREHOUSING. Introduction Modern organizations have huge amounts of data but are starving for information – facing information gap! Reasons for information.
Datawarehouse Objectives
Online Analytical Processing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional.
1 Data Warehouses BUAD/American University Data Warehouses.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management
1 Data Warehousing. 2Definition Data Warehouse Data Warehouse: – A subject-oriented, integrated, time-variant, non- updatable collection of data used.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Chapter 9: data warehousing
1 Topics about Data Warehouses What is a data warehouse? How does a data warehouse differ from a transaction processing database? What are the characteristics.
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
New Developments in Business Intelligence ( Decision Support Systems) BUS 782.
Decision supports Systems Components
Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining.
Business Intelligence. Topics Chart Online Analytical Process, OLAP – Excel’s Pivot table – Data visualization with dashboard Scenario Management Data.
Data Warehousing Multidimensional Analysis
Chapter 11: Data Warehousing Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Data Warehousing.
Advanced Database Concepts
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
 Definition of terms  Reasons for need of data warehousing  Describe three levels of data warehouse architectures  Describe two components of star.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Lecture 14: Data Warehousing Modern Database Management 9 th Edition Jeffrey A. Hoffer, Mary.
1 LM 7 Data Warehouse Dr. Lei Li. Learning Objectives Describe the needs for data warehouse Describe the three levels of a data warehouse Explain the.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
Decision Support System ISYS 363. Decision supports Systems Components Data management function –Data warehouse Model management function –Analytical.
1 Data Warehousing Data Warehousing. 2 Objectives Definition of terms Definition of terms Reasons for information gap between information needs and availability.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
1 HCMC UT, 2008 Data Warehousing 1.Basic Concepts of data warehousing 2.Data warehouse architectures 3.Some characteristics of data warehouse data 4.The.
Chapter 13 Business Intelligence and Data Warehouses
Summarized from various resources Modern Database Management
Chapter 11: Data Warehousing
Data Warehouse.
Competing on Analytics II
Chapter 1: Data Warehousing
Data Warehouse and OLAP
Data Warehouse and OLAP
Online Analytical Processing
Presentation transcript:

Data Warehousing

Definition Data Warehouse: Data Mart: A subject-oriented, integrated, time-variant, non-updatable collection of data used in support of management decision-making processes Subject-oriented: e.g. customers, patients, students, products Integrated: Consistent naming conventions, formats, encoding structures; from multiple data sources Time-variant: Contain a time dimenstion so that it may be used to study trends and changes Nonupdatable: Read-only, periodically refreshed Data Mart: A data warehouse that is limited in scope

Need for Data Warehousing Integrated, company-wide view of high-quality information (from disparate databases) Separation of operational and informational (decision support) systems and data (for improved performance)

Data Warehouse Architectures Generic Two-Level Architecture Independent Data Mart All involve some form of extraction, transformation and loading (ETL)

Figure 11-2: Generic two-level data warehousing architecture One, company-wide warehouse T E Periodic extraction  data is not completely current in warehouse

Figure 11-3 Independent data mart data warehousing architecture Data marts: Mini-warehouses, limited in scope E T L Separate ETL for each independent data mart Data access complexity due to multiple data marts

The ETL Process Capture/Extract Scrub or data cleansing Transform: Convert data from the format of the source to the format of the data warehouse. Load and Index ETL = Extract, transform, and load

Figure 11-10: Steps in data reconciliation Load/Index= place transformed data into the warehouse and create indexes Figure 11-10: Steps in data reconciliation (cont.) Refresh mode: bulk rewriting of target data at periodic intervals Update mode: only changes in source data are written to data warehouse

Index Bitmap index Join index

Bitmap saves on space requirements Figure 6-8 Rows - possible values of the attribute Columns - table rows Bit indicates whether the attribute of a row has the values Figure 6-8 Bitmap index index organization

Figure 6-9 Join Indexes–speeds up join operations

Star Schema for Data Warehouse Objectives Ease of use for decision support applications Fast response to predefined user queries Customized data for particular target audiences Also called “dimensional model” Dimension: A dimension is a term used to describe any category used in analyzing data, such as time, geography, and product line.

Figure 11-13 Components of a star schema Fact tables contain factual or quantitative data 1:N relationship between dimension tables and fact tables Dimension tables are denormalized to maximize performance Dimension tables contain descriptions about the subjects of the business Excellent for ad-hoc queries, but bad for online transaction processing

Figure 11-14 Star schema example Fact table provides statistics for sales broken down by product, period and store dimensions

Figure 11-15 Star schema with sample data

On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their data and allows them to analyze the data using simple windowing techniques Relational OLAP (ROLAP) Traditional relational representation Multidimensional OLAP (MOLAP) Cube structure OLAP Operations Cube slicing–come up with 2-D view of data Drill-down–going from summary to more detailed views

Figure 11-23 Slicing a data cube

Figure 11-24 Example of drill-down Summary report Starting with summary data, users can obtain details for particular cells Drill-down with color added

Data Mining and Visualization Knowledge discovery using a blend of statistical, AI, and computer graphics techniques Goals: Explain observed events or conditions Confirm hypotheses Explore data for new or unexpected relationships Techniques Statistical regression Decision tree induction Clustering and signal processing Affinity Sequence association Case-based reasoning Rule discovery Neural nets Fractals Data visualization–representing data in graphical/multimedia formats for analysis

Pivot Table Excel: Drill Down, Roll Up Access CrossTab query

SQL GROUPING SETS GROUPING SETS SELECT CITY,RATING,COUNT(CID) FROM HCUSTOMERS GROUP BY GROUPING SETS(CITY,RATING,(CITY,RATING),()) ORDER BY CITY; Note: () indicates that an overall total is desired.

SQL CUBE Perform aggregations for all possible combinations of columns indicated. SELECT CITY,RATING,COUNT(CID) FROM HCUSTOMERS GROUP BY CUBE(CITY,RATING) ORDER BY CITY, RATING;

SQL ROLLUP The ROLLUP extension causes cumulative subtotals to be calculated for the columns indicated. If multiple columns are indicated, subtotals are performed for each of the columns except the far-right column. SELECT CITY,RATING,COUNT(CID) FROM HCUSTOMERS GROUP BY ROLLUP(CITY,RATING) ORDER BY CITY, RATING