Data Warehousing.

Slides:



Advertisements
Similar presentations
Chapter 11: Data Warehousing
Advertisements

MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management
IS 4420 Database Fundamentals Chapter 11: Data Warehousing Leon Chen
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 15-1 David M. Kroenke Database Processing Chapter 15 Business Intelligence.
Chapter 13 The Data Warehouse.
Data Warehousing - 2 ISYS 650. Data Warehouse Design - Star Schema - Dimension tables – contain descriptions about the subjects of the business such as.
Decision Support and Data Warehouse. Decision supports Systems Components Data management function –Data warehouse Model management function –Analytical.
Decision Support Systems. Decision Support Trends The emerging class of applications focuses on –Personalized decision support –Modeling –Information.
Chapter 11: Data Warehousing
Online Analytical Processing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional.
© 2007 by Prentice Hall 1 Chapter 11: Data Warehousing Modern Database Management 8 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
Business Intelligence. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
Data Warehousing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Data Warehousing ISYS 650. What is a data warehouse? A data warehouse is a subject-oriented, integrated, nonvolatile, time-variant collection of data.
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
Chapter 1: Data Warehousing
M ODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 ITEC 450.
Data Warehousing.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
Business Intelligence. Topics Chart Online Analytical Process, OLAP – Excel’s Pivot table – Data visualization with dashboard Data warehousing Data Mining.
Chapter 9: data warehousing
Data Warehouse & Data Mining
MBA 664 Database Management Systems Dave Salisbury ( )
DATA WAREHOUSING. Introduction Modern organizations have huge amounts of data but are starving for information – facing information gap! Reasons for information.
Business Intelligence - 1 BUS 782. Topics Scenario Management Chart Online Analytical Process, OLAP – Excel’s Pivot table/Pivot chart Import/Export Data.
Datawarehouse Objectives
Online Analytical Processing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional.
1 Data Warehouses BUAD/American University Data Warehouses.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management
1 Data Warehousing. 2Definition Data Warehouse Data Warehouse: – A subject-oriented, integrated, time-variant, non- updatable collection of data used.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Data Warehousing.
Chapter 9: data warehousing
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
New Developments in Business Intelligence ( Decision Support Systems) BUS 782.
Business Intelligence BUS 782. Topics Import/Export Data Chart Online Analytical Process, OLAP – Excel’s Pivot table/Pivot chart Scenario Management Data.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Decision supports Systems Components
Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining.
Business Intelligence. Topics Chart Online Analytical Process, OLAP – Excel’s Pivot table – Data visualization with dashboard Scenario Management Data.
Data Warehousing Multidimensional Analysis
Chapter 11: Data Warehousing Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Advanced Database Concepts
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
 Definition of terms  Reasons for need of data warehousing  Describe three levels of data warehouse architectures  Describe two components of star.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Lecture 14: Data Warehousing Modern Database Management 9 th Edition Jeffrey A. Hoffer, Mary.
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
1 LM 7 Data Warehouse Dr. Lei Li. Learning Objectives Describe the needs for data warehouse Describe the three levels of a data warehouse Explain the.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
Decision Support System ISYS 363. Decision supports Systems Components Data management function –Data warehouse Model management function –Analytical.
1 Data Warehousing Data Warehousing. 2 Objectives Definition of terms Definition of terms Reasons for information gap between information needs and availability.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
1 HCMC UT, 2008 Data Warehousing 1.Basic Concepts of data warehousing 2.Data warehouse architectures 3.Some characteristics of data warehouse data 4.The.
Chapter 13 Business Intelligence and Data Warehouses
Summarized from various resources Modern Database Management
Chapter 11: Data Warehousing
Data Warehouse.
Data Warehouse and OLAP
Data Warehouse and OLAP
Online Analytical Processing
Presentation transcript:

Data Warehousing

On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their data and allows them to analyze the data using simple windowing techniques Relational OLAP (ROLAP) Traditional relational representation Multidimensional OLAP (MOLAP) Cube structure OLAP Operations Cube slicing–come up with 2-D view of data Drill-down–going from summary to more detailed views Roll-up – the opposite direction of drill-down Reaggregation – rearrange the order of dimensions

Slicing a data cube

Example of drill-down Summary report Starting with summary data, users can obtain details for particular cells Drill-down with color added

Excel’s Pivot Table Data/Pivot Table Drilldown, rollup, reaggregation

Access Pivot Form Drill Down

Data Warehouse A subject-oriented, integrated, time-variant, non-updatable collection of data used in support of management decision-making processes Subject-oriented: e.g. customers, employees, locations, products, time periods, etc. Dimensions for data analysis Integrated: Consistent naming conventions, formats, encoding structures; from multiple data sources Time-variant: Can study trends and changes Nonupdatable: Read-only, periodically refreshed Data Mart: A data warehouse that is limited in scope

Need for Data Warehousing Integrated, company-wide view of high-quality information (from disparate databases) Separation of operational and informational systems and data (for improved performance)

Generic two-level data warehousing architecture One, company-wide warehouse T E Periodic extraction  data is not completely current in warehouse

The ETL Process Capture/Extract Scrub or data cleansing Transform Load and Index ETL = Extract, transform, and load

Capture/Extract…obtaining a snapshot of a chosen subset of the source data for loading into the data warehouse Incremental extract = capturing changes that have occurred since the last static extract Static extract = capturing a snapshot of the source data at a point in time

Data Warehouse Design - Star Schema - Also called “dimensional model” Fact table contain detailed business data Dimension tables contain descriptions about the subjects of the business such as customers, employees, locations, products, time periods, etc. A dimension is a term used to describe any category used in analyzing data, such as time, geography, and product line.

Star schema example Fact table provides statistics for sales broken down by product, period and store dimensions Dimension tables contain descriptions about the subjects of the business

Star schema with sample data

Example: Order Processing System City OID ODate CID Cname Rating SalesPerson Has M Order Customer 1 M Qty Has M Product Price PID Pname

Star Schema Location CustomerRating Dimension Dimension LocationCode State City CustomerRating Dimension Rating Description FactTable LocationCode PeriodCode Rating PID Qty Amount Can group by State, City Period Dimension PeriodCode Year Quarter Product Category CategoryID Description Product Dimension PID Pname CategoryID (Snowflake model)

From SalesDB to MyDataWarehouse Extract data from SalesDB: Create query to get the data Download to MyDataWareHouse File/Import/Save as Table Data scrub/cleasing,and transform: Transform City to Location Transform Odate to Period Load data to FactTable

Bitmap saves on space requirements Figure 6-8 Rows - possible values of the attribute Columns - table rows Bit indicates whether the attribute of a row has the values Figure 6-8 Bitmap index index organization

Figure 6-9 Join Indexes–speeds up join operations

Data Mining and Visualization Knowledge discovery using a blend of statistical, AI, and computer graphics techniques Goals: Explain observed events or conditions Confirm hypotheses Explore data for new or unexpected relationships Techniques Statistical regression Decision tree induction Clustering and signal processing Affinity Sequence association Case-based reasoning Rule discovery Neural nets Fractals Data visualization–representing data in graphical/multimedia formats for analysis

SQL GROUPING SETS GROUPING SETS SELECT CITY,RATING,COUNT(CID) FROM HCUSTOMERS GROUP BY GROUPING SETS(CITY,RATING,(CITY,RATING),()) ORDER BY CITY; Note: () indicates that an overall total is desired.

SQL CUBE Perform aggregations for all possible combinations of columns indicated. SELECT CITY,RATING,COUNT(CID) FROM HCUSTOMERS GROUP BY CUBE(CITY,RATING) ORDER BY CITY, RATING;

SQL ROLLUP The ROLLUP extension causes cumulative subtotals to be calculated for the columns indicated. If multiple columns are indicated, subtotals are performed for each of the columns except the far-right column. SELECT CITY,RATING,COUNT(CID) FROM HCUSTOMERS GROUP BY ROLLUP(CITY,RATING) ORDER BY CITY, RATING