Presentation is loading. Please wait.

Presentation is loading. Please wait.

Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,

Similar presentations


Presentation on theme: "Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,"— Presentation transcript:

1 Decision Support and Date Warehouse Jingyi Lu

2 Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform, and Load (ETL)

3 Decision Support System Information technology to help the knowledge worker(executive, manager, analyst) make faster and better decisions. – What were the sales volumes by region and product category for the last year? – Which orders should we fill to maximize revenues? – Will a 10% discount increase sales volume sufficiently?

4 Decision Support Systems Created to facilitate the decision making process So much information that it is difficult to extract it all from a traditional database Need for a more comprehensive data storage facility  Data Warehouse

5 Decision Support Systems Extract Information from data to use as the basis for decision making Used at all levels of the Organization Tailored to specific business areas Ad Hoc queries to retrieve and display information Combines historical operation data with business activities

6 Decision Support Systems

7 OLAP vs. OLTP OLTP (On-line Transaction Processing): is characterized by a large number of short on-line transactions.-----> Operational database OLAP (On-line Analytical Processing):is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations.------> Data Warehouse

8 OLAP vs. OLTP

9 What is a Data Warehouse The repository for the DSS is the DATA WAREHOUSE Definition: Integrated, Subject-Oriented, Time-Variant, Nonvolatile database that provides support for decision making.

10 Integrated The data warehouse is a centralized, consolidated database that integrated data derived from the entire organization  Multiple Sources  Diverse Sources  Diverse Formats

11 Subject-Oriented Data is arranged and optimized to provide answer to questions from diverse functional areas Data is organized and summarized by topic  Sales / Marketing / Finance / Distribution / Etc.

12 Time-Variant The Data Warehouse represents the flow of data through time Can contain projected data from statistical models Data is periodically uploaded then time- dependent data is recomputed

13 Nonvolatile Once data is entered it is NEVER removed Represents the company’s entire history  Near term history is continually added to it  Always growing  Must support terabyte databases and multiprocessors Read-Only database for data analysis and query processing

14 Dimensional Modeling Dimension  dimension is a data element that categorizes each item in a data set into non-overlapping regions Facts  a value or measurement, which represents a fact about the managed entity or system.  typically numeric values that can be aggregated

15 Dimensional Modeling Database is a set of facts (points) in a multidimensional space Fact tables  contains business facts or measures and foreign keys which refer to primary keys in the dimension tables Dimension tables  Each dimension table has a set of attributes e.g., Day, Month, Year of Date  Attributes of a dimension may be related by partial order Hierarchy: e.g., Day > Month > Year

16 Example of Star Schema

17 Example of Snowflake Schema

18 ETL

19 Extraction Transformation Loading – ETL To get data out of the source and load it into the data warehouse – simply a process of copying data from one database to other Data is extracted from an OLTP database, transformed to match the data warehouse schema and loaded into the data warehouse database Many data warehouses also incorporate data from non- OLTP systems such as text files, legacy systems, and spreadsheets; such data also requires extraction, transformation, and loading When defining ETL for a data warehouse, it is important to think of ETL as a process, not a physical implementation

20 ETL ETL is often a complex combination of process and technology that consumes a significant portion of the data warehouse development efforts and requires the skills of business analysts, database designers, and application developers It is not a one time event as new data is added to the Data Warehouse periodically – monthly, daily, hourly Because ETL is an integral, ongoing, and recurring part of a data warehouse  Automated  Well documented  Easily changeable

21 ETL Staging Database ETL operations should be performed on a relational database server separate from the source databases and the data warehouse database Creates a logical and physical separation between the source systems and the data warehouse Minimizes the impact of the intense periodic ETL activity on source and data warehouse databases


Download ppt "Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,"

Similar presentations


Ads by Google