INCREMENTAL AGGREGATION After you create a session that includes an Aggregator transformation, you can enable the session option, Incremental Aggregation.

Slides:



Advertisements
Similar presentations
CC SQL Utilities.
Advertisements

4 Oracle Data Integrator First Project – Simple Transformations: One source, one target 3-1.
Business Intelligence Simon Pease. Experience with BI Developing end-to-end BI prototype for Plan International Developing end-to-end BI prototype for.
BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading.
Introduction to ETL Using Microsoft Tools By Dr. Gabriel.
Alternative Database topology: The star schema
Technical BI Project Lifecycle
6 th Annual Focus Users’ Conference 6 th Annual Focus Users’ Conference Accounts Receivable Presented by: Robert Myers Presented by: Robert Myers.
Dimensional Modeling Business Intelligence Solutions.
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
ETL Design and Development Michael A. Fudge, Jr.
ETL By Dr. Gabriel.
Workflow Manager and General Tuning Tips. Topics to discuss… Working with Workflows Working with Tasks General Tuning Tips.
Data Warehouse Student Data User Group Meeting 1/29/2015.
DAY 15: ACCESS CHAPTER 2 Larry Reaves October 7,
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08.
Data-mining & Data As we used Excel that has capability to analyze data to find important information, the data-mining helps us to extract information.
Chapter 6 SAS ® OLAP Cube Studio. Section 6.1 SAS OLAP Cube Studio Architecture.
Other database objects (Sequence). What Is a Sequence? A sequence: Automatically generates sequential numbers Is a sharable object Is typically used to.
Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.
1 Data Warehouses BUAD/American University Data Warehouses.
Data Management Console Synonym Editor
ETL Extract. Design Logical before Physical Have a plan Identify Data source candidates Analyze source systems with data- profiling tools Receive walk-through.
Oracle Data Integrator Transformations: Adding More Complexity
Views In some cases, it is not desirable for all users to see the entire logical model (that is, all the actual relations stored in the database.) In some.
BI Terminologies.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
DIMENSIONAL MODELING MIS2502 Data Analytics. So we know… Relational databases are good for storing transactional data But bad for analytical data What.
MIS2502: Data Analytics Dimensional Data Modeling
Manage Attendance. C1-AT Manage Attendance by Class and Subject Teacher Description: –This function allows Class Teacher to; Manage Attendance Roster.
Chapter 9 Constraints. Chapter Objectives  Explain the purpose of constraints in a table  Distinguish among PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK,
Oracle 11g: SQL Chapter 4 Constraints.
Chapter 4 Constraints Oracle 10g: SQL. Oracle 10g: SQL 2 Objectives Explain the purpose of constraints in a table Distinguish among PRIMARY KEY, FOREIGN.
1 Chapter 4: Creating Simple Queries 4.1 Introduction to the Query Task 4.2 Selecting Columns and Filtering Rows 4.3 Creating New Columns with an Expression.
- Joiner Transformation. Introduction ►Transformations help to transform the source data according to the requirements of target system and it ensures.
6 Copyright © 2009, Oracle. All rights reserved. Using the Data Transformation Operators.
7 Strategies for Extracting, Transforming, and Loading.
BI Practice March-2006 COGNOS 8BI TOOLS COGNOS 8 Framework Manager TATA CONSULTANCY SERVICES SEEPZ, Mumbai.
Chapter 5 : Integrity And Security  Domain Constraints  Referential Integrity  Security  Triggers  Authorization  Authorization in SQL  Views 
02 | Data Flow – Extract Data Richard Currey | Senior Technical Trainer–New Horizons United George Squillace | Senior Technical Trainer–New Horizons Great.
Date Warehouse - A data warehouse is a relational/multidimensional database that is designed for query and analysis rather than transaction processing.
Relational Database Management System(RDBMS) Structured Query Language(SQL)
Aggregator  Performs aggregate calculations  Components of the Aggregator Transformation Aggregate expression Group by port Sorted Input option Aggregate.
Day in the Life (DITL) Production Operations with Energy Builder Copyright © 2015 EDataViz LLC.
Aggregator Stage : Definition : Aggregator classifies data rows from a single input link into groups and calculates totals or other aggregate functions.
Module 5: Managing Content. Overview Publishing Content Executing Reports Creating Cached Instances Creating Snapshots and Report History Creating Subscriptions.
Chapter 6 Many-to Many Relationship. Agenda AutoNumber Many-to-many relationship Cascaded updating and deleting Auto Lookup Parameter query Total query.
7 Copyright © 2006, Oracle. All rights reserved. Defining a Relational Dimensional Model.
Building the Corporate Data Warehouse Pindaro Demertzoglou Lally School of Management Data Resource Management.
Extending and Creating Dynamics AX OLAP Cubes
Introduction to Informatica PowerCenter
Plug-In T7: Problem Solving Using Access 2007
Informatica PowerCenter Performance Tuning Tips
MIS2502: Data Analytics Dimensional Data Modeling
MIS2502: Data Analytics Dimensional Data Modeling
MIS2502: Data Analytics Dimensional Data Modeling
Inventory is used to illustrate:
OER UNIT 1 – SCHEMA DESIGN
MIS2502: Data Analytics Dimensional Data Modeling
CRM Analytics Architecture
Typically data is extracted from multiple sources
MIS2502: Data Analytics Dimensional Data Modeling
MIS2502: Data Analytics Dimensional Data Modeling
Contents Preface I Introduction Lesson Objectives I-2
eSeries Entities By Julie Ladner
Implementing ETL solution for Incremental Data Load in Microsoft SQL Server Ganesh Lohani SR. Data Analyst Lockheed Martin
Data Warehousing.
Presentation transcript:

INCREMENTAL AGGREGATION After you create a session that includes an Aggregator transformation, you can enable the session option, Incremental Aggregation. The first time you run an incremental aggregation session, It processes the entire source. At the end of the session, the server stores aggregate data from that session run in two files, the index file and the data file. Each subsequent time you run the session with incremental aggregation, you use only the incremental source changes in the session.

INCREMENTAL AGGREGATION For each input record, it checks historical information in the index file for a corresponding group. If it finds a corresponding group, the Server performs the aggregate operation incrementally, using the aggregate data for that group, and saves the incremental change. If it does not find a corresponding group, the Server creates a new group and saves the record data. When writing to the target, the Server applies the changes to the existing target. It saves modified aggregate data in the index and data files to be used as historical data the next time you run the session.

INCREMENTAL AGGREGATION Circumstances for using incremental aggregation  You can capture new source data Use incremental aggregation when you can capture new source data each time you run the session.  Incremental changes do not significantly change the target. Use incremental aggregation when the changes do not significantly change the target. If processing the incrementally changed source alters more than half the existing target, the session may not benefit from using incremental aggregation. In this case, drop the table and re-create the target with complete source data.

EXAMPLE you might have a session using a source that receives new data every day. You can capture those incremental changes because you have added a mapping variable to the mapping that removes pre-existing data from the flow of data. You then enable incremental aggregation. When the session runs with incremental aggregation enabled for the first time on March 1, you use the entire source. This allows to read and store the necessary aggregate data. On March 2, when you run the session again, you filter out all the records except those time-stamped March 2. The Server then processes only the new data and updates the target accordingly.

REINITIALIZING THE AGGREGATE CACHE In session properties we have an option for reinitializing the aggregate cache. When you enable this option, it overwrites the aggregate cache each time you run the session. When you disable this option it updates the new aggregate data from the source with the existing historical cache data. Example you can reinitialize the aggregate cache if the source for a session changes incrementally every day and completely changes once a month. When you receive the new monthly source, you might configure the session to reinitialize the aggregate cache, truncate the existing target, and use the new source table during the session.

SALES STAR SCHEMA Customer Dimension Period Dimension Sales Fact Product Dimension

Description of sales schema Customer Dimension – SCD Type 1 Product Dimension – SCD Type 2 Period Dimension Sales Fact – Monthly Level Granularity In sales fact combination of cust_id, prod_id, month_id is primary key Each and every day we get daily sales transaction details and we have to populate our warehouse daily.

Mapping (with out using incremental aggregation in session properties)

Source Qualifier Properties $$MAP_VAR mapping variable is used in Source filter to do an incremental extract

Expression properties Converting sales date to Month and year Assigning max value of sales_id to mapping variable using SETMAXVARIABLE() function

Product dimension Lookup Looking up the product dimension and getting the corresponding surragate_id

Period Dimension Lookup Looking up the Period Dimension to get the corresponding month_id

Aggregator properties Aggregating the quantity and total price Group by cust_id, prod_surr_id,month_id Because this combination is primary key in fact

Sales fact Lookup Looking up sales fact to determine whether cust_id,prod_surr_id,m onth_id exist or not Lookup condition

Expression to flag new row and return Qty and Totprice Flag to Find new record This function returns sum of Qty from source and Qty that exist in fact if that record exist in fact, else it returns Qty from source This function returns sum of totalprice from source and totalprice that exist in fact if that record exist in fact, else it returns totalprice from source

Router Transformation Group filter condition for new records Existing record will fall into default group

Update strategy Transformation Existing records will be updated in the sales fact

Target Fact Sales fact in which new rows are inserted and existing rows are updated

With incremental aggregation in session properties

Incremental aggregation option Incremental aggregation option in session properties

Conclusion With out incremental aggregation option in session properties With incremental aggregation option in session properties The two mappings differs only in their expression Transformations. The first mapping (with out incremental aggregation) we manually perform a calculation in expression Transformation to do incremental aggregation. And in second mapping (with Incremental aggregation option) The power center server itself does the incremental aggregation.