Agenda ODI Performance ODI Scheduling ODI Deployment/Release.

Slides:



Advertisements
Similar presentations
Sameer Marwa – Infogig Consulting Khaled Yagoub – Oracle Development
Advertisements

1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
J0 1 Marco Ronchetti - Basi di Dati Web e Distribuite – Laurea Specialistica in Informatica – Università di Trento.
1 Copyright © 2005, Oracle. All rights reserved. Introducing the Java and Oracle Platforms.
6 Copyright © 2005, Oracle. All rights reserved. Building Applications with Oracle JDeveloper 10g.
Data Warehousing – A Technology Marvel -by Swati Chawla.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
9 Copyright © 2006, Oracle. All rights reserved. Automatic Performance Management.
Tori Bowman CSSE 375, Rose-Hulman October 22, Code Tuning (Chapters of Code Complete)
Chapter 9. Performance Management Enterprise wide endeavor Research and ascertain all performance problems – not just DBMS Five factors influence DB performance.
13 Copyright © 2005, Oracle. All rights reserved. Monitoring and Improving Performance.
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Overview of performance tuning strategies Oracle Performance Tuning Allan Young June 2008.
QA practitioners viewpoint
Best Practice Model Customisation and ETL for Sybase IWS – Instant IWS
1 tRelational/DPS Overview. 2 ADABAS Data Transfer: business needs and issues tRelational & DPS Overview Summary Questions? Demo Agenda.
© 2010 TIBCO Software Inc. All Rights Reserved. Confidential and Proprietary. TIBCO Spotfire Application Data Services TIBCO Spotfire European User Conference.
1 Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this proposal or quotation. An Introduction to Data.
Data Warehouse Overview (Financial Analysis) May 02, 2002.
4 Oracle Data Integrator First Project – Simple Transformations: One source, one target 3-1.
Database System Concepts and Architecture
ArrayExpress Query Interface Gonzalo Garc í a Lara January, / 24.
Chapter 9: The Client/Server Database Environment
CA's Management Database (MDB): The EITM Foundation -WO108SN.
Chapter 13 The Data Warehouse
Oracle Hyperion Financial Data Quality Management Considerations for a scaled, expedited and integrated approach on data quality NCOAUG – Aug 15, 2008.
What’s New in BMC ProactiveNet 9.5?
Network Management with JMX Thu Nguyen Oliver Argente CS158B.
M ODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 ITEC 450.
ETL By Dr. Gabriel.
SSIS Over DTS Sagayaraj Putti (139460). 5 September What is DTS?  Data Transformation Services (DTS)  DTS is a set of objects and utilities that.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
INFO425: Systems Design INFORMATION X Finalizing Scope (functions/level of automation)  Finalizing scope in terms of functions and level of.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
4 Copyright © 2009, Oracle. All rights reserved. Designing Mappings with the Oracle Data Integration Enterprise Edition License.
Activity Running Time DurationIntro0 2 min Setup scenario 2 2 min SQL BI components & concepts 4 5 min Data input (Let’s go shopping) 9 7 min Whiteboard.
Database System Concepts and Architecture Lecture # 2 21 June 2012 National University of Computer and Emerging Sciences.
Database Environment Chapter 2 AIT632 Sungchul Hong.
5 Copyright © 2009, Oracle. All rights reserved. Right-Time Data Warehousing with OWB.
What You Need before You Deploy Master Data Management Presented by Malcolm Chisholm Ph.D. Telephone – Fax
Short Introduction to the RDBMS Software Redundancy Proposal PROBLEM / GOAL: avoid any loosing of up-time service of an application using either commercial.
Session 4: The HANA Curriculum and Demos Dr. Bjarne Berg Associate professor Computer Science Lenoir-Rhyne University.
IT 456 Seminar 5 Dr Jeffrey A Robinson. Overview of Course Week 1 – Introduction Week 2 – Installation of SQL and management Tools Week 3 - Creating and.
FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 1 Cornell University’s.
What is a schema ? Schema is a collection of Database Objects. Schema Objects are logical structures created by users to contain, or reference, their data.
Oracle Data Integrator Transformations: Adding More Complexity
1 Chapter Overview Performing Configuration Tasks Setting Up Additional Features Performing Maintenance Tasks.
3 Copyright © 2009, Oracle. All rights reserved. Accessing Non-Oracle Sources.
1 Oracle Enterprise Manager Slides from Dominic Gélinas CIS
Oracle Data Integrator Agents. 8-2 Understanding Agents.
Communicating with the Outside. Hardware [Processor(s), Disk(s), Memory] Operating System Concurrency ControlRecovery Storage Subsystem Indexes Query.
Creating a Data Warehouse Data Acquisition: Extract, Transform, Load Extraction Process of identifying and retrieving a set of data from the operational.
7 Strategies for Extracting, Transforming, and Loading.
RoOUG Iunie Bucuresti, 26 Iunie Agenda Inregistrarea participantilor ODI – Common Use Cases 2Iunie 2013.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Interstage BPM v11.2 1Copyright © 2010 FUJITSU LIMITED INTERSTAGE BPM ARCHITECTURE BPMS.
Integrating and Extending Workflow 8 AA301 Carl Sykes Ed Heaney.
Copyright Sammamish Software Services All rights reserved. 1 Prog 140  SQL Server Performance Monitoring and Tuning.
Physical Layer of a Repository. March 6, 2009 Agenda – What is a Repository? –What is meant by Physical Layer? –Data Source, Connection Pool, Tables and.
Metadata Driven Clinical Data Integration – Integral to Clinical Analytics April 11, 2016 Kalyan Gopalakrishnan, Priya Shetty Intelent Inc. Sudeep Pattnaik,
1 Copyright © 2007, Oracle. All rights reserved. Installing and Setting Up the Warehouse Builder Environment.
Slide 1 © 2016, Lera Technologies. All Rights Reserved. Oracle Data Integrator By Lera Technologies.
11 Copyright © 2009, Oracle. All rights reserved. Enhancing ETL Performance.
ETL Design - Stage Philip Noakes May 9, 2015.
11gR2 Integration Extensibility
PowerMart of Informatica
Presented by: Warren Sifre
Data Warehousing Concepts
Oracle SQL Developer Data Modeler
Presentation transcript:

Agenda ODI Performance ODI Scheduling ODI Deployment/Release

Uli Bethke Dublin based Blog www.bi-q.ie ODI 2007 Reviewer two ODI books ODI articles OTN Deputy chair OUG BI SIG. Next event 11th June ODI advanced trainer

ODI performance ODI is a metadata driven (SQL) code generator using code templates (knowledge modules). It uses a Java agent to communicate and send data between source and target systems and the repository over the network.

SQL > 80%: ODI performance issues = SQL issues => SQL main ODI skill Perfect your SQL. Advanced SQL. Analytic Functions Know your database(s) inside out. In particular the target Understand, write, and modify Knowledge Modules

Agent Light weight Java based application Tied to host OS Generates code based on ODI metadata. Communicates source, target, repository. JDBC data transport XML Jetty Interpreters: Jython, JBS, JavaScript, Groovy HSQLDB in memory database Scheduler Sizing

Agent Target Least amount of roundtrips. Network (JDBC, XML) One target database server only (DW) Another Server ODBC drivers JEE agent on Weblogic No support for target OS Resources on target DBA

interfaces No!! KM using row by row processing Use ODI functions rather than DB functions Don’t overuse CKM (especially for large data volumes) temp indexes (I$) Gather statistics (C$, I$, TGT when applicable) Rule of thumb: Use loader KMs or db link KMs rather than JDBC KMs

Source/target Schemas on same database server. Physical schema and not data server. Have sources physically close to target Minimize impact on source Chunking

CRITICAL PATH Network Paths: Path Durations: B > E > H 6 + 2 + 11 = 19 B > D > F 6 + 4 + 14 = 24 B > D > G 6 + 4 + 10 = 20 A > C > G 9 + 8 + 10 = 27  Critical Path

Micro Tuning JDBC drivers JVM Type 4 or 5 JDBC drivers (Data Direct) Array fetch size. DB packet size. Network packet size.

Performance Monitoring ODI Log Data Mart Facts Dimensions Metrics Frontend

Dbms_sqltune_util0 dbms_sqltune_util0.sqltext_to_sqlid Link to Data Dictionary Tables

maciEJ KOCON Dublin based ODI 2005 (Sunopsis) Reviewer two ODI books Blog www.bi-q.ie maciek@bi-q.ie

ORCHESTRATING DWH PROCESSES Orchestration of Data Process Flow Standard DWH Process flow orchestration Packages in Oracle Data Integrator 10g Load Plans in Oracle Data Integrator 11g Process Flow use cases - efficiency analysis Alternative scheduling benefits

loads data from sources TYPICAL DATA FLOW in DWH step 1 STAGE E-LT DATA EXTRACT loads data from sources

TYPICAL DATA FLOW in DWH step 1 step 2 STAGE DIMs E-LT DATA EXTRACT loads data from sources LABEL provides structured labeling information

TYPICAL DATA FLOW in DWH step 1 step 2 step 3 STAGE DIMs FACTS E-LT DATA EXTRACT loads data from sources LABEL provides structured labeling information FACTS consists of measurements, metrics or facts

TYPICAL DATA FLOW in DWH step 1 step 2 step 3 STAGE DIMs FACTS E-LT DATA EXTRACT loads data from sources LABEL provides structured labeling information FACTS consists of measurements, metrics or facts data transport & transform units

TYPICAL DATA FLOW in DWH step 1 step 2 step 3 STAGE DIMs FACTS E-LT DATA EXTRACT loads data from sources LABEL provides structured labeling information FACTS consists of measurements, metrics or facts data transport & transform units ODI 10g Packages ODI 11 Load Plans orchestration

 ORCHESTRATION – ODI PACKAGES using object directly PKG_ABC INT_A PRC_B INT_C PKG_DE INT_D  INT_E

using scenarios – compiled code ORCHESTRATION – ODI PACKAGES using object directly using scenarios – compiled code SYNCHRONOUS PKG_ABC PKG_ABCDE INT_A INT_A PRC_B PRC_B INT_C INT_C PKG_DE PKG_DE INT_D  INT_E

using scenarios – compiled code ORCHESTRATION – ODI PACKAGES using object directly using scenarios – compiled code PKG_ABC SYNCHRONOUS PKG_ABCDE INT_A INT_A PRC_B PRC_B INT_C INT_C PKG_DE PKG_DE INT_D  INT_E ASYNCHRONOUS PKG_ABCDE INT_A PRC_B INT_C PKG_DE

ODI 10g vs. ODI 11 STAGE DIMs FACTS ODI 10g PKG_DM PKG_ABC PKG_DE PKG_FG INT_A INT_C INT_F ODI 10g Packages PRC_B PRC_D PRC_G INT_C A D F B E G C

ODI 10g vs. ODI 11 STAGE DIMs FACTS ODI 10g ODI 11 PKG_DM PKG_ABC PKG_DE PKG_FG INT_A INT_C INT_F ODI 10g Packages PRC_B PRC_D PRC_G INT_C ODI 11 Load plans

ODI 10g vs. ODI 11 STAGE DIMs FACTS ODI 10g ODI 11 same effect! PKG_DM PKG_ABC PKG_DE PKG_FG INT_A INT_C INT_F ODI 10g Packages PRC_B PRC_D PRC_G INT_C ODI 11 Load plans A D F same effect! B E G C

PROCESS FLOW EFFICIENCY ANALYSIS Standard Flow Orchestration: Stage-(stop)DIMs-(stop)Facts sequential A 30 B 10 C D E F G A D E F G parallel B 30 10 10 C 10 30 + 30 + 10 = 70 10 30 10

PROCESS FLOW EFFICIENCY ANALYSIS Standard Flow Orchestration: Stage-(stop)DIMs-(stop)Facts sequential A 30 B 10 C D E F G A D E F G parallel B 30 10 10 C 10 30 + 30 + 10 = 70 10 30 10 DOWNSIDES: POSSIBLE INEFFICIENCIES (IDLE RESOURCES)

PROCESS FLOW EFFICIENCY ANALYSIS OPTIMIZATION ATTEMPT A 30 B 10 C D E F G

PROCESS FLOW EFFICIENCY ANALYSIS OPTIMIZATION ATTEMPT sequential A 30 B 10 C D E F G A D F G parallel B C 30 10 10 E 10 30 + 10 10 + 30 + 10 = 50 30 10 10 70  50 = 1.4 times quicker! UPSIDE: EFFICIENCY IMPROVED

ADVANCED Data Flow example

Enterprise DWH Data Flow example

Enterprise DWH Data Flow example

PROCESS FLOW EFFICIENCY ANALYSIS OPTIMIZATION ATTEMPT sequential A 30 B 10 C D E F G A D F G parallel B C 30 10 10 E 10 30 + 10 10 + 30 + 10 = 50 30 10 10 70  50 = 1.4 times quicker! UPSIDE: EFFICIENCY IMPROVED DOWNSIDES: TIMINGS KNOWLEDGE REQUIRED OVERALL DEPENDECY KNOWLEDGE REQURED

PROCESS FLOW EFFICIENCY ANALYSIS OPTIMIZATION ATTEMPT sequential A 30 B 10 C D E F G 70 A D E F G parallel B 30 10 10 C 10 30 + 30 + 10 = 70 10 30 10 DOWNSIDE: INEFFICIENCY EXISTS BUT CAN’T BE RESOLVED CONSUMER WAITING & IMPACT

Traditional Scheduling - limitations Possible inefficiencies (idle resources) Timings knowledge required Overall dependecy knowledge requred Inefficiency exists but can’t be resolved Consumer waiting & impact

SCHEDULER Traditional Scheduling - limitations Possible inefficiencies (idle resources) Timings knowledge required Overall dependecy knowledge required Inefficiency exists but can’t be resolved Consumer waiting & impact SCHEDULER

DEPENDENCY DRIVEN Scheduling A B C D E B A C D E A B C D E B A C D E B A C D E A B C D E B A C D E

DEPENDENCY DRIVEN Scheduling A B C D E PACKGAGES & LOAD PLANS B A C D E A B C D E B A C D E B A C D E A B C D E B A C D E

PROCESS FLOW EFFICIENCY ANALYSIS sequential A 30 B 10 C D E F G 70 A D E F G parallel B 30 10 10 C 10 30 + 30 + 10 = 70 10 30 10 A 30 B 10 C D E F G 10 10 10 10 30 30 10

PROCESS FLOW EFFICIENCY ANALYSIS sequential A 30 B 10 C D E F G 70 A D E F G parallel B 30 10 10 C 10 30 + 30 + 10 = 70 10 30 10 A 30 B 10 C D E F G 70 30 10 10 10 10 30 30 10 70  30 = 2.3 times faster!

Dependency Driven Scheduling Simplifies orchestrating the flow only immediate upstream definition required execution timings not relevant self-adapts in the most effective way Improves overall E-LT performance Less idle resources – better utilization Independency unveils its full potential in complex Enterprise class DWHs (Inmon)

Dependency Driven Scheduling Notifications errors (+auto-restartability) finish summary logging Multiple/overlapping E-LT streams load with different frequencies Parameterization improved system stress control process prioritization

FIRST RUN 10 processes

FIRST RUN 10 processes TODAY 584 processes 1389 DEPENDENCIES

10 584 1389 132 231 SCENARIOS RUN FIRST RUN TODAY processes processes DEPENDENCIES 132 231 SCENARIOS RUN

10 584 12h43m 1389 132 231 SCENARIOS RUN TIME LOAD PLANS FIRST RUN processes TODAY 584 processes 1389 DEPENDENCIES 132 231 SCENARIOS RUN 12h43m LOAD PLANS TIME

10 584 2.9 12h43m 4h21m 1389 132 231 SCENARIOS RUN TIME LOAD PLANS FIRST RUN 10 processes TODAY 584 processes 1389 DEPENDENCIES 132 231 SCENARIOS RUN 2.9 12h43m LOAD PLANS 4h21m SCHEDULER TIME TIMES FASTER

Enterprise DWH Data FloW

Release 1.0

Release 2.0 TST

TESTING Release 2.0

Deploy Release 2.0 PRD

The Hot fix SITUATION

Release frequently

CI environment

CI environment

The build master

AUTOMATE Stuff

ODI vs. Source control

ODI structure

Beyond intra build Dependencies