Presentation is loading. Please wait.

Presentation is loading. Please wait.

Agenda ODI Performance ODI Scheduling ODI Deployment/Release.

Similar presentations


Presentation on theme: "Agenda ODI Performance ODI Scheduling ODI Deployment/Release."— Presentation transcript:

1 Agenda ODI Performance ODI Scheduling ODI Deployment/Release

2 Uli Bethke Dublin based Blog www.bi-q.ie ODI 2007
Reviewer two ODI books ODI articles OTN Deputy chair OUG BI SIG. Next event 11th June ODI advanced trainer

3 ODI performance ODI is a metadata driven (SQL) code generator using code templates (knowledge modules). It uses a Java agent to communicate and send data between source and target systems and the repository over the network.

4 SQL > 80%: ODI performance issues = SQL issues => SQL main ODI skill Perfect your SQL. Advanced SQL. Analytic Functions Know your database(s) inside out. In particular the target Understand, write, and modify Knowledge Modules

5 Agent Light weight Java based application Tied to host OS
Generates code based on ODI metadata. Communicates source, target, repository. JDBC data transport XML Jetty Interpreters: Jython, JBS, JavaScript, Groovy HSQLDB in memory database Scheduler Sizing

6 Agent Target Least amount of roundtrips. Network (JDBC, XML)
One target database server only (DW) Another Server ODBC drivers JEE agent on Weblogic No support for target OS Resources on target DBA

7 interfaces No!! KM using row by row processing
Use ODI functions rather than DB functions Don’t overuse CKM (especially for large data volumes) temp indexes (I$) Gather statistics (C$, I$, TGT when applicable) Rule of thumb: Use loader KMs or db link KMs rather than JDBC KMs

8 Source/target Schemas on same database server. Physical schema and not data server. Have sources physically close to target Minimize impact on source Chunking

9 CRITICAL PATH Network Paths: Path Durations:
B > E > H = 19 B > D > F = 24 B > D > G = 20 A > C > G = 27  Critical Path

10 Micro Tuning JDBC drivers JVM Type 4 or 5 JDBC drivers (Data Direct) Array fetch size. DB packet size. Network packet size.

11 Performance Monitoring
ODI Log Data Mart Facts Dimensions Metrics Frontend

12 Dbms_sqltune_util0 dbms_sqltune_util0.sqltext_to_sqlid Link to Data Dictionary Tables

13 maciEJ KOCON Dublin based ODI 2005 (Sunopsis) Reviewer two ODI books
Blog

14 ORCHESTRATING DWH PROCESSES
Orchestration of Data Process Flow Standard DWH Process flow orchestration Packages in Oracle Data Integrator 10g Load Plans in Oracle Data Integrator 11g Process Flow use cases - efficiency analysis Alternative scheduling benefits

15 loads data from sources
TYPICAL DATA FLOW in DWH step 1 STAGE E-LT DATA EXTRACT loads data from sources

16 TYPICAL DATA FLOW in DWH
step 1 step 2 STAGE DIMs E-LT DATA EXTRACT loads data from sources LABEL provides structured labeling information

17 TYPICAL DATA FLOW in DWH
step 1 step 2 step 3 STAGE DIMs FACTS E-LT DATA EXTRACT loads data from sources LABEL provides structured labeling information FACTS consists of measurements, metrics or facts

18 TYPICAL DATA FLOW in DWH
step 1 step 2 step 3 STAGE DIMs FACTS E-LT DATA EXTRACT loads data from sources LABEL provides structured labeling information FACTS consists of measurements, metrics or facts data transport & transform units

19 TYPICAL DATA FLOW in DWH
step 1 step 2 step 3 STAGE DIMs FACTS E-LT DATA EXTRACT loads data from sources LABEL provides structured labeling information FACTS consists of measurements, metrics or facts data transport & transform units ODI 10g Packages ODI 11 Load Plans orchestration

20  ORCHESTRATION – ODI PACKAGES using object directly PKG_ABC INT_A
PRC_B INT_C PKG_DE INT_D INT_E

21 using scenarios – compiled code
ORCHESTRATION – ODI PACKAGES using object directly using scenarios – compiled code SYNCHRONOUS PKG_ABC PKG_ABCDE INT_A INT_A PRC_B PRC_B INT_C INT_C PKG_DE PKG_DE INT_D INT_E

22 using scenarios – compiled code
ORCHESTRATION – ODI PACKAGES using object directly using scenarios – compiled code PKG_ABC SYNCHRONOUS PKG_ABCDE INT_A INT_A PRC_B PRC_B INT_C INT_C PKG_DE PKG_DE INT_D INT_E ASYNCHRONOUS PKG_ABCDE INT_A PRC_B INT_C PKG_DE

23 ODI 10g vs. ODI 11 STAGE DIMs FACTS ODI 10g PKG_DM PKG_ABC PKG_DE
PKG_FG INT_A INT_C INT_F ODI 10g Packages PRC_B PRC_D PRC_G INT_C A D F B E G C

24 ODI 10g vs. ODI 11 STAGE DIMs FACTS ODI 10g ODI 11 PKG_DM PKG_ABC
PKG_DE PKG_FG INT_A INT_C INT_F ODI 10g Packages PRC_B PRC_D PRC_G INT_C ODI 11 Load plans

25 ODI 10g vs. ODI 11 STAGE DIMs FACTS ODI 10g ODI 11 same effect! PKG_DM
PKG_ABC PKG_DE PKG_FG INT_A INT_C INT_F ODI 10g Packages PRC_B PRC_D PRC_G INT_C ODI 11 Load plans A D F same effect! B E G C

26 PROCESS FLOW EFFICIENCY ANALYSIS
Standard Flow Orchestration: Stage-(stop)DIMs-(stop)Facts sequential A 30 B 10 C D E F G A D E F G parallel B 30 10 10 C 10 = 70 10 30 10

27 PROCESS FLOW EFFICIENCY ANALYSIS
Standard Flow Orchestration: Stage-(stop)DIMs-(stop)Facts sequential A 30 B 10 C D E F G A D E F G parallel B 30 10 10 C 10 = 70 10 30 10 DOWNSIDES: POSSIBLE INEFFICIENCIES (IDLE RESOURCES)

28 PROCESS FLOW EFFICIENCY ANALYSIS
OPTIMIZATION ATTEMPT A 30 B 10 C D E F G

29 PROCESS FLOW EFFICIENCY ANALYSIS
OPTIMIZATION ATTEMPT sequential A 30 B 10 C D E F G A D F G parallel B C 30 10 10 E 10 + 10 = 50 30 10 10 70  50 = 1.4 times quicker! UPSIDE: EFFICIENCY IMPROVED

30 ADVANCED Data Flow example

31 Enterprise DWH Data Flow example

32 Enterprise DWH Data Flow example

33 PROCESS FLOW EFFICIENCY ANALYSIS
OPTIMIZATION ATTEMPT sequential A 30 B 10 C D E F G A D F G parallel B C 30 10 10 E 10 + 10 = 50 30 10 10 70  50 = 1.4 times quicker! UPSIDE: EFFICIENCY IMPROVED DOWNSIDES: TIMINGS KNOWLEDGE REQUIRED OVERALL DEPENDECY KNOWLEDGE REQURED

34 PROCESS FLOW EFFICIENCY ANALYSIS
OPTIMIZATION ATTEMPT sequential A 30 B 10 C D E F G 70 A D E F G parallel B 30 10 10 C 10 = 70 10 30 10 DOWNSIDE: INEFFICIENCY EXISTS BUT CAN’T BE RESOLVED CONSUMER WAITING & IMPACT

35 Traditional Scheduling - limitations
Possible inefficiencies (idle resources) Timings knowledge required Overall dependecy knowledge requred Inefficiency exists but can’t be resolved Consumer waiting & impact

36 SCHEDULER Traditional Scheduling - limitations
Possible inefficiencies (idle resources) Timings knowledge required Overall dependecy knowledge required Inefficiency exists but can’t be resolved Consumer waiting & impact SCHEDULER

37 DEPENDENCY DRIVEN Scheduling
A B C D E B A C D E A B C D E B A C D E B A C D E A B C D E B A C D E

38 DEPENDENCY DRIVEN Scheduling
A B C D E PACKGAGES & LOAD PLANS B A C D E A B C D E B A C D E B A C D E A B C D E B A C D E

39 PROCESS FLOW EFFICIENCY ANALYSIS
sequential A 30 B 10 C D E F G 70 A D E F G parallel B 30 10 10 C 10 = 70 10 30 10 A 30 B 10 C D E F G 10 10 10 10 30 30 10

40 PROCESS FLOW EFFICIENCY ANALYSIS
sequential A 30 B 10 C D E F G 70 A D E F G parallel B 30 10 10 C 10 = 70 10 30 10 A 30 B 10 C D E F G 70 30 10 10 10 10 30 30 10 70  30 = 2.3 times faster!

41 Dependency Driven Scheduling
Simplifies orchestrating the flow only immediate upstream definition required execution timings not relevant self-adapts in the most effective way Improves overall E-LT performance Less idle resources – better utilization Independency unveils its full potential in complex Enterprise class DWHs (Inmon)

42 Dependency Driven Scheduling
Notifications errors (+auto-restartability) finish summary logging Multiple/overlapping E-LT streams load with different frequencies Parameterization improved system stress control process prioritization

43 FIRST RUN 10 processes

44 FIRST RUN 10 processes TODAY 584 processes 1389 DEPENDENCIES

45 10 584 1389 132 231 SCENARIOS RUN FIRST RUN TODAY processes processes
DEPENDENCIES SCENARIOS RUN

46 10 584 12h43m 1389 132 231 SCENARIOS RUN TIME LOAD PLANS FIRST RUN
processes TODAY 584 processes 1389 DEPENDENCIES SCENARIOS RUN 12h43m LOAD PLANS TIME

47 10 584 2.9 12h43m 4h21m 1389 132 231 SCENARIOS RUN TIME LOAD PLANS
FIRST RUN 10 processes TODAY 584 processes 1389 DEPENDENCIES SCENARIOS RUN 2.9 12h43m LOAD PLANS 4h21m SCHEDULER TIME TIMES FASTER

48 Enterprise DWH Data FloW

49

50 Release 1.0

51 Release 2.0 TST

52 TESTING Release 2.0

53 Deploy Release 2.0 PRD

54 The Hot fix SITUATION

55 Release frequently

56 CI environment

57 CI environment

58 The build master

59 AUTOMATE Stuff

60 ODI vs. Source control

61 ODI structure

62 Beyond intra build Dependencies


Download ppt "Agenda ODI Performance ODI Scheduling ODI Deployment/Release."

Similar presentations


Ads by Google