Presentation is loading. Please wait.

Presentation is loading. Please wait.

AGENDA ODI Performance ODI Scheduling ODI Deployment/Release.

Similar presentations


Presentation on theme: "AGENDA ODI Performance ODI Scheduling ODI Deployment/Release."— Presentation transcript:

1 AGENDA ODI Performance ODI Scheduling ODI Deployment/Release

2 ULI BETHKE Dublin based Blog ODI 2007 Reviewer two ODI books ODI articles OTN Deputy chair OUG BI SIG. Next event 11 th June ODI advanced trainer

3 ODI PERFORMANCE ODI is a metadata driven (SQL) code generator using code templates (knowledge modules). It uses a Java agent to communicate and send data between source and target systems and the repository over the network.

4 SQL -> 80%: ODI performance issues = SQL issues => SQL main ODI skill -Perfect your SQL. Advanced SQL. Analytic Functions -Know your database(s) inside out. In particular the target -Understand, write, and modify Knowledge Modules

5 AGENT -Light weight Java based application -Tied to host OS -Generates code based on ODI metadata. -Communicates source, target, repository. -JDBC data transport -XML -Jetty -Interpreters: Jython, JBS, JavaScript, Groovy -HSQLDB in memory database -Scheduler -Sizing

6 AGENT Target -Least amount of roundtrips. Network (JDBC, XML) -One target database server only (DW) Another Server -ODBC drivers -JEE agent on Weblogic -No support for target OS -Resources on target -DBA

7 INTERFACES -No!! KM using row by row processing -Use ODI functions rather than DB functions -Dont overuse CKM (especially for large data volumes) -temp indexes (I$) -Gather statistics (C$, I$, TGT when applicable) -Rule of thumb: Use loader KMs or db link KMs rather than JDBC KMs

8 SOURCE/TARGET -Schemas on same database server. Physical schema and not data server. -Have sources physically close to target -Minimize impact on source -Chunking

9 CRITICAL PATH N ETWORK P ATHS :P ATH D URATIONS : B>E>H6+2+11=19 B>D>F6+4+14=24 B>D>G6+4+10=20 A>C>G9+8+10=27 C RITICAL P ATH

10 MICRO TUNING JDBC drivers JVM Type 4 or 5 JDBC drivers (Data Direct) Array fetch size. DB packet size. Network packet size.

11 PERFORMANCE MONITORING ODI Log Data Mart Facts Dimensions Metrics Frontend

12 DBMS_SQLTUNE_UTIL0 dbms_sqltune_util0.sqltext_to_sqlid Link to Data Dictionary Tables

13 MACIEJ KOCON Dublin based ODI 2005 (Sunopsis) Reviewer two ODI books Blog

14 ORCHESTRATING DWH PROCESSES Orchestration of Data Process Flow – Standard DWH Process flow orchestration – Packages in Oracle Data Integrator 10g – Load Plans in Oracle Data Integrator 11g Process Flow use cases - efficiency analysis Alternative scheduling – benefits

15 1 TYPICAL DATA FLOW in DWH step STAGE DATA EXTRACT loads data from sources E-LT

16 12 TYPICAL DATA FLOW in DWH step STAGE DATA EXTRACT loads data from sources step DIMs LABEL provides structured labeling information E-LT

17 12 3 TYPICAL DATA FLOW in DWH step STAGE DATA EXTRACT loads data from sources step DIMs LABEL provides structured labeling information step FACTS consists of measurements, metrics or facts E-LT

18 12 3 TYPICAL DATA FLOW in DWH step STAGE DATA EXTRACT loads data from sources step DIMs LABEL provides structured labeling information step FACTS consists of measurements, metrics or facts data transport & transform units E-LT

19 12 3 TYPICAL DATA FLOW in DWH step STAGE DATA EXTRACT loads data from sources step DIMs LABEL provides structured labeling information step FACTS consists of measurements, metrics or facts data transport & transform units ODI 10g Packages orchestration E-LT ODI 11 Load Plans

20 PRC_B INT_A PKG_ABC ORCHESTRATION – ODI PACKAGES INT_CINT_D PKG_DE INT_E using object directly

21 INT_C PRC_B INT_A PKG_ABCDE PKG_DE PRC_B INT_A PKG_ABC ORCHESTRATION – ODI PACKAGES INT_CINT_D PKG_DE INT_E using object directlyusing scenarios – compiled code SYNCHRONOUS

22 INT_C PRC_B INT_A PKG_ABCDE PKG_DE PRC_B INT_A PKG_ABC ORCHESTRATION – ODI PACKAGES INT_C PRC_B INT_A PKG_ABCDE PKG_DE INT_D PKG_DE INT_E using object directlyusing scenarios – compiled code SYNCHRONOUS ASYNCHRONOUS

23 ODI 10g vs. ODI 11 STAGE DIMs FACTS INT_C PRC_B INT_A PKG_ABC PRC_D INT_C PKG_DE PRC_G INT_F PKG_FG PKG_DM A B C D E F G ODI 10g Packages

24 ODI 10g vs. ODI 11 STAGE DIMs FACTS INT_C PRC_B INT_A PKG_ABC PRC_D INT_C PKG_DE PRC_G INT_F PKG_FG PKG_DM ODI 11 Load plans ODI 10g Packages

25 ODI 10g vs. ODI 11 STAGE DIMs FACTS INT_C PRC_B INT_A PKG_ABC PRC_D INT_C PKG_DE PRC_G INT_F PKG_FG PKG_DM ODI 10g Packages ODI 11 Load plans A B C D E F G SAME EFFECT !

26 PROCESS FLOW EFFICIENCY ANALYSIS ABCD E F G sequential parallel = 70 A 30 B 10 C D E 30 F 10 G Standard Flow Orchestration: Stage-(stop) DIMs-(stop) Facts

27 PROCESS FLOW EFFICIENCY ANALYSIS ABCD E F G sequential parallel = 70 A 30 B 10 C D E 30 F 10 G DOWNSIDES: POSSIBLE INEFFICIENCIES (IDLE RESOURCES)

28 PROCESS FLOW EFFICIENCY ANALYSIS A 30 B 10 C D E 30 F 10 G OPTIMIZATION ATTEMPT

29 PROCESS FLOW EFFICIENCY ANALYSIS A 30 B 10 C D E 30 F 10 G = 50 B C A D E F G sequential parallel OPTIMIZATION ATTEMPT = 1.4 times quicker! UPSIDE: EFFICIENCY IMPROVED

30 ADVANCED DATA FLOW EXAMPLE

31 ENTERPRISE DWH DATA FLOW EXAMPLE

32

33 PROCESS FLOW EFFICIENCY ANALYSIS A 30 B 10 C D E 30 F 10 G = 50 B C A D E F G sequential parallel OPTIMIZATION ATTEMPT = 1.4 times quicker! UPSIDE: EFFICIENCY IMPROVED DOWNSIDES: TIMINGS KNOWLEDGE REQUIRED OVERALL DEPENDECY KNOWLEDGE REQURED

34 PROCESS FLOW EFFICIENCY ANALYSIS ABCD E F G sequential parallel = 70 A 30 B 10 C D E 30 F 10 G OPTIMIZATION ATTEMPT DOWNSIDE: INEFFICIENCY EXISTS BUT CANT BE RESOLVED CONSUMER WAITING & IMPACT 70

35 Possible inefficiencies (idle resources) Timings knowledge required Overall dependecy knowledge requred Inefficiency exists but cant be resolved Consumer waiting & impact TRADITIONAL SCHEDULING - LIMITATIONS

36 Possible inefficiencies (idle resources) Timings knowledge required Overall dependecy knowledge required Inefficiency exists but cant be resolved Consumer waiting & impact TRADITIONAL SCHEDULING - LIMITATIONS SCHEDULER

37 DEPENDENCY DRIVEN SCHEDULING A B C D E B A C D E A B C D E B A C D E B A C D E A B C D E B A C D E

38 A B C D E B A C D E A B C D E B A C D E B A C D E A B C D E B A C D E PACKGAGES & LOAD PLANS

39 PROCESS FLOW EFFICIENCY ANALYSIS ABCD E F G sequential parallel = 70 A 30 B 10 C D E 30 F 10 G A 30 B 10 C D E 30 F 10 G

40 PROCESS FLOW EFFICIENCY ANALYSIS ABCD E F G sequential parallel = 70 A 30 B 10 C D E 30 F 10 G A 30 B 10 C D E 30 F 10 G = 2.3 times faster!

41 DEPENDENCY DRIVEN SCHEDULING Simplifies orchestrating the flow – only immediate upstream definition required – execution timings not relevant – self-adapts in the most effective way Improves overall E-LT performance – Less idle resources – better utilization – Independency – unveils its full potential in complex Enterprise class DWHs (Inmon)

42 DEPENDENCY DRIVEN SCHEDULING Notifications – errors (+auto-restartability) – finish summary – logging Multiple/overlapping E-LT streams – load with different frequencies Parameterization – improved system stress control – process prioritization

43 FIRST RUN 10 processes

44 FIRST RUN 10 processes TODAY 584 processes 1389 DEPENDENCIES

45 FIRST RUN 10 processes TODAY 584 processes SCENARIOS RUN 1389 DEPENDENCIES

46 FIRST RUN 10 processes TODAY 584 processes SCENARIOS RUN 1389 DEPENDENCIES 12h43m LOAD PLANS TIME

47 FIRST RUN 10 processes TODAY 584 processes SCENARIOS RUN 1389 DEPENDENCIES 12h43m LOAD PLANS 4h21m SCHEDULER TIME 2.9 TIMES FASTER

48 ENTERPRISE DWH DATA FLOW

49

50 RELEASE 1.0

51 RELEASE 2.0 TST

52 TESTING RELEASE 2.0

53 DEPLOY RELEASE 2.0 PRD

54 THE HOT FIX SITUATION

55 RELEASE FREQUENTLY

56 CI ENVIRONMENT

57

58 THE BUILD MASTER

59 AUTOMATE STUFF

60 ODI VS. SOURCE CONTROL

61 ODI STRUCTURE

62 BEYOND INTRA BUILD DEPENDENCIES


Download ppt "AGENDA ODI Performance ODI Scheduling ODI Deployment/Release."

Similar presentations


Ads by Google