Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performance Tuning ETL Process

Similar presentations


Presentation on theme: "Performance Tuning ETL Process"— Presentation transcript:

1 Performance Tuning ETL Process
Mark McNeely

2 Test your self “Matching Game”

3 Component Matching Answers

4 Source Systems Source Systems Extract E-Business Suite R12
PeopleSoft Enterprise Siebel CRM JD Edwards Extract Staging Transformation Delivery End-User

5 DAC ETL Scheduler

6 Source System Stats What – gathers important information such as read times for single and multiple block reads, cpu speed, and other system throughputs. Why – Before a query is executed the optimizer calculates the cost of the query. Without Stats full-table scans and index-scans are evaluated as equivalent. Remember to gather stats when the system is busy to get accurate information.

7 SQL Trace files SQL Trace Files do: Parse, execute, and fetch counts
CPU and elapsed times Physical reads and logical reads Number of rows processed Misses on the library cache Username under which each parse occurred Each commit and rollback

8 TKPROF You can run the TKPROF program to format the contents of the trace file and place the output into a readable output file.

9 Explain Plan Explain Plan shows the sequence of operations performed in a SQL Query. It tells you how tables are joined and the indexes used.

10 SDE vs. SIL tasks

11 DAC Details

12 Informatica Workflow Manager

13 ETL Run

14

15 Informatica Workflow Monitor

16 Informatica Session Log

17 Session Log usage Busy % = (Total Run Time – Total Idle Time) / Total Run Time If Busy % (> 70 – 80%) for Reader Thread then review the Source Qualifier If Busy % (>60 – 70 %) for the TRANSF Thread then review the transformation If Busy % high for the WRITER Thread then review the Bulk Mode.

18 Hash Joins vs. Nested Loops
Optimizer chooses Nested Loops because they have less cost. Nested loops do bring the initial rows back quicker but for large volumes of over 10 million use a USE_HASH hint to cause the optimizer to use a hash join. I’ve shaved a couple of hours off of a poor performer.

19 Partitioning Guidelines for large tables
More than 20 million rows. Find a reasonable partition for example year. Couple of advantages: improved query performance and quicker ETL loads.

20 Source System Extract Staging Transformation Delivery End-User


Download ppt "Performance Tuning ETL Process"

Similar presentations


Ads by Google