Performance Tuning ETL Process

Performance Tuning ETL Process
Mark McNeely

Test your self “Matching Game”

Component Matching Answers

Source Systems Source Systems Extract E-Business Suite R12
PeopleSoft Enterprise Siebel CRM JD Edwards Extract Staging Transformation Delivery End-User

DAC ETL Scheduler

Source System Stats What – gathers important information such as read times for single and multiple block reads, cpu speed, and other system throughputs. Why – Before a query is executed the optimizer calculates the cost of the query. Without Stats full-table scans and index-scans are evaluated as equivalent. Remember to gather stats when the system is busy to get accurate information.

SQL Trace files SQL Trace Files do: Parse, execute, and fetch counts
CPU and elapsed times Physical reads and logical reads Number of rows processed Misses on the library cache Username under which each parse occurred Each commit and rollback

TKPROF You can run the TKPROF program to format the contents of the trace file and place the output into a readable output file.

Explain Plan Explain Plan shows the sequence of operations performed in a SQL Query. It tells you how tables are joined and the indexes used.

SDE vs. SIL tasks

DAC Details

Informatica Workflow Manager

ETL Run

Informatica Workflow Monitor

Informatica Session Log

Session Log usage Busy % = (Total Run Time – Total Idle Time) / Total Run Time If Busy % (> 70 – 80%) for Reader Thread then review the Source Qualifier If Busy % (>60 – 70 %) for the TRANSF Thread then review the transformation If Busy % high for the WRITER Thread then review the Bulk Mode.

Hash Joins vs. Nested Loops
Optimizer chooses Nested Loops because they have less cost. Nested loops do bring the initial rows back quicker but for large volumes of over 10 million use a USE_HASH hint to cause the optimizer to use a hash join. I’ve shaved a couple of hours off of a poor performer.

Partitioning Guidelines for large tables
More than 20 million rows. Find a reasonable partition for example year. Couple of advantages: improved query performance and quicker ETL loads.

Source System Extract Staging Transformation Delivery End-User

Performance Tuning ETL Process

Similar presentations

Presentation on theme: "Performance Tuning ETL Process"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Performance Tuning ETL Process

Similar presentations

Presentation on theme: "Performance Tuning ETL Process"— Presentation transcript:

Similar presentations

About project

Feedback