Download presentation
Presentation is loading. Please wait.
Published bySilvester Damon Barnett Modified over 5 years ago
1
What Makes SSIS Tick? A Look at Internals and Performance
By Ravi Kumar @SQLRavi
2
My Presentation Style Fun Engaging Talk about off topic things
Make sure to Put positives and Negatives in Review.
3
Why Present at a users group?
Good networking Free knowledge Geek out
4
Why am I doing this presentation?
Because Chris told me to…
5
Agenda Why Internals? Control Flow Data Flow Performance Tips
Tips to make SSIS life easy (If time is left). Two new features in 2016 (if time is left).
6
Why internals? Package is taking 30mins to run. Why?
7
What is a package? Basic XML file
8
Two Main parts of SSIS Control Flow Data Flow
9
Use the Data Flow Luke!!
10
What is a Data Flow? The apples example… Need 5 volunteers
11
Data Flow Data Flow Engine Layout Management Buffer Manager
Sources Destinations Transformation Layout Management Buffer Manager Execution Control
12
Data Flow Engine: Layout Management
Relationship between data flow items/components It’s stored in XML <DTS:ObjectData> subnode
13
Data Flow engine: Buffers!!
Types of buffers Physical Virtual Private Flat
14
Data Flow engine: Buffers!!
Physical Buffers: Not same as SQL Server Engine buffers Physical Memory with Rows and column dynamically size Looks like real table Class exercise: draw it out.
15
Data Flow engine: Buffers!!
Virtual Buffers: Subset of physical buffers Class Drawing (Draw it). Examples: Derived column and conditional split.
16
Data Flow engine: Buffers!!
Private Buffers: Physical structure but owned by a component in Data Flow Example: Sort component.
17
Data Flow engine: Buffers!!
Flat Buffers: Physical memory but without any structure. Example: Lookup transformation.
18
Data Flow engine: Buffers!!
During ETL, less data is more performance Use only columns needed Use small/narrow data types Char instead of varchar Varchar instead of nvarchar if possible. Try to do data conversion at the source, not add columns RunInOptimizedMode – not available in SSDT.
19
Data Flow engine: Buffers!!
Buffer Sizes How are buffer sized DefautBufferSize: Default 10MB, Max 100MB DefaultBufferMaxRows Calculation RowSize = Data*1024/NoOfRows BufferSize= DefaultBufferSize(Bytes)/RowSize Turn on BufferSizeTuning Logging Draw out how they are calculated
20
Data Flow engine: Buffers!!
Blob Data – The love-hate relationship Varchar(max)/Nvarchar(max)/Varbinary(max) DT_TEXT, DT_NTEXT, DT_IMAGE Share buffers in half Beware of data writing to disk due to memory constraints Maybe because SQL Server is running on same box.
21
Data Flow Engine: Execution Control
Controls how and when sources, transformation and destinations are executed. Execution Trees – Start and end with buffers Thread allotment Synchronous transformation start and end new execution trees. Buffers are reused when no longer used. Max 5 buffers per execution tree More than one thread can be assigned to a execution tree.
22
Data Flow Engine: Execution Control
Backpressure Enhanced backpressure (introduced in 2012)
23
Data Flow: Transformations
Blocking Semi-Blocking Non-Blocking Class exercise (apple time again) Note: buffers don’t move.
24
Data Flow: Transformations
Non-Blocking transformations Semi-blocking transformations Blocking transformations Audit Data Mining Query Aggregate Character Map Merge Fuzzy Grouping Conditional Split Merge Join Fuzzy Lookup Copy Column Pivot Row Sampling Data Conversion Unpivot Sort Derived Column Term Lookup Term Extraction Lookup Union All Multicast Percent Sampling Row Count Script Component Export Column Import Column Slowly Changing Dimension OLE DB Command
25
Data Flow: Output Types
Synchronous – buffers are pushed downstream quickly. Non blocking transformation. All input types (destinations). Asynchronous – buffers are held for processing. Semi-Blocking and blocking transformations. Different buffers between input and output.
26
Data Flow: Class exercise
Apple time again Note: buffers don’t move.
27
Two Main parts of SSIS Control Flow Data Flow
28
Control Flow Load Apply Parameters Validate Execute
29
Control Flow Load Apply Parameters Validate Execute Read XML Decrypt
Check Version Load Objects Apply configuration and expression
30
Control Flow Load Apply Parameters Validate Execute
Used if Project Deployment model is used
31
Control Flow Load Apply Parameters Validate Execute
Validate the package in a tree like structure Root > Contrainers > Sibling Nodes > Task Every container and task validates itself Delay validation comes in play here
32
Control Flow Load Apply Parameters Validate Execute Action time.
Scheduling Ready queue Pending queue Completed queue Class Drawing (Explain Queue). Constrains come into play MaxConcurrentExecutables: Default = -1, cores + 2 How many tasks can run at one time
33
Performance Tips Try not doing Merge and Sorts
Keep events handlers to a minimum Leave the logging to SSIS Catalog Try Performance logging in SSIS Catalog rather than normal. Don’t do Select *
34
Tips to make SSIS life easy
Buy standard extended package for SSIS Implement Standards Centralize ETL Packages Plan them out on whiteboard.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.