Presentation is loading. Please wait.

Presentation is loading. Please wait.

Real-World SSIS A Survival Guide Tim Mitchell. Real World SSIS: A Survival Guide (Tim Mitchell) 2 Lessons I’ve learned the hard way Methodologies to solve.

Similar presentations


Presentation on theme: "Real-World SSIS A Survival Guide Tim Mitchell. Real World SSIS: A Survival Guide (Tim Mitchell) 2 Lessons I’ve learned the hard way Methodologies to solve."— Presentation transcript:

1 Real-World SSIS A Survival Guide Tim Mitchell

2 Real World SSIS: A Survival Guide (Tim Mitchell) 2 Lessons I’ve learned the hard way Methodologies to solve real problems in SSIS Tools to help out Solutions for SQL 2012 as well as earlier versions Demos What we’ll cover today

3 Real World SSIS: A Survival Guide (Tim Mitchell) 3 No intro to SSIS Books Online What we won’t cover

4 Real World SSIS: A Survival Guide (Tim Mitchell) 4 Presentation materials Lunch / breaks Housekeeping

5 Real World SSIS: A Survival Guide (Tim Mitchell) 5 Let’s keep it informal Ask questions Housekeeping

6 Real World SSIS: A Survival Guide (Tim Mitchell) 6 Survival is easier in groups Local user groups Events (SQL Saturday, SQL Bits, PASS Summit) Online communities Twitter (#sqlhelp) PSA: Community

7 Real World SSIS: A Survival Guide (Tim Mitchell) 7 About me Business intelligence consultant Group Principal, Linchpin People SQL Server MVP TimMitchell.net

8 Real World SSIS: A Survival Guide (Tim Mitchell) 8

9 9 Whole mess: Bountiful amounts of something, usually referring to excess More than one way to skin a cat: A pet- unfriendly phrase to indicate that there are usually multiple ways to solve the same problem Ya’ll: A subgroup of the current group All ya’ll: The whole of the current group Texas Dictionary

10 Real World SSIS: A Survival Guide (Tim Mitchell) 10 I tell you what: A statement of strong belief in the preceding statement. May also be used to agree with someone else’s statement Bless [his/her/your] heart: A polite way to call someone an imbecile Dadgummit: A (generally) socially acceptable replacement for other, less socially acceptable words Texas Dictionary

11 Real World SSIS: A Survival Guide (Tim Mitchell) 11 Plumb: In a complete state of something. See also: flat out Yonder: Not here. Texas Dictionary

12 Real World SSIS: A Survival Guide (Tim Mitchell) 12 The state or fact of continuing to live or exist, typically in spite of an accident, ordeal, or difficult circumstances. survival (noun) Reference: Dictionary.com (http://dictionary.reference.com/browse/survival)

13 Real World SSIS: A Survival Guide (Tim Mitchell) 13 Survival is simply the state of existing. It’s just a small step above being dead. -- Me survival (noun) Photo credit: Elvis Ripley (http://www.flickr.com/photos/elvisripley/ /). Used under Creative Commons license.

14 Real World SSIS: A Survival Guide (Tim Mitchell) 14 The dangers: The elements Predators Foolishness of fellow survivors The unexpected Elements of Survival

15 Real World SSIS: A Survival Guide (Tim Mitchell) 15 The dangers: Dirty data Complex or poorly defined ETL requirements Unexpected metadata changes Unstable sources/destinations Project managers Elements of Survival

16 Real World SSIS: A Survival Guide (Tim Mitchell) 16 Elements of Survival Means of survival: Common sense of self preservation Tools Leaning on others Learning from others’ mistakes

17 Real World SSIS: A Survival Guide (Tim Mitchell) 17 Elements of Survival Means of survival: Best practices Consistency Document Tools (buy/build) Community

18 Plan to Fail Survival Tip #1:

19 Real World SSIS: A Survival Guide (Tim Mitchell) 19 Planning to Fail

20 Real World SSIS: A Survival Guide (Tim Mitchell) 20 Planning to Fail Data failures: Missing or offline sources Changed metadata Partial loads Validation issues Unexpected domain values

21 Real World SSIS: A Survival Guide (Tim Mitchell) 21 If it happens… X When

22 Real World SSIS: A Survival Guide (Tim Mitchell) 22 Planning to Fail Planning for failure in the wild: Build your shelter before it rains Layers Leaves Bread crumbs

23 Real World SSIS: A Survival Guide (Tim Mitchell) 23 Planning to Fail Planning for failure, the ETL way: Be a pessimist! Fail gracefully Capture error/warning data on failure Build for restartability (where appropriate)

24 Real World SSIS: A Survival Guide (Tim Mitchell) 24 Failing Gracefully

25 Real World SSIS: A Survival Guide (Tim Mitchell) 25 Planning to Fail Why graceful failure? Avoid leaving affected systems in an inconsistent state Avoid repeating wholesale operations Timely notifications to allow proper response from dev/admin staff

26 Real World SSIS: A Survival Guide (Tim Mitchell) 26 Planning to Fail Graceful failures in SSIS Control flow: – Event handlers – Precedence constraints Data flow: – Error row redirection – Lookup failure redirection – Conditional split

27 Real World SSIS: A Survival Guide (Tim Mitchell) 27 Planning to Fail Graceful failures in SSIS Restartability – SSIS Checkpoints – SSIS transactions – Both methods have shortcomings – Custom restartability can be an option

28 Real World SSIS: A Survival Guide (Tim Mitchell) 28 Planning to Fail Natural failures Simply stop processing on error Default behavior In some cases, can be the right pattern

29 Real World SSIS: A Survival Guide (Tim Mitchell) 29 Demo Designing for failure

30 Take Notes Survival Tip #2:

31 Real World SSIS: A Survival Guide (Tim Mitchell) 31 Take Notes What to note? Trails, paths, and shortcuts Water sources Hazards Enemy positions Weather and wildlife patterns Sunrise/sunset time

32 Real World SSIS: A Survival Guide (Tim Mitchell) 32 Take Notes What to note? Success and failure of operations Row counts Run times Validation information Warnings

33 Real World SSIS: A Survival Guide (Tim Mitchell) 33 Take Notes Why? Know what to expect Plan for growth Cover your assets

34 Real World SSIS: A Survival Guide (Tim Mitchell) 34 Take Notes It’s all about the log. SSIS logging SQL Server log Custom logging

35 Real World SSIS: A Survival Guide (Tim Mitchell) 35 Take Notes SSIS Package Logging It’s already there Easy to start Flexible events and destinations Can be unwieldy

36 Real World SSIS: A Survival Guide (Tim Mitchell) 36 Take Notes SSIS Catalog Logging Version 2012 only Easiest to configure – Design time or runtime Least flexible

37 Real World SSIS: A Survival Guide (Tim Mitchell) 37 Take Notes Custom Logging Roll your own Most difficult to set up Infinitely flexible

38 Real World SSIS: A Survival Guide (Tim Mitchell) 38 Take Notes Server/engine logging SQL Engine error log DMVs Third party tools Windows log PerfMon

39 Real World SSIS: A Survival Guide (Tim Mitchell) 39 Demo Take Notes

40 Perform at your best Survival Tip #3:

41 Real World SSIS: A Survival Guide (Tim Mitchell) 41 Perform at your Best

42 Real World SSIS: A Survival Guide (Tim Mitchell) 42 Perform at your Best Soldier up! Recognize and avoid quicksand React appropriately when you’re stuck Know your environment

43 Real World SSIS: A Survival Guide (Tim Mitchell) 43 Perform at your Best Soldier up! Isolate and eliminate the things that slow you down Recognize design patterns that are detrimental to performance Look *outside* SSIS (gasp!)

44 Real World SSIS: A Survival Guide (Tim Mitchell) 44 Perform at your Best It’s not just SSIS The majority of SSIS performance problems have nothing to do with SSIS Limitations on sources and destinations

45 Real World SSIS: A Survival Guide (Tim Mitchell) 45 Perform at your Best It’s not just SSIS Don’t just ‘pass the buck’, but do consider other factors: – SQL engine configuration – Disk configuration – Network speed/latency – Physical machine capabilities

46 Real World SSIS: A Survival Guide (Tim Mitchell) 46 Perform at your Best It’s not just SSIS Proper query techniques for relational sources Effective indexing for sources and destinations Using OPTION (FAST )

47 Real World SSIS: A Survival Guide (Tim Mitchell) 47 Perform at your Best Streamline your data flows Transformations matter! Know how the blocking properties of transformations

48 Real World SSIS: A Survival Guide (Tim Mitchell) 48 Perform at your Best Streamline your data flows Nonblocking transforms do not hold buffers – Derived Column – Conditional Split – Row Count

49 Real World SSIS: A Survival Guide (Tim Mitchell) 49 Perform at your Best Streamline your data flows Partially blocking transforms will queue up buffers as needed – Merge Join – Lookup – Union All

50 Real World SSIS: A Survival Guide (Tim Mitchell) 50 Perform at your Best Streamline your data flows Fully blocking transforms will not pass any data through until all of the data has been buffered at that transformation – Sort – Aggregate

51 Real World SSIS: A Survival Guide (Tim Mitchell) 51 Perform at your Best Streamline your data flows Be aware of memory use! – LOB (large object) columns will always spool to disk rather than staying in memory. – [N]VARCHAR(MAX) – Memory buffers may spill over to disk

52 Real World SSIS: A Survival Guide (Tim Mitchell) 52 Perform at your Best Streamline your data flows Manage your sources – Don’t use table drop down list – specify your query including only the necessary columns – Be mindful of indexes when writing data retrieval queries

53 Real World SSIS: A Survival Guide (Tim Mitchell) 53 Perform at your Best Streamline your data flows Manage your destinations – Use FAST LOAD for SQL Server destinations – Index management (drop?)

54 Real World SSIS: A Survival Guide (Tim Mitchell) 54 Perform at your Best Go Parallel! Parallel operations can yield faster data flows

55 Real World SSIS: A Survival Guide (Tim Mitchell) 55 Demo Parallel data flow

56 Real World SSIS: A Survival Guide (Tim Mitchell) 56 Perform at your Best Streamline your data flows Using lookups – Pay attention to lookup cache mode Full cache Partial cache No cache

57 Real World SSIS: A Survival Guide (Tim Mitchell) 57 Perform at your Best Streamline your data flows Using lookups – Two-phase lookup strategy: Commonly accessed data in full cache Remaining data in a subsequent partial cache

58 Real World SSIS: A Survival Guide (Tim Mitchell) 58 Perform at your Best Streamline your data flows Using lookups – Cache connection manager Allow reuse of lookup information across data flows

59 Real World SSIS: A Survival Guide (Tim Mitchell) 59 Demo Lookups

60 Clean it up Survival Tip #4:

61 Real World SSIS: A Survival Guide (Tim Mitchell) 61 Clean it up The greatest danger is in the elements Chances are that unsanitary conditions will kill you before a predator does – Infection – Spoiled food or water

62 Real World SSIS: A Survival Guide (Tim Mitchell) 62 Clean it up In ETL, the greatest dangers often lie in the small things Like an infection, bad data can fester for a while until it’s too late Caught early, problems with dirty data are more easily solved

63 Real World SSIS: A Survival Guide (Tim Mitchell) 63 What is dirty data? Types of dirty data: Data type mismatches Domain violations Semantic violations Technical errors Simple inaccuracies

64 Real World SSIS: A Survival Guide (Tim Mitchell) 64 What is dirty data? Data type mismatches Non-numeric data in numeric fields Decimal data in integer fields Incorrect precision / rounding Truncation

65 Real World SSIS: A Survival Guide (Tim Mitchell) 65 What is dirty data? Domain violations Invalid dates Incorrect addresses Semantic violations Data outside of a reasonable range (such as a person’s age in the thousands of years) Inconsistent use of NULL, blanks, and zeroes

66 Real World SSIS: A Survival Guide (Tim Mitchell) 66 What is dirty data? Technical errors Improperly formatted dates Out-of-alignment flat files Too many/too few delimiters

67 Real World SSIS: A Survival Guide (Tim Mitchell) 67 What is dirty data? Simple inaccuracies Misspellings Duplications Improper formatting ( addresses, phone numbers) Case

68 Real World SSIS: A Survival Guide (Tim Mitchell) 68 What causes dirty data?

69 Real World SSIS: A Survival Guide (Tim Mitchell) 69 What is dirty data? Causes of dirty data: Internal: Unvalidated user input Lack of proper database constraints and/or application logic External: Import bad data from other systems ETL errors

70 Real World SSIS: A Survival Guide (Tim Mitchell) 70 Now What?

71 Real World SSIS: A Survival Guide (Tim Mitchell) 71 Clean it up Test your cleansing logic in stage/test/QA first Cleanse directly in production Don’t cleanse at all

72 Real World SSIS: A Survival Guide (Tim Mitchell) 72 Clean it up What to do with unresolvable bad data? Delete Update to NULL or unknown member Mark as suspect Write to triage Stop the ETL

73 Real World SSIS: A Survival Guide (Tim Mitchell) 73 Data Cleansing in SSIS

74 Real World SSIS: A Survival Guide (Tim Mitchell) 74 Data Cleansing in SSIS Tools of the trade Native SSIS components POTS (Plain Old Transact-SQL) SQL Server DQS

75 Real World SSIS: A Survival Guide (Tim Mitchell) 75 Data Cleansing in SSIS SSIS Native Components A versatile approach with more transformation options A much better choice when data cleansing operations involve multiple and/or non-SQL Server data sources Extensible through custom code Third party add-ons

76 Real World SSIS: A Survival Guide (Tim Mitchell) 76 Data Cleansing in SSIS SSIS Native Components Precision tools include Lookup Transformation, Merge Join Flexible/inexact cleansing through Conditional Split, Derived Columns transformation, fuzzy tools

77 Real World SSIS: A Survival Guide (Tim Mitchell) 77 Data Cleansing in SSIS Transact-SQL Fast, simple, effective way to do some cleanup operations Requires no additional software or configuration Extensible through the use of UDFs or CLR functions

78 Real World SSIS: A Survival Guide (Tim Mitchell) 78 Data Cleansing in SSIS Data Quality Services A tool specifically designed for data cleansing Has its own client interface, or can be used within SSIS for cleansing operations Limited set of operations in SSIS

79 Real World SSIS: A Survival Guide (Tim Mitchell) 79 Demo Data Cleansing

80 The Swiss Army Knife Survival Tip #5:

81 Real World SSIS: A Survival Guide (Tim Mitchell) 81 Swiss Army Knife When unexpected situations arise, an all- purpose tool can literally be a lifesaver. Cut up small firewood Can opener Make a game trap

82 Real World SSIS: A Survival Guide (Tim Mitchell) 82 Swiss Army Knife Scripting and coding tools SSIS Expressions Script task/script component PowerShell

83 Real World SSIS: A Survival Guide (Tim Mitchell) 83 Swiss Army Knife SSIS Expressions Built into SSIS Can be used in most any component or task No extra moving parts required Useful for declarative statements

84 Real World SSIS: A Survival Guide (Tim Mitchell) 84 Swiss Army Knife Pros: Easy to get started – just start expressing yourself Ubiquity Relatively easy to use

85 Real World SSIS: A Survival Guide (Tim Mitchell) 85 Swiss Army Knife Cons: Syntax is unique Complex expressions are difficult Troubleshooting

86 Real World SSIS: A Survival Guide (Tim Mitchell) 86 Swiss Army Knife SSIS Scripting.NET Framework VB.NET or C# Can use existing external assemblies

87 Real World SSIS: A Survival Guide (Tim Mitchell) 87 Swiss Army Knife Pros: Swiss Army knife of SSIS Works great for operations where native SSIS tasks/components can’t easily accomplish goal Does not require in-depth programming knowledge

88 Real World SSIS: A Survival Guide (Tim Mitchell) 88 Swiss Army Knife Cons: Does require some familiarity with programming or scripting Not as simple as native components Performance (sometimes)

89 Real World SSIS: A Survival Guide (Tim Mitchell) 89 Swiss Army Knife Script Task Used in the Control Flow Variety of uses: Interact with OS Filesystem operations (archiving) Manipulate SSIS variables Call external programs

90 Real World SSIS: A Survival Guide (Tim Mitchell) 90 Swiss Army Knife Script Component Data Flow pane Data flow/manipulation Used for: Data manipulation in the pipeline that can’t be accomplished otherwise Advanced branching logic Shred unconventional input files Create custom output files

91 Real World SSIS: A Survival Guide (Tim Mitchell) 91 Swiss Army Knife Script Component Synchronous or asynchronous Types Source Transformation Destination

92 Real World SSIS: A Survival Guide (Tim Mitchell) 92 Swiss Army Knife Semi-structured files Nonlinear files Multiple lines of text per output row Varying number of columns Dissimilar data types “Record Type” format

93 Real World SSIS: A Survival Guide (Tim Mitchell) 93 Other Scripting Uses Wait for a file or connection to be available Set and enforce thresholds for maximum execution time Custom logging Custom notifications Cross-package variable sharing ?????

94 Real World SSIS: A Survival Guide (Tim Mitchell) 94 Demo Expressions and scripting

95 Know what’s coming next Survival Tip #6:

96 Real World SSIS: A Survival Guide (Tim Mitchell) 96 Know what’s coming next Survivors keep an eye on what to expect in the days/months/years ahead Weather forecasts Changing of seasons Wildlife patterns

97 Real World SSIS: A Survival Guide (Tim Mitchell) 97 Know what’s coming next Know the technical/business landscape New versions of software Emerging design patterns

98 Real World SSIS: A Survival Guide (Tim Mitchell) 98 What’s new in SSIS for SQL Server 2012

99 Real World SSIS: A Survival Guide (Tim Mitchell) 99 Logging Changes Back in the day… Logging configured at the package level Inconsistent Difficult to add logging afterward

100 Real World SSIS: A Survival Guide (Tim Mitchell) 100 Logging Changes … and now: Logging is configured at the server level (SSIS catalog) Can be added, changed, or removed at runtime

101 Real World SSIS: A Survival Guide (Tim Mitchell) 101 Logging Changes … and now: Logging levels: – Basic – Performance – Verbose – None Native row count logging (and everyone said “Amen”) Logs to table in SSISDB

102 Real World SSIS: A Survival Guide (Tim Mitchell) 102 Logging Changes ETL Head-to-Head: T-SQL vs. SSIS 102 … and now: Built-in reports – Included with SSMS – Detail and aggregate data

103 Real World SSIS: A Survival Guide (Tim Mitchell) 103 Undo/Redo When I was your age… Package changes are immediate Undo = close without saving

104 Real World SSIS: A Survival Guide (Tim Mitchell) 104 Undo/Redo … and now Full support of Undo and Redo in the designer

105 Real World SSIS: A Survival Guide (Tim Mitchell) 105 Package Parameters Prior Versions: Sharing of values between packages required the inheritance of parent package variables Parent packages had no knowledge of expected variables in child packages

106 Real World SSIS: A Survival Guide (Tim Mitchell) 106 Package Parameters Prior Versions: There was no practical way to configure variables as required (other than failing the package)

107 Real World SSIS: A Survival Guide (Tim Mitchell) 107 Package Parameters SQL Server 2012: Package parameters! Required or optional Accessible through the Execute Package Task in parent packages

108 Real World SSIS: A Survival Guide (Tim Mitchell) 108 Package Parameters

109 Real World SSIS: A Survival Guide (Tim Mitchell) 109 DQS and SSIS Then: Data quality routines were everywhere, but also completely manual No standard means of implementation

110 Real World SSIS: A Survival Guide (Tim Mitchell) 110 DQS and SSIS Now: SSIS has a transformation to leverage DQS (also new) for data cleansing operations Consumes reusable knowledge base data for reliable, consistent cleansing

111 Real World SSIS: A Survival Guide (Tim Mitchell) 111 DQS and SSIS

112 Real World SSIS: A Survival Guide (Tim Mitchell) 112 Flat File Improvements Old school: Irregularly shaped flat files could not be natively processed in SSIS Scripting was usually required to process

113 Real World SSIS: A Survival Guide (Tim Mitchell) 113 Flat File Improvements New school: New flat file connection allows native processing of files with missing columns

114 Real World SSIS: A Survival Guide (Tim Mitchell) 114 Flat File Improvements ETL Head-to-Head: T-SQL vs. SSIS

115 Real World SSIS: A Survival Guide (Tim Mitchell) 115 Shared Data Sources In days of yore: “Shared” connections meant configuring a connection in each package, and using package configs for the connection string Still requires setting up and maintaining connections at the package level

116 Real World SSIS: A Survival Guide (Tim Mitchell) 116 Shared Data Sources Here and now: Native shared connections allow SSIS projects to use connections common to the entire project Package-level connections still supported

117 Real World SSIS: A Survival Guide (Tim Mitchell) 117 Shared Data Sources

118 Real World SSIS: A Survival Guide (Tim Mitchell) 118 Script Component Debugging Remember when: MessageBox.Show()

119 Real World SSIS: A Survival Guide (Tim Mitchell) 119 Script Component Debugging … and now: Integrated debugging in the script component Step through code line by line to find issues and test

120 Real World SSIS: A Survival Guide (Tim Mitchell) 120 Script Component Debugging Demo

121 Real World SSIS: A Survival Guide (Tim Mitchell) 121 Name-based metadata mapping Then: Changing upstream components often causes runtime errors in downstream components The longest 4-letter word in the English language: VS_NEEDSNEWMETADATA

122 Real World SSIS: A Survival Guide (Tim Mitchell) 122 Name-based metadata mapping Now: Metadata mapping is based on name Easier to remap upstream components

123 Real World SSIS: A Survival Guide (Tim Mitchell) 123 Name-based metadata mapping Demo

124 Real World SSIS: A Survival Guide (Tim Mitchell) 124 CDC in SSIS The old: CDC (Change Data Capture) was present in the DB engine, but required manual T-SQL coding to implement

125 Real World SSIS: A Survival Guide (Tim Mitchell) 125 CDC in SSIS The new: SSIS now has new task and components to handle CDC processing CDC Task – metadata (start/end initial load, etc.) CDC Source – retrieve CDC data CDC Splitter – break apart results

126 Real World SSIS: A Survival Guide (Tim Mitchell) 126 Environments Environment replace configurations Collections of related values (ex: Production connection strings, Dev connection strings, etc.) Multiple environments can be associated with each project or package Specify for automated job, or easily choose at runtime

127 Real World SSIS: A Survival Guide (Tim Mitchell) Designer Improvements Package annotations In prior versions, annotations were difficult SSIS 2012 improvements

128 Real World SSIS: A Survival Guide (Tim Mitchell) 128 Designer Improvements Sort packages by name Sometimes it’s the little things that matter

129 Real World SSIS: A Survival Guide (Tim Mitchell) 129 Designer Improvements Simplified data viewer

130 Real World SSIS: A Survival Guide (Tim Mitchell) 130 Designer Improvements Universal status indicators

131 Real World SSIS: A Survival Guide (Tim Mitchell) 131 Designer Improvements Variable management Scope default Expression management Static values vs. expression Expression indicator

132 Have a bag of tricks Survival Tip #7:

133 Real World SSIS: A Survival Guide (Tim Mitchell) 133 Have a bag of tricks Be lazy! Code once, reuse many Create a portable system for reusing familiar patterns – Database? – Documentation?

134 Real World SSIS: A Survival Guide (Tim Mitchell) 134 Have a bag of tricks Be lazy! ETL Framework – Managed execution for multipackage ETL processes – Restartability, consolidated error handling and logging

135 Real World SSIS: A Survival Guide (Tim Mitchell) 135 Have a bag of tricks Be lazy! Custom SSIS components – Create custom components for commonly used design patterns – Parameterized script packages may substitute in SQL 2012

136 Real World SSIS: A Survival Guide (Tim Mitchell) 136 Have a bag of tricks Be lazy! Third party tools – BIDS Helper – SSIS Reporting Pack – SQL Sentry Plan Explorer – Brent Ozar’s SQLBlitz

137 Real World SSIS: A Survival Guide (Tim Mitchell) 137 Have a bag of tricks Be lazy! Biml – Business Intelligence Markup Language – Package generation tool – Included with BIDS Helper (free)

138 Real World SSIS: A Survival Guide (Tim Mitchell) 138 Biml package generation Demo

139 Real World SSIS: A Survival Guide (Tim Mitchell) 139 Questions? Comments? Standing ovation?

140 Real World SSIS: A Survival Guide (Tim Mitchell) 140 Thanks!


Download ppt "Real-World SSIS A Survival Guide Tim Mitchell. Real World SSIS: A Survival Guide (Tim Mitchell) 2 Lessons I’ve learned the hard way Methodologies to solve."

Similar presentations


Ads by Google