Presentation is loading. Please wait.

Presentation is loading. Please wait.

SSIS for Absolute Beginners

Similar presentations


Presentation on theme: "SSIS for Absolute Beginners"— Presentation transcript:

1 SSIS for Absolute Beginners
August 20, 2015 SSIS for Absolute Beginners by George SQUILLACE “SQL” is in the name!

2 Speaker Background Husband, Dad MCT MCSE SQL Server 2012 MCSA
22 years of service at New Horizons 96 testing events MCT since 1997 SQLSaturday speaker Microsoft Virtual Academy presenter Husband, Dad MCT MCSE SQL Server 2012 MCSA MCSE: Data Platform MCSE: Business Intelligence Certified in every version of SQL Server from > current Certified in every version of Windows Server from NT 4.0 -> Server 2008 Almost 30 years in I.T.! For fun: (no particular order) Reading Bicycling (road /city / trail) Travel Music / Acoustic Guitar Computers Would love to learn how to fly Would love to learn how to surf

3 Agenda What is “ETL” ? Why Is ETL Needed? What is “SSIS”?
Tour of the Development Environment Control Flow Basics Data Flow Basics Debugging & Error Handling Basics Deployment Basics

4 What is “ETL”, and Why Is It Needed?

5 What is “ETL”? Extract, Transform, & Load
Connect to data sources for Extraction Connect to data destinations for Loading Optional: Transform/process data as required ETL is very frequently required within an organization Oh, and sometimes an E-L-T process is used instead 

6 Why is “ETL” Needed? Import and export data
Working with trading partners Combine data from multiple sources Homogeneous sources (all SQL Servers) Heterogeneous sources Make data available to applications other than the source Make data available for people Combining data from various sources for analysis Create a data warehouse Cleanse data Standardize data A DW is a foundation for a variety of analyses, such as…. interactive analysis with OLAP cubes, Tabular and other data models Reporting What is “The Big Lie”? What is “The Big Lie”? Answer: Believing your data is “clean” and doesn’t need much processing.

7 What is SQL Server Integration Services, or “SSIS”?
Predecessor to SSIS: DTS (Data Transformation Services) Describe why I placed the Mouse Trap picture; note YouTube video link.

8 What is SSIS? ETL Execution Platform ETL Development Environment
Control Flow Engine Data Flow Engine Unit of Execution: “Package” ETL Development Environment Extensible Includes: Wizards Import/Export Data Wizard Package Migration Wizard Package Configuration Wizard Package Deployment Wizard Command Line Executables DTExec DTExecUI DTUtil How does SSIS relate to SQL Server? How does SSIS relate to SQL Server? SSIS is an installable shared feature.

9 Why SSIS? There are competitors…
Commercial Pretty much every database engine platform has a “data integration” feature Oracle IBM SAP Informatica Pentaho Open source Talend CloverETL SSIS is included in the SQL Server product (depending on the Edition) Mature feature set (originally introduced in SQL Server 2005) Enterprise class performance “We loaded 1TB in 30 Minutes with SSIS, and So Can You” Huge support community Websites and help forums Blogs Books BI-specific PASS chapters Software tools Why SSIS? Included in the product (with the correct edition) Mature technology, initially introduced in SQL 2005 Huge support community (BIDN, MSSQLTips, SQLServerCentral, zillion blogs), BI-specific PASS chapters Free software tools Many reference books

10 Development Environment

11 Visual Studio (SSDT for BI) Basics
Key Terms: Solution Project Template

12 What is the SSIS Development Environment?
Visual Studio Shell + Business Intelligence Development Templates Named: SQL Server Data Tools for BI (SSDT-BI) Installation method Licensing SQL Server 2012 comes with the Visual Studio 2010 Shell Can upgrade to SSDT for BI – VS 2012 or VS 2013 Ironically, there is no SSDT (yet) that ships with SQL Server 2014 Can download SSDT for BI SSDT was formerly nicknamed “BIDS”, derived from “Business Intelligence Development Studio”. General assumption: SSIS is a rich environment and can get the job done.

13 Tour of the “Dev” Environment
Note: My demos use Visual Studio 2012 Demo: Launch SSDT Demo: Create a new SSIS Project Demo: Tour the various window panes

14 Control Flow Basics When you think of “Control Flow” think of the terms “workflow” (what, and in what order), and, “general contractor”.

15 Control Flow Elements Tasks Containers Precedence Constraints

16 Frequently Used Control Flow Tasks
Data Profiling Task “hunt for treasure and landmines” Execute SQL Task File System Task Execute Process Task Send Mail Task Execute Package Task Script Task Data Flow Task The “star of the opera” Demo each of these Tasks. Talk about Connection Managers, Project v. Package. Talk about Task naming. Talk about annotations (“Leave a trail”). We’ll look at the Data Flow Task later. Script Task usage: “If I can’t get the job done any other way”.)

17 Control Flow Containers
Why Containers? Containing / organizing Executable unit within the package Can be enabled/disabled Two of the three containers provide Looping Transaction protection context Think “all, or nothing” Checkpoint context Restart point Three Kinds Sequence Container For Loop Container For Each Loop Container Demo: Place ExecSQL, FST, and ExecProcess Tasks in a Sequence Container. Demo: Hide/show container contents Demo: Disable/enable container. Demo: Execute only the container.

18 Precedence Constraints
Precedence Constraints define: Workflow order Workflow conditions Downstream Task & Container Execution can be based on: Constraint evaluation Success Failure Completion Expression Expression and Constraint Expression or Constraint Multiple Constraint evaluation Logical AND Logical OR

19 Data Flow Basics

20 Data Flow Basics Various ways to accomplish ETL in SSIS
BULK INSERT Task Execute SQL Task bcp.exe Command line utility for importing and exporting text files Most typically used ETL tool in SSIS is the Data Flow Task The “DFT” defines a “pipeline” At least one source At least one destination Optional: one or more transformations Demo: importing SuperBowl Excel file into the SQLSat_SSIS_DemoDB.dbo.SuperBowlData (already exists).

21 Data Flow Sources Database Engines File sources
OLE-DB Source ADO .NET Source ODBC File sources Flat file Excel file Raw file XML file Others (pretty much anything)

22 Data Flow Destinations
Database Engines OLE-DB destination ADO .NET destination File destinations Flat file Excel file Raw file Other destinations ODBC destination Recordset destination

23 Data Flow Transformations
Row transformations Character map Copy column Derived column Import column / Export column Rowset transformations Aggregate Sort Pivot and Unpivot Row sampling and Precentage sampling Split & join transformations Conditional Split Lookup Merge Merge Join Multicast Union All Auditing transformations Audit Row count BI transformations Slowly changing dimension Fuzzy grouping Fuzzy lookup Data mining query Custom transformations Script component Demo: Conditional Split Transform, ScoreDiff column > 14.0 (“BigWin”; default output “NotABigWin”) Demo: Derived Column Transform, WinMargin column in place replacement with text string “Big win!” Connect to a Union All Transform and run that Task alone. {Nothing will happen to the data, but the Task will succeed}.

24 Control Flow & Data Flow Demonstration
Show no workflow, linear workflow, branching constraints, LogicalAnd and LogicalOr, and expression driven constraints.

25 General Coding & Development Principles
“Always accept pain as early in the development cycle as possible.” ~ Dave Rodabaugh, BI Developer “If I have to do something once I’m probably going to have to do it again.” Abstraction / Encapsulation “Write once; run many!” Modularity / “Think reusable”

26 What are values that might be different from Package execution to execution?
Server / Instance name Database name Connection String Table name File System Path File Name File Extension Query Parameter Values such as Comparison Dates Query result set

27 Q: Where Can Dynamic / Programmatic Actions Be Implemented in SSIS
Q: Where Can Dynamic / Programmatic Actions Be Implemented in SSIS? A: In many places! Control Flow Precedence Constraints ExecuteSQL Tasks (Reading and / or Writing Variables and the use of Parameterized Queries) Parameterized queries Store the whole query string in a variable Store the output of a query for subsequent processing For Loop Container ForEach Loop Container Script Task code Data Flow Connection Manager Expressions Derived Column Transformation Expressions Conditional Split Transformation Expressions Script Component code Variable & Parameter Values Package Configurations Almost Any Property of Almost Every Task and Component

28 Variables – A Building Block of Dynamic Packages
What are Variables? / Why have Variables? Two Variable Classifications in SSIS System Variables User Variables Creation – HowTo Variable Scope and “(un)changeability” Available Data Types

29 Parameters – Another Building Block of Dynamic Packages
What are Parameters? / Why have Parameters? Applies only to the Project Deployment Model of development Two Parameter classifications in SSIS Project Package Details: Creation – HowTo Variable Scope and “(un)changeability” Available Data Types

30 Expressions – Creating Derived Property Values
Expressions can be used to derive almost any property value and make the value dynamic Can be derived from the combination of: Column Values System & User Variable Values Operators Variety of functions Mathematical String NULL functions Date & Time Type casts

31 Debugging & Error Handling Basics

32 Debugging & Error Handling
Logging Control Flow debugging features Breakpoints Conditional Also within Script code Status Windows Locals Watch Event Handler “Scoped, Conditional, Control Flow” Data Flow features Data Viewers Ignoring or redirecting failed rows Demo: Move Send Mail Task to a Package-wide, OnError Event Handler.

33 Deployment Basics

34 Deployment: Dev -> Prod
Starting with SQL Server 2012 there are now two different deployment models Project Deployment Model Requires a special database created, the “SSIS Catalog” Package Deployment Model MSDB File system Package configurations Each deployment model has very different deployment requirements and methods The new deployment method will be discussed and demonstrated in my next session (end of the day).

35 Using SSIS Programmatically
Building Packages Programmatically Samples for creating SSIS packages programmatically EzAPI – Alternative package creation API Running and Managing Packages Programmatically

36 Biml (Business Intelligence Markup Language)
Stairway to Biml Level 1: What is Biml? Level 2, Biml Basics Level 3, Building an Incremental Load Package Level 4, Using Biml as an SSIS Design Patterns Engine Level 5, Biml Language Elements Biml is better even for simple packages Generate multiple SSIS packages using BIML and metadata 8 Practical BIML Tips BimlScript

37 Extensibility: Free & Commercial Tools
Codeplex SSIS Community Tasks and Components (free & commercial) Tools i.e., SSIS Expression Editor & Tester Connection Managers Log Providers Control Flow Tasks ForEach Enumerators Script Task examples Script Component examples Data Flow Sources, Transforms, & Destinations Commercial Pragmatic Works – Task Factory Abilineage, SSIS Pipeline components Eldos - BizCrypto, like SFTP functionality for SSIS CozyRoc - SSIS+, Control Flow & Data Flow components Keelio - Dynamics GP SSIS Toolkit and XML Toolkit Konesans - a number of components, some of which are free Melissa Data

38 References & Resources
Microsoft course 20463C, available through your local New Horizons Computer Learning Center Implementing a Data Warehouse with SQL Server (5 days) (linked to my BI homepage) Books Implementing a Data Warehouse with Microsoft SQL Server 2012 Training Kit (Exam ) SQL Server 2008 Integration Services Problem-Design-Solution (Wrox) Web & other links: Integration Services Expression Reference SSIS Expression Examples Advanced Integration Services Expressions Integration Services Data Types CAST & CONVERT SSIS Data Types A VB .NET Toolkit for Writing SSIS Script Tasks SSIS Expression Cheat Sheet at PragmaticWorks PragmaticWorks free one hour webinars xxx

39 Thank you!


Download ppt "SSIS for Absolute Beginners"

Similar presentations


Ads by Google