Creating Dynamic, Reusable SSIS Packages

Creating Dynamic, Reusable SSIS Packages
Elizabeth Hunt Creating Dynamic, Reusable SSIS Packages Good morning, I’m Elizabeth Hunt, and we will be talking about creating dynamic, reusable, SSIS packages. Thank you for attending my very first presentation for PASS. I have uploaded my slides to the SQLSaturday697 site, and pretty much everything I’m saying today is included in the notes section of the slides. Please make sure to fill out the session evaluation on the SQLSaturday Spokane site. I would love to hear your feedback about what does and doesn’t work, so that I can make my next presentation better.

We are here thanks to our sponsors!
Thank you to our sponsors today, without which, we would not be able to provide this training for FREE.

Elizabeth Hunt Member of PASS and Spokane SQL User Group
Programmer for 12 years VBA, VB, C# (12 years) SQL (8 years) SSIS\DTS (7 years total, majority for last 5 years) SSRS (5 years) Bachelor’s in Programming and Database Management Connect to learn more @data_liz In 2006, I found out I loved programming in the last 10 minutes of an Advanced Excel class. The teacher of the class was hired to show us formulas we requested in advance, but she said that we probably really wanted to learn VBA due to our complex needs. So, she covered VBA for the last 10 minutes of class… and she changed my life. I went to a bookstore right after work and bought 2 books on VBA and started reading them that night. The reason I’m telling my origin story, is that although this is an intermediate class, I am going to spend the last few minutes talking about advanced scripting designs in SSIS and provide some resources for learning about them.

Best Practices Before we get started, let’s talk about a few best practices.

Not just for object oriented code…
Loose coupling As few external dependencies as possible Tight cohesion Program is focused on accomplishing one thing Following these two software best practices will make your SSIS packages easier to create, maintain, and reuse. Loose coupling refers to the idea that we should try to create code that has as few dependencies on external objects as possible. Cohesion refers to how well each piece of code inside one module (in this case an SSIS package) relate to each other. Tight cohesion means that all of the code is highly related. The example that we are going to work through today has no external dependencies, and all of the tasks are highly related; so we have loose coupling and tight cohesion. (Cody, 2011)

What are some of your favorite tools and techniques?
“…I write code so it’s dynamic – meaning it knows which datacenter it’s in, which environment it’s in – code that’s written this way may, for example, not have backups running in the test environment, but does have them running in production. In essence, [its] got a GPS on it. The code is identical on each server, the variables are dynamically generated values for each individual server.” – Tom Roush As some of you may know, we recently lost Tom Roush, also knowns This quote is from the last interview that he did before he passed. As I read through Tom’s interview, I realized that for years, Tom has been doing the same thing that we are learning about today, he considers it one of his favorite techniques, and he coined the perfect term for it. So, thanks to Tom, we are officially going to give our code GPS today. (diligentdba, 2018), (Roush, n.d.)

Overview

What are we trying to achieve?
Create SSIS packages that: Dynamically know what environment they are in Give our code “GPS” Are easy to reuse Make changes easy Reduce maintenance time Include all logic in source control Are self documenting So that brings us to our goal for today. We are going to create an SSIS package that dynamically calculates values based on its GPS, and is completely reusable. I’d like to point out that although I made this package completely dynamic, it may not be worth doing that in most cases. Dynamic connections are probably the minimum for a package, and then everything beyond that is icing on the cake. As I mentioned earlier, we will also briefly discuss scripting design patterns at the very end of the session. (diligentdba, 2018)

Package Overview

Demo Control Flow Annotation Sequences Connections
Here is the Control Flow for our demo today. It’s important to have some context for the demo package before we start talking about making it dynamic. You can see that we have 3 main sections in our Control Flow: Annotation, Connections, and Sequences. Our sequences are split into a very common design of Pre Execute Tasks, Main Tasks, and Post Execute Tasks. Sequences Connections

Pre Execute Tasks In the Pre Execute sequence, we are using a stored procedure to get a date that will be assigned to a variable and used by the following tasks. For this design we are using the same data warehouse date to also create the date stamp for the file name. This could be any date logic that is common in your environment though. For example, you may need to get a particular date for the data, and then add a time stamp to the file name using GETDATE().

Main Tasks The Main sequence looks like it has two tasks, but there are really three. The Execute SQL Task runs a stored procedure that loads an archive table with data we want to output to a file. The Data Flow Task first uses a stored procedure to query the file data from the newly loaded table, and then it writes the data to a file.

Post Execute Tasks The Post Execute sequence is used to delete old history that should be purged from the archive table we just loaded. In this case, the business has decided to keep 3 months of history, so anything in the table over 3 months old is deleted.

Dynamic Connections Note the fx before the connection names. That means that an expression is being used to determine the value for each connection. We have a dynamic database connection that is used to execute the stored procedures, and a dynamic flat file connection used to output the data.

Annotation Annotations are the comments for your SSIS package. In addition to the normal code header you would include for any code you write, this is a great place to add instructions about how to use, or to reuse your code.

Parameters and Variables

Parameters and Variables
Project Parameters Package Parameters The way that we create dynamic packages is by using parameters and variables. Since SQL 2012, we have had Project Deployment Mode, in addition to Package Deployment Mode. You will only be able to use parameters if you are developing in Project Deployment Mode. Parameters do not allow you to use expressions, which are similar to Excel formulas, but they do allow you to add descriptive text. Project parameters are accessible in the solution explorer window. They have project level scope, which means that any of the packages in the project can access the same project level values. Package parameters are accessible in the tabbed navigation in your package. They only have package level scope, so none of the other packages in a project will be able to reference these values. Variables have package level scope, and are available in both Project and Package Deployment mode. They allow you to use expressions, but they don’t have built in documentation. Variables

Create a User Interface
To get the most out of your SSIS packages, I suggest that you create a user interface. Any information that needs to be manually typed into your package can be set up as a parameter, so that the only place that ever needs to be maintained are those values. The first parameter is the Database Name, which is the same in each environment for this particular process. The second parameter is the Deployment Environment Override, that allows the environment to be explicitly set. This can be very helpful in an emergency, but it is also important for testing your expressions. By overriding the environment to production, you can verify that all of the expressions are calculating the expected values by checking the variables. The third and fourth parameters are the UNC paths for the Development and Production files. The fifth is the schema name in the database where the stored procedures and the archive table reside. The sixth, seventh, and eighth parameters are the relevant server names. The production database and SSIS servers are needed for the dynamic connections to work. This is because the machine name that the package is deployed on is used to detect what environment the package is running in. We only have a production and development server in this example, but there isn’t a limit to the number of environments that this will work for. All of the remaining parameters are the names of stored procedures that will be executed by the package.

Provide Instructions Document how to use the interface
Parameter descriptions Annotations Must use for variables in package deployment mode One thing that I love about parameters, is that they allow us to add documentation about what each parameter is, and how to use them. You will need to use annotations to document your variables though, since there isn’t a nice description column for them. Documentation is what will help ensure that your code is reused easily and correctly.

Under the Hood: Variables and Expressions
Variables and expressions are where the real magic happens. You may notice that I have similar names for my variables and the parameters we were just looking at. That’s because the variables are actually what all of the dynamic objects use to calculate their value. So, we enter the hardcoded values into our interface, and then the variables turn them into the values for the current deployment environment using expressions. You can see in this screenshot that all of the variables are pointing to the development environment values. Additionally, you may notice that I have a dummy value of for the Data_And_File_Date variable. I usually enter a dummy value like this to test my expressions. The real value is assigned at run time by the Pre Execute Tasks.

Variable Expressions Let’s dive into some of the variable expressions that are fueling this package.

Expression Builder Variable expressions are creating the in the Expression Builder. It lists system variables, including machine name, in the GUI, and you can drag them right into the expression window. You can also see the list of functions, casts, and operators available in the right window. There is even a brief description when you select an item.

User::Deployment_Environment
Here you can see the same window, but I’ve scrolled down to see the Package parameters that have a prefix of $Package, and Variables, which have a prefix of User:: If we had used Project level parameters, you would also see options in the list that start with $Project::

User::Deployment_Environment
The deployment environment variable is the foundation for all of the other expressions. This variable is what gives our package GPS. You can see in this example that the evaluated value is currently DEV.

Anatomy of a Conditional Expression
Statement is true ? Value if true : Value if != "" ? : == || == ? "PROD" : "DEV") If you haven’t used the conditional operator in SSIS, then it may take a little getting used to. It may be helpful to start by thinking about how it’s very similar to an Excel If formula. The syntax is a statement that you want to evaluate, then a question mark, the value if true, a colon, and then the value if false. You can nest conditional expressions using the false argument as well. In this example, the logic is asking: “is there a deployment environment override value not equal to blank?” , if it’s true, then return the override value , if it’s false, then perform a second check to see “if the machine name is the same as the Production database server, or Production SSIS server name?” , if that’s true, then return “PROD” , if that’s false, return “DEV” You also may want to note that I’m using the UPPER function on all of the server names to ensure that the case sensitive text comparison is correct. Are there any questions on this before we move on? Now let’s look at the rest of the variable expressions.

User:: Export_Directory
This variable uses the value of the deployment environment variable we just worked with, to pull the Export Directory for the current environment. This is where we are finally using our GPS to assign dynamic values. So to translate the expression, it is asking if the environment is production, then assign the production value, otherwise, assign the development value.

User::Export_File_Path
The last variable calculated the directory, and this variable takes that one step further and adds the entire file path using the + operator to concatenate the pieces of text together.

User::Server_Name The server name variable is calculated the same as the last one. If the deployment environment is production, it assigns the production database server name, otherwise, it assigns the development server name. If you have more environments, you could nest your logic in all of these expressions to handle that as well.

User::Stored_Procedure_To_Get_Data_And_File_Date
In this expression, we are concatenating text to create a SQL query that will be executed by one of the execute SQL tasks in our package. Using this method, we can change the schema and the stored procedure name in our interface, and the new information will be executed without having to update the tasks.

User:: Stored_Procedure_To_Load_File_Data
This variable is doing the same thing, just with a different stored procedure that loads the data.

User:: Stored_Procedure_To_Output_File_Data
This variable is also doing the same thing for the stored procedure that outputs the data.

User:: Stored_Procedure_To_Purge_Archive
Last, but not least, this variable does the same thing for the Purge stored procedure.

GPS Ready! That completes our deep dive into all of our variable expressions.

Deployment Environment Override
Before we move on, I’d like to show you the variables after I changed the deployment environment override parameter to Prod. As you can see, all of the variables now point to the production server and file paths. I always test my expressions like this to double check that everything is working correctly.

Dynamic Connections aka GPS
Dynamic connections are an amazing tool when you are deploying to multiple environments like Dev, QA, and Production. I think that dynamic connections can also make maintenance easier if you only have one server, because eventually that server name will change, and the package will need to be updated. So, just keep in mind that even if you don’t have multiple environments, it might still be worth using this design pattern.

Setting up initial database connection
When you first set up your database connection, you will go through the regular steps to set it up. As soon as this connection is created though, you can rename it to something that doesn’t have a hardcoded server name in it, and then start making it dynamic.

Dynamic Database Connection
Expand Click Ellipsis To set up the dynamic database connection SQLSatSpokaneDB, you’ll need to select the database in the connection manager window with your mouse, and then either press F4, or right click and choose properties to open the properties window. You can expand the expressions section to see the current expressions that are set up for the connection. In this example, I already have the ConnectionString and ServerName properties set up with expressions. To add or edit expressions, you will need to click the ellipses button to open the Property Expression Editor.

Property Expression Editor: DB Connection
When you click that ellipsis, this is the window that will open up for you. You can select the property that you want to add from a drop down list, and then click the ellipsis again, to actually add or edit an expression. I am going to open up the connection string property first.

Dynamic database connection string
We are now three levels deep in the GUI (connection properties, property expression builder, and then the actual expression builder). Note that we aren’t having to do a bunch of calculations about what environment we are in to assign the server name. All of that is done in the variable, so that we can just simply assign the variable’s value here. We are using a combination of a variable and a package parameter for this calculation, because in this design the database has the same name in all environments. Of course, that could easily be changed for another use case. The database connection is now going to automatically update to the correct server name for the environment the package is deployed in, and the database name will be set to whatever is entered in our interface. If we ever used this package for a new process, or moved this process to a new server or database, we can just update our interface, and it would automatically update this connection string. Server Database

Dynamic server name for DB connection
I’ve clicked on the ellipsis next to server name in the Property Expression Editor, and you can see how easy it is to set a dynamic server name here. Again, since we are handling any complex logic in the variable itself, we only need to list the variable name anywhere it is needed.

Automagical Database Connection!
We now have a completely dynamic database connection.

Dynamic File Connection
For our dynamic File Connections, we are going to start just like we did with the database connection, and set up a regular file connection. Note that this file only has one column named FileData. The file format for this package uses what some people call a fixed width file, but it is really a delimited file. Any time you are asked to make a fixed width file, and you add a row delimiter (usually a carriage return line feed), it is now a delimited file. A true fixed width file does not have each record on a separate row. The great news here, is that this package can be reused for any delimited flat file (or a fixed width file that has a row delimiter), as long as you combine the output data for the file into one column. So, if you add the delimiters when you query out the data, or if you pad the data with zeros or spaces to make each row the same length, you can use this exact same file connection. Before we set up the dynamic connection, though, make sure to edit the data type in the connection manager to TEXT. Choosing the TEXT data type is very important here, so that when this package is reused, the width of the file doesn’t have to be manually updated. I couldn’t find a way to make this dynamic, so making it as large as possible was the next best thing. Please keep in mind that, if you want to list all of your columns in your file connection manager, go for it, you can still make the file path dynamic, even if the file columns aren’t. You would just have to update the columns if anything changes, or if you reuse the package.

Automagical File Connection!
Ok, now that we created the file connection, we can make the file connection string dynamic, by clicking on the ellipsis and creating an expression that uses…. one variable. It’s that easy! All of the logic to calculate the Export_File_Path was taken care of in the expression for the variable. If you only make the connection string dynamic, and not the actual file columns like the last slide, this will still be a huge win.

Dynamic Tasks In order to show you all of the different options you have for making your code dynamic and reusable, I also made all of the tasks in this package dynamic.

Pre Execute Task: Get File and Data Date
Out first task is the Pre Execute task that executes a stored procedure to get a date. You can see that SQLSourceType is Variable, and unfortunately in this screenshot I have the wrong value selected. I chose the Package Parameter instead of the Variable by accident. So, I should have listed the Variable that had the EXECUTE query for this stored procedure here. If I had picked the right variable, then this task would run whatever stored procedure was listed in the interface to get this date.

Pre Execute Task: Get File and Data Date
This task also assigns the value returned from the query to the date variable that I had the dummy value of

Main Task: Execute Sproc to Load Data
This is just like the last example, except that I did assign the correct variable to the task this time. When this runs it will load the data into the archive table.

Main Task: Execute Sproc to Read Data and Output File
This is the internal data flow for the second task in our main sequence.

Data Flow Task: Query File Data
The first task in the data flow executes another stored procedure based on a variable.

Data Flow Task: Create Flat File
The create flat file task is using the dynamic file connection, although you can’t see that it’s dynamic in the flat file destination editor. The use of the column name file data is important here. If you don’t want to have to update this, you will want to have the data queried using the same column name if this is reused. This is just the column name assigned from the previous query, so there’s no impact to any actual database column names by using this pattern. Also, remember that there’s only one column in this data, since the file is really fixed width with a row delimiter.

Post Execute Task: Delete Archive Data Older Than Retention Period
Once again, in our post execute task, we are using a query stored in a variable to execute a stored procedure. Unfortunately, I made the same mistake when I made this screenshot, so you see the parameter in the screenshot instead of the correct variable. If I had chosen the correct variable, then this task would purge all history in the archive table that was older than the retention period.

We now have a dynamic, reusable package
Dynamic connections Dynamic stored procedures Easy to make changes using the interface Reusable for similar processes Everything resides within source control We now have a completely dynamic and reusable SSIS package. Are there any questions before we move on to scripting?

Scripting Design Patterns
This is an intermediate level class, but I still wanted to briefly cover a couple of scripting design patterns in SSIS, since that is where you will really get the most reuse of your code. I’ve included links to learn more about these options if you think they will be useful in your environment.

Reusable Design Using a Script Task
Script that runs within an SSIS package Database tasks will run as an ADO connection; single threaded Free training at:

Reusable Design Using BIML
Business Intelligence Markup Language Script the creation of SSIS packages Database tasks will run in parallel Free training at

“Do not try to reinvent every wheel you need.“ - Tom Roush
What did we learn? SSIS does allow for dynamic, reusable design Expressions using variables and parameters Script tasks within SSIS Scripting entire SSIS packages with BIML (diligentdba, 2018)

Thank you Please fill out the session evaluation Connect to learn more
Connect to learn more @data_liz Questions?

References diligentdba. (February 1, 2018). DBA best practices… from the DBA from Heaven: Curious.. about data. Retrieved March 8, 2018 from Cody. (October 5, 2011). [Image of Single Responsibility Principle]. Blesta 3.0 Designing a Modular System: Blesta. Retrieved March 9, 2018 from Roush, T. (n.d.). [Picture of Tom Roush]. LESSONS, TIPS, AND TRICKS FROM A SQL DATABASE GEEQ. Retrieved on March 8, 2018 from

Creating Dynamic, Reusable SSIS Packages

Similar presentations

Presentation on theme: "Creating Dynamic, Reusable SSIS Packages"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Creating Dynamic, Reusable SSIS Packages

Similar presentations

Presentation on theme: "Creating Dynamic, Reusable SSIS Packages"— Presentation transcript:

Similar presentations

About project

Feedback