Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copying, Managing, and Transforming Data With DTS.

Similar presentations


Presentation on theme: "Copying, Managing, and Transforming Data With DTS."— Presentation transcript:

1 Copying, Managing, and Transforming Data With DTS

2 Defining Bulk Insert Task Functionality Quickly Loads Data from a File into SQL Server Encapsulates the Transact-SQL Bulk Insert Statement Supports Table or View Destinations in SQL Server Loads Data with No Applied Transformations Supports Format Files to Specify File Layout Requires Sysadmin or Bulkadmin Fixed Server Roles Membership The Bulk Insert Task is One of Three Ways to Run SQL Server Bulk Copy Operations

3 Sidebar: SQL Server Bulk Copy Operations 1)Bcp Utility 2)Bulk Insert Task or T-SQL Bulk Insert Statement 3)Bulk Copy APIs for OLE DB, ODBC, DB-Library Applications What Do Bulk Copy Operations Offer? Allow Fast Loading of Data into SQL Server Configure Data Load Batches Allow You to Control Logging Operations Ways to Access Bulk Copy Operations

4 Defining the Sales_stage Table Load DTS PolarisTab Delimited File Using the Bulk Insert Task to Load Tab-delimited File Data into Sales_stage Loading Sales_stage with Data Bound for Sales_fact

5 Defining Execute SQL Task Functionality Executing SQL Statements Source database must understand SQL syntax SQL statement determines task performance Task supports single or multiple SQL statements You can create queries in the DTS Query Designer Running Parameterized Queries Input parameters Output parameters

6 Using Parameterized Queries Understanding Global Variable Basics User-defined storage locations Information is shared across package steps Using Parameters with Global Variables Assign global variable values to query input parameters Store query results to a global variable with output parameters

7 Creating Dynamic Queries SELECT * FROM product_dim WHERE product_name = ? AND category_name = ? Parameter 1 Parameter 2 Parameter Parameter 1 The Parameter’s Position in the Query Determines Its Name ProductName CategoryName Global Variables Global Variables Provide Data to Input Parameters ? Question Marks Represent Query Parameters

8 Storing Query Results SELECT begin_date, end_date FROM financial_period WHERE quarter = 1 SELECT * FROM product Global Variables BeginDate EndDate Product Global Variable Output Parameter begin_date end_date Storing Row Values Storing Entire Rowsets Store Query Results in Global Variables Output Parameter Entire Rowset

9 Time_dim_build Stored Procedure DTS Input Parameters - @p_start_date - @p_end_date Defining the Time_dim Data Load

10 Defining the DTS Data Pump DTS Mechanism for Moving and Transforming Data Allows for High-speed Batch Copying of Data Contains Supplied Data Transformations Can Also Define ActiveX Script Transformations Provides An Extendable COM-based Architecture That Allows for Custom Transformations (C++) Permits the Application of Transformation Logic to Specific Phases of a Data Pump Operation Multi Phase Data Pump

11 Understanding How the Data Pump Processes Data OLE DB ODBC Source OLE DB ODBC Destination DTS Data Pump InOut ActiveX Script Copy Trim String …Custom X Forms 1. Connects to the source and destination 2. Reads OLE DB metadata about source and destination columns 3. Gathers data transformation definitions 4. Implements the transformation 5. Writes completed record to the destination

12 Defining the Tasks That Transform Data The Transform Data Task Inserts The Data Driven Query Task Inserts Updates Deletes The ParallelDataPumpTask Processes hierarchical rowsets The Transform Data Task Inserts

13 Defining the Transform Data Task Data Movement and Transformation Functionality Copying data between heterogeneous data sources Applying optional column level transformations Extended Data Transfer Functionality Supporting batch processing of data Providing error-handling capabilities Containing optimization settings for SQL Server destinations

14 Selecting Transformation Types TransformationTransformationDescriptionDescription ActiveX Script Invokes user-defined ActiveX scripts. Copy Column Copies data from source to destination. DateTime String Converts a date to a new destination format. Lowercase String Converts a string to lowercase characters. Uppercase String Converts a string to uppercase characters. Middle of String Extracts a sub string of source data. Trim String Removes white space from a source string. Read File Copies contents of a file to a destination column. File path is specified by a source column. Copies contents of a file to a destination column. File path is specified by a source column. Write File Copies contents of a source column to a file. File path is specified by a second source column. Copies contents of a source column to a file. File path is specified by a second source column.

15 Defining Column Mappings One-to-One Mappings Symmetric Many-to- Many Mappings Asymmetric Mappings

16 Creating Efficient Column Mappings Minimizing the Number of Column Mappings Using Many-to-Many Mappings When Possible Grouping Common Transformations Together

17 Loading Customer_dim Northwind OLTP SQL Server Database

18 Performance Settings Enabling Fast Load Using high-speed bulk copy processing Accepting batches of transformed data Only applies to SQL Server destinations Using a Table Lock Configuring Batch Size

19 Assembling Records into Groups DTS commits records to database as a group Insert batch size sets the number of records in the group Understanding Default Behavior Insert batch size is 0 DTS assigns one batch for all records Setting the Insert Batch Size Value between 0-9999 Setting value can improve performance

20 Defining SQL Solutions You Can Use the Source Query of the Transform Data Task to Implement Data Transformations The Source SQL Statement Must Be Understood by the Source Database The Performance of the Source Query Depends on the SQL Statement You Can Use Parameters in the Source Query to Create Dynamic Source SQL Statements If You Use the Source Query to Manipulate Data, You Can Use the Copy Column Transformation to Load Data into the Destination

21 Applying SQL Solutions to Load Fact Tables Using the Source Query to Join Staging Table Data to Dimension Tables Retrieving Primary Key Values to Store as Foreign Keys on the Fact Table Using a Copy Column Transformation in the Transform Data Task Configuring Fast Load for SQL Server Destinations

22 Loading the Fact Table Dimension Tables customer_dimcustomer_dim 201 ALFI Alfreds product_dimproduct_dim 25 123 Chai Source Data customer id ALFI 1231/1/2000400 134 1/1/2000 time_dimtime_dim product id order date quantity_sales amount_sales 10,7891231/1/200040010,789 cust_key 1231/1/2000400 prod_key time_key quantity_sales amount_sales 2513440010,789201 Sales Fact Data Identifying Dimension Application Key Values in the Fact Table Source Data Retrieving Primary Keys from Each Dimension Table to Assign Foreign Keys

23 Loading Sales_fact DTS Extracting Data from the Sales_stage Table Assigning Foreign Keys by Retrieving Primary Keys from the Product_dim, Customer_dim, and Time_dim Dimensions

24 Best Practices - Performing Inserts Bulk Insert Task Accessing data in files Loading data into SQL Server destinations Copying data with no transformations Transform Data Task Accessing any source Loading to any destination Creating data transformations Using input parameters in the source query Applying custom logic to phases of the data pump

25 Best Practices - Performance Settings Tuning the Transform Data Task Fast load for SQL Server destinations Batch size Table lock Tuning the Bulk Insert Task Sort order for clustered indexes Batch size Table lock

26 Best Practices - Executing Flexible Queries The Data Driven Query Task Execute flexible queries on a row-by-row basis Meet flexibility needs that outweigh performance needs Perform non-insert queries The Execute SQL Task Execute SQL statements and extended SQL statements Perform parameterized queries Assign query outputs to global variables

27 Best Practices - Using Custom Tasks Creating Reusable Functions and Utilities Adding Functionality to DTS Package Designer Implementing a Faster Alternative to ActiveX Script Tasks

28 Best Practices - Creating Efficient Column Mappings Minimizing the Number of Column Mappings Using Many-to-Many Mappings When Possible Grouping Common Transformations Together

29 Best Practices - The Right Transformation Type Using Supplied Transformations When Possible Minimizing ActiveX Script TransformationsWhen Performance Outweighs Flexibility Using SQL Solutions with Copy Column Transformations Developing Custom Transformations as a Faster Alternative to ActiveX Script Transformations


Download ppt "Copying, Managing, and Transforming Data With DTS."

Similar presentations


Ads by Google