Best Practices in Loading Large Datasets Asanka Padmakumara (BSc,MCTS) SQL Server Sri Lanka User Group Meeting Oct 2013.

Slides:



Advertisements
Similar presentations
Yukon – What is New Rajesh Gala. Yukon – What is new.NET Framework Programming Data Types Exception Handling Batches Databases Database Engine Administration.
Advertisements

SSIS Dataflow Performance Tuning 1 st October 2010 Jamie Thomson.
18 Copyright © Oracle Corporation, All rights reserved. Transporting Data Between Databases.
Module 8 Importing and Exporting Data. Module Overview Transferring Data To/From SQL Server Importing & Exporting Table Data Inserting Data in Bulk.
Moving Data Lesson 23. Skills Matrix Moving Data When populating tables by inserting data, you will discover that data can come from various sources.
Fundamentals, Design, and Implementation, 9/e Chapter 11 Managing Databases with SQL Server 2000.
Harvard University Oracle Database Administration Session 2 System Level.
Top 10 SSIS Best Practices Tim Mitchell Artis Consulting The World’s Largest Community of SQL Server Professionals.
Architecting a Large-Scale Data Warehouse with SQL Server 2005 Mark Morton Senior Technical Consultant IT Training Solutions DAT313.
1 Chapter Overview Transferring and Transforming Data Introducing Microsoft Data Transformation Services (DTS) Transferring and Transforming Data with.
Week 5 – Chap. 5 Data Transfer DBAs often must transfer data to and from text files, Excel spreadsheets, Access, Oracle or other SQL Server databases This.
Copying, Managing, and Transforming Data With DTS.
Module 11: Data Transport. Overview Tools and functionality in Oracle and their equivalents in SQL Server for: Data transport out of the database Data.
Module 9: Transferring Data. Overview Introduction to Transferring Data Tools for Importing and Exporting Data in SQL Server Introduction to DTS Transforming.
SQL Server 2005 SP2 Israeli SQL Server User Group March 2005 Ami Levin
MOVE-4: Upgrading Your Database to OpenEdge® 10 Gus Björklund Wizard, Vice President Technology.
Performance Tuning SSIS. HR Departments are no fun. Don’t mention the stalking incident with Clay Aiken What happened in Vegas My prom date with a puppet.
Troubleshooting SQL Server Enterprise Geodatabase Performance Issues
Introduction to Databases Chapter 8: Improving Data Access.
9 Chapter Nine Extracting and Transforming Data with SQL Server 2000.
ASP.NET Programming with C# and SQL Server First Edition
Chapter 2: Designing Physical Storage MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design Study Guide (70-443)
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
2 Overview of SSIS performance Troubleshooting methods Performance tips.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08.
Database Technical Session By: Prof. Adarsh Patel.
1 Oracle Database 11g – Flashback Data Archive. 2 Data History and Retention Data retention and change control requirements are growing Regulatory oversight.
Oracle Database Administration Lecture 6 Indexes, Optimizer, Hints.
By Lecturer / Aisha Dawood 1.  You can control the number of dispatcher processes in the instance. Unlike the number of shared servers, the number of.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
7202ICT Database Administration Lecture 7 Managing Database Storage Part 2 Orale Concept Manuel Chapter 3 & 4.
Eurotrace Hands-On The Eurotrace File System. 2 The Eurotrace file system Under MS ACCESS EUROTRACE generates several different files when you create.
Architecture Rajesh. Components of Database Engine.
9 Copyright © 2004, Oracle. All rights reserved. Flashback Database.
IT 456 Seminar 5 Dr Jeffrey A Robinson. Overview of Course Week 1 – Introduction Week 2 – Installation of SQL and management Tools Week 3 - Creating and.
Triggers A Quick Reference and Summary BIT 275. Triggers SQL code permits you to access only one table for an INSERT, UPDATE, or DELETE statement. The.
1 Chapter 14 DML Tuning. 2 DML Performance Fundamentals DML Performance is affected by: – Efficiency of WHERE clause – Amount of index maintenance – Referential.
Virtual techdays INDIA │ august 2010 SQL Data Loading Techniques Praveen Srivatsa │ Director, AsthraSoft Consulting Microsoft Regional Director,
Today’s Agenda Chapter 7 Review for Midterm. Data Transfer Tools DTS (Data Transformation Services) BCP (Bulk Copy Program) BULK INSERT command Other.
Siebel 8.0 Module 2: Overview of EIM Processing Integrating Siebel Applications.
D Copyright © Oracle Corporation, All rights reserved. Loading Data into a Database.
Permissions Lesson 13. Skills Matrix Security Modes Maintaining data integrity involves creating users, controlling their access and limiting their ability.
1 Advanced Topics Using Microsoft SQL Server 2005 Integration Services Allan Mitchell – SQLBits – Oct 2007.
1 Chapter 13 Parallel SQL. 2 Understanding Parallel SQL Enables a SQL statement to be: – Split into multiple threads – Each thread processed simultaneously.
Working with SQL Server Database Objects Faculty: Nguyen Ngoc Tu.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
NSF DUE ; Wen M. Andrews J. Sargeant Reynolds Community College Richmond, Virginia.
A Guide to SQL, Eighth Edition Chapter Six Updating Data.
SSIS – Deep Dive Praveen Srivatsa Director, Asthrasoft Consulting Microsoft Regional Director | MVP.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. DATABASE.
Ch 7. Working with relational data. Transactions Group of statements executed as a group. If all statements execute successfully, changes are committed.
Creating Simple and Parallel Data Loads With DTS.
Aggregator Stage : Definition : Aggregator classifies data rows from a single input link into groups and calculates totals or other aggregate functions.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
 CONACT UC:  Magnific training   
# CCNZ What is going on here???
Views / Session 3/ 1 of 40 Session 3 Module 5: Implementing Views Module 6: Managing Views.
Microsoft BI Online Training AcuteSoft: India: , Land Line: +91 (0) USA: , UK.
Doing fast! Optimizing Query performance with ColumnStore Indexes in SQL Server 2012 Margarita Naumova | SQL Master Academy.
1 PVSS Oracle scalability Target = changes per second (tested with 160k) changes per client 5 nodes RAC NAS 3040, each with one.
IFS180 Intro. to Data Management Chapter 10 - Unions.
Flash Storage 101 Revolutionizing Databases
Antonio Abalos Castillo
Informatica PowerCenter Performance Tuning Tips
Presented by: Warren Sifre
Re-Indexing - The quest of ultimate automation
Physical Database Design
Bulk Load and Minimal Logging
Chapter 11 Managing Databases with SQL Server 2000
Presentation transcript:

Best Practices in Loading Large Datasets Asanka Padmakumara (BSc,MCTS) SQL Server Sri Lanka User Group Meeting Oct 2013

Agenda  Definition of “Large Dataset”  Database and table structures for large data sets.  Loading data using T-SQL  Loading data using SSIS  Q & A

A Large Dataset….  A Collection of data sets: large complex difficult to process ~ Wikipedia Large data set to you is depend on your hardware configuration.

A Large table…. A large table is one that does not perform as desired or one where the maintenance costs have gone beyond pre-defined maintenance periods. if one user’s activities significantly affect another or if maintenance operations affect other user’s abilities. In effect, this even limits availability. ~Microsoft White Paper ( Partitioned Tables and Indexes )

Database and table structures  Try to have multiple file group across multiple disks. Can backup only the file group. To get Performance Data on a separate I/O path. Index on a separate I/O path  Partition the tables and Indexes Partition means Break large table into parts. Easy to insert-Switch partition in to table Easy to delete- Move partition from table Can rebuild index only of the partition

Database and table structures

 Have Clustered index for the most using column.  Have non Clustered indexes for other most using columns.  Try to have correct data types. Int, smallint makes a different when no of rows are high.

Loading Data with T-SQL  TSQL Commands and Utilities  BCP  BULK INSERT  OPENROWSET

Loading Data with T-SQL  BCP  bulk copy is an utility program: bcp.exe  copies data between Microsoft SQL Server and a data file in a user-specified format  Can generate format file for data.  performance is improved if the data being imported is sorted according to the clustered index on the table  IN – Insert to table, Out- Export to file  Syntax: bcp AdventureWorks2012.sales.CurrencyRate out F:\DemoData\Currency.dat –c –usa – SSLLAPTOP266\SQL2012

Loading Data with T-SQL  Bulk Insert  Transact-SQL statement. faster than the BCP utility  Unlike BCP, Can’t export data to files. Only insert  you can specify up to 1024 fields only, If more than that use BCP.  Syntax : BULK INSERT AdventureWorks2012.Sales.SalesOrderDetail FROM 'f:\orders\lineitem.tbl’

Loading Data with T-SQL  OPENROWSET  Alternative to accessing tables in a linked server and is a one-time  SELECT a.* FROM OPENROWSET('SQLNCLI', 'Server=Seattle1;Trusted_Connection=yes;','SELECT Name FROM Department AS a;)  Use BULK keyword for use OPENROWSET for bulk loading  SELECT a.* FROM OPENROWSET( BULK 'c:\testvalues.txt',FORMATFILE = 'c:\test\values.fmt') AS a;  IGNORE_CONSTRAINTS, IGNORE_TRIGGERS

Loading Data with T-SQL  Considerations…. Disable Index Alter index [IXYourIndex] ON YourTable DISABLE Disable constrain Do not disable clustered index. Table become read- only. If you disable it have to rebuild index. Enable Index ALTER INDEX [IXYourIndex] ON YourTable REBUILD

Deleting Data with T-SQL  Truncate instead of delete.(if possible)  (if not) Move the Partition to new table, then truncate it.-New table should be in same file group Delete batch by batch avoid growing log file delete is a single auto-commit transaction Disable Triggers Simple recovery mode to minimize growing log

Loading data using SSIS Buffer size Can increase buffer size of dataflow task Default size=10MB, can set up to=100MB Cachelookup If same table is looked up more than once inside SSIS package, cache that lookup Vender specific providers will give you more performance. Ex: Oracle provider instead of Microsoft Oledb

Loading data using SSIS  Use raw file instead on temp table when using staging table  Avoid using Slowly changing dimension control. Use merge statement instead of SCD control  Use parallel flow inside dataflow task to multi threading  Minimize usage of blocking,Partially blocking transformations Ex: Aggregate, Sort, Merge, Merge Join, and Union All

Demo  Demo # 1  Delete data from partitioned table.  Demo # 2  Import/Export data using BCP

Q & A