Presentation is loading. Please wait.

Presentation is loading. Please wait.

Know your data source well. Who am I? Nik – Shahriar Nikkhah Microsoft MVP 2010 – SQL Server MCITP SQL 2008 MCTS SQL 2008 and 2005 s:

Similar presentations


Presentation on theme: "Know your data source well. Who am I? Nik – Shahriar Nikkhah Microsoft MVP 2010 – SQL Server MCITP SQL 2008 MCTS SQL 2008 and 2005 s:"— Presentation transcript:

1 Know your data source well

2 Who am I? Nik – Shahriar Nikkhah Microsoft MVP 2010 – SQL Server MCITP SQL 2008 MCTS SQL 2008 and 2005 Emails: SNikkhah@Live.ca SNikkhah@Yahoo.com msdn.microsoft.com (SSIS forum) One chapter on SSIS in MVP Deep dive 2 (Sep 2011)

3 OVERVIEW Know your data source well / Data cleansing 1. Chronological file order 2. Data cleansing 3. Check a few sample packages Error handling / Email notification 1. Capture error in a text file 2. Email error file as notification 3. One package sample A package with the combination of the above.

4 Know your data source well Analyze you data source from 2 different angles 1- Data point of view Data relations, field mapping, data value PK, FK, Index, Metadata, Dictionary (mapping) tables Good records and bad records (Redirecting) 2- Data source behavior Behavior changes (Table / file renaming and header names changes ) Delivery process, how does the source get made, provided and loaded. (CSV been open by excel and saved) Who is providing it.

5 Scenario on data behavior Data Point of view

6 Scenario on data behavior Data Point of view

7 Scenario on data behavior Data source behavior

8 Scenario on data behavior Data source behavior

9 Scenario on data behavior Data source behavior

10 Scenario on data behavior Files renamed and moved to different folders. Data source behavior Who is providing data source

11 Daily file load statistics Working days No. of Packages CVS / Excel, Load & Reload Excel Sheets Records per sheet (1,000) Total no. Records, Million Million record per day 211001 - 11 - 31K, 10K2.1 - 630.1 - 3 Perfect world

12 Daily file load statistics Working days No. of Packages CVS / Excel, Load & Reload Excel Sheets Records per sheet (1,000) Total no. Records, Million Million record per day 211003 - 51 - 31K, 10K6.3 - 3150.3 - 15 Real world Files loaded per monthMonthly extra reload (Population reload) 6,300 – 1o,500 files / month2 – 3 reload a month = 12.6 – 31.5 files / month Loads Forecast Packages for the next yearNew customers Extra 200 (sum of 300 per customer)2 – 3 per year Reloads

13 Chronological file load Over 99% of the ETLs that have a file as a source don’t use chronological file load in the SSIS package.

14 Chronological file load Package overview.

15 Chronological file load Script that provides the files properties and information

16 Chronological file load Inside the DFT

17 Chronological file load Sort object

18 Chronological file load Set flag

19 Chronological file load Second For EachLoop Display script

20 Data cleansing Data cleansing and transformation Data flow transformation includes a series of data cleansing tool such as Joins Fuzzy Lookups Character mapping Data type conversion Derived columns Set of Boolean functions for data comparisons and replacement

21 Data cleansing

22

23

24

25 Error handling / Email notification Keep track of your packages when an error occurs Organize your error files Backup in the right folder Display the right Error message. Send a notification message to the right person The subject of the email must be clear

26 Capture error files in a text file

27

28 SEE ATTACHED SAMPLE

29 Email notification Use SSIS Variables to set your SMTP object SEE ATTACHED SAMPLE

30


Download ppt "Know your data source well. Who am I? Nik – Shahriar Nikkhah Microsoft MVP 2010 – SQL Server MCITP SQL 2008 MCTS SQL 2008 and 2005 s:"

Similar presentations


Ads by Google