Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSV Files and ETL The Good, Bad, and Ugly

Similar presentations


Presentation on theme: "CSV Files and ETL The Good, Bad, and Ugly"— Presentation transcript:

1 CSV Files and ETL The Good, Bad, and Ugly
Eric Freeman CSV Files and ETL The Good, Bad, and Ugly

2 Comma-Separated Values- Overview

3 CSV Overview CSV- comma-separated values Plain text
Delimited text file Each line is a new record Not fully standardized!

4 CSV Evolution 1972- IBM Fortran compiler under OS/360 Input Lists- commas or spaces only

5 CSV Evolution 1983- Osborne Executive computer w/ SuperCalc Spreadsheet Added quoted field containers

6 CSV Evolution 2005- RFC4180 (standardization initiative) Common Format and MIME Type for CSV Files

7 RFC 4180 RFC 4180 Standardization Initiative Each record Is delimited by a line break Last record may end with a line break Headers are optional- Same # of fields Double quotes may enclose fields: “abc”,”def”,”ghi” or abc,def,ghi Double quotes can be escaped: “abc”,”de””f”,”ghi”

8 CSV Overview Basic Concept- Clear Line-breaks Commas Quotes
Escape Character

9 Powershell CSV Functions
Export-Csv -InputObject <PSObject> [[-Path] <String>] [-LiteralPath <String>] [-Force] [-NoClobber] [-Encoding <String>] [-Append] [[-Delimiter] <Char>] [-IncludeTypeInformation] [-NoTypeInformation] [-WhatIf] [-Confirm]

10 Demo

11 Powershell CSV Functions
Import-Csv [[-Delimiter]] <Char>] [[-Path] <String[]>] [-LiteralPath <String[]>] [-Header <String[]>] [-Encoding <String>]

12 Demo

13 The Good Simple File, Comma delimiters only BULK INSERT

14 Demo

15 The Bad Huge CSV file with a consistent format BULK INSERT w/ Format File

16 Demo

17 The Ugly Huge CSV file with Changing format Embedded quotes May contain duplicate column names

18 Demo


Download ppt "CSV Files and ETL The Good, Bad, and Ugly"

Similar presentations


Ads by Google