Presentation is loading. Please wait.

Presentation is loading. Please wait.

Corporate Data Vault Data Warehousing Workshop Sept. 23 2015 Data Warehousing Workshop Sept. 23 2015.

Similar presentations


Presentation on theme: "Corporate Data Vault Data Warehousing Workshop Sept. 23 2015 Data Warehousing Workshop Sept. 23 2015."— Presentation transcript:

1 Corporate Data Vault Data Warehousing Workshop Sept. 23 2015 Data Warehousing Workshop Sept. 23 2015

2 Background to CDV Project Feb 2012 – Review of Corporate Data Model published Apr 2012 – Technical group set up Dec 2012 – Proposal for CDV sent to SMC

3 The Proposal Option 1 File Store Option 2 File Store with Direct Access Option 3 Database tables Option 4 Data warehouse  Data stored in the same format as lodged by the data custodian;  Data retrieved only through the front-end application and copied to local work space.  Data stored in the same format as lodged by the data custodian;  Data can be accessed directly by third party products (e.g. SAS).  Data converted and stored in database table with similar structure to source;  Database tables can be accessed directly by third party products (e.g. SAS).  Data converted and stored in standardised relational database tables;  Database tables can be accessed directly by third party products (e.g. SAS).

4 ProsCons Option 1: File Store  Simplest concept  Lowest development effort  No direct access with 3rd party products  Possible proliferation of copies of files in local work areas  Long term usability of data more difficult to manage Option 2: File Store Direct Access  Simple concept  Provides direct access to data  Security more difficult to manage than for database options  Long term usability of data more difficult to manage Option 3: Database tables  Provides direct access to data  Data stored in single platform  Easier to manage long term usability issues  Data transformed from original format – transformed data may need validation Option 4: Data warehouse  Provides direct access to data  Standardized data in relational databases  Enables easier linkages between data  Opportunities to build other applications on the warehouse  Data transformed from original format – transformed data may need validation  Difficult to design and build  Business effort high as data standardization required

5 Project Stage 1 Two Prototypes Early 2013, the SMC requested that working prototypes of both Option 2 and 3 be developed Prototypes were designed, built & tested between June and Oct 2013 A recommendation on the optimal solution was submitted to the SMC in Nov 2013.

6 Design, Build and Assessment In-scopeOut of Scope Focus of system developmentProduce a working systemFinal screen designs Functions of the system (1) Lodging data & metadata (2) Storing data & metadata (3) Viewing of catalogue (1) Security (2) Reports Testing of system Testing to focus primarily on the “happy path”. Only major bugs and issues to be addressed. Robust testing of the system File TypesSAS files only as (1) High risk (2) Benefit of variable metadata available within the file (3)Structured nature provided suitable test for both prototypes All other file types

7 Issues with Database Prototype IssueImpact on Database Prototype Unable to distinguish between a date and a date/time variable in a SAS dataset SAS dataset is rejected because the date/time column is created as a date and a date/time variable cannot be loaded into a date column. Maximum length of a character variable can be 16384 Character variables longer than 16384 will be truncated. Maximum number of columns currently allowed is 254 SAS dataset is rejected is the number of variables exceed 254 There are 995 different formats available in SAS Data integrity may be compromised or the dataset may be rejected if an unknown format is encountered. It would require each format to be coded for individually during conversion program.

8 Project Stage 2 CDV v1 Build & Design The second stage of this project involved the further design, build and testing of the file store solution. It also included information sessions to users and the initial “Go Live” of the CDV. This second project ran from Jan 2014 until Dec 2014.

9 Project Stage 3 CDV v1 Implementation The third stage of this project is ongoing since Jan 2015 Roll-out of the system across the office Requirements gathering and specifications for CDV v2.

10 About the CDV Independent of production processes Data stored in the same format as lodged Access data through a third party product CDV v1 accepts SAS datasets only

11 Technical Specs Three tier application Client tier: Java Business Logic tier: Weblogic Data Tier: Sybase database.

12 Functionality Lodge Data and Metadata Browse/Search the Catalogue Reports Security

13

14 Lodge Data and Metadata: Step 1

15 Lodge Data and Metadata: Step 2

16 Variable Details Screen Link Classification from CARS

17 Metadata Stored File Level Survey Name Periodicity Time Period Version No. Linked Themes Micro/Macro Data Reference Documentation Description Reason for Version Date Lodged Lodged By Variable Level Name Description Primary Key Unit Type Length Data Type Linked Classification Details

18

19

20

21

22

23

24

25

26 Lodgement Summary

27 Access To Data

28 The End Any Questions?


Download ppt "Corporate Data Vault Data Warehousing Workshop Sept. 23 2015 Data Warehousing Workshop Sept. 23 2015."

Similar presentations


Ads by Google