Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.

Similar presentations


Presentation on theme: "Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data."— Presentation transcript:

1 Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data Mgmt Ralph Hollinshead, Manager, Solutions Data Integration

2 Copyright © 2004, SAS Institute Inc. All rights reserved. Overview Part One: Building an Integrated Data Model Part Two: Deploying and Scaling the Data Architecture

3 Copyright © 2004, SAS Institute Inc. All rights reserved. SAS ® Banking Intelligence Solutions Framework Customer Retention X Sell Up sell X Sell Up sell Marketing Automation Marketing Automation Credit Scoring Credit Scoring Credit Risk Banking Intelligence Architecture Strategic Performance Management INTEGRATED EXTENDABLE ARCHITECTURE FOCUSED ON BUSINESS ISSUES BASED ON EXPERIENCE New Solutions

4 Copyright © 2004, SAS Institute Inc. All rights reserved. SAS ® Cross-Sell and Up-Sell for Banking SAS ® Customer Retention for Banking SAS ® Credit Scoring for Banking Solution Data Marts Extract and Cleanse Files Enterprise Source Systems Independent Solutions Solutions SAS ® Credit Risk Management

5 Copyright © 2004, SAS Institute Inc. All rights reserved. Integrated Data Model: Not All Customers are the Same  Customer A: No Data Warehouse Interested Multiple SAS Solutions  Customer B: With Data Warehouse Adverse to Data Replication Issues  Customer C: With Data Warehouse No Data Marts allowed – Active Data Warehousing Approach

6 Copyright © 2004, SAS Institute Inc. All rights reserved. Customer A: Full SAS Data Architecture Solution Data Marts Extract and Cleanse Files Enterprise Source Systems Solution s SAS® Cross-Sell and Up-Sell for Banking SAS® Customer Retention for Banking SAS® Credit Scoring for Banking SAS® Credit Risk Management SAS Banking Detail Data Store Flexible Options to Meet Customer Needs!

7 Copyright © 2004, SAS Institute Inc. All rights reserved. Customer B: Partial SAS Data Architecture Solution Data Marts Extract and Cleanse Files Enterprise Source Systems Solution s SAS® Cross-Sell and Up-Sell for Banking SAS® Customer Retention for Banking SAS® Credit Scoring for Banking SAS® Credit Risk Management Customer Enterprise Data Warehouse Flexible Options to Meet Customer Needs!

8 Copyright © 2004, SAS Institute Inc. All rights reserved. Customer C: Customer Data Architecture Extract and Cleanse Files Enterprise Source Systems Solution s SAS® Marketing Automation Customer Enterprise Data Warehouse

9 Copyright © 2004, SAS Institute Inc. All rights reserved. Scorecard for Data Architecture Approach Data Management IssueScore Sensitivity to Data Replication-0-5 Sensitivity to H/W processor and storage budget-0-5 Existing warehouse quality-0-5 Implementation time constraints-0-5 Intentions to implement >1 SAS solution+0-5 Historical data requirements+0-5 ScoreDecision -25No DDS. Marts only if absolutely necessary. Information maps may be appropriate. 0Use DDS to persist current extract from source systems. Marts hold multiple extracts up to full history. +25Implement full warehouse, persist history in DDS and as much as wanted in the marts.

10 Copyright © 2004, SAS Institute Inc. All rights reserved. Techniques for Data Model Integration  Detail Data Store Varying Industries General Standards Warehousing Techniques  Data Marts Approach Compared to DDS

11 Copyright © 2004, SAS Institute Inc. All rights reserved. Integrating Models at the Industry Level

12 Copyright © 2004, SAS Institute Inc. All rights reserved. Detail Data Store Standards Needed for Integration  Data Types / Lengths / Classifier Codes  Naming Conventions  Standards for Data Structures Hierarchies Subtypes Reference Data

13 Copyright © 2004, SAS Institute Inc. All rights reserved. Data Administration Standards Domain Data Type Width Applicable Class Codes Comment/Example IdentifierVarchar32IDTypically the identifier from the source system. Small CodeVarchar3CDShort length codes such as ADDRESS_TYPE_CD Medium CodeVarchar10CDMedium length codes such as EXCHANGE_SYMBOL_CD Large CodeVarchar20CDLong length codes such as POSTAL_CD Standard Count CodeNumeric6CNTStandard counts such as AUTHORIZED_USERS_CNT NameVarchar40NMProper name. For example, LAST_NM, FIRST_NM, etc. Short Length TextVarchar20TXTShort freeform text. Medium Length TextVarchar100TXT, DESC Longer freeform text and descriptions associated with code tables. Indicator FieldCharacter1FLGBinary indicatory flag (Y or N). Surrogate KeyNumeric10RK, SKGenerated surrogate keys. Currency AmountNumeric18,5AMTStandard currency amount. Rates and Percentages Numeric9,4PCT, RTFor example, exchange rates. DateTimeDateDT, DTTMAccommodate dates as well as date/time.

14 Copyright © 2004, SAS Institute Inc. All rights reserved. Detail Data Store: Data Warehousing Standards Surrogate Keys, Point-in-Time, and Rapidly Changing Data CUSTOMER_RKVALID_FROM_DTVALID_TO_DTACCOUNT_RKMARITAL_STATUS_CDFIRST_NMLAST_NM 10001JAN199929FEB2000201SJohnSmith 10001MAR2000 31DEC4747201 MJohnSmith ACCOUNT_RKVALID_FROM_DTVALID_TO_DTCUSTOMER_RKFINANCIAL_ACCOUNT_TYPE_CDOPEN_DT 20101JAN199931DEC4747100SAVINGS01JAN2000 CUSTOMER FINANCIAL_ACCOUNT ACCOUNT_RKVALID_FROM_DTVALID_TO_DTBALANCE_AMTCURRENCY_CD 20101JAN199931JAN19992500.75USD 2011FEB199928FEB19994300.25USD FINANCIAL_ACCOUNT_CHNG

15 Copyright © 2004, SAS Institute Inc. All rights reserved. Conformed Dimensions

16 Copyright © 2004, SAS Institute Inc. All rights reserved. Tools: Extending Models CUSTOMER EXTERNAL_ORG SUPPLIER INTERNAL_ORG INTERNAL_ORG_ASSOC INTERNAL_ORG_ASSOC_TYPE COMPETITORS

17 Copyright © 2004, SAS Institute Inc. All rights reserved. Change Analysis Tool

18 Copyright © 2004, SAS Institute Inc. All rights reserved. Deploying the Integrated Data Architecture

19 Copyright © 2004, SAS Institute Inc. All rights reserved. Option A: Full SAS Data Architecture Solution Data Marts Extract and Cleanse Files Enterprise Source Systems Solution s SAS® Cross-Sell and Up-Sell for Banking SAS® Customer Retention for Banking SAS® Credit Scoring for Banking SAS® Credit Risk Management SAS Banking Detail Data Store Flexible Options to Meet Customer Needs!

20 Copyright © 2004, SAS Institute Inc. All rights reserved. Populate DDS and Data Mart Flat File Step 1 - Extract cleanse and transform from source data into flat file Data Warehouse DDS Step 2 – ETL processing to load data warehouse data validation key creation slowly changing dimensions Banking Data Mart Step 3 - Transform into data mart model Excel SAS SAP Oracle PeopleSoft Source Data

21 Copyright © 2004, SAS Institute Inc. All rights reserved. Deployment Focus Scalability and Performance  ETL flows  Physical data model

22 Copyright © 2004, SAS Institute Inc. All rights reserved. Deployment What did We do?  Create and Generate Data  Deploy Hardware and Software  Populate DDS  Populate Data Mart  Analyze ETL Flows  Analyze DDS Model  Change Management

23 Copyright © 2004, SAS Institute Inc. All rights reserved. It All Starts with Data  Bought and Built Data Generators  Built Simulated Data  Applied Business Rules  Scaled - 5 gig -> 50 gig -> 500 gig -> 1TB

24 Copyright © 2004, SAS Institute Inc. All rights reserved. Deploy Hardware and Software  Choose Software Components SAS for the DDS or Data Warehouse Databases for the DDS or Data Warehouse SAS for the Data Marts  Install and Configure SAS Software  Configure Hardware  Design for Progressive Larger Deployment Growth

25 Copyright © 2004, SAS Institute Inc. All rights reserved. Windows Server *Dell PowerEdge 1600SC Windows 2003 DualHyper-threaded 2.8 Ghz processors 4 GB RAM 4 internal IDE drives 60 GB C drive 275 GB D drive Single I/O channel 5gig -> 50gig of Data

26 Copyright © 2004, SAS Institute Inc. All rights reserved. AIX UNIX Servers IBM P630 eServer AIX 5.3 4 processors 4 I/O channels 8 GB RAM 4x72 GB disks 14-drive SCSIS storage array IBM P670 eServer AIX 5.3 16 processors 8 - 1gig fiber I/O Channels Dynamic logical partitioning 2 TB disks 50gig -> 500gig 5500gig -> 1TB of Data

27 Copyright © 2004, SAS Institute Inc. All rights reserved. Populate DDS and Data Mart  Ran ETL Flows Registered in SAS Metadata Repository Loaded Data into Tables Use Slowly Changing Dimension Load Process  Analyze ETL Flows

28 Copyright © 2004, SAS Institute Inc. All rights reserved. Example of SAS ETL Studio Flow Analysis

29 Copyright © 2004, SAS Institute Inc. All rights reserved. Change Management  Loaded New Release of DDS in TST Repository  Compared PRD Repository to TST Repository  Ran Batch Reports to Examine Differences.  Ran Impact Analysis on Column and Table

30 Copyright © 2004, SAS Institute Inc. All rights reserved. What Did We Find  Specific Techniques that Work Best  Recommendations Tremendous Performance Gains!

31 Copyright © 2004, SAS Institute Inc. All rights reserved. Specific Techniques Examples ETL Flows  Parallel ETL flows  SAS coding techniques to use  Use hash table instead of look up  Make sure the I/O buffer size is tuned  Drop constraints

32 Copyright © 2004, SAS Institute Inc. All rights reserved. Specific Techniques Examples DDS Model  Indexes – when and when not to add  Denormalized some tables  Separate tables for data with high volume changes  Partition data by usage (date ranges)

33 Copyright © 2004, SAS Institute Inc. All rights reserved. Recommendations  Debugging techniques  Sorting and memory usage  Joins  Understand disk requirements  I/O optimization  Compression and performance

34 Copyright © 2004, SAS Institute Inc. All rights reserved. Above All  Write ETL  Test, Tune  Test, Tune!!!!

35 Copyright © 2004, SAS Institute Inc. All rights reserved. Summary and Conclusions  Data integration is key  Different approaches for customers  Change management is vital  Performance tuning is vital  Technology evolving

36 Copyright © 2004, SAS Institute Inc. All rights reserved. Questions?


Download ppt "Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data."

Similar presentations


Ads by Google