Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Agile Approach to Building & Managing Data Warehouses A Briefing by WhereScape Mary Edie Meredith, Sr. Technical Analyst -

Similar presentations


Presentation on theme: "An Agile Approach to Building & Managing Data Warehouses A Briefing by WhereScape Mary Edie Meredith, Sr. Technical Analyst -"— Presentation transcript:

1

2 An Agile Approach to Building & Managing Data Warehouses A Briefing by WhereScape Mary Edie Meredith, Sr. Technical Analyst - maryedie@wherescape.com

3 2 Why do Data Warehouse Projects struggle ? Gartner notes that over 50% of data warehouse projects fail or go wildly over budget 1.Inaccurate business requirements - #1 problem IDC 2.Poor development productivity 3.Slow development cycles 4.High cost of resources 5.High TCO 6.Poor documentation – usually the last thing that is considered & never up to date. 7.Poor data quality 8.HIGH RISK

4 3 Where did they go wrong? – one real problem is the “Big Bang” project approach Incremental “ Incremental Data Warehouse Development – The Only Way to Fly” Bill Inmon, Jan 8, 2009, (BeyeNetwork) –“There are many reasons the ‘Big Bang’ approach doesn’t work … “but at the heart is inability of the development analyst to gather requirements in the manner prescribed by the SDLC” –“End users of analytical systems need to know what the possibilities are before they can articulate the requirements.” The goal is NOT to build a Data Warehouse, but rather… –Deliver real value –Create a solution that is adaptable because –Create a solution that is adaptable because responding quickly to change brings competitive advantage –Create a process –Create a process to develop and maintain the solution that is trustworthy and sustainable

5 4 How would agile proponents approach the problem? From the agile manifesto: //agile valuableworkingEarly, frequent, and continuous test and delivery of valuable working software (every 2 wks-2mos). Welcome changing requirementsWelcome changing requirements, even late in development. work together dailyBusiness people, developers work together daily throughout the project. motivated individualsBuild projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done. face-to-faceThe most efficient, effective method of conveying information to and within a development team is face-to-face conversation. technical excellence good designContinuous attention to technical excellence and good design enhances agility. maximizing the amount of work not doneSimplicity--the art of maximizing the amount of work not done--is essential. the team reflectsAt regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.

6 5 What is uncomfortable about this approach? The further out in time, the less a project team can say about what will be accomplished. An agile approach can break the rules. –Agile implementers sometimes wrongly assume you can break ANY rule. – Shortcuts do not equal Quality Pragmatism Classic trade-offs for project managers - Schedule/ Scope/ Resources/ Quality – agile leaves little wiggle room. Does not lend itself to outsourcing, distributed teams. Having a close working relationship with business users does not solve the difficulty determining requirements. And ….

7 6 If I could deliver something meaningful in weeks DON’T YOU THINK I WOULD HAVE, ALREADY.

8 7 Agile Approach Versus Traditional Approach Docs?

9 8 What really works using agile “The WhereScape Way” A Governance structure –Strategy, Architecture, Roadmap, Standards –Goals, sponsors, infrastructure, data governance …. New Development Paradigm for delivering data - RED –ETL tools are great for moving data, but RED can do DW part better. –Integrated Development using one metadata driven tool. –Do the data delivery in the database. –Incorporate Business Rules into data delivery process Iterative workshops with business users –Use REAL DATA for flushing out requirements (RED enables this) –Track all issues discovered, especially data quality

10 9 Agile in Operation Integrate analysis, design, creation, data delivery, deployment, iteration Useful even if you just need to provide the presentation layer Feedback from business users on live data part of the development process Live Data Workshop Business User Sessions

11 10 Speeding up the development by leveraging metadata, embedding best practice methods dim_customer_key dss_update_time

12 11 Data Warehouse Scenario – Build a Sales Fact

13 12 Star schema creation scenario – start with load table Source Warehouse     Oracle, SQL/Server, Teradata, DB2 Native RDBMS, ODBC accessible, Files

14 13 RED Browser Mode Metadata Results Actions Drag and Drop Target Area Browsing Connections Choose connection and filtering

15 14 For the Teradata shop -

16 15 Star schema creation scenario – start with load table Source Warehouse     Oracle, SQL/Server, Teradata, DB2 Native RDBMS, ODBC accessible, Files

17 16 Drag and Drop Example: load source data

18 17 Drag and Drop Example: load table properties

19 18 Drag and Drop Example: load table storage mapping

20 19 Drag and Drop Example: load table “create and load” metadata

21 20 Drag and Drop Example: load table results create generated load script execution

22 21 Drag and Drop Example: load table results create generated load script execution Display Data

23 22 Stage table creation scenario – the stage table Source Warehouse     Foreign dimension Keys, lookups Source table join

24 23 Stage table: start with load_order_header (Drag and Drop)

25 24 Add columns from load_order_line (Drag and Drop) Load_order_header Column metadata

26 25 Add columns from load_order_line (Drag and Drop) prevents duplicate column names Load_order_header Column metadata

27 26 Add FK cols to Stage Table – Drag and Drop dim_* Drag and drop Dimension table keys

28 27 Column Metadata easily altered

29 28 Column Transformations – Business Rules, Computed Fields, String Manipulation, Type Conversion, Null handling,…

30 29 Create the Stage Table (right click object)

31 30 Create the update procedure (object Properties)

32 31 …then select Procedure Type

33 32 … then specify the Join statement add appropriate clauses Numerous joins supported

34 33 …indicate the business key to identify SK in Dimension Prompts if column names match

35 34 …indicate the join column if names are different

36 35 Procedure is created, compiled. Execute Procedure.

37 36 Display Data

38 37 Fact table creation scenario – Sales Fact table Source Warehouse    

39 38 Create the Fact Table from the Stage table

40 39 Metadata leveraged to create the code Dimension tables are created with “zero” row for unknowns Join metadata Transformation for quantity column

41 40 Auto generated stored procedure code … Keeps all the data movement in the database Provides consistent variable naming, coding best practices Utilizes custom parameters you can embed in metadata Includes error checking and rollbacks Preserves the metadata for easy modification Can augment with custom procedures Includes features best practices for various object types o Can handle slowly changing dimensions (all three types) o Procedure provided to populate and update time dimension o Handles code for surrogate keys, update and life-span dates o Creates Unknown Row for each dimension table o Accounts for missing dimension key matches in source data Let’s advance developers can skip the mundane Allows less experienced developers to be productive

42 41 Generated Procedures with version compares

43 42 Next Step – Business User review Easy vehicles to show this to Business users: Output table data to Excel Stress test with SSAS cube

44 43 Create a SSAS Cube for Business User Eval Drag and Drop Fact to OLAP Cube target Creates OLAP dimensions Creates OLAP measure group

45 44 Create a SSAS Cube for Business User Eval Slice and Dice in Analysis Services

46 45 Capturing Metadata - Lineage information

47 46 Leveraging Metadata: Reports

48 47 Ready to Deploy

49 48 Scheduler to manage objects and data flow Run in parallel

50 49 Scheduler to manage objects and data flow Run in parallel

51 50 Diagrammatical View Example: Update Job

52 51 Application Files to promote to QA and Production

53 52 Leveraging Metadata: Auto Producing Documentation

54 53 User Documentation

55 Where RED fits

56 55 "WhereScape promised a lot and the product has delivered. We are very happy with the amount of time it is saving us in development, as well as the documentation it is producing and the built-in scheduler. I am very happy with the purchase.“ We estimate the development lifecycle is 20-25% of what it was previously when we were hand-coding." "We estimate the development lifecycle is 20-25% of what it was previously when we were hand-coding." Dan Mosher, Director of Enterprise Data Warehousing

57 sophisticated Lifecycle Methodology “WhereScape RED offers IPC a sophisticated Lifecycle Methodology that guides us through the process of building our data warehouse. RED creates integrated database objects such as tables, indexes, procedures, etc; produces standard yet customizable T-SQL code and auto-generated user and technical documentation.” Maylee Sanchez, Sr. Database Administrator

58 Some WhereScape Customers 57

59 58 Conclusion Build Your Data Warehouse Solution –Way Faster –Way Cheaper –Ready for Change Get Full Documentation –For Users –For Techies And DO IT THE AGILE WAY

60 59 BACKUP SLIDES

61 60 Tools and Reports

62 61 Additional CUBE Features Can add MDX calculations to the cube metadata for calculated members –Specify font, foreground/background colors, boldness, display format, non-empty behavior, order number, client visibility Canned MDX calculations –Month/Year to date, Moving Qtr/Year, same month previous year, previous year to date. Can specify Post Create or Post Update XML/A Scripts –Allows features built outside of RED to be added to the Schedule cube processing (e.g. security roles added, perspectives, translations ) Cube properties include –Processing modes for Cubes (Regular, Lazy Aggregation) and priority –OLAP dimension processing (together or separately) –Cube visibility to client applications –Default Measure and estimated rows Can optionally drop Dimensions, Measure Groups, Cubes, and Cube databases from within RED. Can manage KPIs, partitioning, and processing for measure groups


Download ppt "An Agile Approach to Building & Managing Data Warehouses A Briefing by WhereScape Mary Edie Meredith, Sr. Technical Analyst -"

Similar presentations


Ads by Google