Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Warehouse Development Methodology

Similar presentations


Presentation on theme: "Data Warehouse Development Methodology"— Presentation transcript:

1 Data Warehouse Development Methodology

2 Waterfall Methodology

3 Waterfall Methodology
With infrastructure setup and project management

4 Iterative Methodology
Incremental approach: Top-down incremental approach Bottom-up incremental approach Warehouse Development Approaches The most challenging aspect of data warehousing lies not in its technical difficulty, but in choosing the best approach to data warehousing for your company’s structure and culture, and dealing with the organizational and political issues that will inevitably arise during implementation. Among the different approaches to developing a data warehouse are: “Big bang” approach Incremental approach Top-down incremental approach Bottom-up incremental approach

5 Top-Down Approach Analyze requirements at the enterprise level
Develop conceptual information model Identify and prioritize subject areas Complete a model of selected subject area Map to available data Perform a source system analysis Implement base technical architecture Establish metadata, extraction, and load processes for the initial subject area Create and populate the initial subject area data mart within the overall warehouse framework Top-Down Incremental Approach Advantages This approach has the following advantages: Provides a relatively quick implementation and payback. Typically, the scoping, definition study, and initial implementation are scaled down so that they can be completed in six to seven months. Offers significantly lower risk because it avoids being as analysis heavy as the “big bang” approach Emphasizes high-level business needs Achieves synergy among subject areas. Maximum information leverage is achieved as cross-functional reporting and a single version of the truth are made possible Disadvantages This approach has the following disadvantages: Requires an increase in up-front costs before the business sees any return on their investment Is difficult to define the boundaries of the scoping exercise if the business is global May not be suitable unless the client needs cross-functional reporting

6 Bottom-Up Approach Define the scope and coverage of the data warehouse and analyze the source systems within this scope Define the initial increment based on the political pressure, assumed business benefit and data volume Implement base technical architecture and establish metadata, extraction, and load processes as required by increment Create and populate the initial subject areas within the overall warehouse framework Bottom-Up Incremental Approach This approach is similar to the top-down approach but the emphasis is on the data rather than the business benefit. Here, IT is in charge of the project either because IT wants to be in charge or the business has deferred the project to IT. Advantages This approach has the following advantages: This is a “proof of concept” type of approach, therefore it is often appealing to IT. It is easier to get IT buy-in for this approach because it is focused on IT. Disadvantages This approach has the following disadvantages: Because the solution model is typically developed from source systems and these source systems will have encapsulated within them the current business processes, the overall extensibility of the model will be compromised. IT staff is often the last to know about business changes—IT could be designing something that will be out of date before they complete its delivery. As the framework of definition in this approach tends to be much narrower, often a significant amount of reengineering work is required for each increment.

7 Incremental Approach to Warehouse Development
Multiple iterations Shorter implementations Validation of each phase Increment 1 Strategy Definition Analysis Design Build Iterative Incremental Approach The incremental approach manages the growth of the data warehouse by developing incremental solutions that comply with the full-scale data warehouse architecture. Rather than starting by building an entire enterprisewide data warehouse as a first deliverable, start with just one or two subject areas, implement them as scalable data mart and roll them out to your end users. Then, after observing how users are actually using the warehouse, add the next subject area or the next increment of functionality to the system. This is also an iterative process. It is this iteration that keeps the data warehouse in line with the needs of the organization. Benefits Delivers a strategic data warehouse solution through incremental development efforts Provides extensible, scalable architecture Supports the information needs of the enterprise organization Quickly provides business benefit and ensures a much earlier return of investment Allows a data warehouse to be built based on a subject or application area at a time Allows the construction of an integrated data mart environment Production

8 Methodology Ensures a successful data warehouse
Encourages incremental development Provides a staged approach to an enterprisewide warehouse: Safe Manageable Proven Recommended Methodology A methodology is a set of detailed steps or procedures to accomplish a defined goal. Employing a methodology for the development of any system is always important. In a warehouse environment even more so. The warehouse is such a big investment, in every resource you can think of, that its success is essential. To avoid failure of the warehouse implementation, you must employ a methodology and keep to it. Failure is generally caused in two ways. The first cause of failure is that the warehouse is not delivered on time, and the second is that the warehouse fails to deliver what the business users need. A good method helps to manage expectations by identifying clear deliverables. On the other hand, don’t become a slave to the steps of a methodology. Practice methodology with focus on results, not on activities. This achieves consistency of deliverables while recognizing differences in individual working styles.

9 Architecture “Provides the planning, structure, and standardization needed to ensure integration of multiple components, projects, and processes across time.” “Establishes the framework, standards, and procedures for the data warehouse at an enterprise level.” — The Data Warehousing Institute Architecture From a business and technology view, an architecture defines a collection of components and specifies their relationships. The goal of the architecture activities is a single, integrated data warehouse meeting business information needs. Some of the components of a data warehousing architecture are: Data sources Data acquisition Data management Data distribution Information directory Data access tools

10 Extraction, Transformation, and Load (ETL)
“Effective data extract, transform and load (ETL) processes represent the number one success factor for your data warehouse project and can absorb up to 70 percent of the time spent on a typical data warehousing project.” Source Staging Area Target Extraction, Transformation, and Loading (ETL) These processes are fundamental to the creation of quality information in the data warehouse. You take data from source systems; clean, verify, validate, and convert it into a consistent state; then move it into the warehouse. Extraction: The process of selecting specific operational attributes from the various operational systems. Transformation: The process of integrating, verifying, validating, cleaning, and time stamping the selected data into a consistent and uniform format for the target databases. Rejected data is returned to the data owner for correction and reprocessing. Loading: The process of moving data from an intermediate storage area into the target warehouse database. ETL Tools Specialized tools make these tasks comparatively easy to setup, maintain, and manage. Specialized tools can be an expensive option, which motivates many warehouses to employ customized ETL programs written in COBOL, C++, PL/SQL, or other programming languages or application development tools. Oracle Warehouse Builder (OWB) is Oracle’s ETL tool.

11 Data Warehouse Architecture Ex., Incremental Implementation
Implementation deliverables: Analysis Confirm and refine requirements Design Gather specifications and prepare the blueprint for the data warehouse or data mart Construction Put in place and test the data warehouse or data mart and all required support tools Deployment Data warehouse or data mart is accepted for use in the business Increment n

12 Operation and Support Data access and reporting
Refreshing warehouse data Monitoring Responding to change Operation Present warehouse data to the end user in a meaningful and business specific manner, and select query tools that are tailored to the users’ requirements for information Periodically refresh the warehouse data Respond to changing data sources, requirements, and technology Monitor, manage, and tune

13 Phases of the Incremental Approach
Strategy Definition Analysis Design Build Production Strategy Definition Analysis Design Build Production Increment 1 Phases of the Incremental Approach Effective and efficient data warehouse project management involves the use of project phases. Project phases identify the tasks to be completed, the resources required, the directing and reporting efforts, and the quality assurance required before moving on to the next phase. Project phasing is a management technique used to focus project teams toward a short-term goal and to communicate progress to senior management. Strategy Define the business objectives and purpose of the data warehouse Define the data warehouse team and executive sponsor Define success measurements Definition Define the scope and objectives for the incremental development effort Identify the technical and data warehouse architecture Outline data access methods

14 Strategy Phase Deliverables
Business goals and objectives Data warehouse purpose, objectives, and scope Enterprise data warehouse logical model Incremental milestones Source systems data flows Subject area gap analysis Identifying Warehouse Strategy Phase Deliverables For each of the data warehouse project phases there are deliverables. The deliverables for the strategy phase focus on defining the business objectives and purpose of the data warehouse solution. The purpose and objectives for the total data warehouse solution are essential to setting and managing expectations. The strategy phase also clearly defines the data warehouse team and the executive sponsor. Business goals and objectives: Documents the strategic business goals and objectives Data warehouse purpose, objectives, and scope: Documents the purpose and objectives of the enterprise data warehouse, its scope, and how it is intended to be used Enterprise data warehouse logical model: High-level, logical information model that diagrams the major entities and relationships for the enterprise Incremental milestones: Documents a realistic scope of the data warehouse, acceptable delivery milestones for each increment, and source data availability

15 Strategy Phase Deliverables
Data acquisition strategy Data quality strategy Metadata strategy Data access environment Training strategy Identifying Warehouse Strategy Phase Deliverables (continued) Source system data flows: Outlines source system data, where it originates, the flow of data between business functions and source systems, degree of reliability, and data volatility Subject area gap analysis: Documents the variance between the information requirements and the ability of the data sources to provide that information Data acquisition strategy: Documents the approach to extracting, transforming, and loading data from the source systems to the target environments for the initial load and subsequent refreshes Data quality strategy: Outlines the approach for data management, error and exception handling, data cleansing, and the audit and control of the data Metadata strategy: Documents the strategy of capturing, integrating, and accessing metadata for all components of the warehouse environment Data access environment: Documents the identification, selection, and design of tools that support end-user access to the warehouse data Training strategy: Outlines the development and end-user training requirements, identifies the technical and business personnel requiring training, and establishes time frames for executing the training plans


Download ppt "Data Warehouse Development Methodology"

Similar presentations


Ads by Google