Presentation is loading. Please wait.

Presentation is loading. Please wait.

AN OVERVIEW OF DATA WAREHOUSING

Similar presentations


Presentation on theme: "AN OVERVIEW OF DATA WAREHOUSING"— Presentation transcript:

1 AN OVERVIEW OF DATA WAREHOUSING

2 Definition of DATA WAREHOUSING
A data warehouse is a subject-oriented, integrated, time variant, non-volatile, collection of data in support of management’s decision. In simple terms, it is a process of combining all of the enterprise data into one central repository which can be accessed by any one from any department at any time.

3 WHAT YOU NEED TO KNOW ABOUT DATA WAREHOUSING TERMINOLOGY?
Data Warehouse: An enterprise wide data store comprising data from all company information system as well as data from external sources. Data Mart: A subset of a data warehouse that includes information from one department. Data Mining: A process used for decision support and discovering information from deep with in a database. Decision Support System (DSS): A set of programs based on statistical models or trend analysis that assists in management decisions.

4 Executive Information System (EIS): EIS provides a structured interface to predefined reports that provide highly summarized, top level information about the business. On Line Analytical Processing (OLAP): Database software tools that make use of an intermediary database to store summary information and pre-defined calculations. On-Line Transactional Processing (OLTP): A system that keeps track of an establishment’s daily transactions and updates the warehouse at periodic intervals. Data Cleansing: Data cleansing ensures that the data extracted from the operational database contains valid information. You can evaluate the data either on a logical or on a technical level.

5 THE NEED FOR DATA WAREHOUSING
Company Information cannot be adequately analyzed in the form in which it is currently stored. Data is spread over multiple tables, too detailed, difficult to find, not documented, not accessible, not in a proper format, or all of the above. Operational database needs to be optimized for data entry since operational data is generated continuously. Simultaneous transaction processing and query processing can greatly slow down the system and give inconsistent results. Operational data processing differs significantly from Decision Support data processing.

6 WHAT IS DATA WAREHOUSING?
The main purpose of a data warehouse is to turn operational data into information. This leads to the basic functions of a data warehouse: extracting data from existing internal or external operational databases, cleaning, transforming, auditing the data, and organizing it in an optimized format to allow fast, easy access and analysis of your company's performance.

7 THE DATA WAREHOUSING PIECES
Software to extract the information from the operational database and load it into the datawarehouse. Software for removing inconsistencies from the data coming from various sources (Data Cleansing). Hardware and database software for the Decision Support Database. Software for managing the metadata (Stored Database Definition). Business Intelligence software that presents the data in the warehouse to the users. Data mining software that looks for hidden patterns in the information.

8 COMPONENTS OF DATA WAREHOUSE
Data Extract Data Cleansing Data Load Data Storage Warehouse Management Data Mining Data Visualize EIS

9 DIFFERENT SYSTEMS: DSS
Decision support systems assist decision makers as they access the information they need to improve their decision quality. A decision support system: - generally relies on databases that are different from an OLTP database, - the number of transactions in a data warehouse database is usually one per day, - this transaction (load) stores all new, updated, and changed data in the data warehouse, - if the load process fails the entire database is restored and not rolled back record-by-record, - the data warehouse database represents a static picture between two load processes.

10 DIFFERENT SYSTEMS: OLTP
By OLTP systems, we mean all those applications that are used in the day-to-day business such as; - order entry, - inventory management, - payroll, - or production tracking. Another term has been established for those applications, OSS (Operational Support Systems). These systems handle millions of transactions per day and ensure data consistency between all of the related tables in the database. Each update of related data is committed to the database. Depending on the purpose of your OLTP system, the contents of the database tables change permanently during the business day.

11 DIFFERENT SYSTEMS : OLAP
Online Analytical Processing (OLAP) is a totally different way to look at data compared to the OLTP method. To analyze a company's performance, the view of the data has to be more - global, - summarized, - and not focussed on detailed pieces of information. This way of looking at data is called OLAP. Users with OLAP requirements generally cannot be satisfied with a database architecture optimized for fast single record retrieval. In their case, the database architecture has to be more open and optimized to create summarized data quickly and support the user's analytical requirements.

12 DATA WAREHOUSE ARCHITECTURE
There are three key components of the data warehouse architecture: 1. The enterprise data store 2. Host data sources 3. Client access “A typical enterprise data warehouse solution architecture is not just a collection of products, but rather an integrated whole consisting of products and an articulated set of interfaces between them, targeted at solving a particular set of problems.”

13

14 DATA WAREHOUSE PRODUCTS: SYBASE
Sybase has a complete end-to-end data mart driven solution for enterprise, distributed decision support. The Sybase Data Warehouse consists of components including design and modeling, transformation and movement as well as data storage and access tools optimized to extend enterprise decision support systems to the Internet and beyond. WarehouseArchitect supports all three DW layers in terms of data modeling, metadata and data import, and interfaces with the various third-party tools that have an active part in the DW environment. Warehouse Architect allows you to: Import source information from OLTP databases. Design DW and DM models. Generate and maintain DW and DM databases. Enable Extraction command scripts to automate data transfering. Export/Import multidimensional information to/from OLAP engines. Generate reports on design activities.

15 DATA WAREHOUSE PRODUCTS: ORACLE
Oracle Applications Data Warehouse is a powerful decision-support application that allows business managers to perform complex analysis of data extracted from Oracle's new client/server applications. Oracle provides a fully-assembled decision-support product combining the strength data management software with on-line analytical processing. Oracle Applications Data Warehouse provides analysis, query and reporting tools, and data-mapping capabilities, that will allow users to access data quickly and easily, with the power to "drill down" for more detail. The tools will also enable users to perform sophisticated multi-dimensional analysis and modeling. Some key features of Oracle Applications Data Warehouse: Integrating Data from Other Sources. Drilling Down and Slicing Data into User-Defined Views Analyze data using dimensions, hierarchies or views. Additional options include ranking, sorting, exception filtering and color coding of data.

16 DATA WAREHOUSE PRODUCTS: IBM
IBM Visual Warehouse is the low-cost, end-to-end solution for data warehousing. With Visual Warehouse, you can extract a wide variety of heterogeneous data, transform the data using SQL, and store the data for use in decision support. IBM's datamart offering is the Visual Warehouse solution. Like datamarts, Visual Warehouse lets you: replicate data from a variety of operational data sources, aggregate, summarize, cleanse, or enhance the data, and make it available to end users through their favorite decision support tools. create business views, which model the structure and format of your data using an interface that people can understand and work with. experience IBM expertise and know-how. Our experienced consultants can help you with everything from the design and planning stages of your datamart to the implementation and maintenance of a production system.

17 DATA MODELING & ITS IMPORTANCE
Data modeling is the process of analyzing and representing the "things" (entities) about which an enterprise must know. Just as a complex building requires a guiding architectural plan, building a complex data infrastructure requires a guiding data model. When performed properly, data modeling is valuable and absolutely essential for effective management of the enterprise's information resources. It is required for both effective shared databases with integrated information systems, and for effective data warehouses with comprehensive decision support systems.

18 DATA MODELS: OPERATIONAL vs DATA WAREHOUSE
The data warehouse model has distinct differences from its operational counterpart: Operational Data Model: Data Warehouse Data Model: Data supports operational Data supports tactical and processes strategic processes (decision support and executive information support) Fully normalized for De-normalized for efficient effective integrity retrieval management   Current data values Historical data values Minimal derived data High degree of summarized data Contains all operational data Contains only data that has value over time

19 DIMENSIONAL MODEL Dimensional modeling is a technique that helps the database designer to build information structures that satisfy the way a business user asks for information. Its most important advantage is that it matches end users’ needs for simplicity. The purpose of dimensional modeling is to provide the data warehouse and managed-query tools with a database definition that lends itself to subject-oriented information processing. In this way, information can be re-arranged to be presented to the end user in multiple views, from different perspectives. Another name for the dimensional model is the STAR JOIN SCHEMA. There is one dominant table (fact table) in the center of the schema with multiple joins joining it to other tables (dimension tables).

20 DIMENSIONAL MODEL: STAR SCHEMA DIAGRAM

21 ENTITY-RELATION MODEL
The ER model describes data as entities, relationships and attributes. This model seeks to drive all the redundancy out of the data. This means that a transaction that changes any data only needs to touch the database in one place. + Its advantage is that data exists in a highly normalized form. - In comparison to the star schema model, the ER model does not show which entity is “strong”. - For queries that span many records or many tables, ER diagrams are too complex for users to understand and to be successfully navigated by DBMS software.

22 ENTITY-RELATION DIAGRAM

23 TRENDS AND ISSUES IN WAREHOUSING
Move towards WEB-based data warehouses. - Lack of Security - Business Value Need to assess the costs vs. benefits of implementing a data warehouse. Increasing need for customized models.


Download ppt "AN OVERVIEW OF DATA WAREHOUSING"

Similar presentations


Ads by Google