Presentation is loading. Please wait.

Presentation is loading. Please wait.

Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.

Similar presentations


Presentation on theme: "Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007."— Presentation transcript:

1 Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007

2 What is OLAP  Basic idea: converting data into information that decision makers need  Concept to analyze data by multiple dimension in a structure called data cube

3 History  In 1993, E. F. Codd came up with the term online analytical processing (OLAP) and proposed 12 criteria to define an OLAP database online analytical processing (OLAP)online analytical processing (OLAP)  the term OLAP seems perfect to describe databases designed to facilitate decision making (analysis) in an organization

4 Purpose of OLAP  To derive summarized information from large volume database  To generate automated reports for human view

5 Why need OLAP over Relational Database I  Consistently fast response  OLAP obtains a consistently fast response is by prestoring calculated values

6 Why need OLAP over Relational Database II  Metadata-based queries  provide analysis functions that are difficult or impossible to express in SQL  SQL was developed primarily for transaction systems, not for reporting applications

7 Why need OLAP over Relational Database III  Spreadsheet-style formulas  design the data structure with users in mind.  Spreadsheets are key components of business management because they are intuitive to create

8 Step I 1.identify multidimensional data  measure attribute (measure some value, can be aggregated upon) (measure some value, can be aggregated upon)  dimension attribute (define the dimension and summary of measure attribute) (define the dimension and summary of measure attribute)

9 (Cont.)  Each dimension is typically expressed as a “hierarchy”  Hierarchy: Analyst is interested in different level of detail of a dimension

10 Step II 2.Analyze multidimensional data into cross-tabulation row header: value for one attribute row header: value for one attribute column header: value for another attr. column header: value for another attr. individual cell: value aggregation individual cell: value aggregation

11 Step III 3.Visualize n-dimensional cube - data cube the word CUBE describe what in the the word CUBE describe what in the relational world would be the integration relational world would be the integration of the fact table with dimension tables of the fact table with dimension tables

12 Step IV  After you design the cube, you will use the cube's structure to build a relational database (known as a star schema) to house the data for the cube

13 Step V  Once you load data into the relational database, and then into the cube, you'll be able to see how attributes, dimensions, measures, and measure groups fit together within a cube to create a powerful analytical tool.

14 Star Schema  Cubes are easily stored in relational databases, using a denormalized data structure called the star schema, developed by Ralph Kimball  starts with a central fact table  Each row in the central fact table contains some combination of keys that makes it unique. These keys are called dimensions.

15 Slicing & Dicing  Additional Functionality that can be thought of as viewing a slice of the data cube, particularly when values for multiple dimensions are fixed.  Slicing/Dicing simply consists of selecting specific values for these attributes, which are then displayed on top of the cross-tab

16

17 Rollup & Drill-down  OLAP permit users to view data at ay desired level of granularity.  Rollup: moving from finer-granularity data to coarser granularity  Drill-down: opposite to Rollup

18 OLAP Inplementation  Multidimensional OLAP (MOLAP)  Relational OLAP (ROLAP)  Hybrid OLAP (HOLAP)

19 MOLAP  The database is stored in a special, usually proprietary, structure that is optimized for multidimensional analysis.  + : very fast query response time because data is mostly pre-calculated  -: practical limit on the size because the time taken to calculate the database and the space required to hold these pre-calculated values

20 ROLAP  The database is a standard relational database and the database model is a multidimensional model, often referred to as a star or snowflake model or schema.  +: more scalable solution  -: performance of the queries will be largely governed by the complexity of the SQL and the number and size of the tables being joined in the query

21 HOLAP  a hybrid of ROLAP and MOLAP  can be thought of as a virtual database whereby the higher levels of the database are implemented as MOLAP and the lower levels of the database as ROLAP

22 DOLAP  The previous terms are used to refer to server based OLAP technologies  DOLAP (Desktop OLAP)  DOLAP enables users to quickly pull together small cubes that run on their desktops or laptops

23 Conclusion  OLAP is a significant improvement over query systems  OLAP is an interactive system to show different summaries of multidimensional data by interactively selecting the attributes in a multidimensional data cube

24 References  IBM Redbooks. DB2 Cube Views: A Primer. Durham, NC, USA: IBM, 2003. ebrary collections. San Jose State University.  IBM Redbooks. DB2 Cube Views: A Primer. Durham, NC, USA: IBM, 2003. ebrary collections. San Jose State University.  Jacobson, Reed, Microsoft® SQL Server™ 2005 Analysis Services Step by Step. Microsoft Press.  Berry, Michael J. A. Data Mining Techniques : For Marketing, Sales, and Customer Relationship Management. Hoboken, NJ, USA: John Wiley & Sons, Incorporated, 2004. ebrary collections. San Jose State University..


Download ppt "Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007."

Similar presentations


Ads by Google