Presentation is loading. Please wait.

Presentation is loading. Please wait.

Klaus Majenz SAP – Product Line BI

Similar presentations


Presentation on theme: "Klaus Majenz SAP – Product Line BI"— Presentation transcript:

1 Klaus Majenz SAP – Product Line BI
BW Basic Architecture Klaus Majenz SAP – Product Line BI

2 Overview complete DW & BI product, comprising ...
ETL tools (extractors, transformation, monitoring, scheduling, ...) OLAP engine data mining engine repository analytical front-end (web- or Excel-based, agents, GIS, ...) prepacked models, built by SAP application departments client-server architecture SAP web application servers database server: 7 commercial RDBMS platforms supported (Oracle, MS, 4´IBM, SAP) part of SAP Netweaver™ SAP's open integration and application platform more details:

3 Overview

4 Scenario (1) Characteristics Key Figures Infoobjects

5 Scenario (2) Dimension Time Dimension Region Dimension Sales Org
Day Month Year Dimension Region City Region Country Dimension Sales Org Sales Person Division Distribution Channel Sales Organization Dimension Product Product Product Group Key Figures Quantity (in pieces) Profit (in US$)

6 An adequate BW Infocube IUSALES
Dimension IUSALEST 0CALDAY 0CALMONTH 0CALYEAR Dimension IUSALES1 IUCITY IUREGION IUCOUNTRY Dimension IUSALES2 IUSALPER IUDIV IUDCHAN IUSALORG Dimension IUSALES3 IUPROD IUPRODGRP Key Figures IUQUAN IUPROFIT

7 Data Flow in BW Aggregate Infocube: E fact table
Initial Fill, Roll-Up Infocube: E fact table Cube Query Compression Infocube: F fact table BW Query Infocube Upload (from ODS) Infocube Upload (from PSA) Operational Data Store (ODS) ODS Query ODS Upload ODS Activate Persistent Staging Area (PSA) V.P. Query Extraction Source System (e.g. R/3, other DB, File, ...) V.P. Query

8 Data Flow in BW – what we will look at
Aggregate Initial Fill, Roll-Up Infocube: E fact table Infocube Cube Query Compression Infocube: F fact table BW Query Infocube Upload (from PSA) Operational Data Store (ODS) ODS ODS Upload ODS Activate Persistent Staging Area (PSA) PSA Extraction Source System (e.g. R/3, other DB, File, ...)

9 PSA

10 PSA table request package (within request) partition no. record no.
(within package) huge number of individual INSERTs no UPDATE SELECT * FROM … WHERE "REQUEST" = … mass deleteion: DELETE … WHERE "PARTNO" = … / DROP PARTITION …

11 ODS

12 ODS object = 3 tables active data : /BIC/AOIUSALES00
modified data ("activation queue") : /BIC/AOIUSALES40 delta data ("change log") : /BIC/B (PSA) ODS upload: INSERT INTO "/BIC/AOIUSALES40" ODS data activation: UPSERT "/BIC/AOIUSALES00" delta records: INSERT INTO "/BIC/B " (mass) DELETE FROM "/BIC/AOIUSALES40" infocube delta upload from ODS: SELECT * FROM "/BIC/B "

13 ODS tables: /BIC/AOIUSALES00, /BIC/AOIUSALES40
active data same as in PSA table modified data

14 ODS Object (BW 3.0) Upload to Activation queue Activation Active data
Change log Doc.No I Value Req.ID I Pack.ID I Rec.No ODSRx I P 1 I Rec.1I4711I 10 4711 I 10 ODSRy I P 1 I Rec.1I4711I-10 ODSRy I P 1 I Rec.2I4711I+30 4711 I 30 Before- and After Image Request ID in activation queue and change log differ from each other. After update, data in the activation queue is deleted. Activation During activation the data is sorted by the key fields of active data plus key fields of Activation queue. This guaranties the correct sequence of the records and allows inserts instead of table locks . REQU1 I P 1 I Rec.1I4711I 10 REQU2 I P 1 I Rec.1I4711I 30 Upload to Activation queue Data from different requests are uploaded in parallel to the activation queue Activation Activation queue Req1 Req2 Req3 Staging Engine

15 Infocube

16 InfoCube: Star Schema F, E D (1) Fact Table (2) Dimension
(3) time-independent-SID time-dependent-SID master SID Char (4) SID Attr Y F, E X S D S This slide illustrates the star schema of an infocube. An infocube is a multi-dimensional reporting scenario built out of characteristics, such as "month“, "product“, "city“, "sales organization" etc. and key figures, such as "costs“, "profit“, "sales" etc. Physically, an infocube is a set of DB tables that are related to one another in a star schema. This is a very common technique in data warehousing and many database vendors have tuned there various DBMS products to recognize star schema scenarios by providing appropriate techniques, such as star joins and bitmap indexing schemes for the processing of queries on star schemas. The boxes in this slide represent various tables; the lines show the foreign key – key relationships between those tables. In the center of a star schema lies the fact table which contains a huge amount of data; in particular it holds all the information on the key figures. All the other tables are relatively small in comparison to the fact table. The dimension tables, group related characteristics, such as "city", "region" and "country". Finally, the master data tables hold information on the characteristics, for example, attributes like the "color" or the "price" of a product. In this example, there are dimensions on time, region, product and sales organization. Thus the fact table holds information on sales figures, profit, costs etc. per day, product, city and sales organization. Each query on such a star schema uses/materializes a certain subset of those relationships.

17 Infocube IUSALES Facttable Dimension 1 S (Population) X (City)
Here we see a subset of tables of the infocube IUSALES's star schema. The red arrows connect the respective forreign key column (end of the arrow) with the corresponding key column (head of arrow): The facttable contains one foreign key column per infocube dimension and a column per key figure (of the infocube). The dimension table consists of a dimension id (DIMID) column which constitutes the primary key of the dimension plus a column per characteristic in that dimension. Those columns hold SID (surrogate id) values of the corresponding characteristic. In the 3rd layer there is are SID-tables of the characteristics. This can be a standard S-table (contains only relationship between SID and characteristic key), an X-table (SID-key relationship plus SID columns per time-independent navigational attribute) or a Y-table (SID-key relationship, timestamp, SID columns per time-dependent navigational attribute) . In the 4th layer there are standard S-tables for navigational attributes.

18 Infocube Indexing (1) – Oracle
line item dimension Facttable Dimension 1 S (Population) X (City) Bitmap Index B-Tree (unique) B-Tree (non-unique) This slide shows the indexing scheme for a typical infocube: In the standard case, the facttable has single column bitmap indexes on the dimension columns. An exception are columns of line-item dimensions which have a (non-unique) B-tree rather than a bitmap index because of it's assumed high cardinality. A dimension table has a primary index (i.e. a unique B-tree) on ist DIMID column and a concatenated non-unique B-tree index over the SID-columns of the characteristics. The order of those SID-columns in the index are determined by the order in which the characteristics have been assigned to the dimension when the infocube was defined. This concatenated index is typically used during data load when the insert program has to determine whether a given combination of characteristic values (SIDs respectively) already exists in the infocube. In this case, that combination already as a DIMID which can be read from the dimension table (and this access is supported by the concatenated index); otherwise a new DIMID has to be drawn and the new combination of characteristic SIDs is inserted into the dimension table (under the new DIMID). The concatenated index is typically not used in BW queries. X- and Y-tables only have a primary key. S-tables have a primary key index – the primary key comprises the characteristic key column(s) – and a unique index on the SID column. The uniqueness of the latter is a kind of additional check to assert the uniqueness of the SIDs.

19 Infocube Indexing (2) – MS SQL Server
line item dimension Facttable Dimension 1 S (Population) X (City) B-tree Index (nonunique, nonclustered) B-Tree (unique, clustered) This slide shows the indexing scheme for a typical infocube: In the standard case, the facttable has single column bitmap indexes on the dimension columns. An exception are columns of line-item dimensions which have a (non-unique) B-tree rather than a bitmap index because of it's assumed high cardinality. A dimension table has a primary index (i.e. a unique B-tree) on ist DIMID column and a concatenated non-unique B-tree index over the SID-columns of the characteristics. The order of those SID-columns in the index are determined by the order in which the characteristics have been assigned to the dimension when the infocube was defined. This concatenated index is typically used during data load when the insert program has to determine whether a given combination of characteristic values (SIDs respectively) already exists in the infocube. In this case, that combination already as a DIMID which can be read from the dimension table (and this access is supported by the concatenated index); otherwise a new DIMID has to be drawn and the new combination of characteristic SIDs is inserted into the dimension table (under the new DIMID). The concatenated index is typically not used in BW queries. X- and Y-tables only have a primary key. S-tables have a primary key index – the primary key comprises the characteristic key column(s) – and a unique index on the SID column. The uniqueness of the latter is a kind of additional check to assert the uniqueness of the SIDs.

20 Infocube Indexing (3) – Oracle
F Facttable partitioning column (for E facttable) E Facttable "P-index" This slide shows the indexing scheme of facttables of a partitioned infocube. The siginificant issue here is an additional bitmap index on the column of the F facttable that corresponds to the partitioning column of the E facttable (see red box). This index's name is "900", i.e. "/BIC/F...~900" on the database. The reason behind that index is to support restrictions that are likely to exist on this column. In case of a partitioned E facttable the BW SQL-generator translates time restricitions into restrictions on the partitioning column whenever possible. This allows the Oracle query optimizer to prune the query to the relevant partitions on the E facttable. The "900" index allows the F facttable to benefit from those additional (and redundant) restrictions too. There is no difference between standard and transactional infocubes. However, in the case of transactional infocubes it is assumed that this is the only bitmap index on the F facttable. Otherwise, transactional write accesses would result in deadlock situations. Please refer to the discussion of deadlocking on the transactional infocube slide. Bitmap Index B-Tree (unique) B-Tree (non-unique) single column indexes support queries P-index: compress additional bitmap index on part. column

21 Infocube Indexing (4) – MS SQL Server
F Facttable Does not exist on MS-SQL E Facttable "P-index" B-tree Index (nonunique, nonclustered) B-Tree (unique, nonclustered) This slide shows the indexing scheme of facttables of a partitioned infocube. The siginificant issue here is an additional bitmap index on the column of the F facttable that corresponds to the partitioning column of the E facttable (see red box). This index's name is "900", i.e. "/BIC/F...~900" on the database. The reason behind that index is to support restrictions that are likely to exist on this column. In case of a partitioned E facttable the BW SQL-generator translates time restricitions into restrictions on the partitioning column whenever possible. This allows the Oracle query optimizer to prune the query to the relevant partitions on the E facttable. The "900" index allows the F facttable to benefit from those additional (and redundant) restrictions too. There is no difference between standard and transactional infocubes. However, in the case of transactional infocubes it is assumed that this is the only bitmap index on the F facttable. Otherwise, transactional write accesses would result in deadlock situations. Please refer to the discussion of deadlocking on the transactional infocube slide. single column indexes support queries P-index: compress

22 Infocube Operations (1)
INSERT: only F facttable array INSERT if array INSERT fails: UPSERT logic DELETE request (mass deletion): only F facttable DELETE FROM "/BIC/FIUSALES" WHERE KEY_IUSALESP = … alternatively: DROP PARTITION DELETE specified data DELETE FROM … WHERE … UPSERT: only E facttable infocube compression (separate slide) SELECT separate slide

23 Infocube Compression (ex.: request 3)
UPDATE INSERT before after

24 Infocube Compression (2)
Oracle (via stored procedure; on DB server) loop over rows for request REQ in F facttable attempt UPDATE of E facttable if UPDATE fails then INSERT rowid into temporary table INS do mass INSERT INTO E facttable using INS DROP PARTITION corresponding to REQ in F facttable MS SQL-Server (via ABAP; via application server) if UPDATE fails then attempt INSERT DELETE FROM F facttable WHERE requestid = REQ

25 Aggregate Fill INSERT INTO [/BIC/E100010]
SELECT [D1].[SID_IUCITY] AS [KEY_ ], [D2].[SID_IUSALPER] AS [KEY_ ], 0 AS [KEY_100010P], SUM ([F].[/BIC/IUPROFIT]), SUM ([F].[/BIC/IUQUAN]), COUNT(*) AS [FACTCOUNT] FROM [/BIC/FIUSALES] [F], [/BIC/DIUSALES1] [D1], [/BIC/DIUSALES2] [D2], [/BIC/DIUSALESP] [DP] WHERE [F].[KEY_IUSALES1] = [D1].[DIMID] AND [F].[KEY_IUSALES2] = [D2].[DIMID] AND [F].[KEY_IUSALESP] = [DP].[DIMID] AND [DP].[SID_0CHNGID] = 0 AND ( [F].[KEY_IUSALESP] = 0 OR [F].[KEY_IUSALESP] = 2 ) AND [DP].[SID_0REQUID] BETWEEN 0 AND 40 GROUP BY [D1].[SID_IUCITY], [D2].[SID_IUSALPER]

26 Aggregate Roll-Up INSERT INTO [/BIC/F100011]
SELECT [D1].[SID_IUCITY] AS [KEY_ ], [D3].[SID_IUPROD] AS [KEY_ ], 7 AS [KEY_100011P], SUM ([F].[/BIC/IUPROFIT]), SUM ([F].[/BIC/IUQUAN]), COUNT(*) AS [FACTCOUNT] FROM [/BIC/FIUSALES] [F], [/BIC/DIUSALES1] [D1], [/BIC/DIUSALES3] [D3], [/BIC/DIUSALESP] [DP] WHERE [F].[KEY_IUSALES1] = [D1].[DIMID] AND [F].[KEY_IUSALES3] = [D3].[DIMID] AND [F].[KEY_IUSALESP] = [DP].[DIMID] AND [DP].[SID_0CHNGID] = 0 AND [F].[KEY_IUSALESP] = 5 AND [DP].[SID_0REQUID] = 498 GROUP BY [D1].[SID_IUCITY], [D3].[SID_IUPROD]

27 Infocube Query Example: Infocube IUSALES
city region country (1) Fact Table (2) Dimensions (3) Characteristics (simplified) sales person day division month distribution channel year sales organization product product group

28 Query Example & Processing (under Oracle)
region country = 'US' (1) Fact Table (2) Dimensions (3) Characteristics (simplified) month year = [98-99] product group

29 Step 1: Restrictions Master Data è Dimensions
region country = 'US' (1) Fact Table (2) Dimensions (3) Characteristics (simplified) month year = [98-99] product group Typical Query Processing

30 Step 2: Restrictions Dimensions è Fact Table
(3) Characteristics (simplified) bitmap index bitmap index product group Typical Query Processing

31 Step 3: Assemble Result (1) Fact Table (2) Dimensions
region country = 'US' (1) Fact Table (2) Dimensions (3) Characteristics (simplified) small subset of facttable month year = [98-99] product group Typical Query Processing

32 Query Example (1) – simple
SELECT "DT"."SID_0CALMONTH" AS "S____081" ,"DT"."SID_0CALYEAR" AS "S____083" ,"D1"."SID_IUCOUNTRY" AS "S____520" ,"D3"."SID_IUPRODGRP" AS "S____524" , COUNT( * ) AS "1ROWCOUNT" , SUM ( "F"."/BIC/IUPROFIT" ) AS "IUPROFIT" , SUM ( "F"."/BIC/IUQUAN" ) AS "IUQUAN" FROM "/BIC/FIUSALES" "F" , "/BIC/DIUSALEST" "DT" , "/BIC/DIUSALES1" "D1" , "/BIC/DIUSALES3" "D3" , "/BIC/DIUSALESP" "DP" WHERE "F"."KEY_IUSALEST" = "DT"."DIMID" AND "F"."KEY_IUSALES1" = "D1"."DIMID" AND "F"."KEY_IUSALES3" = "D3"."DIMID" AND "F"."KEY_IUSALESP" = "DP"."DIMID" AND ( "DT"."SID_0CALMONTH" = AND "DT"."SID_0CALYEAR" = AND "DP"."SID_0REQUID" <= 745 ) GROUP BY "DT"."SID_0CALMONTH", "DT"."SID_0CALYEAR", "D1"."SID_IUCOUNTRY", "D3"."SID_IUPRODGRP"

33 Query Example (2) – navigational attribute
SELECT "DT"."SID_0CALMONTH" AS "S____081" ,"DT"."SID_0CALYEAR" AS "S____083" ,"D1"."SID_IUCOUNTRY" AS "S____520" ,"X1"."S__IUCOLOR" AS "S____530" , COUNT( * ) AS "1ROWCOUNT" , SUM ( "F"."/BIC/IUPROFIT" ) AS "IUPROFIT" , SUM ( "F"."/BIC/IUQUAN" ) AS "IUQUAN" FROM "/BIC/FIUSALES" "F" , "/BIC/DIUSALEST" "DT" , "/BIC/DIUSALES1" "D1" , "/BIC/DIUSALES3" "D3" , "/BIC/XIUPROD" "X1" , "/BIC/DIUSALESP" "DP" WHERE "F"."KEY_IUSALEST" = "DT"."DIMID" AND "F"."KEY_IUSALES1" = "D1"."DIMID" AND "F"."KEY_IUSALES3" = "D3"."DIMID" AND "D3"."SID_IUPROD" = "X1"."SID" AND "F"."KEY_IUSALESP" = "DP"."DIMID" AND ( "DT"."SID_0CALMONTH" = AND "DT"."SID_0CALYEAR" = AND "DP"."SID_0REQUID" <= AND "X1"."OBJVERS" = 'A' ) GROUP BY "DT"."SID_0CALMONTH", "DT"."SID_0CALYEAR", "D1"."SID_IUCOUNTRY", "X1"."S__IUCOLOR"

34 Query Example (3) – external hierarchy
SELECT "DT"."SID_0CALYEAR" AS "S____083" ,"DT"."SID_0CALMONTH" AS "S____081" ,"D1"."SID_IUCOUNTRY" AS "S____520" ,"H1"."PRED" AS "S____524" , COUNT( * ) AS "1ROWCOUNT" , SUM ( "F"."/BIC/IUPROFIT" ) AS "IUPROFIT" , SUM ( "F"."/BIC/IUQUAN" ) AS "IUQUAN" FROM "/BIC/FIUSALES" "F" , "/BIC/DIUSALES3" "D3" , "/BIC/DIUSALEST" "DT" , "/BIC/DIUSALES1" "D1" , "/BIC/DIUSALESP" "DP" , "/BI0/ " "H1" /* This is a (UNION) view! */ WHERE "F"."KEY_IUSALES3" = "D3"."DIMID" AND "F"."KEY_IUSALEST" = "DT"."DIMID" AND "F"."KEY_IUSALES1" = "D1"."DIMID" AND "F"."KEY_IUSALESP" = "DP"."DIMID" AND "D3"."SID_IUPRODGRP" = "H1"."SUCC" AND ( "DT"."SID_0CALYEAR" = 2000 AND "DP"."SID_0REQUID" <= 745 AND "H1"."SUCC" <> ) GROUP BY "H1"."PRED", "DT"."SID_0CALYEAR", "DT"."SID_0CALMONTH", "D1"."SID_IUCOUNTRY"

35 Examples of Conceptual Modeling in SAP BW

36 Examples Reveal why pure RDBMS technology ...
sometimes requires an additional conceptual layer on top, is not sufficient is some cases, has no chance in some situations because it has to be more general than necessary. Examples example 1: infoproviders in SAP BW uniform view on differing physical layouts example 2: non-cumulative key figures in SAP BW semantic relationship between table columns example 3: aggregates in SAP BW could be implemented by using materialized views (or equivalent) but they have proved to be inferior

37 Example 1: Infoprovider (1)
An infoprovider in SAP BW ... comprises a reporting scenario, is the entity on which a query is defined, combines (aggregated or non-aggregated) operational data with master data (e.g. product, customer, ... data), or constitutes a master data entity

38 Example 1: Infoprovider (2) – Examples
Example A: a cube is an infoprovider fact table holds operational data on certain granularity dimensions hold master data Example B: customer master data can be an infoprovider same UI as for other infoproviders selections, projections, summaries using attributes (e.g. address, customer category, region, ...)

39 Example 1: Infoprovider (3) -- Overview (SAP BW 3.x)
Infocube multi-dim. analytical sales cube ODS-Object flat operational POS data Characteristic (master data) flat master data product data Multi- provider UNION Infoset JOIN Virtual Infoprovider API e.g. remote access real time

40 Example 2: Non-Cumulative Key Figures (1)
also: "semi-additive measures" example: account balance conceptually: physically: account day balance delta A 29-Sep 100 € 10 € B 500 € 30-Sep 110 € - 100 € 1-Oct 400 € 2-Oct 3-Oct - 60 € 4-Oct 50 € account day ref point delta A 29-Sep no  10 € B 30-Sep - 100 € 3-Oct - 60 € 4-Oct yes 50 € 400 €

41 Example 2: Non-Cumulative Key Figures (2)
non-cumulative key figures / semi-additive measures balance can be reconstructed for any moment in the past  that information has not to be physically stored advantages significantly reduced data volumes better performance more flexibility however: algorithms are required for reconstruction  read insertion  load

42 Example 3: Aggregates in SAP BW
SAP BW constraints: only SUM, MIN, MAX aggregations are materialized uploaded data (in an infocube) can still be identified  delta roll-ups are simple Materialized or Indexed Views / Automatic Summary Tables could be used in theory however: maintenance is considerably slower ... due to expensive tracking and logging mechanisms that are necessary if the general case has to be covered

43 Summary

44 Summary brief introduction to SAP BW three examples:
an additional conceptual layer on top of the relational one a semantical pattern that is frequently used in business an object that might suffer from the generic approach Do the examples reveal shortcomings of RDBMS or are they application domain specific ?


Download ppt "Klaus Majenz SAP – Product Line BI"

Similar presentations


Ads by Google