Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tim Gorman Evergreen Database Technologies, Inc.

Similar presentations

Presentation on theme: "Tim Gorman Evergreen Database Technologies, Inc."— Presentation transcript:

1 Tim Gorman Evergreen Database Technologies, Inc.
North Carolina Oracle Users Group Scaling to Infinity: Partitioning Data Warehouses in Oracle Tim Gorman Evergreen Database Technologies, Inc.

2 Speaker Qualifications
Tim Gorman (chief, cook, and bottle-washer – Director of RMOUG “Training Days 2007” conference Info online at Co-author (with Gary Dodge) “Oracle8 Data Warehousing” 1998 John Wiley & Sons “Essential Oracle8i Data Warehousing” 2000 John Wiley & Sons Co-author (with Oak Table Network - “Oracle Insights: Tales of the Oak Table” 2004 Apress Oracle data warehousing DBA since 1994 Technical manager at Oracle Consulting Independent consultant since 1998

3 I’ve seen the good, the bad, and the ugly…
Agenda I’ve seen the good, the bad, and the ugly… In the end, successful data warehouses are a combination of all three But in general, I see three major errors that result in doom… Ignore basic requirements for DW and design what is familiar Fail to portray data changes over time Fail to utilize partitioning from the beginning

4 Ignoring the requirements
Repeat after me -- reporting and analysis applications do not enforce business rules! Reporting and analysis applications are responsible for presenting data in the format that works best for end-users and their query/analysis tools Very often, what end-users seem to want is a simple spreadsheet, hundreds of columns wide GIVE THEM WHAT THEY WANT! Conceal from them what it takes to provide what they want Do NOT build a data model to enforce referential integrity and/or business rules

5 Ignoring the requirements
Third-normal form: Eliminate repeating groups Every attribute is atomic and scalar Eliminate functional dependencies on composite key components Every attribute is functionally dependent on the whole key Eliminate functional dependencies on non key components Every fact/attribute in the entity should rely on the whole key 4th, 5th, and 6th normal forms have been defined But most entities that are in 3NF are also 4th, 5th, and 6th NF Intended for use in process-oriented operational systems enforce data integrity according to business rules using referential-integrity constraint mechanisms in application code as well as databases

6 Ignoring the requirements
Data presented in a simplistic dimensional model versus the 3rd-normal-form (3NF) entity-relationship model used by most operational systems Ralph Kimball discusses in The Data Warehouse Toolkit John Wiley & Sons - ISBN # provide immediate, on-demand, and high-performance access to corporate or organizational subject data comprised of fact tables containing varying levels of summarized data and dimension tables representing important subject areas very simple representation of data It is a spreadsheet with one degree of normalization for flexibility also known as a star schema because diagrams generally represent the fact table as a hub and dimensions as spokes

7 Ignoring the requirements
Transactional Operational Entity-Relational Modeling Dimensional Modeling Customers Suppliers Suppliers Dim Products Dim Orders Products Order Facts Order Lines Customers Dim Time Dim

8 Ignoring the requirements
Fact tables More volatile Contain columns for: Dimension keys Measures In a spreadsheet or tabular report Dimension keys don’t appear at all Measures appear in the “cells” of the report Dimension tables Usually more static Although the dimension for people is usually quite volatile Attributes Attributes appear as “column headers” or “row headers”

9 Time-variant data, who cares?
Two major types of queries from business intelligence applications to data warehouse databases Point in time What is the present situation? What do the numbers look like now? “Situational awareness” applications, also known as “dashboards” or “executive information systems” Usually uses the present point in time, but could also use any specific point in time in the past Trend analysis How do things look now versus 3 months ago? A year ago? How have things changed day-by-day over the past quarter? Week-by-week over the past year? Month-by-month over the past 7 years?

10 Time-variant data, who cares?
Consider this… Dimension tables are usually designed to be point-in-time or type-1 People, items, products, etc. Locations, time, etc. Fact tables are almost always designed to be time-variant or type-2 Transactions What happens when you join transactions from years ago with dimensional attributes from the present? For example, when analyzing purchases by location, does it make sense to summarize all transactions by a person’s present location? Or should it reflect the person’s location at the time of the transaction?

11 Time-variant data, who cares?
Every data warehouse has at least one slowly-changing dimension (SCD) Usually involving “people” (i.e. accounts, customers, employees, parties, etc) Static dimensions do not need to be time-variant Identifying “static” dimensions: if a change is made to the dimension, should it be reflected across all time? SCDs should be represented as “type-2” “type-1” views of SCDs can be created as needed “type-1” views of fact tables can also be created, if necessary, to support point-in-time tactical reporting

12 Time-variant data, who cares?
Type-2 Dimension (time-variant) Type-1 Dimension (point-in-time view) PERSON_DIM Person_key Eff_dt Last_name First_name Address_1 Address_2 City CURR_PERSON_DIM Person_key o Eff_dt Last_name First_name Address_1 Address_2 City PK PK

13 Time-variant data, who cares?
PERSON_DIM Person_key Eff_dt CURR_PERSON_DIM Person_key TXN_FACT * … Person_key Person_eff_dt

14 Time-variant data, who cares?
Slowly-changing dimensions should always be “type-2” With “type-1” views constructed using the just-loaded “type-2” data So, with this in mind… Why do people so often treat time-variant tables as an after-thought? Why do “extraction-transformation-loading” (ETL) processes so often focus on “MERGE” logic (“if row doesn’t exist then INSERT else UPDATE”) on the current point-in-time tables, and then insert change data as an after-thought a.k.a. “type-1” or “point-in-time” data Instead of… inserting change data into the time-variant “type-2” table from which point-in-time “type-1” views (as materialized views?) can be built for any point-in-time? Think about it… If users should be using “type-2” data for SCDs, who usually utilizes the “type-1” views of the SCDs? What are they good for?

15 Four characteristics of a DW
Non-volatile, time-variant, subject-oriented, integrated Bill Inmon “Building the Data Warehouse” 3rd Ed 2002 (Wiley) Think about what these mean? Consider the converse of these characteristics? Volatile? Static-image? Process-oriented? Application-specific? Time-variant, non-volatile database implies: Insert, index, and analyze each row of data only once From an implementation perspective, this is vital to remember! And often ignored completely!!! Consider an extreme situation? Analytical database for quantum research in physics 50 Tbytes of data to load every day

16 Insert-only processing enables…
The Virtuous Cycle Insert-only processing enables… Tables and indexes partitioned by time Optionally sub-partitioned by other key values Partitioned tables/indexes enables… Partition pruning during queries Direct-path loads using EXCHANGE PARTITION Time-variant tables/indexes and tablespaces Purging using DROP or TRUNCATE partition instead of DELETE Partition pruning enables… Infinite scalability for queries, regardless of how large the database becomes Direct-path (a.k.a. append) loads enable… Ability to load more data, faster, more efficiently Table compression

17 Time-variant partitioned tables/indexes enable…
The Virtuous Cycle Time-variant partitioned tables/indexes enable… Time-variant tablespaces Time-variant tablespaces enable… READ ONLY tablespaces for older, less-volatile data READ ONLY tablespaces enable… Near-line storage (i.e. NAS, SAMFS/HFS, etc) “Right-sizing” of storage to the need, classified by IOPS Backup efficiencies READ WRITE tablespaces scheduled for backup every day or week READ ONLY tablespaces scheduled for backup every quarter or year

18 Using EXCHANGE PARTITION for loads enables…
The Virtuous Cycle Using EXCHANGE PARTITION for loads enables… Elimination of ETL “load window” and 24x7 availability for queries Direct-path loads Bitmap indexes and bitmap-join indices Bitmap indices enable… Star transformations on “star” (dimensional) schemas Star transformations enable… Bitmap-join indexes SUCCESS! optimal query-execution plan for dimensional data models!

19 Conventional-path operations are trouble with:
The Death Spiral Volatile data presented in a static-image according to process-oriented concepts leads to… ETL using “conventional-path” INSERT, UPDATE, and DELETE operations (including MERGE and multi-table INSERT) Conventional-path operations are trouble with: Bitmap indexes and bitmap-join indexes Forcing frequent complete rebuilds until they get too big Contention in Shared Pool, Buffer Cache, global structures Mixing of queries and loads simultaneously on table and indexes Periodic rebuilds/reorgs of tables if deletions occur Full redo logging and undo transaction tracking ETL will dominate the workload in the database Queries will consist mainly of “dumps” or extracts to downstream systems Query performance will be abysmal and worsening…

20 Everything just gets harder and harder to do…
The Death Spiral Without partitioning Query performance worsens as tables/indexes grow larger Loads must be performed into “live” tables Users must be locked out during “load cycle” In-flight queries must be killed during “load cycle” Bitmap indexes must be dropped/rebuilt during “load cycle” Entire tables must be re-analyzed during “load cycle” Entire database must be backed up frequently Data cannot be “right-sized” to storage options according to IOPS Everything just gets harder and harder to do… …and that stupid Oracle software is to blame… BRING ON TERADATA OR <insert-flavor-of-the-month>

21 Exchange Partition The basic technique of bulk-loading new data into a temporary “load table”, which is then indexed, analyzed, and then “published” all at once to end-users using the EXCHANGE PARTITION operation, should be the default load technique for all large tables in a data warehouse fact tables slowly-changing or quickly-changing dimensions Assumptions for this example: Composite partitioned fact table named TXN Range partitioned on DATE column TXN_DATE Hash partitioned on NUMBER column ACCT_KEY Data to be loaded into partition P on TXN

22 Exchange Partition Composite-partitioned table TXN Hash-partitioned
1. Create Temp Table 5. EXCHANGE PARTITION 2. Bulk Loads Hash-partitioned table TXN_TEMP 3. Index Creates 4. Table & Col Stats 22-Feb 2004 23-Feb 2004 24-Feb 2004 (empty) 25-Feb 2004

23 Exchange Partition Create temporary table TXN_TEMP as a hash-partitioned table Perform parallel, direct-path load of new data into TXN_TEMP Create indexes on the temporary hash-partitioned table TXN_TEMP corresponding to the local indexes on TXN using PARALLEL, NOLOGGING, and COMPUTE STATISTICS options Gather CBO statistics on table TXN_TEMP Only table and columns stats -- leave computed index stats! alter table TXN exchange partition P with table TXN_TEMP including indexes without validation update global indexes;

24 Exchange Partition It is a good idea to encapsulate this logic inside PL/SQL packaged- or stored-procedures: SQL> execute exchpart.prepare(‘TXN_FACT’,’TMP_’, - 2 ’25-FEB-2004’); SQL> alter session enable parallel dml; SQL> insert /*+ append parallel(n,4) */ into tmp_txn_fact n 3 select /*+ full(x) parallel(x,4) */ * 4 from stage_txn_fact x 5 where load_date >= ‘25-FEB-2004’ 6 and load_date < ‘28-FEB-2004’; SQL> commit; SQL> execute exchpart.finish(‘TXN_FACT’,’TMP_’); DDL for “exchpart.sql” posted at

25 Exchange Partition Loading time-variant fact and dimension tables is not the only load activity in most data warehouses Often, some tables contain current or point-in-time data Example: type-1 dimension “snowflaked” from type-2 dimension This is often an excellent situation for materialized views But, as is often the case, the refresh mechanisms built in with materialized views might not be the most efficient With each load cycle, the current images need to be updated Instead of performing transactional MERGE (I.e. Update or Insert) logic directly on the table Rebuild the table into a temporary table, then “swap” it in using EXCHANGE PARTITION

26 Exchange Partition Composite-partitioned table ACCOUNT_DIM
CURR_ACCOUNT_DIM Hash-partitioned table After the main composite-partitioned dimension table of ACCOUNT_HISTORY_DIM has been loaded (as documented on the previous slide), it is now necessary to reload or update the current-image “view” represented by the table ACCOUNT_DIM. Since the main type-2 slowly-changing dimensions as well as the main fact tables represent the true image of time-variant data in the data warehouse, then current-image dimensions and facts should be considered generically as “views” of these tables. These “views” are subsets of rows, with the subsets comprised of the latest versions of each data granule along the TIME dimension. Examples of this in CDCI include ACCOUNT_DIM (a.k.a. current-image view of ACCOUNT_HISTORY_DIM) and ACCOUNT_FINANCIAL (a.k.a. current-image view of several fact tables). There are at least five main options for refreshing these current-image “views”: Use ETL utilities (i.e. Ab Initio) to perform INSERT/UPDATE (a.k.a. “up-sert”) or “merge” logic on the current-image view/table. Truncate and completely refresh the current-image view/table by extracting entire set of data from the source (main) tables. For example, a full query of the source table which extracts only the most-recently updated rows would get the job done. Capture changed rows in the source (main) table and incrementally update the current-image view/table Refresh a complete new copy of the current-image view/table, using a query that finds the most-recent record for each account solely from the entire ACCOUNT_HISTORY_DIM table, then use EXCHANGE PARTITION to publish the new data for end-users Refresh a complete new copy of the current-image view/table, by merging the existing ACCOUNT_DIM table with the newly-inserted rows in ACCOUNT_HISTORY_DIM, then use EXCHANGE PARTITION to publish the new data for end-users Our recommendation is option #5. It holds to the basic ETL design principle of “only INSERTs and SELECTs and no UPDATEs or DELETEs” (which disqualifies options #1 and #3). This simple-yet-powerful design principle guarantees performance scalability as data volumes increase, and it is the basis of options #2, #4, and #5. Bulk INSERT operations are orders of magnitude faster than “conventional” SQL operations. Introducing any UPDATE or DELETE operations tend to inhibit scalable performance, and should be used only as a last resort. Option #5 is the best approach because it avoids a “complete” refresh from the large ACCOUNT_HISTORY_DIM table (i.e. option #2). It accomplishes this by utilizing the range-partitioning scheme on the ACCOUNT_HISTORY_DIM table as a “change-data capture” mechanism. That is, we know that the rows just inserted reside only in the latest partition. Thus, we can easily query only those recently-changed rows using Oracle’s “partition-pruning” mechanism and then merge that data against all of the existing rows in the ACCOUNT_DIM table. Thus, we create a new copy of ACCOUNT_DIM in the ACCOUNT_DIM_TEMP table, which is almost a mirror-image of ACCOUNT_DIM. This is illustrated in the slide above. Please note that ACCOUNT_DIM itself is a composite-partitioned table, with only a single range-partition on EFFECTIVE_DATE with values less than MAXVALUE. From a functional perspective, ACCOUNT_DIM is really just a hash-partitioned table, but physically it should be range-partitioned and hash-subpartitioned. The purpose of this seemingly-useless range-partition is to simplify EXCHANGE PARTITION logic to be seen on the next slide… The following SQL statement can be utilized to perform the actual merge/build: INSERT /*+ append nologging parallel(t, 16) */ INTO ACCOUNT_DIM_TEMP T SELECT /*+ ordered full(x) parallel(x, 16) */ LAST_VALUE(x.col1) OVER (PARTITION BY ACCOUNT_KEY ORDER BY EFFECTIVE_DATE) col1, LAST_VALUE(x.col2) OVER (PARTITION BY ACCOUNT_KEY ORDER BY EFFECTIVE_DATE) col2, LAST_VALUE(x.colN) OVER (PARTITION BY ACCOUNT_KEY ORDER BY EFFECTIVE_DATE) colN FROM (SELECT col1, col2, …, colN FROM ACCOUNT_DIM UNION SELECT col1, col2, …, colN FROM ACCOUNT_HISTORY_DIM WHERE EFFECTIVE_DATE BETWEEN TO_DATE(‘25-FEB :00:00’,’DD-MON-YYYY HH24:MI:SS’) AND TO_DATE(‘25-FEB :59:59’,’DD-MON-YYYY HH24:MI:SS’) ORDER BY ACCOUNT_KEY, EFFECTIVE_DATE) x; The general idea is that the in-line view is comprised of the UNION of two SELECT statements. One of the SELECT statements retrieves the entire contents of the current ACCOUNT_DIM table. The other SELECT statement retrieves only the recently-changed rows in the ACCOUNT_HISTORY_DIM table. The assumption shown in this code example assumes that only the most recent partition (i.e. 25-Feb data) has been recently loaded. This may or may not be the case. If data was also loaded into earlier partitions, then it would make sense to query those as well. For example, if data was loaded as early as 21-Feb 2004, then the query might instead need to look like: BETWEEN TO_DATE(‘21-FEB :00:00’,’DD-MON-YYYY HH24:MI:SS’) AND CDCI_LOAD_DATE This allows partition-pruning to limit the number of partitions scanned by the EFFECTIVE_DATE range-partition key column while limiting the rows retrieved by the date value in the CDCI_LOAD_DATE column to only those rows loaded on 25-Feb. Please turn to the next slide for the remaining steps… Merge/build operation 23-Feb 2004 24-Feb 2004 25-Feb 2004

27 New current-image data in hash-partitioned table CURR_ACCT_DIM_TEMP
Exchange Partition EXCHANGE PARTITION Previous cycle’s current-image data in composite-partitioned table CURR_ACCOUNT_DIM, with single partition named PZERO New current-image data in hash-partitioned table CURR_ACCT_DIM_TEMP Continuing from the previous slide… Now that the new current-image has been constructed in the temporary ACCOUNT_DIM_TEMP table, our next task is to publish the data to the ACCOUNT_DIM table, which is of course visible to end-users, without interrupting the availability of the table. To do this, we once again use the ubiquitous EXCHANGE PARTITION operation. Since the ACCOUNT_DIM_TEMP is a hash-partitioned table and the ACCOUNT_DIM table is a composite range-partitioned and hash-subpartitioned table, a single EXCHANGE PARTITION command should suffice. The EXCHANGE PARTITION mechanism has the advantage of supporting in-flight queries without interruption. Oracle’s read-consistency mechanism will all queries that are already in-progress to reference rows that have been exchanged to the individual standalone tables. New queries initiated after the EXCHANGE PARTITION operation will, of course, reference the newly-exchanged and newly-visible partition. Another nice side-effect of this technique is that the previous day’s image of ACCOUNT_DIM is still stored in the standalone hash-partitioned ACCOUNT_DIM_TEMP table. Restoring this previous day’s image can be done quite swiftly, if necessary. The upshot is zero downtime for users of the ACCOUNT_DIM table. And it is much faster than any of the other options. In computing, it is not often that high-availability and high-performance go together. All too often, one must be traded for the other. But not in this case… :-)

28 Exchange Partition INSERT /*+ append parallel(t, 8) */ INTO TMP_CURR_ACCOUNT_DIM T SELECT /*+ full(x) parallel(x, 8) */ 0 partkey, acctkey, effdt, …(and so on for all columns)… FROM (SELECT acctkey, effdt, …(and so on for all columns)…, row_number() over (partition by acctkey order by effdt) ranking FROM (SELECT acctkey, effdt, …(and so on for all columns)… FROM CURR_ACCOUNT_DIM UNION ALL SELECT acctkey, effdt, …(and so on for all columns)… FROM CURR_ACCOUNT_DIM partition (P ) ) WHERE RANKING = 1;

exchange partition PZERO with table TMP_ACCOUNT_DIM [ with | without ] validation including indexes update global indexes;

30 Choosing partition keys
The most important decision when partitioning is… Choosing the partition key columns All benefits of partitioning hinges upon this choice!!! Which columns to partition upon? If the table contains time-variant data Choose the RANGE partition key DATE column to optimize: ETL according to load cycles End-user access through partition pruning Choose the HASH or LIST sub-partition key column to optimize: If the table does NOT contain time-variant data Choose the RANGE, HASH, or LIST partition key column to optimize:

31 Choosing partition keys
When choosing columns to optimize ETL Choose a column which distinguishes different load cycles Should be a DATE column When choosing columns to optimize end-user access Gather hard facts about usage – don’t guess! Oracle STATSPACK and Oracle10g AWR Data dictionary table SYS.COL_USAGE$ Populated automatically by cost-based optimizer in Oracle9i and above DDL script “dba_column_usage.sql” can be downloaded from Ambeo Usage Tracker ( Teleran iSight (

32 Choosing partition keys
Example: fact table for credit-card processing Fact table is time-variant Use range partitioning on DATE datatype to optimize ETL and queries Use hash- or list-subpartitioning to optimizer queries Fact table has four DATE columns TXN_DT (date on which transaction occurred) POST_DT (date on which transaction was posted by merchant) PAID_DT (date on which transaction was paid to merchant) LOAD_DT (date on which transaction was loaded to DW)

33 Choosing partition keys
Which should be chosen? And why? LOAD_DT Optimizes ETL perfectly, but does not benefit queries in any way… Data is loaded by LOAD_DT End-users don’t query on LOAD_DT TXN_DT, POST_DT, and PAID_DT Each benefits a different set of end-user queries Presents some problems for ETL processing Each date loads mostly into the latest partition, then a little into each partition for the previous 2-4 days This situation can be handled by iterating through the five steps of the basic EXCHANGE PARTITION algorithm Where each iteration processes a different LOAD_DT value

34 Summary recommendations
Use dimensional data models for the “presentation” to end-users Don’t “free lance” and confuse the end-users Understand the purpose of facts and dimensions Base the database design on time-variant data structures Don’t join “type-2” fact data to “type-1” dimension data Load “type-2” data first, then rebuild “type-1” data from that Use partitioning Enable the “virtuous cycle” of Oracle features that cascade from using partitioning intelligently…

35 Questions?

36 Thank You! Rocky Mountain Oracle Users Group (
“Training Days 2008”, Denver CO Tue-Thu Feb 2008 Tues 12-Feb: 4-hour “university sessions” Wed-Thu Feb: main conference Thu-Sun Feb 2008 Informal ad-hoc ski weekend for attendees who wish to partake!!! Tim’s contact info: Web:

Download ppt "Tim Gorman Evergreen Database Technologies, Inc."

Similar presentations

Ads by Google