Presentation on theme: "Go-Faster Consultancy Ltd.1 Single Table Clusters, an alternative to partitioning? David Kurtz Go-Faster Consultancy Ltd."— Presentation transcript:
Go-Faster Consultancy Ltd.1 Single Table Clusters, an alternative to partitioning? David Kurtz Go-Faster Consultancy Ltd. firstname.lastname@example.org www.go-faster.co.uk
Go-Faster Consultancy Ltd.2 Who am I? DBA –Independent Consultant –Performance Tuning
Go-Faster Consultancy Ltd.3 What is this all about? A single table cluster Why it reduced contention? Why it worked on the test system? Why it failed on the production system?
Go-Faster Consultancy Ltd.4 What was our environment? Swiss Payroll System –Retrospective (not like P.A.Y.E.) Oracle 8.0.5 64-bit –Rule Based Optimiser –Now 188.8.131.52.0 HP-UX 11.00 35,000 employees
Go-Faster Consultancy Ltd.5 How many values are calculated? –100 values / employee / month retained Writes 3.5M values to balance table –78 million rows on balance table –(after 20 months)
Go-Faster Consultancy Ltd.6 How is payroll calculated? There are 2 ways to do payroll –Process employees sequentially Calculate each rule for each employee –Set processing Rule A+B=C INSERT INTO C(EMPNO, VALUE) SELECT E.EMPNO, A.VALUE+B.VALUE,... FROM tableA, tableB, elig E WHERE A.EMPNO = E.EMPNO AND B.EMPNO = E.EMPNO
Go-Faster Consultancy Ltd.7 Payroll Process Design Each employee allocated to pay group pay group = MOD(emplid,14)+1 All employees in each pay group calculated simultaneously using set processing Each pay group allocated to a payroll process Process 14 pay groups in parallel Process do not lock each other out
Go-Faster Consultancy Ltd.8 Payroll Process Design Single balance table For each month for each retro date –Deletes all pay element rows for eligible employees –Recalculate –Insert freshly calculated rows
Go-Faster Consultancy Ltd.9 Effects of streaming 14 processes concurrently deleting from and inserting in the same balance table. 14 updates to the same block 14 versions of the block in the rollback segment Recalculation process reads rows from balance table from previous month
Diagram from Oracle Concepts Manual10 ORA-1555 Snapshot Too Old Start a long running query Update a data block used by that query Commit the update Other updates spin the rollback segments
Go-Faster Consultancy Ltd.11 ORA-1555 Snapshot Too Old Long running query on balance table as part of calculation Deletes in other streams Huge rollback segments (to avoid ORA- 1555) –15 x 2100Mb segments SET TRANSACTION USE ROLLBACK SEGMENT
Diagram From Oracle Concepts Manual12 Rollback segment contention Read Consistency Navigate the rollback segment block header chain to the version of the data block that was current when the query started.
Go-Faster Consultancy Ltd.13 Rollback segment contention Navigation of segment block header chain CPU intensive Re-read rollback segment blocks from disk that have been aged out of the block buffer cache.
Go-Faster Consultancy Ltd.14 How to avoid inter-process contention? Avoid updating a data block in one process while it referencing/updating it in another. Make sure that data for different streams is in different data blocks How about partitioning the data?
Go-Faster Consultancy Ltd.15 Partitioning Range Partition (Oracle 8.0) Hash Partition (Oracle 8.1) –(Hash within a range) We considered changing the mod(,14) function to hash(,14) BUT, partitioning invokes the Cost Based Optimiser.
Diagram From Oracle Concepts Manual16 Clustering
Go-Faster Consultancy Ltd.17 Clustering Physically arranges the data within a table. Define a cluster key All rows that have the same cluster key values are kept together Old technology - dates back to Oracle 6 can still used Rule Based Optimiser
Go-Faster Consultancy Ltd.18 Clustering All rows that have the same cluster key values are kept together –All the rows in any one block share the same cluster key Rows with different cluster key values must exist in different blocks.
Go-Faster Consultancy Ltd.19 Clustered Balance Table by Employee ID Inter-process contention disappeared –Each block contained rows for one and only one employee ID –Although one emplid may actually be stored in many blocks. Rollback HWMs dropped
Go-Faster Consultancy Ltd.21 When are clusters faster? For a single threaded process, inserting/deleting a cluster was slower than similar heap table. Clusters faster when 3 or more processes in parallel updating a single clustered table.
Go-Faster Consultancy Ltd.22 Clustered Balance Table by Employee ID) Unique index not used (unless hinted) Considered dropping it –Application design permitted it So after successful testing we planned to put it in production
Go-Faster Consultancy Ltd.23 And when we moved it to production... Disaster 50% INCREASE in execution time big jump in I/O Why?
Go-Faster Consultancy Ltd.24 A simple example create table heap_t1 (key integer, counter integer, value varchar2(50) ); create cluster clus_1 (key integer); create index clus_1 on cluster clus_1; create table clus_t1 (key integer, counter integer, value varchar2(50) ) cluster clus_1 (key);
Go-Faster Consultancy Ltd.25 cluster key values 0-9 - 200 each declare l_counter integer := 0; l_string varchar2(50) := 'zero'; begin loop l_string := RPAD(l_string,50,'.'); insert into heap_t1 (key,counter,value) values (MOD(l_counter,10), l_counter, l_string); insert into clus_t1 (key,counter,value) values (MOD(l_counter,10), l_counter, l_string); insert into heap_t2 (key,counter,value) values (TRUNC(l_counter/200), l_counter, l_string); insert into clus_t2 (key,counter,value) values (TRUNC(l_counter/200), l_counter, l_string); l_counter := l_counter + 1; exit when l_counter >= 2000; l_string := TO_CHAR(TO_DATE(l_counter,'j'),'jsp'); end loop; commit; end; /
Go-Faster Consultancy Ltd.26 Which key values are in which block? selectblk_no, key, count(*) from(selectkey,counter,dbms_rowid.rowid_relative_fno(rowid) file_no,dbms_rowid.rowid_block_number(rowid) blk_no,dbms_rowid.rowid_row_number(rowid) row_no fromheap_t1 ) group by blk_no, key ;
Go-Faster Consultancy Ltd.29 Why did the behaviour change? Test System – 7 blocks / emplid Production (at time of move) –10 blocks / emplid mutliblock_read_count = 8 2 I/Os to read all blocks for an emplid indicated by cluster key
Go-Faster Consultancy Ltd.30 Solution Cluster key on two columns –emplid, –curr_pay_end_dt If blocks/key << 1 –space wastage –increase I/O
Go-Faster Consultancy Ltd.32 Moral of the story Make sure you test environment accurately mirrors production –Now –AND FOR THE FUTURE
Go-Faster Consultancy Ltd.33 What actually happened next? Upgraded to Oracle 8.1.6 Global Temporary Tables Developed incremental processing Retro-date changed to 1.1.2000 –thus reducing the processing It has now changed again to 1.1.2001 New payroll system being implemented
Go-Faster Consultancy Ltd.34 Related Reading Causes of ORA-1555 - Oracle Scene Jonathan Lewis, Practical Oracle 8i Dan Hotka’s Geeky Block Internals presentations –UKOUG 1999 & 2000