Presentation is loading. Please wait.

Presentation is loading. Please wait.

Configuring PeopleSoft Global Payroll for Optimal Performance

Similar presentations

Presentation on theme: "Configuring PeopleSoft Global Payroll for Optimal Performance"— Presentation transcript:

1 Configuring PeopleSoft Global Payroll for Optimal Performance
Session 507

2 Who are we? David Kurtz Gene Pirogovsky
Independent Consultant working for UBS Swiss GP project Systems Performance Tuning Oracle Databases Unix PeopleSoft Applications Gene Pirogovsky Independent Consultant working for UBS Swiss GP project Global Payroll Interfaces Customizations

3 Configuring Global Payroll
Physical Database Considerations Oracle specific Reducing I/O CPU overhead GP Changes Reduce CPU consumption of rules Engine Data Migration This presentation will discuss the physical implications of using an Oracle database. The principle of read consistency is fundamental to a great deal of what Oracle does behind the scenes. Read Consistency means that the data returned by a queries is constant during the life of that query. If data being returned by a long running query is updated AND committed after that long running query starts, but before it can be fetched by the query, then the value returned will be the value as at the point in time when the query began. Physically, the before update value is reconstructed from the rollback segment. All this is also done without locking the entire table, or the entire block of data. This reconstruction process is slow and CPU as well as disk intensive. It is to be avoided. GP can process very significant quantities of data. It is important that the SQL in the identify stage executes efficiently. On the GP side it is important that the rules are written efficiently in order to minimise accesses to the PIN manager.

4 Initial Impressions Payroll is calculated by a Cobol program GPPDPRUN
Single non-threaded process Four Stages Cancel Identify (Re-)Calculate Finalise The payroll calculation is performed by a Cobol process. It is a single process. It can only execute on one CPU at any one time. If you have 10 CPUs only one will ever be consumed by a single Cobol process. The Oracle shadow process will not be active at the same time as the Cobol. To utilise more than one CPU you need to run more than one Cobol in parallel. This is termed ‘streaming’ Cancel phase essentially deletes rows from the result tables that were inserted by previous calculations. The identify phase determines which employees have to be calculated. This populates two result tables GP_PYE_SEG_STAT and GP_PYE_PRC_STAT. GP_PYE_SEG_STAT has one row per employee per period per process type (calc, absence). Calculate phase is the CPU intensive part when the rule engine performs the calculation. The finalise closes off a pay calendar.

5 Two stages with different behaviours
Identify Populating temporary work tables Opening cursors Database Intensive ~20 minutes Calculation Evaluation of rules (Cobol only) Batch insert of results into database Cobol (CPU) Intensive ~5000 segments / hour / stream (was 1200) The identify stages populates some temporary tables and opens a number of cursors that feed data into the calculate phase. The identify phase performs takes the information, caching some of it in memory as it goes. The results are inserting into the result tables in batches, by default, every 500 rows (but this is configurable). The identify phase is basically a series of SQL insert/update statements and is very database intensive, The Oracle shadow processes Calculation was initially 1200 segs/hr/stream, tuning got it down to 5000 segs

6 What is Streaming? Employees are split into groups defined by ranges of employee ID Each group/range can be processed by a different instance/stream of GPPDPRUN The streams can then be run in parallel. Vanilla PeopleSoft functionality. Streaming is vanilla functionality delivered by PeopleSoft. Streams defined as ranges of employee IDs

7 Why is Streaming Necessary?
GPPDPRUN is a standard Cobol program. It is a single threaded process One Cobol process can only run on one CPU at any one time 33000 employees at 5000 employees/hour/stream 6.6hrs if run in one stream 27.5 hours at 1200/hr On a multi-processor server streaming enables consumption of extra CPU.

8 Calculation of Stream Definitions
Objective is roughly equal processing time for all stream PS_GP_PYE_PRC_STAT indicates work to be done by payroll. Calculate ranges of roughly equal numbers of rows for this table Script using Oracle’s Analytic functions that directly populates PS_GP_STRM This does NOT lead to equally sized GP_RSLT* tables.

9 Partition Boundary Creep
As new employees hired EMPLIDs allocated into the same stream. That stream starts to run longer. Effective execution time is maximum execution time for all streams. Need to periodically recalculate stream ranges Need to reflect this is physical changes. There are a number of implications of using streams

10 Database Contention Rollback Contention Snapshot Too Old
Insert Contention I/O Volume Datafile I/O Redo/Archive Log Activity It is not only possible, but highly likely, that Read consistency means

11 Rollback Contention Working Storage Tables Shared by all streams
Rows inserted/deleted during run Different Streams never create locks that block each other, Do update different rows in same block during processing 1 interested transaction per stream in many blocks. There is a additional rollback overhead of 16 bytes per row if two rows in same block -v- different blocks updates of ~<100 bytes / row

12 Read Consistency Oracle guarantees that data is consistent throughout life of a query If a block has been updated by another transaction since a long running query started, it must be possible to reconstruct the state of that block at the time the query started using the rollback segment. If that information cannot be found in the rollback segment the long running query fails with ORA

13 ORA Snapshot Too Old Rollback segments are not extended for read consistency. Additional rollback overhead can cause rollback segments to spin and wrap. Error message also described a ‘rollback segments too small.’ In this case, to simply extend the segments is the wrong response. CPU overhead to navigate rollback segment header chain

14 Insert Contention During the calculation phase results are written to the result tables. A number of stream can simultaneously insert into the same result tables. Increases chance that one block will contain rows relating to more than one stream. This in turn causes rollback problems during the cancel in the next calculation.

15 Another cause of ORA-1555 If not processing calendar for the first time, previous results cancelled Result table are deleted Monolithic deletes from each table. If Streams start together tend to delete same table at same time in each stream. A long running delete is also a query for the purposes of read consistency. It is necessary to reconstruct a block as at the time the long running delete started in order to delete a row from it. Reconstruction occurs during ‘consistent read’. Deletes by primary key columns, thus Oracle tends to look each row up row by index. Thus index reads also ‘consistent’.

16 Datafile and Log Physical I/O Activity
During the identify phase data is shuffled from table to table This generates datafile and redo log I/O Rollback activity is also written to disk, undo information is also written to the redo log. All the data placed in the temporary working tables by a stream is of no use to any other instance of the calculation process. It will be deleted by a future process.

17 High Water Marks The working storage tables tend to be used to drive processing. Thus, the SQL tends to use full table scans. In Oracle, High Water Mark is the highest block that has ever contained data. Full Scans scan the table up to the high water mark. Temporary tables contain data for ALL streams. All streams can have to scan data for all streams.

18 How to avoid inter-stream contention?
Keep rows from different streams in different blocks Each block should contain rows for one and only one stream. Two Oracle Features

19 What is Partitioning? Logically, Physically, Local Index
a partitioned table is a still a single table Physically, each partition is a separate table. in a range partitioned table, the partition in which a row is placed is determined by the value of one or more columns. Local Index is partitioned on the same logical basis as the table.

20 How should Partitioning used in GP?
Largest Result tables range each partitioned on EMPLID to match GP streaming 1 stream : 1 partition Thus each stream references one partition in each result table. Only 1 interested transaction per block Indexes ‘locally’ partitioned Partitioning really designed for DSS systems. Only efficient for large tables. GP_RSLT_ACUM, GP_RSLT_ERN_DED, GP_RSLT_PIN, GP_RSLT_PI_DATA GP_PYE_PRC_STAT, GP_PYE_SEG_STAT

21 Global Temporary Tables
Definition is permanently defined in database catalogue. Physically created on demand by database in temporary tablespace for duration of session/transaction. Then dropped. Each session has its own copy of each referenced GT table. Each physical instance of each GT table only contains data for one stream. Working Storage Tables PS_GP_%_WRK converted to GT tables.

22 Global Temporary Tables
Advantages Not recoverable, therefore no Redo/Archive Logging some undo information improved performance reduce rollback No High Water Mark problems Smaller object to scan. No permanent tablespace overhead. Disadvantages Does consume temporary tablespace but only during payroll No CBO Statistics Can hamper debugging New in Oracle 8.1, some bugs.

23 How many streams should be run?
Cobol run on database server Either Cobol is active or database is active No more than one stream per CPU Perhaps CPUs -1 be careful not to starve database of CPU run process scheduler at lower OS priority Cobol and database on different servers Cobol active for 2/3 of execution time. Up to 1.5 streams per CPU on Cobol server Up to 3 streams per CPU on database server

24 UBS Production Payroll Configuration
2 nodes Database Node Application Server/Process Scheduler Node 20 CPUs each 30 Streams 2/3 of 30 is 20, so all 20 application server node CPUS active during calculate phase ‘nice’ the Cobol processes 1/3 of 30 is 10, so 10 of 16 CPUs active important to leave some free CPU for database else spins escalated to sleeps generating latch contention

25 QA Payroll Configuration
2 nodes Database Node Application Server/Process Scheduler Node 10 CPUs each 15 Streams Full production volume payroll < 1 hour

26 Goals How to create and test efficient rules that work without adversely effecting performance How best to identify problems particularly in the area of system setup/data versus a problem in a rule or underlying program How to use GP payroll debugging tools

27 Efficient Rules Responsible for two thirds of the execution time, and so could produce the greatest saving, it will also require the greatest effort. Detailed functional and technical analysis of the definition of the payroll rules. The process involves detailed functional and technical analysis of the definition of the payroll rules. While this is responsible for two thirds of the execution time, and so could produce the greatest saving, it will also require the greatest effort. The tuning of rules can be as simple as using literals instead of variable elements and as complex as redesigning them. The process ideally starts during the design stage when various implementation schemes are analysed, intermediate tests are performed and the most efficient scheme is chosen. Likewise, all aspects of Global Payroll must be considered since creating rules to simplify calculation can adversely affect reporting or other online and batch areas and vice versa. This is an on-going process that does not stop with the rule’s implementation since the change in the size of employee population, number of records on the underlying tables, etc. can produce an unexpectedly substandard result for an initially efficient rule.

28 Efficient Rules The process ideally starts during the design stage when various implementation schemes are analysed, intermediate tests are performed and the most efficient scheme is chosen. All aspects of Global Payroll must be considered since creating rules to simplify calculation can adversely affect reporting or other online and batch areas and vice versa. The rules can be broken down into two groups. PeopleSoft delivered rules, and customer developments. So far, most of the tuning effort has focused on the rules delivered by PeopleSoft. The choice of rules to be examined has been determined by running payroll for a small subset of employees with auditing enabled. From this we can determine how much time has been spent in each rule. Then we examine the rules that take the greatest time. In principle it should be perfectly possible for PeopleSoft to tune the rules that they have delivered. However, different sets of data will exercise the rules to different degrees. Thus, if PeopleSoft use their own data set they may choose to tune different rules.

29 Efficient Rules Arrays Re-calculate? Store / Don’t store Formulas
Proration and Count Historical rules Generation control versus conditional section Re-calculation = Yes or No? It’s important to be careful when you are using this functionality. In fact each time you are using an element with “re-calc” = Yes, the process will call the program to resolve it. Set this switch to 'No' unless you are sure that you want a recalculation. Store / Do Not store? You only need to store elements if you want to use them in a historical rule, if you need them for retro, reporting or auditing purposes. If you need certain supporting elements for reporting or audit, it might be better to create a Write Array that writes a row with all of the necessary results. Store if zero? If you decided to store an element, do you want to store it if its value is zero or blank? Definitely do not store accumulators if they are zero. Formulas: 1. Use literals like 'Y' or 'N' instead of variables. For 56 employees and 10 formulas, the difference in processing with variables vs. literals was close to 700%. 2. Use Exit in nested IF. 3. When you have multiple conditions, put the most 'popular' at the top, followed by second most 'popular', etc. 4. Use Min/Max. Arrays: The most important thing is to reduce the number of times you call the lookup formula. Proration and Count: When you need to have multiple proration rules as Calendar days, workdays and work hours for the same slice periods, it’s better to have one count element to “resolve” all proration rules. The goal is to minimize the “reading” of works schedule. Generation control versus conditional section: If a conditional formula resolves to 0, all elements in that section are skipped. That means that some Positive Input records and adjustments may remained unprocessed. However, it’s much better for the performance to use a conditional section.

30 Efficient Rules Keyed by Employee - 1 select, multiple fetches, small result set to search User Defined - 1 select, multiple fetches, all searches in memory. User Defined with the Reload Option - multiple selects, multiple fetches, small result set to search.

31 Efficient Rules

32 Efficient Rules

33 Efficient Rules

34 Efficient Rules

35 Efficient Rules

36 Efficient Rules

37 Efficient Rules

38 Efficient Rules

39 Efficient Rules

40 Migration/Customization
PI v. Array PI can be used during identification. PI has special considerations during eligibility checking. PI allows easy override of components on element definition such as Unit, Rate, Percent or Base. The Array cannot handle multiple instances of earning/deduction. PI vs. Array approach. Using Arrays to drive payroll calculations is a very complicated process. Some functionality that is available to PI cannot be duplicated any other way, including by using arrays. The following are some of the major advantages of using PI (in no particular order): 1) PI can be used during identification. Since arrays are available in the calculation step only, the customer will have to come up with a User Exit to add appropriate payees to the process. The User Exit be smart enough to do Cancel logic as well (or create another User Exit) in order to cancel employees out of Calculation if Data removed from the table. PeopleSoft does not encourage the use of user exists. They should be considered the last resort. 2) PI has special considerations during eligibility checking. It supercedes any Payee Override information. If there is a PI for element that is not in eligibility group but in the process list, it will be processed if the PI override switch on the Pay Entity is on. This functionality is completely outside of Array abilities. 3) PI allows easy override of components on element definition such as Unit, Rate, Percent or Base. In the absence of PI, the default process ensues. In other words, a unit can be defined as a formula, amount, bracket, etc. If there is no PI, the process will resolve that formula, bracket, etc. If there is a PI, it overrides the calculation for that payee and calendar. The arrays can only populate variables, so either an earning/deduction component that is fed by the array must be defined as variable or the whole logic must be duplicated in some formulae. Most likely such formulae will have to be created for each earning/deduction component. So let's see… the number of elements * the number of components. 4) The Array cannot handle multiple instances of earning/deduction. The table that is read by Array must have the sum of all instances for a payee/element/period. 5) There is no way to override prorate option on earning/deduction definition using Arrays. 6) PI also overrides Generation control. So if generation control says not to process an earnings or deduction but PI exists, the earnings or deduction will be processed if PI override checkbox on Paying Entity is on. Cannot be done with Array. 7) PI is automatically directed to a proper segment/slice based upon the begin / end PI dates. A special non-trivial process (?) must be devised to enable use of Arrays during segmentation/slicing event. I have to spend some time thinking about this process. At this time, I can't imagine what it might look like. 8) Using Array precludes the customer from using an element on multiple calendars without an additional non-GP procedure (SQR?) to mark processed instances. 9) The RATE AS OF DATE will have to be somehow controlled for every Rate Code element since it may be different for various earnings and deductions. PI provides an easy way of doing so per each instance of an element. 10) PI allows the customer to override rate code, rounding rule, currency, etc. for each instance of earnings/deductions. There is no easy way to duplicate this using Arrays. 11) The provisions must be made to resolve conflict between a PI instance generated from Absence calculation and data read by Array for the same element. Which one wins? Can't be both. This is not a problem when using PI approach. 12) GP automatically keeps track of all the changes to PI over time. This is not only allows for a proper calculation but makes researching the changes a snap. The customer will have to create a process to duplicate this functionality. 13) Using PI allows overriding a GL cost center for a specific instance of an element. During GL calculation, the process looks through PI tables to get these overrides. This functionality cannot be duplicated by the Array approach. The customer will have to create a process to duplicate this functionality. 14) This same concept also applies to User Keys on Accumulators. If PI SOVRs are used, each instance of PI will update the appropriate Accumulator Instance (based on User Keys). 15) The issue of Retro, Segmentation or Iterative triggers is the same for either approach but for PI can be solved with the use of Component Interface.

41 Debugging Tools Audit Trace Trace All Trace Errors
Large number of records, potential rollback segment size problems View on-line Query with SQL

42 t126297: Debugging Tools

43 t126297: Debugging Tools

44 Debugging Tools select * from sysadm.ps_gp_audit_tbl
where emplid = '884324' and cal_run_id = 'ErrMigr' and pin_num = 40811 order by audit_sort_key, audit_seq_num; and audit_sort_key = 229 order by audit_seq_num;

45 Debugging Tools t126297: select emplid, audit_sort_key as key
,audit_seq_num as seq, pin_chain_rslt_num as rslt_num ,b.pin_nm, a.pin_num ,pin_status_ind as status, c.pin_nm ,a.pin_parent_num as parent, a.fld_fmt as fmt ,calc_rslt_val as num, date_pin_val as dateval ,chr_pin_val as chr, pin_val_num as pin from ps_gp_audit_tbl a, ps_gp_pin b, ps_gp_pin c where cal_run_id = 'U_22_CI0101' and (emplid, audit_sort_key) in (select emplid, audit_sort_key from ps_gp_audit_tbl and pin_num = (select pin_num from ps_gp_pin where pin_nm = 'CH_EP_CHK_1002FF')) and a.pin_num = b.pin_num and a.pin_parent_num = c.pin_num order by emplid, audit_sort_key, audit_seq_num

46 t126297: Debugging Tools


48 Conclusion Use of Partitioning and Global Temporary Tables reduce (almost eliminate) inter-stream contention. This permits use of streaming to utilise all available CPUs. GP will always be a CPU bound process Rule Tuning will reduce CPU overhead It is an on-going process

49 And there’s more This has been a very concentrated session
Round Table Discussion session 626 Discuss some areas in more detail. However, we do have time for some questions now...

50 Configuring PeopleSoft Global Payroll for Optimal Performance
Session 507

Download ppt "Configuring PeopleSoft Global Payroll for Optimal Performance"

Similar presentations

Ads by Google