Presentation on theme: "Configuring PeopleSoft Global Payroll for Optimal Performance"— Presentation transcript:
1 Configuring PeopleSoft Global Payroll for Optimal Performance Session 507
2 Who are we? David Kurtz Gene Pirogovsky Independent Consultant working for UBSSwiss GP projectSystems Performance TuningOracle DatabasesUnixPeopleSoft ApplicationsGene PirogovskyIndependent Consultant working for UBSSwiss GP projectGlobal PayrollInterfacesCustomizations
3 Configuring Global Payroll Physical Database ConsiderationsOracle specificReducing I/OCPU overheadGP ChangesReduce CPU consumption of rules EngineData MigrationThis presentation will discuss the physical implications of using an Oracle database.The principle of read consistency is fundamental to a great deal of what Oracle does behind the scenes. Read Consistency means that the data returned by a queries is constant during the life of that query. If data being returned by a long running query is updated AND committed after that long running query starts, but before it can be fetched by the query, then the value returned will be the value as at the point in time when the query began. Physically, the before update value is reconstructed from the rollback segment. All this is also done without locking the entire table, or the entire block of data. This reconstruction process is slow and CPU as well as disk intensive. It is to be avoided.GP can process very significant quantities of data. It is important that the SQL in the identify stage executes efficiently.On the GP side it is important that the rules are written efficiently in order to minimise accesses to the PIN manager.
4 Initial Impressions Payroll is calculated by a Cobol program GPPDPRUN Single non-threaded processFour StagesCancelIdentify(Re-)CalculateFinaliseThe payroll calculation is performed by a Cobol process. It is a single process. It can only execute on one CPU at any one time.If you have 10 CPUs only one will ever be consumed by a single Cobol process. The Oracle shadow process will not be active at the same time as the Cobol. To utilise more than one CPU you need to run more than one Cobol in parallel. This is termed ‘streaming’Cancel phase essentially deletes rows from the result tables that were inserted by previous calculations.The identify phase determines which employees have to be calculated. This populates two result tables GP_PYE_SEG_STAT and GP_PYE_PRC_STAT.GP_PYE_SEG_STAT has one row per employee per period per process type (calc, absence).Calculate phase is the CPU intensive part when the rule engine performs the calculation.The finalise closes off a pay calendar.
5 Two stages with different behaviours IdentifyPopulating temporary work tablesOpening cursorsDatabase Intensive~20 minutesCalculationEvaluation of rules(Cobol only)Batch insert of results into databaseCobol (CPU) Intensive~5000 segments / hour / stream (was 1200)The identify stages populates some temporary tables and opens a number of cursors that feed data into the calculate phase.The identify phase performs takes the information, caching some of it in memory as it goes. The results are inserting into the result tables in batches, by default, every 500 rows (but this is configurable).The identify phase is basically a series of SQL insert/update statements and is very database intensive, The Oracle shadow processesCalculation was initially 1200 segs/hr/stream, tuning got it down to 5000 segs
6 What is Streaming?Employees are split into groups defined by ranges of employee IDEach group/range can be processed by a different instance/stream of GPPDPRUNThe streams can then be run in parallel.Vanilla PeopleSoft functionality.Streaming is vanilla functionality delivered by PeopleSoft.Streams defined as ranges of employee IDs
7 Why is Streaming Necessary? GPPDPRUN is a standard Cobol program.It is a single threaded processOne Cobol process can only run on one CPU at any one time33000 employees at 5000 employees/hour/stream6.6hrs if run in one stream27.5 hours at 1200/hrOn a multi-processor server streaming enables consumption of extra CPU.
8 Calculation of Stream Definitions Objective is roughly equal processing time for all streamPS_GP_PYE_PRC_STAT indicates work to be done by payroll.Calculate ranges of roughly equal numbers of rows for this tableScript using Oracle’s Analytic functions that directly populates PS_GP_STRMThis does NOT lead to equally sized GP_RSLT* tables.
9 Partition Boundary Creep As new employees hired EMPLIDs allocated into the same stream.That stream starts to run longer.Effective execution time is maximum execution time for all streams.Need to periodically recalculate stream rangesNeed to reflect this is physical changes.There are a number of implications of using streams
10 Database Contention Rollback Contention Snapshot Too Old Insert ContentionI/O VolumeDatafile I/ORedo/Archive Log ActivityIt is not only possible, but highly likely, thatRead consistency means
11 Rollback Contention Working Storage Tables Shared by all streams Rows inserted/deleted during runDifferent Streams never create locks that block each other,Do update different rows in same block during processing1 interested transaction per stream in many blocks.There is a additional rollback overhead of 16 bytes per row if two rows in same block -v- different blocksupdates of ~<100 bytes / row
12 Read ConsistencyOracle guarantees that data is consistent throughout life of a queryIf a block has been updated by another transaction since a long running query started, it must be possible to reconstruct the state of that block at the time the query started using the rollback segment.If that information cannot be found in the rollback segment the long running query fails with ORA
13 ORA Snapshot Too OldRollback segments are not extended for read consistency.Additional rollback overhead can cause rollback segments to spin and wrap.Error message also described a ‘rollback segments too small.’In this case, to simply extend the segments is the wrong response.CPU overhead to navigate rollback segment header chain
14 Insert ContentionDuring the calculation phase results are written to the result tables.A number of stream can simultaneously insert into the same result tables.Increases chance that one block will contain rows relating to more than one stream.This in turn causes rollback problems during the cancel in the next calculation.
15 Another cause of ORA-1555If not processing calendar for the first time, previous results cancelledResult table are deletedMonolithic deletes from each table.If Streams start together tend to delete same table at same time in each stream.A long running delete is also a query for the purposes of read consistency.It is necessary to reconstruct a block as at the time the long running delete started in order to delete a row from it.Reconstruction occurs during ‘consistent read’.Deletes by primary key columns, thus Oracle tends to look each row up row by index. Thus index reads also ‘consistent’.
16 Datafile and Log Physical I/O Activity During the identify phase data is shuffled from table to tableThis generates datafile and redo log I/ORollback activity is also written to disk, undo information is also written to the redo log.All the data placed in the temporary working tables by a stream is of no use to any other instance of the calculation process.It will be deleted by a future process.
17 High Water MarksThe working storage tables tend to be used to drive processing.Thus, the SQL tends to use full table scans.In Oracle, High Water Mark is the highest block that has ever contained data.Full Scans scan the table up to the high water mark.Temporary tables contain data for ALL streams.All streams can have to scan data for all streams.
18 How to avoid inter-stream contention? Keep rows from different streams in different blocksEach block should contain rows for one and only one stream.Two Oracle Features
19 What is Partitioning? Logically, Physically, Local Index a partitioned table is a still a single tablePhysically,each partition is a separate table.in a range partitioned table, the partition in which a row is placed is determined by the value of one or more columns.Local Indexis partitioned on the same logical basis as the table.
20 How should Partitioning used in GP? Largest Result tables range each partitioned on EMPLID to match GP streaming1 stream : 1 partitionThus each stream references one partition in each result table.Only 1 interested transaction per blockIndexes ‘locally’ partitionedPartitioning really designed for DSS systems. Only efficient for large tables.GP_RSLT_ACUM, GP_RSLT_ERN_DED,GP_RSLT_PIN, GP_RSLT_PI_DATAGP_PYE_PRC_STAT, GP_PYE_SEG_STAT
21 Global Temporary Tables Definition is permanently defined in database catalogue.Physically created on demand by database in temporary tablespace for duration of session/transaction. Then dropped.Each session has its own copy of each referenced GT table.Each physical instance of each GT table only contains data for one stream.Working Storage Tables PS_GP_%_WRK converted to GT tables.
22 Global Temporary Tables AdvantagesNot recoverable, therefore no Redo/Archive Loggingsome undo informationimproved performancereduce rollbackNo High Water Mark problemsSmaller object to scan.No permanent tablespace overhead.DisadvantagesDoes consume temporary tablespace but only during payrollNo CBO StatisticsCan hamper debuggingNew in Oracle 8.1, some bugs.
23 How many streams should be run? Cobol run on database serverEither Cobol is active or database is activeNo more than one stream per CPUPerhaps CPUs -1be careful not to starve database of CPUrun process scheduler at lower OS priorityCobol and database on different serversCobol active for 2/3 of execution time.Up to 1.5 streams per CPU on Cobol serverUp to 3 streams per CPU on database server
24 UBS Production Payroll Configuration 2 nodesDatabase NodeApplication Server/Process Scheduler Node20 CPUs each30 Streams2/3 of 30 is 20, so all 20 application server node CPUS active during calculate phase‘nice’ the Cobol processes1/3 of 30 is 10, so 10 of 16 CPUs activeimportant to leave some free CPU for database else spins escalated to sleeps generating latch contention
26 GoalsHow to create and test efficient rules that work without adversely effecting performanceHow best to identify problems particularly in the area of system setup/data versus a problem in a rule or underlying programHow to use GP payroll debugging tools
27 Efficient RulesResponsible for two thirds of the execution time, and so could produce the greatest saving, it will also require the greatest effort.Detailed functional and technical analysis of the definition of the payroll rules.The process involves detailed functional and technical analysis of the definition of the payroll rules. While this is responsible for two thirds of the execution time, and so could produce the greatest saving, it will also require the greatest effort.The tuning of rules can be as simple as using literals instead of variable elements and as complex as redesigning them. The process ideally starts during the design stage when various implementation schemes are analysed, intermediate tests are performed and the most efficient scheme is chosen.Likewise, all aspects of Global Payroll must be considered since creating rules to simplify calculation can adversely affect reporting or other online and batch areas and vice versa.This is an on-going process that does not stop with the rule’s implementation since the change in the size of employee population, number of records on the underlying tables, etc. can produce an unexpectedly substandard result for an initially efficient rule.
28 Efficient RulesThe process ideally starts during the design stage when various implementation schemes are analysed, intermediate tests are performed and the most efficient scheme is chosen.All aspects of Global Payroll must be considered since creating rules to simplify calculation can adversely affect reporting or other online and batch areas and vice versa.The rules can be broken down into two groups. PeopleSoft delivered rules, and customer developments. So far, most of the tuning effort has focused on the rules delivered by PeopleSoft.The choice of rules to be examined has been determined by running payroll for a small subset of employees with auditing enabled. From this we can determine how much time has been spent in each rule. Then we examine the rules that take the greatest time.In principle it should be perfectly possible for PeopleSoft to tune the rules that they have delivered. However, different sets of data will exercise the rules to different degrees. Thus, if PeopleSoft use their own data set they may choose to tune different rules.
29 Efficient Rules Arrays Re-calculate? Store / Don’t store Formulas Proration and CountHistorical rulesGeneration control versus conditional sectionRe-calculation = Yes or No?It’s important to be careful when you are using this functionality. In fact each time you are using an element with “re-calc” = Yes, the process will call the program to resolve it. Set this switch to 'No' unless you are sure that you want a recalculation.Store / Do Not store?You only need to store elements if you want to use them in a historical rule, if you need them for retro, reporting or auditing purposes. If you need certain supporting elements for reporting or audit, it might be better to create a Write Array that writes a row with all of the necessary results.Store if zero?If you decided to store an element, do you want to store it if its value is zero or blank? Definitely do not store accumulators if they are zero.Formulas:1. Use literals like 'Y' or 'N' instead of variables. For 56 employees and 10 formulas, the difference in processing with variables vs. literals was close to 700%.2. Use Exit in nested IF.3. When you have multiple conditions, put the most 'popular' at the top, followed by second most 'popular', etc.4. Use Min/Max.Arrays:The most important thing is to reduce the number of times you call the lookup formula.Proration and Count:When you need to have multiple proration rules as Calendar days, workdays and work hours for the same slice periods, it’s better to have one count element to “resolve” all proration rules. The goal is to minimize the “reading” of works schedule.Generation control versus conditional section:If a conditional formula resolves to 0, all elements in that section are skipped. That means that some Positive Input records and adjustments may remained unprocessed. However, it’s much better for the performance to use a conditional section.
30 Efficient RulesKeyed by Employee - 1 select, multiple fetches, small result set to searchUser Defined - 1 select, multiple fetches, all searches in memory.User Defined with the Reload Option - multiple selects, multiple fetches, small result set to search.
40 Migration/Customization PI v. ArrayPI can be used during identification.PI has special considerations during eligibility checking.PI allows easy override of components on element definition such as Unit, Rate, Percent or Base.The Array cannot handle multiple instances of earning/deduction.PI vs. Array approach.Using Arrays to drive payroll calculations is a very complicated process. Some functionality that is available to PI cannot be duplicated any other way, including by using arrays.The following are some of the major advantages of using PI (in no particular order):1) PI can be used during identification. Since arrays are available in the calculation step only, the customer will have to come up with a User Exit to add appropriate payees to the process. The User Exit be smart enough to do Cancel logic as well (or create another User Exit) in order to cancel employees out of Calculation if Data removed from the table. PeopleSoft does not encourage the use of user exists. They should be considered the last resort.2) PI has special considerations during eligibility checking. It supercedes any Payee Override information. If there is a PI for element that is not in eligibility group but in the process list, it will be processed if the PI override switch on the Pay Entity is on. This functionality is completely outside of Array abilities.3) PI allows easy override of components on element definition such as Unit, Rate, Percent or Base. In the absence of PI, the default process ensues. In other words, a unit can be defined as a formula, amount, bracket, etc. If there is no PI, the process will resolve that formula, bracket, etc. If there is a PI, it overrides the calculation for that payee and calendar. The arrays can only populate variables, so either an earning/deduction component that is fed by the array must be defined as variable or the whole logic must be duplicated in some formulae. Most likely such formulae will have to be created for each earning/deduction component. So let's see… the number of elements * the number of components.4) The Array cannot handle multiple instances of earning/deduction. The table that is read by Array must have the sum of all instances for a payee/element/period.5) There is no way to override prorate option on earning/deduction definition using Arrays.6) PI also overrides Generation control. So if generation control says not to process an earnings or deduction but PI exists, the earnings or deduction will be processed if PI override checkbox on Paying Entity is on. Cannot be done with Array.7) PI is automatically directed to a proper segment/slice based upon the begin / end PIdates. A special non-trivial process (?) must be devised to enable use of Arrays during segmentation/slicing event. I have to spend some time thinking about this process. At this time, I can't imagine what it might look like.8) Using Array precludes the customer from using an element on multiple calendars without an additional non-GP procedure (SQR?) to mark processed instances.9) The RATE AS OF DATE will have to be somehow controlled for every Rate Code element since it may be different for various earnings and deductions. PI provides an easy way of doing so per each instance of an element.10) PI allows the customer to override rate code, rounding rule, currency, etc. for each instance of earnings/deductions. There is no easy way to duplicate this using Arrays.11) The provisions must be made to resolve conflict between a PI instance generated from Absence calculation and data read by Array for the same element. Which one wins? Can't be both. This is not a problem when using PI approach.12) GP automatically keeps track of all the changes to PI over time. This is not only allows for a proper calculation but makes researching the changes a snap. The customer will have to create a process to duplicate this functionality.13) Using PI allows overriding a GL cost center for a specific instance of an element. During GL calculation, the process looks through PI tables to get these overrides. This functionality cannot be duplicated by the Array approach. The customer will have to create a process to duplicate this functionality.14) This same concept also applies to User Keys on Accumulators. If PI SOVRs are used, each instance of PI will update the appropriate Accumulator Instance (based on User Keys).15) The issue of Retro, Segmentation or Iterative triggers is the same for either approach but for PI can be solved with the use of Component Interface.
41 Debugging Tools Audit Trace Trace All Trace Errors Large number of records, potential rollback segment size problemsView on-lineQuery with SQL
48 ConclusionUse of Partitioning and Global Temporary Tables reduce (almost eliminate) inter-stream contention.This permits use of streaming to utilise all available CPUs.GP will always be a CPU bound processRule Tuning will reduce CPU overheadIt is an on-going process
49 And there’s more This has been a very concentrated session Round Table Discussion session 626Discuss some areas in more detail.However, we do have time for some questions now...
50 Configuring PeopleSoft Global Payroll for Optimal Performance Session 507