HR7.5 Department Security Tree Tuning

HR7.5 Department Security Tree Tuning
David Kurtz Go-Faster Consultancy Ltd. Sometimes customers ask me to tune their databases But that is not what this session is about. Because, usually, the problem is not some much the database, or the network, the problem is in the application . Performance tuning is 10% database tuning and 90% application tuning I cannot claim the credit for developing the techniques described in this presentation, but I have gathered them from a variety of sources and this is an opportunity to share them back.

Why Trees? What I am going to talk about is application tuning .
In particular a specific aspect of it which I find keeps cropping up when tuning PeopleSoft applications. That is getting better performance from the SQL queries that reference trees. The principles contained in this session are fairly generic, but I need to cover some Oracle issues. Aerobic session: How many on which platforms? (BTW: Picture is The Great Pine by Paul Cézanne )

This is a technical presentation
Complex SQL Optimisers Query Execution Plans Indexes Replication Before I go any further I think it only fair to warn you that this is going to be an unashamedly technical presentation. I am going to show you some of the fairly complicated looking SQL that is used. We are going to discuss how databases use query optimisers to decide how to execute those queries. How indexes are used, and how they are not always beneficial. Importance of indexing the tables that underlie a view. Most of my work is done on Oracle databases, so I will tend to talk about how it is on an Oracle database, and then point out any differences. Rule of Engagement - please feel free to ask questions at any time, you don’t have to save them up for the end. I’ll take on subject questions during the presentation, and if you want to discuss anything else I am happy to use any spare time at the end.

PeopleSoft Applications
HRMS Department Security Tree Financials Roll-Up Reporting nVision Summary Ledgers PeopleSoft’s applications are diverse and complex. They cover just about every aspects of the day-to-day business of doing business. There is a lot of functionality and it is executed in a lot of different places Panel processor, PeopleCode behind the panel, reporting (query crystal or SQR), batch processes (SQR or Cobol), nVision) Something which we come across very frequently in business processes are hierarchical sets of data. One piece of data is the parent of another which is in turn the parent of other pieces. And that is what we call a tree. In Financials they are used for roll up reporting of accounts In HRMS, just about everything that you do requires that you reference the department security tree. So I want to take a closed look at the HR system

Department Security Tree
A operator has access to those employees who have, or who will have, jobs in or below (as defined by the department security tree in force as at a given date) those departments to which the operator has been given access I have tried to come up with a sentence which describes the function of the security within the HR application which is based on the department security tree. And here it is. Throughout PeopleTools the tree structure is used to hold the definition of below.

Panel Search Dialogue You give an operator, or a class of operators access to usually one, but sometimes many, nodes on the department security table, and then they can access all employees in all of the departments in or below those nodes. The department security tree provides the definition of what is below. When you go into a panel you are presented with the search dialogue. This queries the panel search record for the panel group In PS/Query you join the record you are querying to the query security record by the primary key columns in common. In HR7.5 there are separate panel search and query security views. But they are effectively identical queries with different columns. In previous version the same record to be used as both the panel search record and the query security record. After all if you can see an employee in the panels you should be able to queried him.

Panel Search Record Query
SELECT DISTINCT EMPLID, NAME, LAST_NAME_SRCH, ... FROM PS_PERS_SRCH_GBL WHERE EMPLID LIKE '8%' AND OPRCLASS='ALLPANLS' ORDER BY EMPLID When you press the OK or search button on the search panel you query the panel search record, determine which rows you can see in the panel. This is the SQL query created by PeopleTools. Note the line that says ‘and OPRCLASS = ‘ALLPANLS’’. is added by PeopleTools. That is what ties the view to the current user. In the HR system it is this record PERS_SRCH which keeps being used. There are different version of it for different countries, with select different columns. There is also a view called EMPLMNT_SRCH, which is identical except that the order of the second and third column is reversed. Note: from version 7 we are using ‘like’ not ‘between’ If your search key is shorted than the column you get LIKE otherwise if they are the same length PeopleTools will generate an =. The = is more efficient. If you have 6 character EMPLIDs you will never get =. You could consider changing the length of the EMPLID field. But that is a fundamental change to make.

PERS_SRCH / EMPLMT_SRCH
CREATE OR REPLACE VIEW PS_PERS_SRCH_GBL (...) AS SELECT … FROM PS_PERSONAL_DATA A, PS_JOB B, PS_PERS_NID ND, PS_NID_TYPE_TBL NDT, PS_SCRTY_TBL_DEPT SEC WHERE A.EMPLID=B.EMPLID AND A.EMPLID=ND.EMPLID AND ND.COUNTRY=NDT.COUNTRY AND ND.NATIONAL_ID_TYPE=NDT.NATIONAL_ID_TYPE AND ( B.EFFDT>=%CURRENTDATEIN OR ( B.EFFDT=( SELECT MAX(B2.EFFDT) FROM PS_JOB B2 WHERE B2.EMPLID=B.EMPLID AND B2.EMPL_RCD#=B.EMPL_RCD# AND B2.EFFDT<=%CURRENTDATEIN) AND B.EFFSEQ= (SELECT MAX(B3.EFFSEQ) FROM PS_JOB B3 WHERE B3.EMPLID=B.EMPLID AND B3.EMPL_RCD#=B.EMPL_RCD# AND B3.EFFDT=B.EFFDT ) ) ) AND SEC.ACCESS_CD='Y' AND EXISTS (SELECT 'X' FROM PSTREENODE SEC3 WHERE SEC3.SETID = SEC.SETID AND SEC3.SETID = B.SETID_DEPT AND SEC3.TREE_NAME='DEPT_SECURITY' AND SEC3.EFFDT= SEC.TREE_EFFDT AND SEC3.TREE_NODE=B.DEPTID AND SEC3.TREE_NODE_NUM BETWEEN SEC.TREE_NODE_NUM AND SEC.TREE_NODE_NUM_END AND NOT EXISTS ( SELECT 'X' FROM PS_SCRTY_TBL_DEPT SEC2 WHERE SEC.OPRID = SEC2.OPRID AND SEC.SETID = SEC2.SETID AND SEC.TREE_NODE_NUM <> SEC2.TREE_NODE_NUM AND SEC3.TREE_NODE_NUM BETWEEN SEC2.TREE_NODE_NUM AND SEC2.TREE_NODE_NUM_END AND SEC2.TREE_NODE_NUM BETWEEN SEC.TREE_NODE_NUM AND SEC.TREE_NODE_NUM_END)) You are not intended to be able to read the text on this slide. It is in fact the SQL behind the view. And it look horrible. Don;t worry about the actual SQL on this page, but take a look at the shape All these search views have the same basic construction. They select mostly the same data from the same tables and they make the same joins between these tables query sub-query(2 actually) another sub-query sub-query of sub-query lets start to pull this view apart

Tree-Reading Security View
This is how the view actually hangs together PERSONAL_DATA - employee name JOB - effective dated job records PERS_NID - national id number PERS_NID_TYPE - national id type - for description of national ID PSTREENODE - that is where the tree is actually stored PS_SCRTY_DEPT_TBL - the tree nodes to which you have been granted access Personal data is joined to job by the employee ID The department security table determines which nodes on the tree that the operator has access AND job is joined to PSTREENODE. And that is the area when you can get into trouble. For every node on the department security tree to which an operator has access, you search for that department on the JOB table. The more departments that you have access to, the longer it takes.

Options for Optimisation
Simplification Flattening Optimiser Oracle only: Cost -v- Rule Pre-process Generated tables Replication Data Latency -v- Performance So, what can you do about this view This is a complex view, however, it can be simplified by flattening out the sub-queries without altering the functionality We have observed that the Oracle’s cost based optimiser performs better on flatter queries. So having flattened this view, you can consider using the cost based optimiser. All other platforms ONLY have a cost based optimiser. Other options include generating tables which effectively perform some of the joins in advance, rather than have to repeat the operations every time the view is executed The drawback with generated tables, which must be considered, is that you may have to wait to see new data until the table is regenerated. But we will talk about that.

Optimisers Rule Cost Old (Stable) Inflexible Predictable Influence New
Hints (Oracle) Statistics Distributions Maintenance At this point it might be useful to divert slightly and talk about database query optimisers. When any SQL query is submitted to the database, the database has to determine how best to execute the query, which table should drive the query, and which indexes, if any, should be used to locate the data. Oracle still has, and used only to have, a rule based optimiser which analyses the SQL statement and the indexes available on the tables, and then follows a set of hard coded rules to choose how to execute the query. Oracle are threatening to remove the optimiser in a later release, but it is still there in Oracle 8.1. All the other databases that PeopleSoft supports only have cost based optimisers. The CBO, will also use statistical information about the volume of data within the tables, and the cardinality, or uniqueness, of each of the indexes in deciding how to execute a query. These statistics have to be calculated. A process has to be run to calculate these statistics, which effectively means the DBA has to write and execute a script. From v7.3 Oracle can also take the distribution of the data into account. Oracle calls these distributions histograms. The statistics can be calculated either by analysing all the data in the table and index, or by making an estimate based upon a sample of the data. Ideally all of the tables should be fully calculated, but this takes longer than the estimate. Sometimes estimation is sufficient, all that is required is enough information about the table to persuade the optimiser to take a better path.

WHERE EXISTS(sub-query)
CREATE OR REPLACE VIEW PS_PERS_SRCH_GBL (...) AS SELECT … FROM PS_PERSONAL_DATA A, PS_JOB B, PS_PERS_NID ND, PS_NID_TYPE_TBL NDT, PS_SCRTY_TBL_DEPT SEC WHERE A.EMPLID=B.EMPLID AND A.EMPLID=ND.EMPLID AND ND.COUNTRY=NDT.COUNTRY AND ND.NATIONAL_ID_TYPE=NDT.NATIONAL_ID_TYPE AND (B.EFFDT>=%CURRENTDATEIN OR ( B.EFFDT=( SELECT MAX(B2.EFFDT) FROM PS_JOB B2 WHERE B2.EMPLID=B.EMPLID AND B2.EMPL_RCD#=B.EMPL_RCD# AND B2.EFFDT<=%CURRENTDATEIN) AND B.EFFSEQ= (SELECT MAX(B3.EFFSEQ) FROM PS_JOB B3 WHERE B3.EMPLID=B.EMPLID AND B3.EMPL_RCD#=B.EMPL_RCD# AND B3.EFFDT=B.EFFDT ) ) ) AND SEC.ACCESS_CD='Y' AND EXISTS (SELECT 'X' FROM PSTREENODE SEC3 WHERE SEC3.SETID = SEC.SETID AND SEC3.SETID = B.SETID_DEPT AND SEC3.TREE_NAME='DEPT_SECURITY' AND SEC3.EFFDT= SEC.TREE_EFFDT AND SEC3.TREE_NODE=B.DEPTID AND SEC3.TREE_NODE_NUM BETWEEN SEC.TREE_NODE_NUM AND SEC.TREE_NODE_NUM_END AND NOT EXISTS ( SELECT 'X' FROM PS_SCRTY_TBL_DEPT SEC2 WHERE SEC.OPRID = SEC2.OPRID AND SEC.SETID = SEC2.SETID AND SEC.TREE_NODE_NUM <> SEC2.TREE_NODE_NUM AND SEC3.TREE_NODE_NUM BETWEEN SEC2.TREE_NODE_NUM AND SEC2.TREE_NODE_NUM_END AND SEC2.TREE_NODE_NUM BETWEEN SEC.TREE_NODE_NUM AND SEC.TREE_NODE_NUM_END)) Here is the vanilla PERS_SRCH again. I’m going to spend the next few slides talking about the things you can do to this view to improve its overall performance. You may wonder why we ship it like this. If you a querying a single employee, particularly if you are using the whole employee ID then this is very efficient. This is because the JOB and PERSONAL_DATA tables are joined on EMPLID The problem comes when you search by a different attribute of the person, such as their name. I want to look at the where exists sub-query in PERS_SRCH. I’ve marked it up in bold and italic so that you can see the area of the query I am looking at

WHERE EXISTS(sub-query)
... AND EXISTS (SELECT 'X' FROM PSTREENODE SEC3 WHERE SEC3.SETID = SEC.SETID AND SEC3.SETID = B.SETID_DEPT AND SEC3.TREE_NAME='DEPT_SECURITY' AND SEC3.EFFDT= SEC.TREE_EFFDT AND SEC3.TREE_NODE=B.DEPTID AND SEC3.TREE_NODE_NUM BETWEEN SEC.TREE_NODE_NUM AND SEC.TREE_NODE_NUM_END ... and here it is close up. The PSTREENODE table contains one row per node per tree per effective date. That’s not quite the primary key of the TREENODE table, but the underlined conditions do uniquely describe one row on the department security table. Later I will explain why I can prove that assertion. If the sub-query only returns a single row, then I can merge it with the parent query and I will not change the behaviour of the view. By doing so I flatten 1 level of sub-query from my view.

WHERE NOT EXISTS(sub-query)
CREATE OR REPLACE VIEW PS_PERS_SRCH_GBL (...) AS SELECT … FROM PS_PERSONAL_DATA A, PS_JOB B, PS_PERS_NID ND, PS_NID_TYPE_TBL NDT, PS_SCRTY_TBL_DEPT SEC, PSTREENODE SEC3 WHERE A.EMPLID=B.EMPLID AND A.EMPLID=ND.EMPLID AND B.EMPLID=ND.EMPLID AND ND.COUNTRY=NDT.COUNTRY AND ND.NATIONAL_ID_TYPE=NDT.NATIONAL_ID_TYPE AND ( B.EFFDT>=%CURRENTDATEIN OR ( B.EFFDT=( SELECT MAX(B2.EFFDT) FROM PS_JOB B2 WHERE B.EMPLID=B2.EMPLID AND B.EMPL_RCD#=B2.EMPL_RCD# AND B2.EFFDT<=%CURRENTDATEIN ) AND B.EFFSEQ=( SELECT MAX(B3.EFFSEQ) FROM PS_JOB B3 WHERE B.EMPLID=B3.EMPLID AND B.EMPL_RCD#=B3.EMPL_RCD# AND B.EFFDT=B3.EFFDT))) AND SEC.ACCESS_CD='Y' AND SEC3.SETID = SEC.SETID AND SEC3.SETID = B.SETID_DEPT AND SEC3.TREE_NAME='DEPT_SECURITY' AND SEC3.EFFDT= SEC.TREE_EFFDT AND SEC3.TREE_NODE=B.DEPTID AND SEC3.TREE_NODE_NUM BETWEEN SEC.TREE_NODE_NUM AND SEC.TREE_NODE_NUM_END AND NOT EXISTS ( SELECT 'X' FROM PS_SCRTY_TBL_DEPT SEC2 WHERE SEC.OPRID = SEC2.OPRID AND SEC.SETID = SEC2.SETID AND SEC.TREE_NODE_NUM <> SEC2.TREE_NODE_NUM AND SEC3.TREE_NODE_NUM BETWEEN SEC2.TREE_NODE_NUM AND SEC2.TREE_NODE_NUM_END AND SEC2.TREE_NODE_NUM BETWEEN SEC.TREE_NODE_NUM AND SEC.TREE_NODE_NUM_END) The second stage that I experimented with was to remove the where not exists su-query down here at the end

… AND NOT EXISTS (SELECT 'X' FROM PS_SCRTY_TBL_DEPT SEC2 WHERE SEC.OPRID = SEC2.OPRID AND SEC.SETID = SEC2.SETID AND SEC.TREE_NODE_NUM <> SEC2.TREE_NODE_NUM AND SEC3.TREE_NODE_NUM BETWEEN SEC2.TREE_NODE_NUM AND SEC2.TREE_NODE_NUM_END AND SEC2.TREE_NODE_NUM BETWEEN SEC.TREE_NODE_NUM AND SEC.TREE_NODE_NUM_END) If we merged this query into the parent query and simply joined this table, then the query would return no rows where the rows do not exist in this query. If we outer join this table to the parent query, then there will be a product from the query whether or not rows in this query exist. In the query, where a row in the outer joined table does not exist, a dummy row with NULL values is notionally added to that table and returned by the query. These dummy rows can be found by adding an IS NULL condition for an otherwise NOT NULL column in that table. This is equivalent to identify the rows which do not exist in that table. Hence it is possible to flatten out a where not exists sub-query There is a catch. The table in the sub-query is joined to two other tables. And you cannot outer join a table to more than one table in Oracle SQL.

Workaround to Outer-join to 2 tables
CREATE OR REPLACE VIEW fudge_vw (...) AS SELECT ... FROM PSTREENODE E, PS_SCRTY_TBL_DEPT C WHERE C.ACCESS_CD='Y' AND E.SETID=C.SETID AND E.TREE_NAME='DEPT_SECURITY' AND E.EFFDT=C.TREE_EFFDT AND E.TREE_NODE_NUM BETWEEN C.TREE_NODE_NUM AND C.TREE_NODE_NUM_END So, what you do is you join together the tables involved in the other join and produce a simple view. This view not contains all of the nodes in the tree identified by the department security table to which you have access. This view has no functional significance. It is purely there to evade a SQL coding restriction specifically on Oracle.

Don’t try this at home AND FDG.SETID = B.SETID_DEPT AND FDG.TREE_NODE=B.DEPTID AND SEC2.OPRID IS NULL AND FDG.OPRID = SEC2.OPRID(+) AND FDG.SETID = SEC2.SETID(+) AND FDG.T_TREE_NODE_NUM >= SEC2.TREE_NODE_NUM(+) AND FDG.T_TREE_NODE_NUM <=SEC2.TREE_NODE_NUM_END(+) AND FDG.S_TREE_NODE_NUM <> SEC2.TREE_NODE_NUM (+) AND FDG.S_TREE_NODE_NUM <= SEC2.TREE_NODE_NUM (+) AND FDG.S_TREE_NODE_NUM_END >= SEC2.TREE_NODE_NUM (+) This is what the end of the query looks like with the outer join. The IS NULL condition restricts it to the outer joined rows The BETWEENs have been re-coded as pairs of inequalities. The (+) is the Oracle outer join syntax The key point it is functionally identical.

Fully Flattened View Don’t try this at home
CREATE OR REPLACE VIEW PS_PERS_SRCH_GBL (...) AS SELECT ... FROM PS_PERSONAL_DATA A, PS_JOB B, PS_PERS_NID ND, PS_NID_TYPE_TBL NDT, PS_SCRTY_TBL_DEPT SEC2, PS_FUDGE_VW FDG WHERE A.EMPLID=B.EMPLID AND A.EMPLID=ND.EMPLID AND B.EMPLID=ND.EMPLID AND ND.COUNTRY=NDT.COUNTRY AND ND.NATIONAL_ID_TYPE=NDT.NATIONAL_ID_TYPE AND ( B.EFFDT>= %CURRENTDATEIN OR ( B.EFFDT=( SELECT MAX(B2.EFFDT) FROM PS_JOB B2 WHERE B.EMPLID=B2.EMPLID AND B.EMPL_RCD#=B2.EMPL_RCD# AND B2.EFFDT<= %CURRENTDATEIN) AND B.EFFSEQ=( SELECT MAX(B3.EFFSEQ) FROM PS_JOB B3 WHERE B.EMPLID=B3.EMPLID AND B.EMPL_RCD#=B3.EMPL_RCD# AND B.EFFDT=B3.EFFDT))) AND FDG.SETID = B.SETID_DEPT AND FDG.TREE_NODE=B.DEPTID AND SEC2.OPRID IS NULL AND FDG.OPRID = SEC2.OPRID(+) AND FDG.SETID = SEC2.SETID(+) AND FDG.T_TREE_NODE_NUM >= SEC2.TREE_NODE_NUM(+) AND FDG.T_TREE_NODE_NUM <= SEC2.TREE_NODE_NUM_END(+) AND FDG.S_TREE_NODE_NUM <> SEC2.TREE_NODE_NUM(+) AND FDG.S_TREE_NODE_NUM <= SEC2.TREE_NODE_NUM(+) AND FDG.S_TREE_NODE_NUM_END >= SEC2.TREE_NODE_NUM(+) Here is the final flattened view. The only sub-query is the current effective date for job. There is a warning here - Don’t try this at home. I have included the last few slides to show that I examined this approach. It was not beneficial.

So what is the benefit of flattening?
Depends upon the conditions WHERE EMPLID = ‘1234’ slightly worse WHERE EMPLID like ‘1234%’ no difference WHERE NAME = ‘SMITH’ better WHERE NAME LIKE ‘SMI%’ much better So what is the benefit of flattening these views - Removing the WHERE EXISTS improves performance, removing the WHERE NOT EXISTS degrades it We have also found is that it is actually dependant upon the conditions used to query the data If you are searching for a particular employee ID, then the vanilla version of the view is the fastest, but we have found that the partially flattened view with the cost based optimiser, is fractionally slower, but requires less I/O If you are searching by an attribute of the employee such as name or national ID, then either the partially flattened view under RBO, or the fully flattened view under cost based optimisation are the best. A note, if the string you are searching for is shorter than the definition of the field, PeopleTools will generate the LIKE keyword. Only if the search string is the same length as the field with the condition be generated as an equality. By default EMPLID is 11 characters.

So what is the benefit of flattening?
This graph shows the average i/o overhead for a range of typical queries and conditions on each of the three views, for rule and cost optimisation. Key points: The cost based optimiser doesn’t do very well on the vanilla version of the view. Partially flattening the view dramatically improves the performance under CBO. Fully flattened although still better under CBO than anything else under RBO, is not as good as partially flattened view. You can enable CBO for a particular statement by use of an optimiser hint. So you could put the hint in the panel search view. Flattening usually helps the rule based optimiser. The partially flattened view was significantly better, for some queries but not for others. On average, the fully flattened view performed slightly worse under the RBO These numbers came from tests on Oracle Last year when I gave this presentation at conference in Paris the numbers came from Oracle 7.3 on HR7.0 views. The extra join has improved the partially flattened view such that it performs better than the fully flattened view.

And there’s more! Lets go back to the diagram of PERS_SRCH
The tables below the horizontal line are PSTREENODE and the PS_SCRTY_TBL_DEPT. They represent the tree and the security access profile of the operator. These tables are relatively static. Changes to the tree and the access profile of operator classes are going to be relatively rare. You could consider joining these two tables together and placing the results in another table and then using the other table in PERS_SRCH The tables above the line are PERSONAL_DATA, JOB, PERS_NID and PERS_NID_TYPE. they are very dynamic, because they contain the actual data.

Pre-generated tables Pre-join the data Extra indexes Latency
Once when generate Not every time in the view Extra indexes Latency frequency of regeneration If you need even better performance, then you may have to arrange your data in such a way that it is more readily queryable. Generated tables, joining fewer tables in the security view you can build addition indexes on the generated tables because you don’t have the cost of maintaining them the catch is that you introduce latency in the security table. Data changes don’t appear in the generated table until is is regenerated

Security Table CREATE TABLE PS_SECURITY AS
SELECT E.TREE_NODE, C.OPRID, C.SETID FROM PS_SCRTY_TBL_DEPT C, PSTREENODE E WHERE C.ACCESS_CD='Y' AND E.SETID=C.SETID AND E.TREE_NAME='DEPT_SECURITY' AND E.EFFDT=C.TREE_EFFDT AND E.TREE_NODE_NUM BETWEEN C.TREE_NODE_NUM AND C.TREE_NODE_NUM_END AND NOT EXISTS( SELECT 'X' FROM PS_SCRTY_TBL_DEPT G WHERE C.OPRID=G.OPRID AND C.TREE_NODE_NUM<>G.TREE_NODE_NUM AND E.TREE_NODE_NUM BETWEEN G.TREE_NODE_NUM AND G.TREE_NODE_NUM_END AND G.TREE_NODE_NUM BETWEEN C.TREE_NODE_NUM AND C.TREE_NODE_NUM_END) Here is that table You can see the same construction that you saw in the vanilla version of PERS_SRCH You can build a primary key index on the columns of this table and the fact that you can proves that the assumption made about what uniquely describes a single row of the department security on the PSTREENODE table.

PERS_SRCH CREATE OR REPLACE VIEW PS_PERS_SRCH_GBL (...) AS SELECT ... FROM PS_PERSONAL_DATA A, PS_JOB B, PS_PERS_NID ND, PS_PERS_NID_TYPE NDT, SECURITY SEC WHERE A.EMPLID=B.EMPLID AND A.EMPLID=ND.EMPLID AND B.EMPLID=ND.EMPLID AND ND.COUNTRY=NDT.COUNTRY AND ND.NATIONAL_ID_TYPE=NDT.NATIONAL_ID_TYPE AND ( B.EFFDT>= %CURRENTDATEIN OR ( B.EFFDT=( SELECT MAX(B2.EFFDT) FROM PS_JOB B2 WHERE B.EMPLID=B2.EMPLID AND B.EMPL_RCD#=B2.EMPL_RCD# AND B2.EFFDT<= %CURRENTDATEIN) AND B.EFFSEQ=( SELECT MAX(B3.EFFSEQ) FROM PS_JOB B3 WHERE B.EMPLID=B3.EMPLID AND B.EMPL_RCD#=B3.EMPL_RCD# AND B.EFFDT=B3.EFFDT))) AND SEC.SETID = B.SETID_DEPT AND SEC.TREE_NODE=B.DEPTID PERS_SRCH can now be changed to reference the security table. The whole of the tree reading sub-query has been replaced with the generated security table and the last two lines of the where clause, in italics

PERSONAL_DATA, JOB & NID
CREATE TABLE PS_GEN_JOB_TBL(...) AS SELECT DISTINCT ... FROM PS_PERSONAL_DATA A, PS_JOB B, PS_PERS_NID ND, PS_NID_TYPE_TBL NDT WHERE A.EMPLID=B.EMPLID AND A.EMPLID=ND.EMPLID AND B.EMPLID=ND.EMPLID AND ND.NATIONAL_ID_TYPE=NDT.NATIONAL_ID_TYPE AND ( B.EFFDT>=%CURRENTDATEIN OR ( B.EFFDT=( SELECT MAX(B2.EFFDT) FROM PS_JOB B2 WHERE B2.EMPLID=B.EMPLID AND B2.EMPL_RCD#=B.EMPL_RCD# AND B2.EFFDT<=%CURRENTDATEIN) AND B.EFFSEQ= (SELECT MAX(B3.EFFSEQ) FROM PS_JOB B3 WHERE B3.EMPLID=B.EMPLID AND B3.EMPL_RCD#=B.EMPL_RCD# AND B3.EFFDT=B.EFFDT))) You can also join personal data and the job and national insurance id records Now, this becomes a large table to maintain and index because it contains all the alternate search key and list columns on the panel search view. You also have the problem of when do you maintain the generated table. Every time the JOB table changes, every time the PERSONAL_DATA table changes and every time a NATIONAL insurance table changes. The part that degrades the performance is the effective date and sequence sub-queries on PS_JOB. So we have found that it is reasonable to have a generated job table only

Current and future JOB CREATE TABLE PS_GEN_JOB_TBL(...) AS
SELECT DISTINCT B.EMPLID, B.EMPL_RCD#, B.DEPTID, B.SETID_DEPT FROM PS_JOB B WHERE ( B.EFFDT>=%CURRENTDATEIN OR ( B.EFFDT=( SELECT MAX(B2.EFFDT) FROM PS_JOB B2 WHERE B2.EMPLID=B.EMPLID AND B2.EMPL_RCD#=B.EMPL_RCD# AND B2.EFFDT<=%CURRENTDATEIN) AND B.EFFSEQ=( SELECT MAX(B3.EFFSEQ) FROM PS_JOB B3 WHERE B3.EMPLID=B.EMPLID AND B3.EMPL_RCD#=B.EMPL_RCD# AND B3.EFFDT=B.EFFDT))) You will notice the DISTINCT keyword. An employee can have many future effective dated job record that do not changes his/her department. So you don’t need to replicate them into the generated table. The distinct will eliminate them. This same table can be used within the query security record. But you could have a department change where both the current and future department are both visible to the same operator. In which case there would be a duplicate row in the query for this employee. To resolve this you require a DISTINCT keyword within the query security view, and I will show you than in a few slides

Maintain via PeopleCode
JOB.DEPTID.SavePostChg /* maintain GEN_JOB_TBL whenever an update to PS_JOB is made */ SQLExec(”delete from PS_GEN_JOB_TBL where EMPLID = :1 and EMPL_RCD# = :2", EMPLID, EMPL_RCD#); SQLExec("insert into PS_GEN_JOB_TBL select * from PS_GEN_JOB_VW where EMPLID = :1 and EMPL_RCD# = :2", EMPLID, EMPL_RCD#); If a current, or future effective dated row is inserted into the generated job table via a panel you can maintain the generated job table via PeopleCode. You must delete and reinsert all rows for a n employee because you don’t know how many future rows there may be. It is faster to do delete and reinsert than to decide whether it is necessary to do the update and then maybe do the update. If you don’t maintain the table, when you hire a person you will not be able to see them until the table is regenerated, and the same is true if an employee is transferred from a department which you cannot see to a department which can see. You must regenerate the table daily just after midnight. The continuous deleting and inserting from the table can degrade the performance of the indexes on this table, so it is as well to rebuild the indexes when the table is rebuilt.

Current or first JOB, but no future
CREATE TABLE PS_GEN_JOB_TBL(...) AS SELECT B.EMPLID, B.EMPL_RCD#, B.DEPTID, B.SETID_DEPT FROM PS_JOB B WHERE ( B.EFFDT= (SELECT MAX(D.EFFDT) FROM PS_JOB D WHERE B.EMPLID=D.EMPLID AND B.EMPL_RCD#=D.EMPL_RCD# AND D.EFFDT<=%CURRENTDATEIN) OR B.EFFDT= (SELECT MIN(E.EFFDT) FROM PS_JOB E WHERE B.EMPLID=e.EMPLID AND B.EMPL_RCD#=E.EMPL_RCD# HAVING MIN(E.EFFDT)>%CURRENTDATEIN)) AND B.EFFSEQ=(SELECT MAX(B3.EFFSEQ) FROM PS_JOB B3 WHERE B.EMPLID=B3.EMPLID AND B.EMPL_RCD#=B3.EMPL_RCD# AND B.EFFDT=B3.EFFDT)) If you do not require future effective security, or never enter future effective rows, then consider this. You get either the current effective dated job record. Or if there is no current job record, as in the case of a future hire, you get the first effective dated record. You get the maximum sequenced record for the effective date. One and only one row from job This query returns only one row per employee. So you no longer need the distinct keyword, either in this query or the security view. Because you don’t need the DISTINCT in the query security view in order to obtain to remove the duplicates in queries, and so you get even better performance.

Maintain via PeopleCode
JOB.DEPTID.SavePostChg /* maintain GEN_JOB_TBL whenever an update to PS_JOB is made */ &TMP = 0; SQLExec("select 1 from PS_GEN_JOB_TBL where EMPLID = :1 and EMPL_RCD# = :2", EMPLID, EMPL_RCD#, &TMP); If %SqlRows > 0 Then SQLExec("update PS_GEN_JOB_TBL set (DEPTID, SETID_DEPT) = (SELECT DEPTID, SETID_DEPT) from PS_GEN_JOB_VW where EMPLID = :1 and EMPL_RCD# = :2) where EMPLID = :1 and EMPL_RCD# = :2", EMPLID, EMPL_RCD#); Else SQLExec("insert into PS_GEN_JOB_TBL select * from PS_GEN_JOB_VW where EMPLID = :1 and EMPL_RCD# = :2", EMPLID, EMPL_RCD#); End-If; If you know that there is only one row per employee/EMPL_RCd# in the generated Job table then the PeopleCode to maintain the table can be simplified so that the values are updated not deleted and inserted. This looks more complicated but is more efficient because it only updates a single row, and because rows are not being deleted and reinserted into the database it is both faster and does less damage to the indexes. You must still regenerate the table daily just after midnight.

Panel Search Record CREATE OR REPLACE VIEW PERS_SRCH_GBL(...)
AS SELECT /*+ALL_ROWS*/ ... FROM PS_PERSONAL_DATA A, PS_GEN_JOB_TBL B, PS_PERS_NID ND, PS_NID_TYPE_TBL NDT, PS_SECURITY SEC WHERE A.EMPLID=B.EMPLID AND A.EMPLID=ND.EMPLID AND B.EMPLID=ND.EMPLID AND ND.COUNTRY=NDT.COUNTRY AND ND.NATIONAL_ID_TYPE=NDT.NATIONAL_ID_TYPE AND SEC.SETID = B.SETID_DEPT AND SEC.TREE_NODE=B.DEPTID And then PERS_SRCH is simply the result of joining those two tables The only reason why I haven’t suggested using just a single generated table is that that table could get very big, and very costly to regenerate If you had 100 operators/operator classes, each with access to 1000 employees then the single table would contain 100,000 rows. So a single table could take time to build, take up a lot of space and could require a lot of administration.

Query Security Record CREATE OR REPLACE VIEW PS_EMPLMT_SRCH_QRY
(EMPLID, EMPL_RCD#, OPRCLASS) AS SELECT DISTINCT A.EMPLID, A.EMPL_RCD#, S.OPRID FROM PS_SECURITY S, PS_GEN_JOB_TBL A WHERE S.TREE_NODE=A.DEPTID AND S.SETID=A.SETID_DEPT The query security record doesn’t need the extra tables. Only the security record and the job table. You still need the DISTINCT if you have the future effective dated security, and you also need it in PERS_SRCH, which does not have EMPL_RCD#, if you need it if you use concurrent jobs. If you don’t use concurrent job processing you can use PERS_SRCH_QRY throughout the system. That help queries than would otherwise use both EMPLMT_SRCH_QRY and PERS_SRCH_QRY. Customers who have implemented this two generated table approach are managing to generate both tables in times measured in minutes rather than hours.

Current & Current or future JOB
CREATE TABLE PS_GEN_JOB_TBL(...) AS SELECT B.EMPLID, B.EMPL_RCD#, B.DEPTID, B.SETID_DEPT, MIN(B.EFFDT) FROM PS_JOB B WHERE (B.EFFDT>= (SELECT NVL(MAX(D.EFFDT),%CURRENTDATEIN) FROM PS_JOB D WHERE B.EMPLID=D.EMPLID AND B.EMPL_RCD#=D.EMPL_RCD# AND D.EFFDT<=%CURRENTDATEIN) AND B.EFFSEQ=( SELECT MAX(B3.EFFSEQ) FROM PS_JOB B3 WHERE B.EMPLID=B3.EMPLID AND B.EMPL_RCD#=B3.EMPL_RCD# AND B.EFFDT=B3.EFFDT)) GROUP BY B.EMPLID, B.EMPL_RCD#, B.DEPTID, B.SETID_DEPT I went to one customer recently where they wanted a mixture of current only and current and future effective security. So an effective date was added to the generated job table. It is the earliest effective date for each employee/department combination. The query is grouped by the other four fields, so there is no DISTINCT

Read only & Read/Write Security
CREATE TABLE PS_SECURITY AS SELECT E.TREE_NODE, C.OPRID, C.SETID, C.ACCESS_CD FROM PS_SCRTY_TBL_DEPT C, PSTREENODE E WHERE C.ACCESS_CD != 'N' AND E.SETID=C.SETID AND E.TREE_NAME='DEPT_SECURITY' AND E.EFFDT=C.TREE_EFFDT AND E.TREE_NODE_NUM BETWEEN C.TREE_NODE_NUM AND C.TREE_NODE_NUM_END AND NOT EXISTS( SELECT 'X' FROM PS_SCRTY_TBL_DEPT G WHERE C.OPRID=G.OPRID AND C.TREE_NODE_NUM<>G.TREE_NODE_NUM AND E.TREE_NODE_NUM BETWEEN G.TREE_NODE_NUM AND G.TREE_NODE_NUM_END AND G.TREE_NODE_NUM BETWEEN C.TREE_NODE_NUM AND C.TREE_NODE_NUM_END) Same customer also wanted a mixture of read only and read/write panels. The read only panel has a read only panel search record. So we had to generate ACCESS_CDs Y and R onto the generated security table

Current Only/Read Write Security
Current & Current or future JOB CREATE OR REPLACE VIEW PS_EMPLMT_SRCH_QRY (EMPLID, EMPL_RCD#, OPRCLASS) AS SELECT DISTINCT A.EMPLID, A.EMPL_RCD#, S.OPRID FROM PS_SECURITY S, PS_GEN_JOB_TBL A WHERE S.TREE_NODE=A.DEPTID AND S.SETID=A.SETID_DEPT AND A.EFFDT <= %CURRENTDATEIN AND S.ACCESS_CD = 'Y' The query security record doesn’t need the extra tables. Only the security record and the job table. You still need the DISTINCT if you have the further effective dated, and in PERS_SRCH, which does not have EMPL_RCD# you need it if you use concurrent jobs. If you don’t use concurrent job processing you can use PERS_SRCH_QRY throughout the system. That help queries than would otherwise use both EMPLMT_SRCH_QRY and PERS_SRCH_QRY.

The benefit of generated tables?
This graphs illustrates the I/o for a typical query using different options for the query view. These tests were run on Oracle using both the rule and cost (blue) based optimiser. The y-axis showsi/o Conclusions Flattening helps, especially under Cost Based Optimisation But don’t try flattening outer joins Pregenerating the security tables further improves performance. Two generateed tables is better still. Honestly there is something in the last column

The benefit of generated tables?
Honestly there is something in the last column Here is the same graph with a logarithmic y-axis So each division is a factor 10, Two generated tables are two divisions better 10 squared - 100x

Tree Reading Query Performance
Security Views Flatten Cost Based Optimiser Pre-Generated tables To summarise- I described a couple of generic database tuning techniques applied to the security views. We first flattened them, I considered using the CBO, and finally we introduced pre-generated table.

Implementation recommendations
Panel Search Records Two generated tables with PeopleCode Query Security Records Two generated tables Remove duplicates Distinct Current Security Only I have no hesitation in recommending the two generated table approach. Detailed tests at a number of different customer sites have consistently shown this to be the best option. The PeopleCode to deal with the latency problem is not a significant overhead. If any batch/interface process writes to the JOB record then you must deal with the generated job table yourself. For HR7.5 if you put in future effective dated personal data changes the PERSONAL_DATA table has to be regenerated every night from PERS_EFFDT_TBL. The same is now true of the generated JOB, so you have an overnight batch anyway. You can have current effective dated security, or you can suppress the duplicate rows in query and crystal reports by use of ‘distinct’ in the query security views

HR7.5 Department Security Tree Tuning
David Kurtz Go-Faster Consultancy Ltd. Telephone

HR7.5 Department Security Tree Tuning

Similar presentations

Presentation on theme: "HR7.5 Department Security Tree Tuning"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

HR7.5 Department Security Tree Tuning

Similar presentations

Presentation on theme: "HR7.5 Department Security Tree Tuning"— Presentation transcript:

Similar presentations

About project

Feedback