GDT Tips and Tricks. GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June 25 - 28  Palm Springs,

Slides:



Advertisements
Similar presentations
Chapter 4 Memory Management Page Replacement 补充:什么叫页面抖动?
Advertisements

Lectures on File Management
Hash-Based Indexes Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY.
Hash-based Indexes CS 186, Spring 2006 Lecture 7 R &G Chapter 11 HASH, x. There is no definition for this word -- nobody knows what hash is. Ambrose Bierce,
1 Hash-Based Indexes Module 4, Lecture 3. 2 Introduction As for any index, 3 alternatives for data entries k* : – Data record with key value k – –Choice.
Hash-Based Indexes The slides for this text are organized into chapters. This lecture covers Chapter 10. Chapter 1: Introduction to Database Systems Chapter.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11.
CPSC 404, Laks V.S. Lakshmanan1 Hash-Based Indexes Chapter 11 Ramakrishnan & Gehrke (Sections )
Chapter 11 (3 rd Edition) Hash-Based Indexes Xuemin COMP9315: Database Systems Implementation.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
File Systems.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
CPSC 388 – Compiler Design and Construction
1 Hash-Based Indexes Yanlei Diao UMass Amherst Feb 22, 2006 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Memory Management (II)
1 Hash-Based Indexes Chapter Introduction  Hash-based indexes are best for equality selections. Cannot support range searches.  Static and dynamic.
FALL 2004CENG 3511 Hashing Reference: Chapters: 11,12.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
CS4432: Database Systems II
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
Tree-Structured Indexes. Range Searches ``Find all students with gpa > 3.0’’ –If data is in sorted file, do binary search to find first such student,
CHP - 9 File Structures. INTRODUCTION In some of the previous chapters, we have discussed representations of and operations on data structures. These.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
Converting COBOL Data to SQL Data: GDT-ETL Part 1.
Silberschatz, Galvin and Gagne  Operating System Concepts File Concept Contiguous logical address space Smallest user allocation Non-volatile.
March 16 & 21, Csci 2111: Data and File Structures Week 9, Lectures 1 & 2 Indexed Sequential File Access and Prefix B+ Trees.
UNIX File and Directory Caching How UNIX Optimizes File System Performance and Presents Data to User Processes Using a Virtual File System.
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
Chapter 7 File I/O 1. File, Record & Field 2 The file is just a chunk of disk space set aside for data and given a name. The computer has no idea what.
Indexed and Relative File Processing
12.1 Chapter 12: Indexing and Hashing Spring 2009 Sections , , Problems , 12.7, 12.8, 12.13, 12.15,
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11 Modified by Donghui Zhang Jan 30, 2006.
Introduction to Database, Fall 2004/Melikyan1 Hash-Based Indexes Chapter 10.
1.1 CS220 Database Systems Indexing: Hashing Slides courtesy G. Kollios Boston University via UC Berkeley.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Indexed Sequential Access Method.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 10.
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
Spring 2003 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
File Systems cs550 Operating Systems David Monismith.
Session 1 Module 1: Introduction to Data Integrity
Spring 2004 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2004 Yanyong Zhang
NTFS Filing System CHAPTER 9. New Technology File System (NTFS) Started with Window NT in 1993, Windows XP, 2000, Server 2003, 2008, and Window 7 also.
Using Sequential Containers Lecture 8 Hartmut Kaiser
Optimizing your GDT Environment. Optimizing Your GDT Environment Doug Evans GDT 2004 International User Conference – Evolving the Legacy July
1 Tree-Structured Indexes Chapter Introduction  As for any index, 3 alternatives for data entries k* :  Data record with key value k   Choice.
CS4432: Database Systems II
Memory management The main purpose of a computer system is to execute programs. These programs, together with the data they access, must be in main memory.
Database Applications (15-415) DBMS Internals- Part III Lecture 13, March 06, 2016 Mohammad Hammoud.
CHP - 9 File Structures.
Multiway Search Trees Data may not fit into main memory
Hash-Based Indexes Chapter 11
Hashing CENG 351.
Main Memory Management
Introduction to Database Systems
CS222: Principles of Data Management Notes #8 Static Hashing, Extendible Hashing, Linear Hashing Instructor: Chen Li.
Hash-Based Indexes Chapter 10
Indexing and Hashing Basic Concepts Ordered Indices
CS222P: Principles of Data Management Notes #8 Static Hashing, Extendible Hashing, Linear Hashing Instructor: Chen Li.
Hash-Based Indexes Chapter 11
Index tuning Hash Index.
Chapter 11 Instructor: Xin Zhang
Lecture Topics: 11/20 HW 7 What happens on a memory reference Traps
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #07 Static Hashing, Extendible Hashing, Linear Hashing Instructor: Chen Li.
CSE 542: Operating Systems
Presentation transcript:

GDT Tips and Tricks

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California File Handling  The Indexed File Structure  The BINARY Tree  What happens during a Read and Write operation? n What happens to the Index? n How do we obtain data by the Index?  The Impact of Compressing your keys  How is Data File Integrity maintained  Causes and Response to File Corruption  Enhancing Performance!

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Question to Ponder?  What Level of Data Integrity do you require in your files? n Maybe you want to immediately flush any write operations immediately to disk?? n Maybe you want a reasonable level of integrity where you let Micro Focus write the data as soon as it can to protect against application being killed etc..?? n Maybe you are comfortable and have the luxury of just re-running your applications to recover from an untimely event??

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Indexed File Structure  The basics n Your indexed file will have a primary key and possibly alternate keys that interface to your data. n The data will include live records as well as deleted records.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Index Structure

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Index Structure

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California The Binary Tree (Read a Key)

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California The Binary Tree (Read a Key)

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California The Binary Tree (Read a Key)

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California The Binary Tree (Read a Key)

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California The “Binary Chop”

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California The “Binary Chop”

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California The “Binary Chop”

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California The “Binary Chop”

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California The Binary Tree (Read a Key)  If you do another Random Read of another Key? n It would start at the beginning Node and work it’s way back down the chain UNLESS l If previous READ is in cache, then it can read the nodes from cache. n ALTERNATIVELY, if you are doing a sequential READ NEXT, it knows via cache the previous read Node and starts from there (much quicker).

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Key Compression  Key Compression – to save space  Types of Key Compression n Duplicate Key Compression l Maybe used when you have many keys the same l Shows the first instance of the key while all other occurrences have a pointer to the node it should point to n Leading Character Compression l If 1 record key contains AAAAA and the second record key contains AAAAB, then the second record key will only show “B”, the A’s are compressed. The key does however contain information required so key can be decompressed. n Trailing Space Compression l Spaces at end of key are compressed. Again information maintained for decompression. n Trailing Null Compression l Null’s at end of key are compressed. Again information maintained for decompression.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Key Compression  What happens when you try to read with Key Compression? n Keys are not fixed length (some compressed more than others) l So, the keys need to be decompressed before they can be read and compared to the key being looked for l The “Binary Chop” cannot happen l MUST SEQUENTIALLY WALK THROUGH EVERY NODE!

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Indexed Files (Writing Records)  What is happening? n Every index in the file needs to be updated (Primary and Alternate Keys) n The basic process: l The Header is updated – just to say we are in mid-update l The Record is added to the Data file l Indexes are Updated – 1 st the Primary then the Alternate keys l The Header is updated – to say that the action is completed

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Indexed Files (Writing Records)

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Indexed Files (Writing Records)

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Indexed Files (Writing Records) “NODE SPLITTING”  Done to have the available room to add the entry to the node.  Must look at the preceding node to verify that it also has available room to add the entry.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Indexed Files (Writing Records) “NODE SPLITTING”

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Indexed Files (Writing Records) “NODE SPLITTING”

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California How File Integrity is Maintained  The File Header n Static Information l File Attributes l Number of Keys l Format and Organization of a file n Dynamic Information l Integrity indicator l Modification Counter l Logical EOF marker

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California How File Integrity is Maintained The File Header

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California How File Integrity is Maintained The File Header  Integrity Flag n The File Handler uses this flag to maintain integrity n 2 Byte Field n Value depends on the update being performed (type of operation) n A non zero value when header is read indicates to the File Handler that an operation is not fully completed.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California How File Integrity is Maintained The File Header  Modified Value Field n Position n 4 byte field n Used as an aid to performance n If a process detects that this value has changed after the last read of the header, this indicates to the process that nodes cached are invalid and must read new nodes from the indexed structure that are physically stored.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Understanding the Write Operation  The File Handler obtains a “Write Semaphore” n To only allow 1 process to update at a time n Control-Break is disabled  The File Header is read  The Integrity Flag is updated n To insure that another process has not left the file in a corrupt status and also checks the Modified Value flag  Update and Write the File Header basicall stating that the process is performing a write operation.  Write DATA n Record create is written to disk  Write INDEX n Index is created and written to disk  INTEGRITY FLAG is reset and written back to disk for another process to update the file  FILE HEADER is written  SEMAPHORE is RELEASED

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Understanding the Write Operation  Special Note n When using a WRITE / OPEN EXCLUSIVE on a file, the indexes are CACHED until either the CACHE limit is reached or a CLOSE operation is done.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Possible Causes of Corruption of Indexed Files  KILL -9 is used on Unix n Need to use KILL -15 l RTS invokes Micro Focus Exit procedure flushing back to disk the cached indexed nodes  Copying open files on Unix. n Unix allows the copying of opened Exclusive files which at the time of copying, the indexed nodes cached may not be flushed  Network problems  Machine rebooted or powered off while indexed files are opened  Actual error in system itself

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Fixing Corrupted Files  REBUILD UTILITY n Taking the attributes of the input file to produce the output file n Requires Exclusive use of the file n TO REORGANIZE A FILE l USES THE INDEX TO READ THE DATA RECORDS l REBUILD INFILE,OUTFILE n TO FIX CORRUPTED INDEXES l IGNORES INDEXES AND DIRECTLY READS RECORDS FROM THE DATA FILE l NEW DATA STRUCTURE AND DATA FILE CREATED l REBUILD INFILE,OUTFILE /d n TO RUN REBUILD ON LIVE DATA l REBUILD INFILE

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Summary on File Integrity  The MOST Integrity? n WRITETHRU directive l Can be used at compile time or as a tunable to the File Handler l When OPEN is done on the file, specifies to the Operating System that any WRITES to a file are flushed immediately to disk. l PERFORMANCE takes a NOSE DIVE immediately!  Default Level of Integrity? n Reasonable level of Integrity n Micro Focus will write data as soon as possible n Protects against application being killed n Couple of directives to look at l IDXDATBUF  The IDXDATBUF option determines the size of buffer used when accessing the data portion of a file with organization INDEXED.  DEFINE NBBUF & BPB from JCL will overwrite this setting if given. l LOADONTOHEAP  The LOADONTOHEAP option specifies whether the File Handler loads the file into memory before executing any I/O operations.  Able to RERUN your applications n Maybe recovery enough

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California File Handling Performance  Getting better performance from your application by getting better performance from your file handler  Advances in technology, cpu speed, amount of memory addressable by a process and code generator optimization has made it easier to push back thoughts of trying to improve performance  Accessing your disk is still the slowest thing you can do to your machines today

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California File Handling Performance  You can make certain aspects of the file handler perform better but you need to be careful on how this can effect another application accessing the file in a different manner.  Micro Focus provides an “All Round” solution to performance n Giving you the ability to tune the file handler to what you need in your application performance!

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California File Handling Performance  The BIG question…what should you use?  Understanding what to use is based on your understanding of the Binary Tree. n 1 for every index of your data file

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Tuning Your Files  Access Permission n Examine your data files on individual per file basis n Not every file needs complete access to everyone n When Opening Files: l Use Exclusive access where possible l Otherwise allow Only Readers n Only when absolutely necessary, give all others complete access to the file n Micro Focus Timing (8 million records read on IDX 8 format file) l Exclusive access - 7 min 13 sec l Allow Readers – 25 min l Allow all Others – 25 min

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Tuning Your Files  Based on the findings below, you may want to just say Allow all Others if choice between that and Allow Readers, but this was because only 1 user was used in the test. n Micro Focus Timing (8 million records written on IDX 8 format file) l Exclusive access - 7 min 13 sec l Allow Readers – 25 min l Allow all Others – 25 min

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Tuning Your Files  Write Allowing Readers n When update is done, goes to disk to allow others to read  Write Allowing Others n Other updates by other processes may be done at same time.  When Writing n Nodes are Cached into memory n With Exclusive use, only has to check if nodes have been changed. Quick n With allowing others, keeps reading nodes off of disk as they are changed. A lot slower.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Tuning Your Files  Micro Focus Timing n (8 million records read on IDX 8 format file) l Exclusive access - 2 min 32 sec l Allow Readers – 2 min 32 sec l Allow all Others – 6 min 28 sec  This allows applications to change the file. It has a lot more checks to do to see if the file has been changed each time.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California File Handler Configuration Settings  READSEMA n Specifies whether or not the system attempts to gain a semaphore for shared files when operations are performed that do not modify the file. (READ, START etc..) n You need to ensure that is set to OFF (default) n You might think that this can cause dirty reads? l No, when you read a record it checks to see if the record has been changed, if yes, then it takes out a semaphore on that record. n When set on you can degrade the performance by 15%

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California File Handler Configuration Settings  IGNORELOCK n Not interested if you have a “dirty” read. n Not bothered that someone comes in and changes a record you have just read. n Can improve performance by 15% l Take care that this is handled internally by GDT thru READLOCK=STAT directive.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California KEYS  The shorter the keys are in size, the better. n Fit more in a node n Quicker to traverse the Binary Tree  Remove redundant keys n Each key has a tree n Needs to be updated for each insert and delete  Micro Focus Timing n 8 million records Read l 2 alternate keys – 6 min 28 sec l 3 alternate keys – 8 min 40 sec l 4 alternate keys – 53 min 08 sec ! (we will talk about this in a couple of minutes)

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Compression  Data Compression n Minimal Performance hit l When reading a record, it will traverse the tree, every time it gets a record, it has to decompress the record before writing.  Index Compression n If you can get away with it, do not use it! n Always a hit in performance – sometimes severe! n File handler cannot Binary Chop the node when searching for the key. l All keys are different size l Cannot tell where middle of the node is

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Compression  Micro Focus Timing n 8 million records l Sequential Write No data/key compression – 6 min 40 sec l Random Read No data/key compression – 6 min 28 sec l Sequential Read No data/key compression – 6 min 12 sec l Sequential Write with Data compression – 8 min 14 sec l Random Read with Data compression – 7 min 59 sec l Sequential Read with Data compression – 7 min 53 sec l Sequential Write w/ Data/Key compression – 15 min 18 sec l Random Read w/ Data/Key compression – 15 min 36 sec l Sequential Read w/ Data/Key compression – 7 min 50 sec n SEQUENTIAL READ – Consistent. Doing a read next it will always know where previous key is. Much different than random reads.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California FINE TUNING  Setting File Handler Configuration Options n Set on per file basis n There is no magic formula. n You need to adjust to suit each application n Can have both positive and negative impact on your application n SET EXTFH=C:\EXTFH.CFG l [XFH-DEFAULT] l NODESIZE=4096 l [FILE1.DAT] l NODESIZE=1024 l [FILE2.DAT] l INDEXCOUNT=32

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California INDEXCOUNT  Specifies number of index nodes to be cached for an index file per process  Default cache size is 16 nodes n [XFH-DEFAULT] n INDEXCOUNT=32 l take care that this is handled internally by GDT thru NBBUF & BPB directives. n [FILE1.DAT] n INDEXCOUNT=16

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California INDEXCOUNT IN ACTION INDEXCOUNT = 4

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California INDEXCOUNT IN ACTION INDEXCOUNT = 4

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California INDEXCOUNT IN ACTION INDEXCOUNT = 4

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California INDEXCOUNT IN ACTION INDEXCOUNT = 4

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California INDEXCOUNT IN ACTION  Now we need to read 4B

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California INDEXCOUNT IN ACTION INDEXCOUNT = 4

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California INDEXCOUNT IN ACTION  Prioritization of Cache n Works out which nodes are needed more often n Nodes higher up in the tree, it tries to keep in cache the longest  INDEXCOUNT with multiple indexes n 2 KEYS n BETTER TO HAVE INDEXCOUNT GREATER THAN 4 OR YOU WILL HAVE LOADS OF DISK I/O SWAPPING OF THE NODES IN CACHE

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California INDEXCOUNT  When to use n For writing where there are multiple keys n For reading files randomly n Reading files sequentially may degrade peformance as it keeps track of the nodes in cache n Set INDEXCOUNT to 32 for most files

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California NODESIZE  Specifies the size of the index nodes to use for an indexed file when it is created  NODESIZE={512|1024|4096|16384}  Default set by the file handler at creation based on the key size  NODESIZE can also be given in the BLF statement under GDTBATCH.  If only reading files sequentially, you may want to think of increasing the node size. n Why? l On a sequential read access it always know where the next record is and once it is at the end of the node, then it will go back to the top of the tree and traverse the tree again. On Random reads, it does not know where the next record is so it always goes to the top of the tree and binary chops it way down the tree. In this case NODESIZE has no effect.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California NODESIZE  When to use Nodesize n Reading records sequentially n Avoid using with HIGH INDEXCOUNT l Creates high memory usage n If in doubt, let file handler set the NODESIZE

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California LOADONTOHEAP  Forces file handler to load entire file onto memory heap n All Operations execute in memory – only writing back to disk when it is closed n Use with caution! l Could lose your data n [XFH-DEFAULT] n [FILE1.DAT] n LOADONTOHEAP=ON n When to use l Only in Exclusive Mode l Only on small files l Where you are batch processing and data is backed up l Look for SPEED versus INTEGRITY

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California IDXDATBUF  Determines the size of buffer used when accessing the data portion of a file with organization indexed  Default is 0  Set to Disk / Page allocation size  More suited to Batch Data  Not applicable ot single file format IDX 8  May want to look at if file is too large for LOADONTOHEAP  Sequential file version SEQDATBUF  Relative file version RELDATBUF

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Tuning Summary  Consider tuning files on a file to file basis  Think about permissions – exclusive use  Avoid key compression  Unless reading sequential data, set INDEXCOUNT=32 or calculate the optimum figure  Sequential Access – look at NODESIZE  For unloading or loading data on larger files, look at LOADONTOHEAP, IDXDATBUF or SEQDATBUF

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Using REBUILD utility for better performance  To remove Compression  To remove Unused Keys  To determine the number of Index Nodes to Cache  To change the Index Node Size of your data

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California How do you know you have Compression on your file? REBUILD /N filename or GDTFI filename

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Removing Key Compression  So we can Binary Chop the Index Nodes n rebuild infile,outfile /c:i0 l c = compression l i0 = remove key compression  So we don’t have to decompress/compress n rebuild infile,outfile /c:d0i0 l c = compression l d0 = remove data compression l i0 = remove key compression

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Removing / Adding Keys  Less keys you have, the faster the updates will be! n You cannot remove the primary key! n rebuild infile /k:r:22+10 l k = references to rebuild that we are dealing with keys l r = remove l 22 = start of key l 10= length of key

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Removing / Adding Keys  Removing a key that is defined in multiple areas of the file n rebuild /k:r:22+10,58+2,100+8  Adding a key with duplicates n rebuild infile /k:a:1+130d l a = adding a key l d = key has duplicates  Be careful when adding and removing keys l You must change your SELECT statements and RECOMPILE or you will end up with a MISMATCH between the attributes on your file and the attributes in your select statements

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California How to determine the number of index nodes to cache? Max tree depth times the no. of keys + no. of keys = indexcount i.e. 3 x = 16 To improve performance when writing records and random reads!

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California How to determine the number of index nodes to cache?  Remember n 16 is the default INDEXCOUNT setting unless you specify differently in your EXTFH.CFG file

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California How to change your Index Node Size for your files  To improve sequential access – get more keys in a node  NODESIZE is set automatically by the file handler via the following controls n Key length in bytes < 51 then nodesize = 512 n Key length 51 to 100 then nodesize =1024 (default) n Key length 101 to 512 then nodesize = 4096 n Key length 513 to 4080 then nodesize =  Can make entry in EXTFH.CFG n NODESIZE=4096 n Then rebuild infile,outfile will grab EXTFH.CFG setting BUT if you choose a lower value then you should have, the file handler will pick the setting based on the controls above.

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Operating System Tuning  You may fully understand the “Binary Tree”, the operations of your application and have applied all the necessary tunables.  BUT, application(s) run no better and may be even worse. You are disappointed!  Look at performance at a different level.  Operating system n Server Operating system l Prioritization is more evenly distributed throughout all processes n Windows Operating system l Prioritization is given to the application running in the foreground

GDT Tips and Tricks Doug Evans GDT 2006 International User Conference: Evolving the Legacy – Revolutions June  Palm Springs, California Operating System Tuning  Server Operating System n Amount of CACHE is based on the physical amount of memory available on the system l With 6 – 12 gig of memory this means you would have a large amount of cache  Windows Operating System n Amount of CACHE is normally around 10 meg. Once this is filled up then the operating has to remove items from cache and insert newer instances. Big difference in performance.  Windows Operating System n Changing the settings for CACHE in the System properties can improve performance dramatically. n Be careful, this may create some side effects!