May 2008John Mycroft – WAVV 2008 VSE/VSAM – Under the covers John Mycroft Product Development Manager CSI International

Slides:



Advertisements
Similar presentations
Lectures on File Management
Advertisements

©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
Files  File organisation and usage A record is a group of logically related fields A file is a group of logically related records Files are used to store.
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
MVS/ESA Fundamentals of VSAM © Copyright IBM Corp., 2000, All rights reserved.
Chapter 12 File Management
Tree-Structured Indexes. Introduction v As for any index, 3 alternatives for data entries k* : À Data record with key value k Á Â v Choice is orthogonal.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
1 Computer System Overview OS-1 Course AA
METU Department of Computer Eng Ceng 302 Introduction to DBMS Disk Storage, Basic File Structures, and Hashing by Pinar Senkul resources: mostly froom.
Introduction of z/OS Basics © 2006 IBM Corporation Chapter 5: Working with data sets.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 13 Disk Storage, Basic File Structures, and Hashing.
File Organizations and Indexing Lecture 4 R&G Chapter 8 "If you don't find it in the index, look very carefully through the entire catalogue." -- Sears,
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
Chapter 5: Working with data sets
DISK STORAGE INDEX STRUCTURES FOR FILES Lecture 12.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
Tree-Structured Indexes. Range Searches ``Find all students with gpa > 3.0’’ –If data is in sorted file, do binary search to find first such student,
Chapter 10 Storage and File Structure Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 17 Disk Storage, Basic File Structures, and Hashing.
1 Computer System Overview Chapter 1. 2 n An Operating System makes the computing power available to users by controlling the hardware n Let us review.
March 16 & 21, Csci 2111: Data and File Structures Week 9, Lectures 1 & 2 Indexed Sequential File Access and Prefix B+ Trees.
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
WAVV 2007, Green Bay, WI VSE/VSAM – Inside & Out John Mycroft, Software Developer CSI International
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
Virtual Memory Expanding Memory Multiple Concurrent Processes.
DB2. 2 Copyright © 2005, Infosys Technologies Ltd ER/CORP/CRS/DB01/003 Version No:2.0a Session Plan Introduction to Concurrency Control Different types.
Introduction to the new mainframe © Copyright IBM Corp., All rights reserved. Chapter 4: Working with data sets.
VSAM Alternate Indexes Department of Computer Science Northern Illinois University August 2005.
File Storage Organization The majority of space on a device is reserved for the storage of files. When files are created and modified physical blocks are.
File Structures. 2 Chapter - Objectives Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and.
High Availability in DB2 Nishant Sinha
CS333 Intro to Operating Systems Jonathan Walpole.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
1 Copyright © 2011 Tata Consultancy Services Limited Virtual Access Storage Method (VSAM) and Numeric Intrinsic Functions (NUMVAL and NUMVAL-C) LG - TMF148.
VSAM ESDS and RRDS Department of Computer Science Northern Illinois University September 2005 Some of the illustrations are from VSAM: Access Method Services.
VSAM KSDS Structure and Processing Department of Computer Science Northern Illinois University August 2005 Some of the illustrations are from VSAM: Access.
1 Tree-Structured Indexes Chapter Introduction  As for any index, 3 alternatives for data entries k* :  Data record with key value k   Choice.
Chapter 5 Record Storage and Primary File Organizations
CS4432: Database Systems II
Announcements Today –RAID –Begin Indexes Program 1 due Friday –Office Hours today 2-3 pm –I’ll have limited contact over the weekend –later today.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 10.
( ) 1 Chapter # 8 How Data is stored DATABASE.
Database Applications (15-415) DBMS Internals- Part III Lecture 13, March 06, 2016 Mohammad Hammoud.
Ver 1.0/ 3 rd Sep 2001 Classification : InternalProprietary & Confidential VSAM.
Tree-Structured Indexes. Introduction As for any index, 3 alternatives for data entries k*: – Data record with key value k –  Choice is orthogonal to.
Storage and File Organization
CS222: Principles of Data Management Lecture #4 Catalogs, Buffer Manager, File Organizations Instructor: Chen Li.
Jonathan Walpole Computer Science Portland State University
Chapter 2 Memory and process management
Module 11: File Structure
CS522 Advanced database Systems
Record Storage, File Organization, and Indexes
Tree-Structured Indexes
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
File System Structure How do I organize a disk into a file system?
CSI 400/500 Operating Systems Spring 2009
Lecture 10: Buffer Manager and File Organization
Disk Storage, Basic File Structures, and Buffer Management
Disk storage Index structures for files
Introduction to Database Systems
Overview: File system implementation (cont)
Tree-Structured Indexes
Chapter 13: Data Storage Structures
Indexing 4/11/2019.
Chapter 13: Data Storage Structures
Lecture Topics: 11/20 HW 7 What happens on a memory reference Traps
Chapter 13: Data Storage Structures
Presentation transcript:

May 2008John Mycroft – WAVV 2008 VSE/VSAM – Under the covers John Mycroft Product Development Manager CSI International

May 2008John Mycroft – WAVV 2008 Acknowledgement With grateful thanks to Dan Janda, The Swami of VSAM, from whom much of this presentation was stolen To CSI for providing me with Data- Miner, CSI-Sort and a machine to create the examples To my fellow developers at CSI who put up with my hogging the machine for hours on end

May 2008John Mycroft – WAVV 2008 Abstract Overview of VSAM & its components. We take a look at what a VSAM file really looks like and how to soup up its performance. We also look at some common mistakes and how to avoid them. This presentation and its materials are copyrighted and developed by John Mycroft from a presentation originally copyrighted by Dan Janda. Permission is granted for WAVV to reproduce this presentation for distribution to its members at no charge. Trademarks: IBM, VSE, VSE/ESA, zVSE, CICS & DL/I are trademarks or registered trademarks of the IBM Corporation The Swami of VSAM is a trademark of Dan Janda.

May 2008John Mycroft – WAVV 2008 VSE/VSAM Overview Virtual Storage Access Method For disk files Sequential – “Entry Sequence Dataset” or ESDS Begin at the beginning, go on til you get to the end and then stop Indexed – “Keyed Sequence Dataset” or KSDS Process by key or sequentially or a mixture Direct – “Relative Record Dataset” or RRDS (fixed) or VRDS (variable) Calculate a record’s location in the file to access it Alternate index (AIX) – gives an alternative route to a KSDS Allows unique & non-unique keys

May 2008John Mycroft – WAVV 2008 VSE/VSAM Functional areas Catalog Volume & file information Usage statistics Disk space management Space allocation including secondary allocations VSAM and VSAM/SAM files System files Libraries

May 2008John Mycroft – WAVV 2008 VSE/VSAM Functional areas Integrity Performance Data transfer size Buffering Backup / restore File sharing between jobs and systems

May 2008John Mycroft – WAVV 2008 Processing a VSAM file Sequentially (ESDS) Forward or backward Keyed access (KSDS) Direct by full or partial (generic) key Sequentially, forward or backward Skip sequential, forward or backward Addressed access (RRDS, VRDS) Direct, by record address Sequential & skip sequential Alternate Index Access Same as keyed access Also direct access by non-unique key

May 2008John Mycroft – WAVV 2008 How VSAM stores data We’re going to look at How VSAM stores records logically on disk Performance considerations How VSAM physically stores data on disk Disk space usage calculations Optimizing disk capacity Performance considerations VSAM jargon Control Interval Control Area CI & CA splits Freespace RDF, CIDF

May 2008John Mycroft – WAVV 2008 VSAM Jargon Control Interval (CI) “Smallest unit of data transfer between main & disk storage” In other words, when you read a record, VSAM reads the whole CI that contains that record Think of it as the same as a block of records in a sequential file if you like (though it’s laid out differently) A CI can initially contain 1 or more records More can be inserted Some or all can be deleted When you try to add a new record to a CI with no room, a “CI split” takes place – more about that later

May 2008John Mycroft – WAVV 2008 Layout of a control interval ALL VSAM FILES ARE VARIABLE LENGTH Even if all the records are the same size Rec 1 – Rec n 1 to n logical records of any length FreespaceUnused space in CI for inserting records or making existing records longer RDFs3 byte record descriptor field ESDS/KSDS1 per LRECL, 1 for all consecutive records of same length RRDSone per numbered record slot CIDF4 byte Control Interval Descriptor Field Rec 1Rec 2Rec 3Rec …FreespaceRDFsCIDF

May 2008John Mycroft – WAVV 2008 Control Area (CA) CA size is the smallest of : One cylinder or The size of the primary allocation The size of the secondary allocation The number of CIs per CA depends on the device and the CI and CA sizes It is generally a good idea to go for the biggest CA possible A CA is a group of CIs. In a KSDS, all the data CIs in a CA are indexed by one index CI CI 0CI 1CI 2CI 3CI 4CI 5CI 6CI 7CI 8CI 9 CI10CI11CI12CI13CI14CI15CI16CI17CI18CI19 CI20CI21CI22CI23CI24CI25CI26CI27CI28CI29

May 2008John Mycroft – WAVV 2008 Index Control Interval (Index CI) CI 0CI 1CI 2CI 3CI 4CI 5CI 6CI 7CI 8CI 9 CI10CI11CI12CI13CI14CI15CI16CI17CI18CI19 CI20CI21CI22CI23CI24CI25CI26CI27CI28CI29 A CI in an index containing pointers to The next level in the index or The Data CI in the CA – this is referred to as a Sequence Set CI Index CI

May 2008John Mycroft – WAVV 2008 Index and data structure Balanced tree Sparse index Always just 1 high-level index CI There can be 0 to many intermediate level index CIs There can be one or more low-level (sequence set) index CIs. If there is only 1 sequence set CI, it is also the high-level index CI

May 2008John Mycroft – WAVV 2008 And now the bit you’ve all been waiting for……

May 2008John Mycroft – WAVV 2008 Performance rules of thumb Use largest data CI possible, especially for sequential work Use as small an index CI as you can (but not too small!) Use large data CA – allocate primary and secondary as at least 1 cylinder Avoid too many extents / allocations

May 2008John Mycroft – WAVV 2008 Allocation calculations CI freespace = CI Size * Freespace % Number of records per CI “Fixed” length: (CI Size -10 –Freespace) / LRECL Variable length: (CI Size -7 –Freespace) / (Average LRECL +3)

May 2008John Mycroft – WAVV 2008 What’s in a CI? Data and control info (end of CI)

May 2008John Mycroft – WAVV 2008 CI control information At the end of each data CI

May 2008John Mycroft – WAVV 2008 Data records

May 2008John Mycroft – WAVV 2008 The CIDF Note – (back 2 slides) free space has data in it from earlier CI split

May 2008John Mycroft – WAVV 2008 The Index

May 2008John Mycroft – WAVV 2008 Allocation calculations Calculate Freespace in each CA Get number of CIs per CA from LISTCAT or device characteristics (3390, 12 x 4K CIs/track, 180/cyl) CA freespace = No of CIs per CA * CA Freespace %, rounded up Number of CIs loaded per CA = CIs per CA – CA freespace Number of records loaded per CA = Loaded CIs in CA * No of recs in CI

May 2008John Mycroft – WAVV 2008 VSAM Catalogs Exactly one master catalog Assigned at IPL with DEF CAT or DEFINE MCAT IDCAMS command User catalogs – 0 to many No more than 1 per volume Catalog can own multiple spaces on a volume Many catalogs can own space on a volume

May 2008John Mycroft – WAVV 2008 VSAM Catalogs Catalog contains :- Self-describing records User catalog pointers Volume definitions Space definitions Cluster (file) definitions Component (data, index) definitions AIX & Path definitions

May 2008John Mycroft – WAVV 2008 Catalog recommendations Use naming conventions Name Cluster, Data and Index components explicitly Use partition / system independent names where applicable Separate Files seldom defined or deleted Files often defined or deleted Online critical files Batch files Multiple baskets – all the eggs won’t get broken

May 2008John Mycroft – WAVV 2008 More recommendations Don’t use recoverable catalogs Hangover from 2314 / 3330 Backup is vastly better IDCAMS, Faver, Maxback, Dr D, user- written …

May 2008John Mycroft – WAVV 2008 CI & CA splits and freespace You try to insert a record in a CI or extend a record already there If there is enough free space in the CI, everyone moves up, record is inserted and CI rewritten BUT what if there isn’t enough free space????

May 2008John Mycroft – WAVV 2008 CI & CA splits CI split – 4 physical IOs Set “Split in progress”, write CI Move half of records to new CI & write it Update sequence set, write index CI Erase moved records from old CI, turn off “Split in progress”, write old CI BUT…..

May 2008John Mycroft – WAVV 2008 Failure in CI split System failure Corrected next time CI is updated No free CI in the CA CA split is needed Remember – 1 physical IO = 30,000 – 40,000 CPU instructions…

May 2008John Mycroft – WAVV 2008 CA Split MANY physical reads and writes Set “Split in progress”, write sequence set CI Maybe get new extent Format new CA at HURBA position Read / write half of CIs to new CA Write new sequence set CI for new CA Update higher level index CIs Erase moved CIs from old CA, write empty CIs Write updated original sequence set CI

May 2008John Mycroft – WAVV 2008 Recommendations Don’t worry about CI splits Avoid excessive CA splits by defining CA freespace Don’t do a reorg just because you have done n CI / CA splits

May 2008John Mycroft – WAVV 2008 To reorg or not to reorg? “We’ve done 1000 CA splits – better reorg!” Inserts tend to be clustered CI / CA split creates freespace where it is needed, allows faster inserts Reorg gets rid of freespace, causing more CI / CA splits

May 2008John Mycroft – WAVV 2008 My house Buy a 3 bedroom house Have 2 kids Ma-in-law moves in – add a room Ma-in-law moves out – demolish room Have another kid - Add a bedroom Oldest kid goes to college – demolish bedroom Oldest kid brings home girlfriend……

May 2008John Mycroft – WAVV 2008 My KSDS Get some space Insert records causing CI splits REORG!! Delete some records, freeing space REORG!!! Add records, causing CA splits REORG!!!!

May 2008John Mycroft – WAVV 2008 Recommendations Avoid frequent reorgs Once a split has occurred, the processing cost has been paid Don’t reorg to compress out free space

May 2008John Mycroft – WAVV 2008 Reorgs Understand your application 1 “hot spot” Little distributed freespace – let it split Many hot spots Little distributed freespace – let it split Even distribution – no hot spots Use distributed freespace

May 2008John Mycroft – WAVV 2008 Freespace 3% of each CI is empty 5% of CIs in each CA are empty 3% of 2048 = 61 bytes = 0 records (or, at most, 1) 5% of 315 CIs per CA = 16 CIs

May 2008John Mycroft – WAVV 2008 Freespace 3% CI freespace where CISZ=2048 and average LRECL=120 No room in this CI for an average length record

May 2008John Mycroft – WAVV 2008 Altering freespace Initial freespace set via DEFINE eg 10% of CI and 5% of CA If inserts are clustered, consider DEFINE with 0% freespace, then Load the “fixed” part of the file then ALTER freespace to non-zero Load the “variable” part of the file

May 2008John Mycroft – WAVV 2008 Freespace ain’t free space Freespace is empty, not used You still have to pay IBM for it

May 2008John Mycroft – WAVV 2008 Strings VSAM allows multiple concurrent processing e.g. CICS transactions Browsing Updating Placeholders (“strings”) hold file location info

May 2008John Mycroft – WAVV 2008 Shared / non-shared resources Non-shared resources (NSR) Each string has its own buffers Multiple copies of a CI may be in memory Works well for batch Local Shared Resources (LSR) Many strings share a pool of buffers Only 1 copy of a CI in the pool Ideal for online

May 2008John Mycroft – WAVV 2008 Recommendations - NSR Non-shared resources Each string must have enough index buffers Bad – 1 buffer (old default) OK – 1 buffer per index level (new default) Good – enough buffers for all high level indexes + 1 more Best – enough buffers to hold entire index

May 2008John Mycroft – WAVV 2008 Recommendations - LSR Local Shared Resource buffers Same index buffer needs as NSR (buffers are per pool, not per string) Monitor VSAM LSR stats to make sure BUFNI keeps up with index growth Monitor data buffers for high hit rates

May 2008John Mycroft – WAVV 2008 IO with NSR VSAM uses chained IO to read ahead and write behind Better to read many CIs in one IO Block big Large CI sizes Be aware that VSAM will split CIs into smaller blocks to save space Eg 3390 with 32K CI gets written as 2 x 16K blocks giving 1.5 CIs = 48K/track Buffer big ½ to 1 cyl of BUFND to minimize IO

May 2008John Mycroft – WAVV 2008 IO with LSR VSAM reads 1 CI at a time, even for sequential processing

May 2008John Mycroft – WAVV 2008 Monitor your stats LISTCAT before and after critical job Data & Index EXCPs – the fewer the better. Index EXCPs should be close to number of index CIs. Job Accounting data IO count by device Overal CPU & IO activity CICS stats Shows logical / physical IO counts by file LSR pool hits and misses VSAM buffer stats – in VSE/ESA examples doc LSR is in 31 bit – use LOTS but don’t page

May 2008John Mycroft – WAVV 2008 Sharing VSAM datasets VSAM can share files among partitions And among VSE systems BUT TANSTAAFL (Robert Heinlein) Sharing is not a performance option (Dan Janda) It’s your gun and your foot (Steve Huggins)

May 2008John Mycroft – WAVV 2008 Sharing VSAM datasets Sharing is based on The type of sharing you ask for (SHAREOPTIONS) VSE Lock Table within a single VSE system VSE Lock File when sharing across VSE systems VSE sharing mechanism is not compatible with zOS or zVM

May 2008John Mycroft – WAVV 2008 Sharing VSAM datasets Sharing at OPEN / CLOSE time Entries checked and placed in / removed from lock table If DASD volume is added as shared (ADD cuu,SHR), it is added to lock file VSE & VSAM allow concurrent processing to protect against concurrent updates messing up the file

May 2008John Mycroft – WAVV 2008 Sharing VSAM datasets Integrity classes – your choice NO INTEGRITY – VSE & VSAM provide no data protection: it’s all up to you. Your data can be messed up. WRITE INTEGRITY – VSE & VSAM protect against concurrent updates READ INTEGRITY – VSE & VSAM make sure your programs always see the latest version of a record The price Higher levels & broader scopes of integrity lead to more CPU and IO activity

May 2008John Mycroft – WAVV 2008 SHAREOPTIONS Ready – Fire – Aim Set in DEFINE CLUSTER Get it wrong & be prepared to suffer If a disk drive isn’t shared between VSEs, don’t ADD it with SHR as this causes lock file IO

May 2008John Mycroft – WAVV 2008 SHAREOPTIONS & Locking SHR(1) 1 output OR many input External lock at OPEN, unlock at CLOSE SHR(2) 1 output AND many input External lock at OPEN, unlock at CLOSE SHR(3) No checking or locking Prepare for garbage data SHR(4) Many output in one VSE & many input OPENs across all VSEs External lock at OPEN, unlock at CLOSE External lock at access, unlock at release SHR(4 4) Many output OPENs across all VSEs + many input OPENs Locks same as SHR(4)

May 2008John Mycroft – WAVV 2008 Alternate indexes (AIX) An AIX is a VSAM KSDS, acting as a “pointer file” for another file Target file (“Base Cluster”) can be KSDS – pointers are KSDS key values ESDS – pointers are Relative Byte Addrs Great for multiple or non-unique keys BUT Processing via an AIX needs IO to both the AIX and to the base cluster

May 2008John Mycroft – WAVV 2008 Setting up an AIX DEFINE CLUSTER for base cluster DEFINE AIX for the alternate index Give base cluster’s name & alternate key Data & Index CI sizes DEFINE PATH Allows specifying of NOUPGRADE paths BLDINDEX Reads primary & alternate key info from base cluster Sorts into alternate key sequence Loads alternate index

May 2008John Mycroft – WAVV 2008 AIX recommendations To process the base cluster in AIX order, it is better to sort it and use the SORTOUT file Remember VSAM processes base clusters directly based on AIX values Base cluster will need lots of index buffers for batch processing. Give Base cluster large BUFFERSPACE on DEFINE or ALTER

May 2008John Mycroft – WAVV 2008 AIX and CICS “SPHERE” – a base cluster and all its AIXs related to it Requirements Each sphere must be wholly within one LSR pool Use Dataset Name Sharing In CICS 2.3, add BASE= to FCT entry for Base cluster file entry Each related path file entry This is automatic in CICS TS SHR(2) is usually best Make sure your CICS and VSAM service is current!

May 2008John Mycroft – WAVV 2008 MYTH 1 - RECOVERY is a good option for a dataset Oh yeah? RECOVERY makes it possible for you to write a recovery routine to restart loading. COPY 50,000 record KSDS- SPEED = 6 secs, 1512 I/Os RECOVERY = 10 secs, 1925 I/Os BUSTED!!!

May 2008John Mycroft – WAVV 2008 MYTH 2 – No need to sort before loading KSDS Load 100,000 record KSDS with Data-Miner COPY Elapsed = 7:11,CPU = 51”, EXCP = , CIsplit = 2011, CAsplit = 63 Sort to KSDS with CSI-Sort Elapsed = 0:27,CPU = 6”, EXCP = 4314, CIsplit = 0,CAsplit = 0 BUSTED!!!

May 2008John Mycroft – WAVV 2008 And now the most burning question of the day…… How do you delete an unwanted slide from a Power Point presentation?

May 2008John Mycroft – WAVV 2008 Contacting the presenter You can contact me by at And, if you want to find me this evening…

May 2008John Mycroft – WAVV 2008 You’ll find me here