Presentation is loading. Please wait.

Presentation is loading. Please wait.

May 2008John Mycroft – WAVV 2008 VSE/VSAM – Under the covers John Mycroft Product Development Manager CSI International

Similar presentations


Presentation on theme: "May 2008John Mycroft – WAVV 2008 VSE/VSAM – Under the covers John Mycroft Product Development Manager CSI International"— Presentation transcript:

1 May 2008John Mycroft – WAVV 2008 VSE/VSAM – Under the covers John Mycroft Product Development Manager CSI International www.csi-international.com johnm@csi-international.com

2 May 2008John Mycroft – WAVV 2008 Acknowledgement With grateful thanks to Dan Janda, The Swami of VSAM, from whom much of this presentation was stolen To CSI for providing me with Data- Miner, CSI-Sort and a machine to create the examples To my fellow developers at CSI who put up with my hogging the machine for hours on end

3 May 2008John Mycroft – WAVV 2008 Abstract Overview of VSAM & its components. We take a look at what a VSAM file really looks like and how to soup up its performance. We also look at some common mistakes and how to avoid them. This presentation and its materials are copyrighted and developed by John Mycroft from a presentation originally copyrighted by Dan Janda. Permission is granted for WAVV to reproduce this presentation for distribution to its members at no charge. Trademarks: IBM, VSE, VSE/ESA, zVSE, CICS & DL/I are trademarks or registered trademarks of the IBM Corporation The Swami of VSAM is a trademark of Dan Janda.

4 May 2008John Mycroft – WAVV 2008 VSE/VSAM Overview Virtual Storage Access Method For disk files Sequential – “Entry Sequence Dataset” or ESDS Begin at the beginning, go on til you get to the end and then stop Indexed – “Keyed Sequence Dataset” or KSDS Process by key or sequentially or a mixture Direct – “Relative Record Dataset” or RRDS (fixed) or VRDS (variable) Calculate a record’s location in the file to access it Alternate index (AIX) – gives an alternative route to a KSDS Allows unique & non-unique keys

5 May 2008John Mycroft – WAVV 2008 VSE/VSAM Functional areas Catalog Volume & file information Usage statistics Disk space management Space allocation including secondary allocations VSAM and VSAM/SAM files System files Libraries

6 May 2008John Mycroft – WAVV 2008 VSE/VSAM Functional areas Integrity Performance Data transfer size Buffering Backup / restore File sharing between jobs and systems

7 May 2008John Mycroft – WAVV 2008 Processing a VSAM file Sequentially (ESDS) Forward or backward Keyed access (KSDS) Direct by full or partial (generic) key Sequentially, forward or backward Skip sequential, forward or backward Addressed access (RRDS, VRDS) Direct, by record address Sequential & skip sequential Alternate Index Access Same as keyed access Also direct access by non-unique key

8 May 2008John Mycroft – WAVV 2008 How VSAM stores data We’re going to look at How VSAM stores records logically on disk Performance considerations How VSAM physically stores data on disk Disk space usage calculations Optimizing disk capacity Performance considerations VSAM jargon Control Interval Control Area CI & CA splits Freespace RDF, CIDF

9 May 2008John Mycroft – WAVV 2008 VSAM Jargon Control Interval (CI) “Smallest unit of data transfer between main & disk storage” In other words, when you read a record, VSAM reads the whole CI that contains that record Think of it as the same as a block of records in a sequential file if you like (though it’s laid out differently) A CI can initially contain 1 or more records More can be inserted Some or all can be deleted When you try to add a new record to a CI with no room, a “CI split” takes place – more about that later

10 May 2008John Mycroft – WAVV 2008 Layout of a control interval ALL VSAM FILES ARE VARIABLE LENGTH Even if all the records are the same size Rec 1 – Rec n 1 to n logical records of any length FreespaceUnused space in CI for inserting records or making existing records longer RDFs3 byte record descriptor field ESDS/KSDS1 per LRECL, 1 for all consecutive records of same length RRDSone per numbered record slot CIDF4 byte Control Interval Descriptor Field Rec 1Rec 2Rec 3Rec …FreespaceRDFsCIDF

11 May 2008John Mycroft – WAVV 2008 Control Area (CA) CA size is the smallest of : One cylinder or The size of the primary allocation The size of the secondary allocation The number of CIs per CA depends on the device and the CI and CA sizes It is generally a good idea to go for the biggest CA possible A CA is a group of CIs. In a KSDS, all the data CIs in a CA are indexed by one index CI CI 0CI 1CI 2CI 3CI 4CI 5CI 6CI 7CI 8CI 9 CI10CI11CI12CI13CI14CI15CI16CI17CI18CI19 CI20CI21CI22CI23CI24CI25CI26CI27CI28CI29

12 May 2008John Mycroft – WAVV 2008 Index Control Interval (Index CI) CI 0CI 1CI 2CI 3CI 4CI 5CI 6CI 7CI 8CI 9 CI10CI11CI12CI13CI14CI15CI16CI17CI18CI19 CI20CI21CI22CI23CI24CI25CI26CI27CI28CI29 A CI in an index containing pointers to The next level in the index or The Data CI in the CA – this is referred to as a Sequence Set CI Index CI

13 May 2008John Mycroft – WAVV 2008 Index and data structure Balanced tree Sparse index Always just 1 high-level index CI There can be 0 to many intermediate level index CIs There can be one or more low-level (sequence set) index CIs. If there is only 1 sequence set CI, it is also the high-level index CI

14 May 2008John Mycroft – WAVV 2008 And now the bit you’ve all been waiting for……

15 May 2008John Mycroft – WAVV 2008 Performance rules of thumb Use largest data CI possible, especially for sequential work Use as small an index CI as you can (but not too small!) Use large data CA – allocate primary and secondary as at least 1 cylinder Avoid too many extents / allocations

16 May 2008John Mycroft – WAVV 2008 Allocation calculations CI freespace = CI Size * Freespace % Number of records per CI “Fixed” length: (CI Size -10 –Freespace) / LRECL Variable length: (CI Size -7 –Freespace) / (Average LRECL +3)

17 May 2008John Mycroft – WAVV 2008 What’s in a CI? Data and control info (end of CI)

18 May 2008John Mycroft – WAVV 2008 CI control information At the end of each data CI

19 May 2008John Mycroft – WAVV 2008 Data records

20 May 2008John Mycroft – WAVV 2008 The CIDF Note – (back 2 slides) free space has data in it from earlier CI split

21 May 2008John Mycroft – WAVV 2008 The Index

22 May 2008John Mycroft – WAVV 2008 Allocation calculations Calculate Freespace in each CA Get number of CIs per CA from LISTCAT or device characteristics (3390, 12 x 4K CIs/track, 180/cyl) CA freespace = No of CIs per CA * CA Freespace %, rounded up Number of CIs loaded per CA = CIs per CA – CA freespace Number of records loaded per CA = Loaded CIs in CA * No of recs in CI

23 May 2008John Mycroft – WAVV 2008 VSAM Catalogs Exactly one master catalog Assigned at IPL with DEF CAT or DEFINE MCAT IDCAMS command User catalogs – 0 to many No more than 1 per volume Catalog can own multiple spaces on a volume Many catalogs can own space on a volume

24 May 2008John Mycroft – WAVV 2008 VSAM Catalogs Catalog contains :- Self-describing records User catalog pointers Volume definitions Space definitions Cluster (file) definitions Component (data, index) definitions AIX & Path definitions

25 May 2008John Mycroft – WAVV 2008 Catalog recommendations Use naming conventions Name Cluster, Data and Index components explicitly Use partition / system independent names where applicable Separate Files seldom defined or deleted Files often defined or deleted Online critical files Batch files Multiple baskets – all the eggs won’t get broken

26 May 2008John Mycroft – WAVV 2008 More recommendations Don’t use recoverable catalogs Hangover from 2314 / 3330 Backup is vastly better IDCAMS, Faver, Maxback, Dr D, user- written …

27 May 2008John Mycroft – WAVV 2008 CI & CA splits and freespace You try to insert a record in a CI or extend a record already there If there is enough free space in the CI, everyone moves up, record is inserted and CI rewritten BUT what if there isn’t enough free space????

28 May 2008John Mycroft – WAVV 2008 CI & CA splits CI split – 4 physical IOs Set “Split in progress”, write CI Move half of records to new CI & write it Update sequence set, write index CI Erase moved records from old CI, turn off “Split in progress”, write old CI BUT…..

29 May 2008John Mycroft – WAVV 2008 Failure in CI split System failure Corrected next time CI is updated No free CI in the CA CA split is needed Remember – 1 physical IO = 30,000 – 40,000 CPU instructions…

30 May 2008John Mycroft – WAVV 2008 CA Split MANY physical reads and writes Set “Split in progress”, write sequence set CI Maybe get new extent Format new CA at HURBA position Read / write half of CIs to new CA Write new sequence set CI for new CA Update higher level index CIs Erase moved CIs from old CA, write empty CIs Write updated original sequence set CI

31 May 2008John Mycroft – WAVV 2008 Recommendations Don’t worry about CI splits Avoid excessive CA splits by defining CA freespace Don’t do a reorg just because you have done n CI / CA splits

32 May 2008John Mycroft – WAVV 2008 To reorg or not to reorg? “We’ve done 1000 CA splits – better reorg!” Inserts tend to be clustered CI / CA split creates freespace where it is needed, allows faster inserts Reorg gets rid of freespace, causing more CI / CA splits

33 May 2008John Mycroft – WAVV 2008 My house Buy a 3 bedroom house Have 2 kids Ma-in-law moves in – add a room Ma-in-law moves out – demolish room Have another kid - Add a bedroom Oldest kid goes to college – demolish bedroom Oldest kid brings home girlfriend……

34 May 2008John Mycroft – WAVV 2008 My KSDS Get some space Insert records causing CI splits REORG!! Delete some records, freeing space REORG!!! Add records, causing CA splits REORG!!!!

35 May 2008John Mycroft – WAVV 2008 Recommendations Avoid frequent reorgs Once a split has occurred, the processing cost has been paid Don’t reorg to compress out free space

36 May 2008John Mycroft – WAVV 2008 Reorgs Understand your application 1 “hot spot” Little distributed freespace – let it split Many hot spots Little distributed freespace – let it split Even distribution – no hot spots Use distributed freespace

37 May 2008John Mycroft – WAVV 2008 Freespace 3% of each CI is empty 5% of CIs in each CA are empty 3% of 2048 = 61 bytes = 0 records (or, at most, 1) 5% of 315 CIs per CA = 16 CIs

38 May 2008John Mycroft – WAVV 2008 Freespace 3% CI freespace where CISZ=2048 and average LRECL=120 No room in this CI for an average length record

39 May 2008John Mycroft – WAVV 2008 Altering freespace Initial freespace set via DEFINE eg 10% of CI and 5% of CA If inserts are clustered, consider DEFINE with 0% freespace, then Load the “fixed” part of the file then ALTER freespace to non-zero Load the “variable” part of the file

40 May 2008John Mycroft – WAVV 2008 Freespace ain’t free space Freespace is empty, not used You still have to pay IBM for it

41 May 2008John Mycroft – WAVV 2008 Strings VSAM allows multiple concurrent processing e.g. CICS transactions Browsing Updating Placeholders (“strings”) hold file location info

42 May 2008John Mycroft – WAVV 2008 Shared / non-shared resources Non-shared resources (NSR) Each string has its own buffers Multiple copies of a CI may be in memory Works well for batch Local Shared Resources (LSR) Many strings share a pool of buffers Only 1 copy of a CI in the pool Ideal for online

43 May 2008John Mycroft – WAVV 2008 Recommendations - NSR Non-shared resources Each string must have enough index buffers Bad – 1 buffer (old default) OK – 1 buffer per index level (new default) Good – enough buffers for all high level indexes + 1 more Best – enough buffers to hold entire index

44 May 2008John Mycroft – WAVV 2008 Recommendations - LSR Local Shared Resource buffers Same index buffer needs as NSR (buffers are per pool, not per string) Monitor VSAM LSR stats to make sure BUFNI keeps up with index growth Monitor data buffers for high hit rates

45 May 2008John Mycroft – WAVV 2008 IO with NSR VSAM uses chained IO to read ahead and write behind Better to read many CIs in one IO Block big Large CI sizes Be aware that VSAM will split CIs into smaller blocks to save space Eg 3390 with 32K CI gets written as 2 x 16K blocks giving 1.5 CIs = 48K/track Buffer big ½ to 1 cyl of BUFND to minimize IO

46 May 2008John Mycroft – WAVV 2008 IO with LSR VSAM reads 1 CI at a time, even for sequential processing

47 May 2008John Mycroft – WAVV 2008 Monitor your stats LISTCAT before and after critical job Data & Index EXCPs – the fewer the better. Index EXCPs should be close to number of index CIs. Job Accounting data IO count by device Overal CPU & IO activity CICS stats Shows logical / physical IO counts by file LSR pool hits and misses VSAM buffer stats – in VSE/ESA examples doc LSR is in 31 bit – use LOTS but don’t page

48 May 2008John Mycroft – WAVV 2008 Sharing VSAM datasets VSAM can share files among partitions And among VSE systems BUT TANSTAAFL (Robert Heinlein) Sharing is not a performance option (Dan Janda) It’s your gun and your foot (Steve Huggins)

49 May 2008John Mycroft – WAVV 2008 Sharing VSAM datasets Sharing is based on The type of sharing you ask for (SHAREOPTIONS) VSE Lock Table within a single VSE system VSE Lock File when sharing across VSE systems VSE sharing mechanism is not compatible with zOS or zVM

50 May 2008John Mycroft – WAVV 2008 Sharing VSAM datasets Sharing at OPEN / CLOSE time Entries checked and placed in / removed from lock table If DASD volume is added as shared (ADD cuu,SHR), it is added to lock file VSE & VSAM allow concurrent processing to protect against concurrent updates messing up the file

51 May 2008John Mycroft – WAVV 2008 Sharing VSAM datasets Integrity classes – your choice NO INTEGRITY – VSE & VSAM provide no data protection: it’s all up to you. Your data can be messed up. WRITE INTEGRITY – VSE & VSAM protect against concurrent updates READ INTEGRITY – VSE & VSAM make sure your programs always see the latest version of a record The price Higher levels & broader scopes of integrity lead to more CPU and IO activity

52 May 2008John Mycroft – WAVV 2008 SHAREOPTIONS Ready – Fire – Aim Set in DEFINE CLUSTER Get it wrong & be prepared to suffer If a disk drive isn’t shared between VSEs, don’t ADD it with SHR as this causes lock file IO

53 May 2008John Mycroft – WAVV 2008 SHAREOPTIONS & Locking SHR(1) 1 output OR many input External lock at OPEN, unlock at CLOSE SHR(2) 1 output AND many input External lock at OPEN, unlock at CLOSE SHR(3) No checking or locking Prepare for garbage data SHR(4) Many output in one VSE & many input OPENs across all VSEs External lock at OPEN, unlock at CLOSE External lock at access, unlock at release SHR(4 4) Many output OPENs across all VSEs + many input OPENs Locks same as SHR(4)

54 May 2008John Mycroft – WAVV 2008 Alternate indexes (AIX) An AIX is a VSAM KSDS, acting as a “pointer file” for another file Target file (“Base Cluster”) can be KSDS – pointers are KSDS key values ESDS – pointers are Relative Byte Addrs Great for multiple or non-unique keys BUT Processing via an AIX needs IO to both the AIX and to the base cluster

55 May 2008John Mycroft – WAVV 2008 Setting up an AIX DEFINE CLUSTER for base cluster DEFINE AIX for the alternate index Give base cluster’s name & alternate key Data & Index CI sizes DEFINE PATH Allows specifying of NOUPGRADE paths BLDINDEX Reads primary & alternate key info from base cluster Sorts into alternate key sequence Loads alternate index

56 May 2008John Mycroft – WAVV 2008 AIX recommendations To process the base cluster in AIX order, it is better to sort it and use the SORTOUT file Remember VSAM processes base clusters directly based on AIX values Base cluster will need lots of index buffers for batch processing. Give Base cluster large BUFFERSPACE on DEFINE or ALTER

57 May 2008John Mycroft – WAVV 2008 AIX and CICS “SPHERE” – a base cluster and all its AIXs related to it Requirements Each sphere must be wholly within one LSR pool Use Dataset Name Sharing In CICS 2.3, add BASE= to FCT entry for Base cluster file entry Each related path file entry This is automatic in CICS TS SHR(2) is usually best Make sure your CICS and VSAM service is current!

58 May 2008John Mycroft – WAVV 2008 MYTH 1 - RECOVERY is a good option for a dataset Oh yeah? RECOVERY makes it possible for you to write a recovery routine to restart loading. COPY 50,000 record KSDS- SPEED = 6 secs, 1512 I/Os RECOVERY = 10 secs, 1925 I/Os BUSTED!!!

59 May 2008John Mycroft – WAVV 2008 MYTH 2 – No need to sort before loading KSDS Load 100,000 record KSDS with Data-Miner COPY Elapsed = 7:11,CPU = 51”, EXCP = 294412, CIsplit = 2011, CAsplit = 63 Sort to KSDS with CSI-Sort Elapsed = 0:27,CPU = 6”, EXCP = 4314, CIsplit = 0,CAsplit = 0 BUSTED!!!

60 May 2008John Mycroft – WAVV 2008 And now the most burning question of the day…… How do you delete an unwanted slide from a Power Point presentation?

61 May 2008John Mycroft – WAVV 2008 Contacting the presenter You can contact me by email at johnm@csi-international.com johnm@csi-international.com And, if you want to find me this evening…

62 May 2008John Mycroft – WAVV 2008 You’ll find me here


Download ppt "May 2008John Mycroft – WAVV 2008 VSE/VSAM – Under the covers John Mycroft Product Development Manager CSI International"

Similar presentations


Ads by Google