Presentation is loading. Please wait.

Presentation is loading. Please wait.

How to be a Great z/VM System Programmer

Similar presentations


Presentation on theme: "How to be a Great z/VM System Programmer"— Presentation transcript:

1 How to be a Great z/VM System Programmer
June 26, 2015 Version 1.1 Tim Reynolds z/VM CP Level 2 Support P4 Version 1.2 – Tim Reynolds for VM Workshop – INTRODUCE YOURSELF AND EXPLAIN YOUR ROLE, WELCOME QUESTIONS, ETC. Also, “DON’T BE *THAT* SYSTEM PROGRAMMER!” Version 1.1 Changes Clean up deck with comments from Tracy Dean an others after Edge in Las Vegas and Tech Conf in Dublin… More descriptions in comments and links. Version 1.0 – New for Edge in Las Vegas and Tech Conf in Dublin -- S. Wilkins & D. Griffith

2 RACF* Storwize* System Storage* System x* System z*
The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. BladeCenter* DB2* DS6000* DS8000* ECKD FICON* GDPS* HiperSockets HyperSwap IBM z13* OMEGAMON* Performance Toolkit for VM Power* PowerVM PR/SM RACF* Storwize* System Storage* System x* System z* System z9* System z10* Tivoli* zEnterprise* z/OS* zSecure z/VM* z Systems* * Registered trademarks of IBM Corporation The following are trademarks or registered trademarks of other companies. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license there from. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. Java and all Java based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. OpenStack is a trademark of OpenStack LLC. The OpenStack trademark policy is available on the OpenStack website. TEALEAF is a registered trademark of Tealeaf, an IBM Company. Windows Server and the Windows logo are trademarks of the Microsoft group of countries. Worklight is a trademark or registered trademark of Worklight, an IBM Company. UNIX is a registered trademark of The Open Group in the United States and other countries. * Other product and service names might be trademarks of IBM or other companies. Notes: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. This information provides only general descriptions of the types and portions of workloads that are eligible for execution on Specialty Engines (e.g., zIIPs, zAAPs, and IFLs) ("SEs"). IBM authorizes customers to use IBM SE only to execute the processing of Eligible Workloads of specific Programs expressly authorized by IBM as specified in the “Authorized Use Table for IBM Machines” provided at (“AUT”). No other workload processing is authorized for execution on an SE. IBM offers SE at a lower price than General Processors/Central Processors because customers are authorized to use SEs only to process certain types and/or amounts of workloads as specified by IBM in the AUT.

3 No other workload processing is authorized for execution on an SE.
Any information contained in this document regarding Specialty Engines ("SEs") and SE eligible workloads provides only general descriptions of the types and portions of workloads that are eligible for execution on Specialty Engines (e.g., zIIPs, zAAPs, and IFLs).  IBM authorizes customers to use IBM SE only to execute the processing of Eligible Workloads of specific Programs expressly authorized by IBM as specified in the “Authorized Use Table for IBM Machines” provided at (“AUT”). No other workload processing is authorized for execution on an SE.  IBM offers SEs at a lower price than General Processors/Central Processors because customers are authorized to use SEs only to process certain types and/or amounts of workloads as specified by IBM in the AUT. The information contained in this document has not been submitted to any formal IBM test and is distributed on an "AS IS" basis without any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer's ability to evaluate and integrate them into the operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. In this document, any references made to an IBM licensed program are not intended to state or imply that only IBM's licensed program may be used; any functionally equivalent program may be used instead. Any performance data contained in this document was determined in a controlled environment and, therefore, the results which may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environments. It is possible that this material may contain reference to, or information about, IBM products (machines and programs), programming, or services that are not announced in your country. Such references or information must not be construed to mean that IBM intends to announce such IBM products, programming or services in your country. 3

4 Disclaimer The information contained in this document has not been submitted to any formal IBM test and is distributed on an "AS IS" basis without any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer's ability to evaluate and integrate them into the operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. In this document, any references made to an IBM licensed program are not intended to state or imply that only IBM's licensed program may be used; any functionally equivalent program may be used instead. Any performance data contained in this document was determined in a controlled environment and, therefore, the results which may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environments. It is possible that this material may contain reference to, or information about, IBM products (machines and programs), programming, or services that are not announced in your country. Such references or information must not be construed to mean that IBM intends to announce such IBM products, programming or services in your country. 20 September 2018

5 Why – A Self Serving Explanation from the Development point of view
Agenda Why – A Self Serving Explanation from the Development point of view And it is good for you too Being a Systems programmer is sort of like many things  “Top 10” Things You Need to Know and Do Initial System Setup or Planning for Changes Everyday Items Occasional Items Prepared Responses to Catastrophic Events What is next ** The URL’s are not referenced in each section of this presentation They are all together in a reference page at the end. 20 September 2018

6 Some history of how we got to this presentation
“Top 10” things you need to know and do A self serving idea that will help us all Some history of how we got to this presentation Smart people thrust into uncomfortable waters Some situations we have seen Unknown outages (outages without any Documentation or Known Cause) Configuration Performance The real start of this was a number of Customer situations where bad things have happened. This is an attempt to get some easy items documented with real solutions Dan’s person goal is to have customers be able to spend a ‘REASONABLE” amount of time on a regular basis doing care and feeding of their z/VM systems. Reasonable means different things to different customers and that is the reason for this format. 20 September 2018

7 “Top 10” things you need to know and do Good Better Best
Good – You have more than thought about the topic but the response is mostly manual and ad-hoc Better – You have a plan and are taking the steps necessary to carry it out. Best – You have a well thought out methodology. Many of the responses are automated and documented so they are repeatable. You learn from the past 20 September 2018

8 “Top 10” things you need to know and do Like your car
The day you bring it home Daily needs Weekly needs Occasional needs 20 September 2018

9 “Top 10” things you need to know and do Like your pets
The day you bring it home Daily needs Weekly needs Occasional needs 20 September 2018

10 The day you bring “it” home Daily needs Weekly needs Occasional needs
“Top 10” things you need to know and do Like your Personal Relationships The day you bring “it” home Daily needs Weekly needs Occasional needs 20 September 2018

11 “Top 10” things you need to know and do What are the “Top 10”?
Security policy Configuration – Storage, Network, etc. Backups Paging and Spooling space Performance Virtual Machine management Logging Social factors Maintenance (applying service) Handling outages – Disaster Recovery, abends 20 September 2018

12 “Top 10” things you need to know and do
Initial System Setup or Planning for Changes Everyday Items Occasional Items Prepared Responses to Catastrophic Events 20 September 2018

13 “Top 10” things you need to know and do Change Installation Passwords
z/VM and many products deliver with common passwords These absolutely need to be changed Passwords for other installed products Before moving your system into production, you should ensure all passwords conform to your corporate security policies. It can be embarrassing or worse Other Material CP Planning and Administration manual ** FIELD EXAMPLE – During hot problem customer didn’t know password, IBM rep suggested common password which worked 20 September 2018

14 “Top 10” things you need to know and do System Security Policy
System Security is very broad and means different things to different enterprises Passwords, Rules, Access Control, Granularity … External Security Manager (e.g. RACF/VM) NOTE: Adding an ESM to an existing SSI cluster is difficult. It is possible to do this after implementation of an SSI but you are better to do this before implementing SSI ESMs provide password encryption Common Criteria Certification by z/VM A fully defined system It may be too much for you but it gives good ideas Other Material z/VM: Secure Configuration Guide Manual ** VM V6.3 Achieves Common Criteria Certification Reference - Brian 20 September 2018

15 Have a plan or work toward a plan for your storage configuration
“Top 10” things you need to know and do Storage Configuration (FICON DASD and FCP SCSI) Have a plan or work toward a plan for your storage configuration Current needs and growth Types of Storage Storage Allocation and Maintenance Allocation (Standardization on Size and Device numbers) across LPARS Settings and Error reporting Duplicate VOLSER issues Cylinder zero is special sometimes (1-END Minidisk to protect the VOLSER and allocation) Advanced Configurations (e.g. GDPS) Deflect questions to more knowledgeable people – Steve W. Also, it would be good to eventually have a Good/Better/Best chart after this chart. 20 September 2018

16 “Top 10” things you need to know and do Storage Configuration (FICON DASD and FCP SCSI)
Other Material CP Planning and Administration Manual ** EREP in System Operations Manual ** GDPS References and description page – 20 September 2018

17 Your physical and logical network for z/VM is key to nearly everything
“Top 10” things you need to know and do A Planned Network Configuration Your physical and logical network for z/VM is key to nearly everything Server and Application Connectivity Transaction time and Perceptions Robustness – Built-in failover VSwitch and VSwitch Link aggregation is preferred Lower CPU costs Operates in Ethernet (Layer 2) or IP modes (Layer 3) Supports port isolation Supports link aggregation Involve your network team!! This is really a must Other Material z/VM Connectivity Manual ** z/VM: Getting Started with Linux on System z ** Linux on System z Tuning Hints and Tips for Networking – Deflect questions to more knowledgeable people - Alan 20 September 2018

18 “Top 10” things you need to know and do BACKUP of z/VM and Server data
Sometime and for some reason you will need to restore data on your system. Plan on this from the beginning Storage Failures Application failures ... How Backups of key data – File level backup (including to ) or Device level Don’t backup unnecessary things (paging volumes, redundant SSI data, etc.) Being able to rebuild data Where to backup the data to is of your choice Duplicate copies of data (Flash Copy or otherwise) Consistent USABLE data. (with I/O Quiesced) TEST YOUR BACKUPs!!!!! CP DATA SPXTAPE for Spool and System Data files DDR for CP Volumes (allocation maps etc.) 20 September 2018

19 “Top 10” things you need to know and do BACKUP of z/VM and Server data
Other Material z/VM: Getting Started with Linux on System z ** Backup and Restore Manager for z/VM ** Tape Manager for z/VM ** SPXTAPE and DDR in the CP Commands Manual ** DFSMS/VM publications in the VM Library for Tape Handling ** Tivoli Storage Manager (now: IBM Spectrum Protect) ** 20 September 2018

20 “Top 10” things you need to know and do
Initial System Setup or Planning for Changes Everyday Items Occasional Items Prepared Responses to Catastrophic Events 20 September 2018

21 Paging space is user and system pages on External Storage
“Top 10” things you need to know and do Manage Paging space on the system Paging space is user and system pages on External Storage These pages can be a copy of what is in storage or the only copy Usage has changed in z/VM 6.3.0 Planning your paging allocation is important Paging space is not optional Running out of paging will cause a System Outage (PGT004 abend) Messages issued by CP at 90% & 100% also at 90% of spooling space in use as a last effort Often the messages come too late for must systems Monitoring over time will give you a good indication Commands and Tooling to watch and monitor PAGE QUERY ALLOC PAGE PGT004 ABENDs.. Nothing more to be said here….. 20 September 2018

22 “Top 10” things you need to know and do Manage Paging space on the system
Good – Periodic QUERY ALLOC PAGE to see where the system is regarding PAGE usage Better – Queries but also maintaining the history of usage so you can see trends and EXECs to monitor specifics OMEGAMON can track history of Page Space Utilization Best – An automated solution like Operations Manager that will both visually provide the state and will Notify you if some threshold has been exceeded. Catch the problem as it is changing in real time. 20 September 2018

23 “Top 10” things you need to know and do Manage Paging space on the system
Other Material CP Planning and Administration manual – Estimation ** Operations Manager for z/VM - 20 September 2018

24 System performance is about a number of things
“Top 10” things you need to know and do Keep track of System & Server Performance System performance is about a number of things Monitoring System Performance Monitoring Guest Performance Making Improvements and knowing what those changes did Best to have performance data before and after changes are made Make one (or a very few changes) and measure Predict what is next Things you can do Commands may help but overall they do not give a good holistic picture of performance (e.g. INDICATE) Collect Monitor Records – Even if you have a Performance Product you should save the data with MONWRITE Use analysis products 20 September 2018

25 Best – A holistic approach
“Top 10” things you need to know and do Keep track of System & Server Performance Good – Periodic Commands to keep a pulse on the state of the running system. Maintain some history Better – Queries and a performance product like Performance Toolkit and Tivoli OMEGAMON Best – A holistic approach Automation via Performance product Collection of MONWRITE data at least Periodically and Before / After significant change Catch the problem as it is changing in real time. We could potentially expand this chart with more examples. 20 September 2018

26 “Top 10” things you need to know and do Keep track of System & Server Performance
Other Material CP Planning and Administration manual – Estimation ** Tivoli OMEGAMON XE on z/VM and Linux Collecting MONWRITE Data: About as Simple as I Can Make It by Brian Wade MONCLEAN on the z/VM Download page ** Redbook - The Virtualization Cookbook for IBM z/VM 6.3, RHEL 6.4, and SLES 11 SP3 – Section z/VM: Performance Toolkit Guide Manual ** 20 September 2018

27 Spooling space in a z/VM system is required for normal operation
“Top 10” things you need to know and do Manage Spooling space on the system Spooling space in a z/VM system is required for normal operation Shared Segments - NSS / DCSS – Usually are not a problem Console Files – Logs from z/VM service machines Files being moved from one user to another and diagnostic data Spool has a tendency to grow Aged unprocessed files Occasional Data -- DUMPS, Trace files, etc. Commands and Tooling to watch and monitor SPOOL QUERY ALLOC SPOOL CP Messages come really late for most installations Page Space Overflow (Last effort paging space) You need to have a System Operator for notifications 20 September 2018

28 “Top 10” things you need to know and do Manage Spooling space on the system
Good – Periodic QUERY ALLOC SPOOL to see where the system is regarding SPOOL usage Allocate Dedicated DUMP space QUERY DUMP Better – Queries but also maintaining the history of usage so you can see trends Run tools like SFPURGER & SPOOLPIG to determine more information OMEGAMON will keep spool History Best – An automated solution like Operations Manager that will both visually provide the state but will Notify you if some threshold has been exceeded Operations Manager can also run SFPURGER on a schedule or when thresholds have been reached Catch the problem as it is changing in real time. FIELD EXAMPLE – RSCS not working which prevented FTP jobs to z/OS from running. Found out that spool space ran out, breaking RSCS 20 September 2018

29 “Top 10” things you need to know and do Manage Spooling space on the system
Other Material CP Planning and Administration manual ** SFPURGER – CMS Commands and Utilities manual ** Operations Manager for z/VM - SPOOL PIG and others – z/VM Download Page ** 20 September 2018

30 “Top 10” things you need to know and do Monitor your Virtual Machines
Monitoring Virtual Machines (Servers) is often needed if anyone else is working with the system Two Types of servers (at least) Service Virtual Machines like those shipped with z/VM tend to be relatively stable but even they can change Linux Servers (Production or Test) tend to have transparent changes. (e.g. Workload, Configuration, z/VM definition etc) Servers tend to grow over time Knowing what is running will allow you to anticipate Things that you can monitor QUERY NAMES – About the simplest INDICATE Working Set Sizes Resident Pages CPU Consumed Remember – some changes may not take effect until the server is restarted, which might not occur for a long time Physical resources Memory IFLs I/O and Network Linux Guests Virtualized in LPARs z/VM + IBM Wave for z/VM LPAR 20 September 2018

31 “Top 10” things you need to know and do Monitor your Virtual Machines
Other Material CP Planning and Administration manual ** VIR2REAL – Compute the Virtual to Real storage (memory) ration of running users in a z/VM LPAR – z/VM Download Page ** Tivoli OMEGAMON XE on z/VM and Linux ** Physical resources Memory IFLs I/O and Network Linux Guests Virtualized in LPARs z/VM + IBM Wave for z/VM LPAR 20 September 2018

32 “Top 10” things you need to know and do Capture Important Console logs
z/VM servers have valuable information in their console logs. However, you need to capture it. Startup data is often captured in logs Logs contain a significant amount of mundane information Critical error information Only guests that are logged (could be disconnected) on to the system have logs. Should get at least a couple of days & initialization (more is better) Gathering the logs can be done by various methods Spooled Console Logs Automated logs to Disk Automation Products that do much of this for you TERM TIMESTAMP ON – Adds a time to each entry Note that “z/VM servers” mostly refers to Service Virtual Machines and Linux Servers. 20 September 2018

33 “Top 10” things you need to know and do Capture Important Console logs
Good – Ensure that Spooling of logs is enabled on all servers. Spooling – Set up with COMMAND statement in users Directory Entry Logging in a profile or server start-up Better – Monitor Spooling of logs on periodic basis. Close/Purge oldest and open new Console Spool or log keeping newest. EXECs that may use FOR command to remotely do this Best – An automated solution like Operations Manager will automatically save and manage server machine consoles and logs, and optionally notify you of critical events Operations Manager VIEWCON tool allows for real time viewing of events that may also make management easier 20 September 2018

34 “Top 10” things you need to know and do Capture Important Console logs
Other Material The basics on gathering a Console log in the z/VM Diagnosis guide at: Operations Manager for z/VM ** 20 September 2018

35 “Top 10” things you need to know and do Mainframe Social -
Be Social – This is not a full time task but it really can help Watch what is being done by others Contribute your own thoughts and ideas Ask Questions Walking around – Virtually or Physically There art a lot of avenues for material List Servers, Web groups (IBMVM and LINUX-390 LISTSERVs for questions, advice, lessons learned, answers, banter, etc.) Available 24 / 7 / 365 Relatively low traffic, low spam, little bad advice Friendly, helpful, potential for lasting contacts Other Material VM Community 20 September 2018

36 “Top 10” things you need to know and do
Initial System Setup Everyday Items Occasional Items Prepared Responses to Catastrophic Events 20 September 2018

37 Defines your system configuration On PMAINT CF0 in 6.3.0
“Top 10” things you need to know and do Changing SYSTEM CONFIG – As safely as possible SYSTEM CONFIG Defines your system configuration On PMAINT CF0 in 6.3.0 Develop a process for changes and stick to it. Suggested steps: Make a backup copy before changing anything. This backup can be used in an emergency from the SAPL panel. Save backup in a place you can access in an emergency Have a peer review your changes Without fail, run CPSYNTAX !!! Changes not effective until next IPL (errors may not be discovered for months!) Reference – Tim G. pitch 20 September 2018

38 “Top 10” things you need to know and do Changing SYSTEM CONFIG – As safely as possible
CPSYNTAX Utility verifies that the SYSTEM CONFIG file is at least syntactically correct. A system programmer’s best friend! Available on the MAINT 193 minidisk An easy way to avoid embarrassing mistakes at IPL or worse Easy to run – Catches incorrect and unrecognized statements Even Comment Changes Corrupted SYSTEM CONFIG file problems can be very ugly and can take a long time to fix Reference – Tim G. 20 September 2018

39 “Top 10” things you need to know and do Changing SYSTEM CONFIG – As safe as possible
Other Material CPSYNTAX Described in the CP Commands Manual ** CP Planning and Administration manual ** 20 September 2018

40 “Top 10” things you need to know and do Review Startup Logs for errors
Start-up console logs may reveal errors or problems Even if a server or application starts successfully there can be issues. Error messages, Warnings & overrides should be reviewed Critical times for reviewing logs. New Releases Maintenance of server or application Common Error messages that could be missed DASD Problems Duplicate VOLID or Offline Spool Problems (e.g. NSS/DCSS …) CONFIG ERRORs Other Material CP Messages and Codes Manual ** Other Server manuals 20 September 2018

41 Have a strategy to apply z/VM maintenance at least twice a year
“Top 10” things you need to know and do Maintenance is not something that can wait forever Have a strategy to apply z/VM maintenance at least twice a year Maybe when the clocks change for Daylight Saving time RSU’s for z/VM are the items that we Recommend ordering / applying Keep up to date no matter what your future plans are Just because you are ready does not mean that activating the maintenance is always required. Prepare through testing 2nd level Urgent Service – Red Alerts and Hiper APARs Be ready for backlash – no set schedule! 20 September 2018

42 Apply Recommended Service Upgrade (RSU):
“Top 10” things you need to know and do Maintenance is not something that can wait forever Apply Recommended Service Upgrade (RSU): Released Periodically (6 months give or take) Contains cumulative service including all pre and co-requisites in a pre-built format Includes service for all integrated components and pre-installed program products Available on 3590 tape, DVD, or electronically (servlink envelope) Includes service required by most customer installations RSUs are proven, tested, and selective Easy to install: SERVICE PUT2PROD Easy to remove or back out SAPL – IPL from CPLOLD MODULE VMSES/E - VMFREM 20 September 2018

43 “Top 10” things you need to know and do Maintenance is not something that can wait forever
Other Material RSU Page – as needed. See: Alert Page -- A great place to watch for the most important items. To Subscribe: News -- RSU Buckets and other maintenance is still Important 20 September 2018

44 Systems will change over time
“Top 10” things you need to know and do What is the future – Longer range Systems will change over time Server changes (maintenance or workload changes) Data Changes END of Service -- End of currency UPDATE Environmental changes CEC changes What else is being utilized on CEC Storage changes Network changes Growth Changes Additional Consolidation is a good thing but there are limits Ask what release they are running. 20 September 2018

45 “Top 10” things you need to know and do
Initial System Setup Everyday Items Occasional Items Prepared Responses to Catastrophic Events 20 September 2018

46 “Top 10” things you need to know and do Disaster Recovery
Review/Develop your Disaster Recovery (DR) strategy DR is important in ALL environments DR procedures must be adjusted for SSI members DR site and Home site needs to be the same. A multi-member Home needs multi-member DR or use REPAIR MODE. VM65358 (still Open) will make this better by providing the new CLEARPDR IPL parameter on the SAPL panel. Some Planning now will help later Disaster is not well defined but I am sure you will know when you experience one TEST Your DR Plans GDPS anyone? If so, we can talk DR offline. 20 September 2018

47 “Top 10” things you need to know and do Prepare for z/VM Failures
CP does not often fail but that does not mean that you should not be ready for an event CP Abends – When CP discovers an unrecoverable error and Dumps Incorrect Output – user is stuck in CP (hung), detected by user, etc. It is Very Rare – CP HANG The goal is to Get the System Running and gather as much data as possible IMPORTANT: Reserve dedicated dump space DUMP option on CP_Owned statement 20 September 2018

48 “Top 10” things you need to know and do Prepare for z/VM Failures
Dump types, methods, and behavior CP abend – Dump taken and system IPLs automatically SNAPDUMP command – Dump taken, system stays up. However, system is quiesced which may adversely affect servers. PSW RESTART from the HMC. Dump taken and system IPLs automatically. Resultant dump appears as an SVC002 abend dump. VMDUMP command – Dump of a virtual machine’s storage. Usually only requested by server support groups, but could be requested by us if it is for a second level VM system. WARNING: Straight “LOAD” of CP from the HMC will just IPL CP and NOT produce a dump! For all hangs, use PSW RESTART!! CP dumps written to OPERATNS reader (default) 20 September 2018

49 “Top 10” things you need to know and do Prepare for z/VM Failures
Learn how to process a CP dump DUMPLOAD or DUMPLD2 utility DUMPLD2 enables you to create a multi-file dump, which is easier to transfer to IBM Collect the OPERATOR’s console from the time of failure Practice moving files to and from z/VM (even copy/paste to ) Nearly every problem diagnosis starts with the same questions: Description? Release and service level? What Changed (Workload, Service, HW, …) ??? “WHAT CHANGED?” Field examples: Nothing changed – except RSU applied last weekend Nothing changed – further inquiries uncovered another abend, hardware change, and falling ceiling tiles on the HW in question 20 September 2018

50 “Top 10” things you need to know and do Prepare for z/VM Failures
Other Material z/VM: Diagnosis Guide ** Software Support Handbook z/VM Service Resources 20 September 2018

51 Education is an ongoing process
“Top 10” things you need to know and do Your own education and that of your peers Education is an ongoing process THANK YOU for your interest in attending this presentation and furthering your education Please continue this ongoing effort Assisting a peer will make vacations much more enjoyable  Types of education Workshops and Tech Conferences Formal and informal classes OJT and/or reading manuals or Redbooks Other Material CP Planning and Administration manual ** Running guest Operating System manual ** Getting Started with Linux Manual ** The z/VM Education Page Redbooks 20 September 2018

52 More Information

53 Common References ** zVM Manuals in the Knowledge center for z/VM V6.3 VM V6.3 Achieves Common Criteria Certification VM Download Page RSU Page – as needed Alert Page -- A great place to watch for the most important items To Subscribe: z/VM: Performance Toolkit Guide Manual VM Service News RSU Buckets and other maintenance is still Important GDPS Reference page Operations Manager for z/VM Tivoli OMEGAMON XE on z/VM and Linux Tivoli Tivoli Storage Manager (now: IBM Spectrum Protect) z/VM Glossary z/VM Migration Guide 20 September 2018

54 For More Information … Web sites:
-- zVM on the Web -- the online zVM Library -- presentations, classes and information Via mailing lists: Contact Information: Tim Reynolds z/VM CP Level 2 Support Acknowledgement: Thanks to Steve Wilkins, Bill Bitner, and Dan Griffith from IBM z/VM Development for helping put this presentation together. 20 September 2018

55 Thank You شكراً Arabic ขอบคุณ Thai 谢谢 Chinese Tak Danish
Dank u Dutch Спаcибо Russian Merci French Gracias Spanish شكراً Arabic 감사합니다 Korean Tack så mycket Swedish धन्यवाद Hindi תודה רבה Hebrew 谢谢 Chinese Obrigado Brazilian Portuguese Thank You Dankon Esperanto ありがとうございます Japanese Tak Danish Trugarez Breton Danke German Grazie Italian நன்றி Tamil ขอบคุณ Thai děkuji Czech go raibh maith agat Gaelic


Download ppt "How to be a Great z/VM System Programmer"

Similar presentations


Ads by Google