From the Trenches OHVCMG May 13, 2010 Richard S. Ralston Antarctica.

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

System Integration and Performance
Part IV: Memory Management
Technology & Operations - Enterprise Infrastructure Enterprise Platform Services Customer Experiences with HiperDispatch & Soft Capping in IBM Mainframe.
 A quantum is the amount of time a thread gets to run before Windows checks.  Length: Windows 2000 / XP: 2 clock intervals Windows Server systems: 12.
Chapter 11: File System Implementation
Continuously Recording Program Execution for Deterministic Replay Debugging.
File Management Systems
Software Performance Engineering - SPE HW - Answers Steve Chenoweth CSSE 375, Rose-Hulman Tues, Oct 23, 2007.
Avishai Wool lecture Priority Scheduling Idea: Jobs are assigned priorities. Always, the job with the highest priority runs. Note: All scheduling.
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
Multiprocessing Memory Management
Chapter 14 Chapter 14: Server Monitoring and Optimization.
Memory Management 2010.
Virtual Memory Chapter 8.
Chapter 1 and 2 Computer System and Operating System Overview
Chapter 11 Operating Systems
Computer Organization and Architecture
OS Spring’04 Virtual Memory: Page Replacement Operating Systems Spring 2004.
Common Tuning Opportunities
©Company confidential 1 Performance Testing for TM & D – An Overview.
Check Disk. Disk Defragmenter Using Disk Defragmenter Effectively Run Disk Defragmenter when the computer will receive the least usage. Educate users.
Chapter Ten Performance Tuning. Objectives Create a performance baseline Create a performance baseline Understand the performance and monitoring tools.
Performance and Capacity Experiences with Websphere on z/OS & OS/390 CMG Canada April 24, 2002.
SQL Server memory architecture and debugging memory Issues
File System. NET+OS 6 File System Architecture Design Goals File System Layer Design Storage Services Layer Design RAM Services Layer Design Flash Services.
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
Budapesti Műszaki és Gazdaságtudományi Egyetem Méréstechnika és Információs Rendszerek Tanszék Scheduling in Windows Zoltan Micskei
© 2011 IBM Corporation 11 April 2011 IDS Architecture.
Windows 2000 Memory Management Computing Department, Lancaster University, UK.
Computer Organization
Objectives To provide a grand tour of the major operating systems components To provide coverage of basic computer system organization.
© 2013 IBM Corporation IBM SmartCloud Monitoring Capacity Planner PowerVM Placement – TOPOLOGY VIEW Girish B Chandrasekhar.
Highlights Builds on Splunk implementations – extending enterprise value to include mission-critical IBM mainframe data. Unified mainframe data source.
Threads, Thread management & Resource Management.
Suite zTPFGI Facilities. Suite Focus Three of zTPFGI’s facilities:  zAutomation  zTREX  Logger.
4.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 4: Organizing a Disk for Data.
Specialty Engines and Kneecapped Processors
©Copyright 2008, Computer Management Sciences, Inc., Hartfield, VA 1 Introduction to HiperDispatch Management Mode with z10 NCACMG meeting.
DONE-08 Sizing and Performance Tuning N-Tier Applications Mike Furgal Performance Manager Progress Software
Suite zTPFGI Facilities. Suite Focus Three of zTPFGI’s facilities:  zAutomation  zTREX  Logger.
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
Virtual Memory Virtual Memory is created to solve difficult memory management problems Data fragmentation in physical memory: Reuses blocks of memory.
Data Sharing. Data Sharing in a Sysplex Connecting a large number of systems together brings with it special considerations, such as how the large number.
Operating System Principles And Multitasking
Memory Management. Why memory management? n Processes need to be loaded in memory to execute n Multiprogramming n The task of subdividing the user area.
Project Summary Fair and High Throughput Cache Partitioning Scheme for CMPs Shibdas Bandyopadhyay Dept of CISE University of Florida.
1 Channel Access Concepts – IHEP EPICS Training – K.F – Aug EPICS Channel Access Concepts Kazuro Furukawa, KEK (Bob Dalesio, LANL)
Jeffrey Ellak CS 147. Topics What is memory hierarchy? What are the different types of memory? What is in charge of accessing memory?
Multimedia Retrieval Architecture Electrical Communication Engineering, Indian Institute of Science, Bangalore – , India Multimedia Retrieval Architecture.
Click to add text © 2004 IBM Corporation IBM ^ z/VM Basic Structures and Commands Control Program.
Chapter 7 Memory Management Eighth Edition William Stallings Operating Systems: Internals and Design Principles.
Seventh Grade Mathematics Meghan Cassady St. Mary School
System Performance Monitoring at RBC Craig Hodgins zSeries Performance Engineer Royal Bank of Canada.
October 15-18, 2013 Charlotte, NC Accelerating Database Performance Using Compression Joseph D’Antoni, Solutions Architect Anexinet.
© 2015 IBM Corporation z13TLLB1 IBM zEnterprise Data Compression (zEDC)
Saving Software Costs with Group Capacity Richard S. Ralston OHVCMGMay 13, 2010.
Memory Management.
Part V Memory System Design
Paging and Segmentation
Lecture 10: Buffer Manager and File Organization
Computer Architecture
Operating Systems.
Presentation & Demo August 7, 2018 Bill Shelden.
CPU Explorer Training 2014.
If a DRAM has 512 rows and its refresh time is 9ms, what should be the frequency of row refresh operation on the average?
CSE451 Virtual Memory Paging Autumn 2002
Coherent caches Adapted from a lecture by Ian Watson, University of Machester.
Operating Systems: Internals and Design Principles, 6/E
Two Threads Are Better Than One
Presentation transcript:

From the Trenches OHVCMG May 13, 2010 Richard S. Ralston Antarctica

SMF Buffers Interesting word – Assume Several occurrences of lost SMF data in 2009 Guess who got to fix the problem? Initially had a small SMF Buffer space and small CIsize

SMF Buffers – Do You Have Enough? Watch and track data from your type 23 records. 2 types of records SMF dataset switch 15 minute SMF status records **** Make sure you have current SMF maintenance Use either 1/3 or ½ track CIsize For high activity LPARs use 1 gig buffer space For other LPARs the type 23s will provide clues to determine buffer space osperfinstr/Home

Console Messages *IEE986E SMF HAS USED 50% OF AVAILABLE BUFFER SPACE Messages start at 50% 1 message for every 1% increase Messages stop when decreasing and reaches 50%

Type 23 Status Record 15 minute interval record Contains: SMF23BFANumber of Buffer Allocation Requests SMF23BFH High Water Mark of Storage Allocation SMF23BFWNumber of SMF Buffers Written SMF23BFT Total Buffer Storage Allocated SMF23SUS Number of SMF Buffer Suspensions SMF23BFL Percent Buffer Usage Warning Level Reqs SMF23RCWSMF Records Written Amongst other things.

Issue SMF gets bent out of shape if a big bunch of (>2000?) big records hit it within 1 second followed by continued high SMF activity. Normally (98+% of the time) SMF handles all records in 8 meg of buffers. When ‘bent’, SMF gets buffers in 8 meg increments but may only use 1 buffer from each increment. Possible to ‘use up’ all buffer space in a matter of seconds Handful of APARs on this, all improve the situation, but it is not completely eliminated. Use SMF23BFH to determine how big to make SMF Buffer Space, max is 1 gig

Hit 92% of Available Buffer Space

Buffers vs. SMF Records

Buffer Turnover

SMF/Buffer

Bytes Written

Buffer Memory Used Tracking SMF23BFH and SMF23BFT

Ultimate Cause? 3521 SMF records written at 9:00: were Event Duration: 13:30 – long 50% buffer full: 9:07:49 92% buffer full: 9:13: total records Type 14: 9509 Type 30: 2369 Type 42: 4930 Type 101: 12,856 Type 110: 12,793 Type 115: 5029 Type 254: 12,714 (IDMS) Plenty of CPU, CPC: 40%, LPAR 20% Cause: Still Unknown – I/O Issues? 60,200 records, 84.9% of total

Different Time, Different LPAR December :06:51 Hit 82% buffer full – 1 gig buffer space 10 Test DB2 regions shutdown at the same time Solution: stagger DB2 shutdowns by 1 minute

Buffer Memory Used

CICS Record Compression Compresses 110 records as they are written Does not reduce the number records Does reduce the number of SMF buffers written Can be turned on dynamically

CICS Record Compression CICS Record Compression Turned On

DB2 ACCUMAC DB2 accounting data for DDF and RRSAF threads is accumulated by end user Default: ACCUMAC=10 Was set to NO on ASG’s recommendation Currently testing

SMF Logger Intended to fix SMF data loss problems Gets better with each z/OS release Move to it when you think its ready

Hiperdispatch On z10s Intended to reduce internal overheads Cache refreshes, fetchs MP effect Dynamically changes CPU assignments to LPARs Provides CPU affinity to CICS, DB2, IDMS, etc. address spaces Recommendation, use it for all LPARs with 3 or more CPs Note: I’m using it on all LPARs, anyway. Cathy Walsh’s recommendation

What It Does Parks (pseudo varies offline) CPs, zIIPs, & zAAPs not used in an LPAR Always keep 2 of each unparked Good thing, reduces MP effect Only Low Share polarity engines can be parked. Dynamically provides processors in 3 polarity catagories High Share polarity – dedicated, 100% used by LPAR Medium Share polarity – Approximately 50% per LPAR Low Share polarity – Minimal use, these may be parked Good thing, minimizes cache refreshes Changes dynamically based upon dynamic LPAR weighting factor Provides CP affinity for high CPU workloads (CICS, DB2, etc) Reduces Level 1 and 1.5 cache refreshes

Parking To determine the amount of time parked, look at IPUPTM in the type 70s for each processor. Sum them by processor type, CP, zIIP, zAAP to get total parked time per type for the 15 minute interval. MICS HARCPU: CPUTOPTM, CPUSUPTM, CPUZPPTM MXG TYPE70: CPUPATxx, CPUPATTM, SMF70PAT Divide by the interval duration to determine how many processors were parked during the interval.

Parking

In TMONMVS, the CPU Activity Display will show which logical processors are parked. They will show 100% CPU Utilization and 0% Dispatch Utilization. This applies to CPs, zIIPs, And zAAPs.

Parking Different LPAR. Notice the parked zIIP is the middle zIIP. The zAAP is not parked because there must be at least 3 processors (zAAPS in this case) available in order for parking to be used.

Processor Polarity This one’s nastier to report/track. RMF/PM and RMF Data Portal provide this information in a nice usable form. Based on bit values in SMF70POF. MICS converts this to data in HARIPU. Count of intervals of high, medium, low share for each processor type. Also in HARCPU CPs: CPUAVTHP, CPUAVTMP, CPUAVTLP zIIPS: CPUAVSHP, CPUAVSMP, CPUAVSLP zAAPs: CPUAVZHP, CPUAVZMP, CPUAVZLP In MXG TYPE70PR has a new field called POLARITY and a format, MG070PO, for that field.

Processor Polarity

Processor Polarity - RMF PM RMF PM has data elements defined to display the high share, medium share, low share CPs. These windows are updated every minute.

CP Polarity – Normal

CP Polarity – Group Capped

Data Visualization A picture is worth 1000 words 3 great books Show Me the Numbers: Designing Tables and Graphs to Enlighten, Stephen Few, Analytics Press, 2004, ISBN: Information Dashboard Design: The Effective Visual Communication of Data, Stephen Few, O’Reilly Media, 2006, ISBN: Now you see it: Simple Visualization Techniques for Quantitative Analysis, Stephen Few, Analytics Press, 2009, ISBN:

2010 Convert my SAS/Graph stuff to v9.2 – 50% completed Use SAS/Graph v9.2 to create KPI dashboard z/VM and z/LINUX performance reporting and capacity planning CPU Measurement Facility (type 113 records)

Future Work/Desire – Fun Stuff We are using SAS on AIX for BI work. I want to take specific type 70, 71, 72 data, extract it and place on the AIX SAS server. Then use the SAS data miner to find the chalk the cheese. How does the performance data relate to each other? I did brief similar experiment at Whirlpool in 2002 using the IBM Data Miner.

Contact me Facebook & LinkedIn