Presentation is loading. Please wait.

Presentation is loading. Please wait.

z/VM Performance with SMT

Similar presentations


Presentation on theme: "z/VM Performance with SMT"— Presentation transcript:

1 z/VM Performance with SMT
June 27, 2015 z/VM Workshop Xenia Tkatschow z/VM Performance Analysis © 2013, 2015 IBM Corporation 1

2 Trademarks 2 RACF* Storwize* System Storage* System x* System z*
The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. BladeCenter* DB2* DS6000* DS8000* ECKD FICON* GDPS* HiperSockets HyperSwap IBM z13* OMEGAMON* Performance Toolkit for VM Power* PowerVM PR/SM RACF* Storwize* System Storage* System x* System z* System z9* System z10* Tivoli* zEnterprise* z/OS* zSecure z/VM* z Systems* * Registered trademarks of IBM Corporation The following are trademarks or registered trademarks of other companies. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. Java and all Java based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. OpenStack is a trademark of OpenStack LLC. The OpenStack trademark policy is available on the OpenStack website. TEALEAF is a registered trademark of Tealeaf, an IBM Company. Windows Server and the Windows logo are trademarks of the Microsoft group of countries. Worklight is a trademark or registered trademark of Worklight, an IBM Company. UNIX is a registered trademark of The Open Group in the United States and other countries. * Other product and service names might be trademarks of IBM or other companies. Notes: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. This information provides only general descriptions of the types and portions of workloads that are eligible for execution on Specialty Engines (e.g., zIIPs, zAAPs, and IFLs) ("SEs"). IBM authorizes customers to use IBM SE only to execute the processing of Eligible Workloads of specific Programs expressly authorized by IBM as specified in the “Authorized Use Table for IBM Machines” provided at (“AUT”). No other workload processing is authorized for execution on an SE. IBM offers SE at a lower price than General Processors/Central Processors because customers are authorized to use SEs only to process certain types and/or amounts of workloads as specified by IBM in the AUT. © 2013, 2015 IBM Corporation 2

3 Agenda Overview of Architecture Changes
SMTMET Tool (new – Available July 1)) CPUMF Tool Closer Look At Performance Results Monitor and PERFKIT Changes © 2013, 2015 IBM Corporation 3

4 Overview of Architecture Changes
© 2013, 2015 IBM Corporation

5 z13 Notable Characteristics
Compared to the zEC12, the z13 offers larger caches L1 I-cache is 50% larger L1 D-cache is 33% larger L2 cache is 100% larger L3 cache is 33% larger L4 cache is 25% larger CPU cores are multithreaded Clock speed is slower than zEC12 Various other changes and improvements (e.g. Branch Prediction) z/VM exploits multithreading only on IFL cores © 2013, 2015 IBM Corporation

6 Review of Performance Tools CPUMF and SMTMET
© 2013, 2015 IBM Corporation

7 CPUMF Display Tool An Exec that extracts z System CPU records its internal performance experience; metrics as instructions completed, clock cycles used, and cache misses The process of reducing the CPUMF counters: Start with a MONWRITE file that contains Domain 5 Record 13 records. Command Syntax EXEC CPUMFINT filename MONDATA filemode Resultant file: filename CPUMFINT filemode  Interim file EXEC CPUMFLOG filename CPUMFINT filemode filename $CPUMFLG filemode  Final report file Use the CPUMFLOG tool to process the interim file. The CPUMFLOG tool applies the Burg formulas, does the appropriate calculations, and writes a report. The report file will have CMS filetype $CPUMFLG. © 2013, 2015 IBM Corporation

8 Sample $CPUMFLG Output
_IntEnd_ LPU Typ ___L1MP___ ___L2P____ ___L3P____ ___L4LP__ >>Mean>> 0 IFL >>Mean>> 1 IFL >>Mean>> 2 IFL >>Mean>> 3 IFL >>Mean>> 4 IFL >>Mean>> 5 IFL >>MofM>> >>AllP>> 00:46: IFL 00:46: IFL 00:46: IFL 00:46: IFL 00:46: IFL 00:46: IFL L1MP – Percentage of instructions That incur an L1 miss L2P – Percentage of L1 misses Sourced from L2 L1MP – Percentage of instructions that incur an L1 miss. L2P - Percentage of L1 misses sourced from L2 © 2013, 2015 IBM Corporation

9 Lets consider a Workload’s Cache Footprint
Without SMT A workload that: Stayed well within the zEC12’s cache might see only modest improvement on z13 because it will get no help from increased z13 cache sizes Grossly overflows cache on both machines might see no benefit from z13 Didn’t fit well into zEC12 cache but does fit well into the increased caches on z13; might experience the most improvement Now enable SMT You have twice as many logical processors competing for the same amount of cache L1s and L2s are larger on the z13, but with multithreading, two threads of a core share the L1 and the L2. This may change the performance of the L1 and L2 and is very much a function of the workload Again, CPUMF will come in handy to observe your workload’s behavior with respect to cache. © 2013, 2015 IBM Corporation

10 SMTMET Display Tool An EXEC that extracts MT metrics from Domain 0 Record 2. Available on our download library. The resultant file includes two reports: per-core-type report per-core report SMTMET Documentation: (Available July 1) The process of reducing the CPUMF counters Start with a MONWRITE file that contains D0 R2 records. Command Syntax from CMS prompt: SMTMET filename MONDATA filemode Resultant file from CMS prompt: filename $SMTMET filemode © 2013, 2015 IBM Corporation

11 What are all those other numbers on Indicate MT?
Busy time - how often the core was executing instructions during the interval. Thread density - how often the core was able to run both threads at once, while the core was in use at all. Productivity - how often the core was completely busy on both threads while the core was in use. MT Utilization - how much of the maximum core capacity was used. Capacity factor - a way of looking at the amount of work the multithreaded core was able to accomplish compared to the amount of work a single threaded core could accomplish. Maximum Capacity factor - how much work could've been accomplished at the current rate, if the core had been kept busier. © 2013, 2015 IBM Corporation 11

12 D0R2 Per-Core-type Report for file: IDLESYS MONDATA
Interval Core Sampled Pct Core Pct Cap Pct Max Pct MT Average __Ended_ Type ___Secs___ ___Cores__ Prodctvity __Factor__ _Cap Fct__ Utilztion_ Thread Den >>Mean>> IFL 13:51:42 IFL 13:52:12 IFL 13:52:42 IFL 13:53:12 IFL 13:53:42 IFL 13:54:12 IFL 13:54:42 IFL 13:55:12 IFL 13:55:42 IFL 13:56:12 IFL © 2013, 2015 IBM Corporation

13 D0R2 Per-Core Report for file: IDLESYS MONDATA
Interval Core Core Pct Core Pct MT Average Pct Core __Ended_ _ID_ Type ___Secs___ Prodctvity Utilztion_ Thread Den ___Busy___ >>Mean>> 00 IFL >>Mean>> 01 IFL >>Mean>> 02 IFL >>Mean>> 03 IFL 13:52: IFL 13:52: IFL 13:52: IFL 13:52: IFL 13:53: IFL 13:53: IFL 13:53: IFL 13:53: IFL 13:53: IFL 13:53: IFL 13:53: IFL 13:53: IFL 13:54: IFL 13:54: IFL 13:54: IFL 13:54: IFL © 2013, 2015 IBM Corporation

14 Performance Results © 2013, 2015 IBM Corporation

15 Performance Measurements
SMT2 Ideal Application Maximum Storage Configuration Maximum Logical Processor Configuration Linux-only mode with Single Processor serialization Application Mitigation 1: Increasing virtual processors Mitigation 2: Increasing servers in workload Linux-only mode with Master Processor Serialization Application z/VM-mode with Master Processor Serialization Application CPU Pooling Workload Live Guest Relocation (LGR) Workload For a more details about performance results see: © 2013, 2015 IBM Corporation 15

16 Performance Measurements: SMT2 Ideal Application
Run ID AMPDGLD0 AMPDGLD Delta Pct Multithreading (p) Disabled Enabled Logical Processors (p) ETR (c) ITR (p) Total Util/Proc (p) AWM avgRT (a) Client Util (p) Server Util (p) Virtual CPUs Avg Thread Density na Highly parallel activity with no single point of serialization Total of 16 virtual processors to drive the 4 logical processors in the nonSMT case and 8 logicals with SMT This configuration demonstrates a value that can be obtained for a workload that has ideal SMT characteristics © 2013, 2015 IBM Corporation 16

17 D0R2 Per-Core-type Report for file: AMPDGLD1 MONDATA
Interval Core Sampled Pct Core Pct Cap Pct Max Pct MT Average __Ended_ Type ___Secs___ ___Cores__ Prodctvity __Factor__ _Cap Fct__ Utilztion_ Thread Den >>Mean>> IFL 21:32:02 IFL 21:32:32 IFL 21:33:02 IFL 21:33:32 IFL 21:34:02 IFL 21:34:32 IFL 21:35:02 IFL 21:35:32 IFL 21:36:02 IFL 21:36:32 IFL 21:37:02 IFL 21:37:32 IFL 21:38:02 IFL © 2013, 2015 IBM Corporation

18 D0R2 Per-Core Report for file: AMPDGLD1 MONDATA
Interval Core Core Pct Core Pct MT Average Pct Core __Ended_ _ID_ Type ___Secs___ Prodctvity Utilztion_ Thread Den ___Busy___ >>Mean>> 00 IFL >>Mean>> 01 IFL >>Mean>> 02 IFL >>Mean>> 03 IFL 21:32: IFL 21:32: IFL 21:32: IFL 21:32: IFL 21:32: IFL 21:32: IFL 21:32: IFL 21:32: IFL 21:33: IFL 21:33: IFL 21:33: IFL 21:33: IFL © 2013, 2015 IBM Corporation

19 Processor Activity, by Time From 2015/02/14 16:31:32
FCX304 Run 2015/03/04 15:16: PRCLOG Processor Activity, by Time From 2015/02/14 16:31:32 To /02/14 16:42:02 For Secs 00:10: "This is a performance repo ___________________________________________________________________ <--- Percent Busy ----> <-- Ra C Pct Interval P Park Inst End Time U Type PPD Ent. DVID Time Total User Syst Emul Siml >>Mean>> 0 IFL VhD >>Mean>> 1 IFL VhD >>Mean>> 2 IFL VhD >>Mean>> 3 IFL VhD >>Mean>> 4 IFL VhD >>Mean>> 5 IFL VhD >>Mean>> 6 IFL VhD >>Mean>> 7 IFL VhD >>Total> 8 IFL VhD MIX k © 2013, 2015 IBM Corporation

20 Performance Measurements: Single Processor Serialization Application
Run ID APNDGLD0 APNDGLD Delta Pct Multithreading (p) Disabled Enabled Logical Processors (p) ETR (c)  ITR (p) Total Util/Proc (p) AWM avgRT (a)  Client Util (p) Client Virt CPUs (p) Server Util (p) Thread Density na Client is a one-way and becomes the single point of serialization because now it is running on a thread and sharing the core with another thread. A guest that is a one-way and is using more than 50% of a core; one would think it might become processor constrained in the SMT2 environment In this case, the client is driving the workload, so when the client becomes processor constrained, the whole workload is slowed down. External Transaction Rate decreased by 35%. Not all workloads show benefit when SMT2 is enabled We adjusted this workload in the SMT2 environment to help overcome the performance impact © 2013, 2015 IBM Corporation 20

21 D0R2 Per-Core-type Report for file: APNDGLD1 MONDATA
Interval Core Sampled Pct Core Pct Cap Pct Max Pct MT Average __Ended_ Type ___Secs___ ___Cores__ Prodctvity __Factor__ _Cap Fct__ Utilztion_ Thread Den >>Mean>> IFL 19:19:53 IFL 19:20:23 IFL 19:20:53 IFL 19:21:23 IFL 19:21:53 IFL 19:22:23 IFL 19:22:53 IFL 19:23:23 IFL 19:23:53 IFL 19:24:23 IFL Nothing in this data indicated that your workload actually incurred a ETR degradation. © 2013, 2015 IBM Corporation

22 D0R2 Per-Core Report for file: APNDGLD1 MONDATA
Interval Core Core Pct Core Pct MT Average Pct Core __Ended_ _ID_ Type ___Secs___ Prodctvity Utilztion_ Thread Den ___Busy___ >>Mean>> 00 IFL >>Mean>> 01 IFL >>Mean>> 02 IFL 19:19: IFL 19:19: IFL 19:19: IFL 19:20: IFL 19:20: IFL 19:20: IFL 19:20: IFL 19:20: IFL 19:20: IFL 19:21: IFL 19:21: IFL 19:21: IFL 19:21: IFL 19:21: IFL 19:21: IFL © 2013, 2015 IBM Corporation

23 Mitigation 1: adding more virtual processors
Run ID APNDGLD1 APNDGLDF Delta Pct Multithreading (p) Enabled Enabled Logical Processors (p) ETR (c)  thrput ITR (p) Total Util/Proc (p) AWM avgRT (a)  Resp. Time Client Util (p) Client Virt CPUs (p) Each client was given an extra virtual processor; so now they each became a 2-way Client utilization jumped from 100% to 143%. Total workload utilization increased and the bottleneck was removed © 2013, 2015 IBM Corporation 23

24 D0R2 Per-Core-type Report for file: APNDGLDF MONDATA
Interval Core Sampled Pct Core Pct Cap Pct Max Pct MT Average __Ended_ Type ___Secs___ ___Cores__ Prodctvity __Factor__ _Cap Fct__ Utilztion_ Thread Den >>Mean>> IFL 21:28:23 IFL 21:28:53 IFL 21:29:23 IFL 21:29:53 IFL 21:30:23 IFL 21:30:53 IFL 21:31:23 IFL 21:31:53 IFL 21:32:23 IFL 21:32:53 IFL 21:33:23 IFL © 2013, 2015 IBM Corporation

25 D0R2 Per-Core Report for file: APNDGLDF MONDATA
Interval Core Core Pct Core Pct MT Average Pct Core __Ended_ _ID_ Type ___Secs___ Prodctvity Utilztion_ Thread Den ___Busy___ >>Mean>> 00 IFL >>Mean>> 01 IFL >>Mean>> 02 IFL 21:28: IFL 21:28: IFL 21:28: IFL 21:28: IFL 21:28: IFL 21:28: IFL 21:29: IFL 21:29: IFL 21:29: IFL 21:29: IFL 21:29: IFL 21:29: IFL © 2013, 2015 IBM Corporation

26 Mitigation 2: adding more client guests
Run ID APNDGLD1 APNDGLDD Delta Pct Multithreading (p) Enabled Enabled Logical Processors (p) ETR (c)  thrput ITR (p) Total Util/Proc (p) AWM avgRT (a)  resp. time Client Users (p) Client Virt CPUs (p) Client Util (p) Another adjustment was to add more clients to drive the workload Each client remained a one-way and we doubled the number of clients from 3 to 6 (this is another way to add parallelism to the workload) The ETR increased by 100% Workload adjustment should be considered when possible © 2013, 2015 IBM Corporation 26

27 D0R2 Per-Core-type Report for file: APNDGLDD MONDATA
Interval Core Sampled Pct Core Pct Cap Pct Max Pct MT Average __Ended_ Type ___Secs___ ___Cores__ Prodctvity __Factor__ _Cap Fct__ Utilztion_ Thread Den >>Mean>> IFL 00:46:02 IFL 00:46:32 IFL 00:47:02 IFL 00:47:32 IFL 00:48:02 IFL 00:48:32 IFL 00:49:02 IFL 00:49:32 IFL 00:50:02 IFL © 2013, 2015 IBM Corporation

28 D0R2 Per-Core Report for file: APNDGLDD MONDATA
Interval Core Core Pct Core Pct MT Average Pct Core __Ended_ _ID_ Type ___Secs___ Prodctvity Utilztion_ Thread Den ___Busy___ >>Mean>> 00 IFL >>Mean>> 01 IFL >>Mean>> 02 IFL 00:46: IFL 00:46: IFL 00:46: IFL 00:46: IFL 00:46: IFL 00:46: IFL 00:47: IFL 00:47: IFL 00:47: IFL 00:47: IFL 00:47: IFL 00:47: IFL © 2013, 2015 IBM Corporation

29 Performance Measurements: Live Guest Relocation
25 Linux guests relocated while running three workloads PING – to simulate network traffic BLAST– to simulate I/O PFAULT- to simulate referencing storage Relocation was done synchronously using the SYNC option of VMRELOCATE command Results: Relocation time increased by 10% Quiesce time increased by 26% PFAULT (71%) and BLAST (34%) completions increased Total number of pages relocated during quiesce increased by 51% Conclusion: With SMT2, the BLAST and PFAULT workloads were changing pages more frequently, thus causing more pages to be moved during quiesce time. © 2013, 2015 IBM Corporation 29

30 Performance Measurements: Conclusion
Results in measured workloads varied widely. Best results were observed for applications having highly parallel activity and no single point of serialization. No improvements were observed for applications having a single point of serialization. To overcome serialization, workload adjustment should be done where possible. Workloads that have a heavy dependency on the z/VM master processor are not good candidates for SMT-2. In z/VM Performance Toolkit, the master processor can be identified from FCX100 CPU and FCX180 SYSCONF. © 2013, 2015 IBM Corporation 30

31 Performance Measurements: Conclusion
The multithreading metrics (provided by the $SMTMET tool) provide information about how well the cores perform when SMT is enabled. There is no direct relationship with ETR or with transaction response time. Measuring workload throughput and response time is the best way to know whether SMT is providing value to the workload. © 2013, 2015 IBM Corporation 31

32 Performance Monitor and Performance Toolkit Changes
© 2013, 2015 IBM Corporation

33 Monitor Changes New Monitor Record Name Change Monitor Records Name 33
Domain 5 Record 20 MT CPUMF counters Change Monitor Records Name Domain 0 Record 2 Processor data (per processor) Domain 0 Record 15 Logical CPU utilization (global) Domain 0 Record 16 CPU utilization in a logical partition) Domain 0 Record 17 Physical CPU utilization data for LPAR management Domain 0 Record 19 System data (global) Domain 0 Record 23 Formal spin lock data (global) Domain 1 Record 4 System configuration data Domain 1 Record 5 Processor configuration data (per processor) Domain 1 Record 16 Scheduler settings Domain 1 Record 18 CPU capability change Domain 2 Record 4 Add user to dispatch list Domain 2 Record 5 Drop user from dispatch list Domain 2 Record 7 Set SRM changes Domain 2 Record 13 Add VMDBK to limit list Domain 2 Record 14 Drop VMDBK from limit list Domain 4 Record 2 User logoff data Domain 4 Record 3 User activity data Domain 4 Record 9 User activity data at transaction end Domain 5 Record 1 Vary on processor Domain 5 Record 2 Vary off processor Domain 5 Record 11 Instruction counts per processor Domain 5 Record 13 CPU-measurement facility counters Domain 5 Record 16 Park/unpark decision Domain 5 Record 17 Real CPU data Domain 5 Record 19 CPU pool utilization © 2013, 2015 IBM Corporation 33

34 SMT Ideal Application: Perfkit Screen PRCLOG (FCX304)
SMT Disabled FCX304 Run 2015/02/15 08:52: PRCLOG Page 56 Processor Activity, by Time From 2015/02/14 16:04: SYSTEMID To /02/14 16:14: CPU SN 12F17 For Secs 00:10: "This is a performance report for SYSTEM XYZ" z/VM V SLU 0000 ____________________________________________________________________________________________________________________________________ <--- Percent Busy ----> <-- Rates per Sec. ---> <----- Paging > <Co> < Di> C Pct Fast Page <mm> < ag> Interval P Park Inst <2GB PGIN Path Read Msgs X'9C' Core/  Core added End Time U Type PPD Ent. DVID Time Total User Syst Emul Siml DIAG SIGP SSCH /s /s % /s /s /s Thread  Thread 0 only >>Mean>> 0 IFL VhD /0 >>Mean>> 1 IFL VhD /0 >>Mean>> 2 IFL VhD /0 >>Mean>> 3 IFL VhD /0 >>Total> 4 IFL VhD MIX k MIX SMT Enabled FCX304 Run 2015/02/15 08:52: PRCLOG Page 56 Processor Activity, by Time From 2015/02/14 16:31: SYSTEMID To /02/14 16:42: CPU SN 12F17 For Secs 00:10: "This is a performance report for SYSTEM XYZ" z/VM V SLU 0000 ____________________________________________________________________________________________________________________________________ <--- Percent Busy ----> <-- Rates per Sec. ---> <----- Paging > <Co> < Di> C Pct Fast Page <mm> < ag> Interval P Park Inst <2GB PGIN Path Read Msgs X'9C' Core/  Core added End Time U Type PPD Ent. DVID Time Total User Syst Emul Siml DIAG SIGP SSCH /s /s % /s /s /s Thread  thread added >>Mean>> 0 IFL VhD /0 >>Mean>> 1 IFL VhD /1 >>Mean>> 2 IFL VhD /0 >>Mean>> 3 IFL VhD /1 >>Mean>> 4 IFL VhD /0 >>Mean>> 5 IFL VhD /1 >>Mean>> 6 IFL VhD /0 >>Mean>> 7 IFL VhD /1 >>Total> 8 IFL VhD MIX k MIX © 2013, 2015 IBM Corporation 34

35 SMT Ideal Application: Perfkit Screen SYSCONF (FCX180)
SMT Enabled FCX180 Run 2015/02/15 08:52: SYSCONF System Configuration, Initial and Changed From 2015/02/14 16:31:32 To /02/14 16:42: CPU For Secs 00:10: "This is a performance report for SYSTEM XYZ" z/VM V.6.3.0 ____________________________________________________________________________________________________________________________________ :  Data removed to save screen space : Multithreading Enabled  the z/VM system is enabled for SMT Server Time Protocol (STP) facility configuration XRC_TEST enabled No XRC_OPTIONAL enabled No STP H/W feature installed No STP H/W feature enabled No STP Timestamping enabled No STP Timezone usage enabled No STP is active No STP is suspended No STP susp. message issued No STP TOD clock offset :00: Initial Status on 2015/02/14 at 16:31, Processor Total Conf Stby Resvd Ded Shrd Real Proc: Cap Sec. Proc: Cap Log. IFL : CAF < Processor > Core/ Num Serial-Nr Type Status Thread F17 IFL Master /0  Total of 4 cores and each core has a thread 0 and thread 1 associated with it F17 IFL Alternate 00/1 F17 IFL Alternate 01/0 F17 IFL Alternate 01/1 F17 IFL Alternate 02/0 F17 IFL Alternate 02/1 F17 IFL Alternate 03/0 F17 IFL Alternate 03/1 Processor Configuration Mode: LINUX © 2013, 2015 IBM Corporation 35

36 SMT Ideal Application: Perfkit Screen SYSSET (FCX154)
SMT Enabled FCX154 Run 2015/02/15 08:52: SYSSET System Scheduler Settings, Initial and Changed From 2015/02/14 16:31: SYSTEMID To /02/14 16:42: CPU For Secs 00:10: "This is a performance report for SYSTEM XYZ" z/VM V.6.3.0 ____________________________________________________________________________________________________________________________________ Initial Scheduler Settings: 2015/02/14 at 16:31:32 DSPSLICE (minor) msec IABIAS Intensity Percent Hotshot T-slice msec IABIAS Duration Minor T-slices DSPBUF Q Openings STORBUF Q1 Q2 Q % Main storage DSPBUF Q1 Q Openings STORBUF Q2 Q % Main storage DSPBUF Q1 Q2 Q Openings STORBUF Q % Main storage LDUBUF Q1 Q2 Q % Paging exp. Max. working set % Main storage LDUBUF Q2 Q % Paging exp. Loading user Pgrd / T-slice LDUBUF Q % Paging exp. Loading capacity Paging expos. LIMITHARD algorithm Consumption DSPWD method Reshuffle  z/VM Dispatch Workload Algorithm must be set to Reshuffle for SMT to be enabled Polarization Vertical  Hiperdispatch polarization must be Vertical for SMT to be enabled Global Perf. Data ON EXCESSUSE: CP CPUPAD: CP % ZAAP ZAAP % IFL IFL % ICF ICF % ZIIP ZIIP % Multithreading Enabled < Threads > H/W Requested System Activated Max Threads  Maximum number of threads activated on this z/VM system CP core  Activated column = MIN (H/W , System) IFL core ICF core ZIIP core Changed Scheduler Settings Date Time Changed No changes processed © 2013, 2015 IBM Corporation 36


Download ppt "z/VM Performance with SMT"

Similar presentations


Ads by Google