Presentation is loading. Please wait.

Presentation is loading. Please wait.

Two Threads Are Better Than One

Similar presentations


Presentation on theme: "Two Threads Are Better Than One"— Presentation transcript:

1 Two Threads Are Better Than One
Craig Hodgins zSeries Performance Engineer Royal Bank of Canada

2 What is SMT2? SMT2 is Simultaneous Multithreading x2
CPU is now called a core An instruction stream is now called a thread Allows 2 threads to execute on one zIIP core

3 Why SMT2? processor speeds are approaching the physical limits
attempt to use parallelism to increase capacity

4 Faster execution but lower throughput
Slower execution but higher throughput

5 SMT2 Requirements Enabled Turned ON

6 Roll Out Methodology new system measurement metrics may affect performance tools, capacity planning, and chargeback reporting for example RMF, MXG, TDS desirable to detect and assess any measurement impacts as early as possible on test systems before rolling out to production [sysprog/dev/test/prod] APAR Identifier    OA47662     Last Changed /08/07         * PROBLEM DESCRIPTION: RMF Monitor III PROC and PROCU          * *                      reports:                                * *                      Lost of precision for APPL% and EAPPL%  * *                      fields when running in PROCVIEW CORE    * *                      mode and MT_1 mode only.                *

7 Rollout Methodology Enabling at least one LPAR per production sysplex with different characteristics and workload mix would be useful In other words, don’t do the whole sysplex at one time I created a spreadsheet to track the project

8 SMT2 Verification Review messages after SET OPT=xx Review SDSF
Review RMF

9 Messages After SET OPTxx
00:27:05 E SET OPT=MH 00:27:05 E IEE252I MEMBER IEAOPTMH FOUND IN SYS1.PARMLIB 00:27:05 E IEE536I OPT VALUE MH NOW IN EFFECT 00:27:06 E IWM066I MT MODE CHANGED FOR PROCESSOR CLASS zIIP. THE MT MODE WAS CHANGED FROM 1 TO 2.

10 SDSF D M=CPU

11 RMF CPC Report

12 New Metrics MT-2 MAX CF (Capacity Factor) is the ratio of the maximum amount of work that can be accomplished using 2 threads to the amount of work that would have been accomplished with 1 thread MT-2 Max CF is workload dependent (the max value is 2 and IBM expects average values of about 1.4) The MT-2 CF is the ratio of the maximum amount of work that has been accomplished using 1 or 2 threads to the amount of work that would have been accomplished with multithreading turned off The Average Thread Density shows the average number of threads that have been simultaneously active in the measured interval

13 SMT2 Benefits • SMT delivers more throughput per core, therefore more capacity • Less power and cooling required per unit of capacity • But an individual SMT2 thread is slower than a single thread would be (we’ll see why in a minute) • If an SMT2 core provides 140% of the capacity of a single thread, then two threads will (on average) each run at 70% of the single-thread speed when both threads are active • Increased sharing of low-level resources by threads makes the amount of work that a thread can do dependent on what else the core is doing

14 What Causes the Slowdown?
• A major cause is the sharing of processor cache • On recent System z processors, there are two levels of cache that are private to each core (L1 and L2) • If a core has more than one thread, these caches will be shared across both threads • Each thread is forced to get by with a smaller footprint in these caches and so incurs more L1 and L2 misses than if the caches were not shared • Other resources must also be shared: • The execution pipes • The translation lookaside buffer (TLB) • Physical General Purpose Registers • Store Buffers and other resources on the core

15 What to Expect • Actual throughput for SMT2 can range from less than 100% to close to 200%, depending upon the usage of the shared resources • If programs running on the same core utilize the same resources (competing), they will run slower than before • If programs use different resources (complimentary), they can run close to the ideal maximum speed • Running the same application multiple times shows less repeatable CPU usage because it may run in differing environments

16 What Did RBC See? Using 3 LPARs as a sample….
There was no noticeable response time or task delay impact with “slower” SMT2 zIIP threads There was no zIIP CPU consumption or chargeback volume change We realized approximately 10% reduction in relative physical zIIP utilization on a large LPAR, but only 3% reduction on smaller LPARs.  The overall weighted zIIP capacity utilization benefit from SMT2 across all large and small LPARs was about 8% (compare to IBM’s claim of expected 25%-40% zIIP capacity benefit from SMT2). No major issues (23 LPARs converted with 17 left to go)

17 What Does the Future Hold?
Other platforms have had SMTx for years IBM currently only supports SMT2 on a zIIP IBM future support?

18 Considerations Vendors need to catch up with SMT2
IBM (RMF PTF) MXG May have to make reporting changes internally

19 Recommendations / Summary
SMT2 should be explored in order to exploit capacity and throughput improvements on a z13 Enable SMT2 in a formal and controlled manner Compare before/after metrics carefully Workload drives results/benefits Your mileage will vary

20 References There are various CMG and SHARE papers available on the Internet IBM marketing/technical material EPV white papers Google “SMT2 z13”

21 Q&A and Discussion Are you on z13 boxes?
Has your company implemented SMT2? If not, why not? If so, what did you see?


Download ppt "Two Threads Are Better Than One"

Similar presentations


Ads by Google