Download presentation
Presentation is loading. Please wait.
1
Linux on IBM Z Workload Performance on IBM z14
Workload Performance details for z14 M0x models are provided on charts p4 – p50 z14 ZR1 model are provided on charts p51- IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
2
Trademarks The obligatory ‘Trademarks’ slide, …
The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. DataStage* Db2* DS8000* FICON* FlashSystem* HiperSockets IBM* IBM (logo)* Ibm.com IBM Z* InfoSphere* System Storage* WebSphere* z13* Z13s* z14 z/VM* Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. IT Infrastructure Library is a Registered Trade Mark of AXELOS Limited. ITIL is a Registered Trade Mark of AXELOS Limited. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. UNIX is a registered trademark of The Open Group in the United States and other countries. VMware, the VMware logo, VMware Cloud Foundation, VMware Cloud Foundation Service, VMware vCenter Server, and VMware vSphere are registered trademarks or trademarks of VMware, Inc. or its subsidiaries in the United States and/or other jurisdictions. Other product and service names might be trademarks of IBM or other companies. * Registered trademarks of IBM Corporation Notes: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. This information provides only general descriptions of the types and portions of workloads that are eligible for execution on Specialty Engines (e.g, zIIPs, zAAPs, and IFLs) ("SEs"). IBM authorizes customers to use IBM SE only to execute the processing of Eligible Workloads of specific Programs expressly authorized by IBM as specified in the “Authorized Use Table for IBM Machines” provided at (“AUT”). No other workload processing is authorized for execution on an SE. IBM offers SE at a lower price than General Processors/Central Processors because customers are authorized to use SEs only to process certain types and/or amounts of workloads as specified by IBM in the AUT. The obligatory ‘Trademarks’ slide, … IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
3
Linux Workload Performance on IBM z14 M0x Models
Scale-out (vs. IBM z13®), Scale-up (vs. z13) Database Backup & Restore (vs. x86) FICON Express16S+ (vs. z13) Pervasive Encryption (z14 vs. z13, z14 vs. x86) Competitive Workload Performance (vs x86) Java™ (IBM z14™ vs. z13, z14 vs. x86) Microservices (z14 vs. x86) IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
4
Scale-out (vs. z13) Scale-up (vs. z13)
Linux Workload Performance on IBM z14 Model M0x Run 25% more MongoDB guests with the same throughput under z/VM® 6.4 on z14 compared to z13. Use up to 170 cores on z14 to scale-out MongoDB databases under z/VM 6.4, each with a constant throughput and not more than 10µs latency increase per additional MongoDB instance. Scale-up (vs. z13) Run MongoDB under z/VM 6.4 on z14 and get 4.8x better performance leveraging additional memory available per z/VM instance compared to z13. Scale-up a single MongoDB instance to 17 TB in a single system without database sharding and get 2.4x more throughput and 2.3x lower latency on z14 leveraging the additional memory available compared to z13. Following proof points and the detailed measurements are available on the next slides IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
5
MongoDB Consolidation under z/VM on z14 versus z13
Run 25% more MongoDB guests with the same throughput under z/VM 6.4 on z14 compared to z13 z14 z13 LPAR (32 IFL, 1 TB memory) LPAR (32 IFL, 1 TB memory) MongoDB 2 GB DB MongoBench (SLES guest 1, 2 vCPU, 4 GB) MongoDB 2 GB DB MongoBench (SLES guest 2, 2 vCPU, 4 GB) MongoDB 2 GB DB MongoBench (SLES guest 200, 2 vCPU, 4 GB) MongoDB 2 GB DB MongoBench (SLES guest 1, 2 vCPU, 4 GB) MongoDB 2 GB DB MongoBench (SLES guest 2, 2 vCPU, 4 GB) MongoDB 2 GB DB MongoBench (SLES guest 160, 2 vCPU, 4 GB) ... ... Disclaimer: Performance result based on IBM internal tests comparing MongoDB performance under z/VM 6.4 with the PTF for APAR VM65942 on z14 versus z13 driven locally by MongoBench ( issuing 90% read and 10% write operations. Results may vary. z14 configuration: LPAR with 32 dedicated IFLs and 1 TB memory running a z/VM 6.4 with the PTF for APAR VM65942 instance in SMT mode with 200 guests. Each guest was configured with 2 vCPUs and 4 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 2 GB database. The databases were located on a FCP-attached DS8700 LUN with multi-pathing enabled. z13 configuration: LPAR with 32 dedicated IFLs and 1 TB memory running a z/VM 6.4 with the PTF for APAR VM65942 instance in SMT mode with 160 guests. Each guest was configured with 2 vCPUs and 4 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 2 GB database. The databases were located on a FCP-attached DS8700 LUN with multi-pathing enabled. Details are self-explanatory on the slide. z/VM 6.4 z/VM 6.4 zHypervisor zHypervisor IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
6
MongoDB w/ 2 GB DB MongoBench MongoDB w/ 2 GB DB Mongobench
Scale-out MongoDB instances under z/VM 6.4 on z14 with minimal SLA impact Use up to 170 IFLs on z14 to scale-out MongoDB databases under z/VM 6.4, each with a constant throughput and not more than 10µs latency increase per additional MongoDB instance z14 zHypervisor LPAR 1 (42 IFL, 1.3 TB memory) z/VM 6.4 MongoDB w/ 2 GB DB MongoBench (SLES guest 1, 2 vCPU, 4 GB) (SLES guest 2, 2 vCPU, 4 GB) MongoDB w/ 2 GB DB Mongobench (SLES guest 336, 2 vCPU, 4 GB) . . . LPAR 2 LPAR 3 LPAR 4 z/VM 336 MongoDB guests Details are self-explanatory on the slide. Disclaimer: Performance result is extrapolated from IBM internal tests running in a z14 LPAR with 32 dedicated IFLs and 1 TB memory a z/VM 6.4 with the PTF for APAR VM65942 instance in SMT mode with up to 256 guests. Each guest was configured with 2 vCPUs and 4 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 2 GB database. The guest image and the databases were located on a FCP-attached DS8700 with multi-pathing enabled. The MongoDB instances were driven locally by a MongoBench ( instance which issued 90% read and 10% write operations with 8 threads against each MongoDB instance. Results may vary. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
7
Scale-out with Docker under z/VM on IBM z14
LPAR 1 (10 IFL, 2 TB memory) LPAR 16 (10 IFL, 2 TB memory) 1000 BusyBox Container with ApacheHTTP z/VM guest 1 (2 vCPU, 16GB) 1000 BusyBox Container with ApacheHTTP z/VM guest 128 (2 vCPU, 16GB) 1000 BusyBox Container with ApacheHTTP z/VM guest 1872 (2 vCPU, 16GB) 1000 BusyBox Container with ApacheHTTP z/VM guest 2000 (2 vCPU, 16GB) Scale-out to 2 million Docker containers in a single z14 system, no application server farms necessary . . . . . . . . . z/VM 6.4 z/VM 6.4 zHypervisor Details are self-explanatory on the slide. Disclaimer: Performance result is extrapolated from IBM internal tests running in a z14 LPAR with 10 dedicated IFLs and 16 GB memory 1000 BusyBox Docker containers with ApacheHTTP. Results may vary. Operating system was SLES 12 SP2 (SMT mode). Docker 1.12 was used. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
8
MongoDB Consolidation under z/VM on IBM z14
Each MongoDB instance processed 280 million transactions per day Run 1344 concurrent databases executing a total of 377 billion database transactions per day on a single z14 server MongoBench Workload Driver (90% read, 10% write, 8 threads) Included in each instance z14 zHypervisor LPAR 1 (42 IFL, 1.3 TB memory) z/VM 6.4 MongoDB w/ 2 GB DB MongoBench (SLES guest 1, 2 vCPU, 4 GB) (SLES guest 2, 2 vCPU, 4 GB) MongoDB w/ 2 GB DB Mongobench (SLES guest 336, 2 vCPU, 4 GB) . . . LPAR 2 LPAR 3 LPAR 4 z/VM 336 MongoDB guests Details are self-explanatory on the slide. Disclaimer: Performance result is extrapolated from IBM internal tests running in a z14 LPAR with 32 dedicated IFLs and 1 TB memory a z/VM 6.4 with the PTF for APAR VM65942 instance in SMT mode with 256 guests. Each guest was configured with 2 vCPUs and 4 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 2 GB database. The guest image and the databases were located on a FCP-attached DS8700 with multi-pathing enabled. The MongoDB instances were driven locally by a MongoBench ( instance which issued 90% read and 10% write operations with 8 threads against each MongoDB instance. Results may vary. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
9
AcmeAir Consolidation in native LPAR on z14
Each of the 20 AcmeAir instances processed over 2.1 billion web transactions per day Run 20 concurrent AcmeAir benchmark instances on 260 node.js 6.11 instances executing in total over 43 billion web transactions per day on a single z14 server z14 (with 170 IFL) 2x Drawer (with 42 IFL) zHypervisor 4x LPAR (with 9 IFL, 64 GB memory) HTTP 14x Node.js (AcmeAir) MongoDB 20x Jmeter (AcmeAir Workload Driver) 1x LPAR (with 6 IFL, 64 GB memory) 8x Node.js (AcmeAir) 2x Drawer (with 43 IFL) 1x LPAR (with 7 IFL, 64 GB memory) 10x Node.js (AcmeAir) Details are self-explanatory on the slide. Disclaimer: Performance result is extrapolated from IBM internal tests running 4 LPARs on a single drawer z14 system, where each LPAR had 9 IFLs, 64 GB memory and was running the AcmeAir benchmark with 10,000 customers on 14 node.js 6.11 instances against a MongoDB Enterprise database driven remotely by 250 JMeter 2.13 threads. In each LPAR an Apache HTTP server instance was running as load balancer to forward the JMeter requests to the node.js instances. The 14 Node.js instances in each LPAR were pinned to 7 IFLs and the MongoDB instance was pinned to 1 IFL. SLES12 SP2 was used as operating system and ran in SMT mode. Application logs and the MongoDB databases were located on RAM disk. Results may vary. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
10
MongoDB Consolidation under z/VM on IBM z14 leveraging additional memory available
zHypervisor LPAR (32 IFL, 2 TB memory) z/VM 6.4 MongoDB 256 GB database (SLES guest 1, 8 vCPU, 510 GB) YCSB (100% read) Databases fit in memory (SLES guest 2, 8 vCPU, 510 GB) (SLES guest 3, 8 vCPU, 510 GB) (SLES guest 4, 8 vCPU, 510 GB) z13 LPAR (32 IFL, 1 TB memory) z/VM 6.3 (SLES guest 1, 8 vCPU, 250 GB) (SLES guest 2, 8 vCPU, 250 GB) (SLES guest 3, 8 vCPU, 250 GB) (SLES guest 4, 8 vCPU, 250 GB) Databases do not fit in memory 193k transactions/sec in total 40k transactions/sec in total versus Run MongoDB under z/VM 6.4 on z14 and get 4.8x better performance leveraging additional memory available per z/VM instance compared to z13 Disclaimer: Performance result based on IBM internal tests comparing MongoDB performance under z/VM 6.4 with the PTF for APAR VM65942 on z14 with MongoDB performance under z/VM 6.3 on z13 driven remotely by YCSB (100% read operations). Results may vary. z14 configuration: LPAR with 32 dedicated IFLs and 2 TB memory running a z/VM 6.4 instance in SMT mode with 4 guests. Each guest was configured with 8 vCPUs and 510 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 256 GB database. The databases were located on a FCP-attached DS8700 LUN with multi-pathing enabled. 1 FCP path per z/VM guest. z13 configuration: LPAR with 32 dedicated IFLs and 1 TB memory running a z/VM 6.3 instance in SMT mode with 4 guests. Each guest was configured with 8 vCPUs and 250 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 256 GB database. The databases were located on a FCP-attached DS8700 LUN with multi-pathing enabled. 1 FCP path per z/VM guest. Details are self-explanatory on the slide. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
11
MongoDB scale-up on z14 leveraging the additional memory available compared to z13
2.4x Scale-up a single MongoDB instance to 17 TB in a single system without database sharding and get 2.4x more throughput and 2.3x lower latency on z14 leveraging the additional memory available compared to z13 0.44x 0.42x Details are self-explanatory on the slide. Disclaimer: Performance result based on IBM internal tests comparing MongoDB performance in native LPAR on z14 using additional memory versus z13 driven by YCSB (write-heavy, read-only). Results may vary. z14 configuration: LPAR with 12 dedicated IFLs and 20 TB memory running on SLES 12 SP2 (SMT mode) a MongoDB Enterprise Release instance (no sharding, no replication) with a 17 GB database. The database was located on an 18 TB LUN on an IBM FlashSystem 900. z13 configuration: LPAR with 12 dedicated IFLs and 10 TB memory running on SLES 12 SP2 (SMT mode) a MongoDB Enterprise Release instance (no sharding, no replication) with a 17 GB database. The database was located on an 18 TB LUN on an IBM FlashSystem 900. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
12
IBM FlashSystem 900 (18 TB LUN) IBM FlashSystem 900 (18 TB LUN)
MongoDB scale-up on z14 leveraging the additional memory available compared to z13 – Benchmark Setup z13 LPAR w/ 12 IFLs and 10 TB memory running SLES 12 SP2, 18 TB FlashSystem™ 900 storage 17 TB MongoDB database, no sharding YCSB Benchmark (write-heavy, read-only) z14 LPAR w/ 12 IFLs and 20 TB memory running SLES 12 SP2, 18 TB FlashSystem 900 storage YCSB benchmark (write-heavy, read-only) z13 zHypervisor LPAR (12 IFL, 10 TB memory) MongoDB w/ 17 TB database IBM FlashSystem 900 (18 TB LUN) Database does not fit in memory on z13 z14 zHypervisor LPAR (12 IFL, 20 TB memory) MongoDB w/ 17 TB database IBM FlashSystem 900 (18 TB LUN) Database does fit in memory on z14 Details are self-explanatory on the slide. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
13
Database Backup & Restore (vs. x86)
Linux Workload Performance on IBM z14 Model M0x Database Backup & Restore (vs. x86) Operators can perform database backup up to 11.9x faster and database restore up to 2.4x faster for Db2® LUW on a z14 LPAR using zEDC Express versus a compared x86 platform using software compression. Operators can perform database backup with up to 75% lesser CPU utilization and database restore with up to 71% lesser CPU utilization for Db2 LUW on a z14 LPAR using zEDC Express versus a compared x86 platform using software compression. Operators can perform database dump up to 3.5x faster and database restore up to 1.1x faster for MongoDB Enterprise Edition on a z14 LPAR using zEDC Express versus a compared x86 platform using software compression. Operators can perform database dump with up to 84% lesser CPU utilization and database restore with up to 9% lesser CPU utilization for MongoDB Enterprise Edition on a z14 LPAR using zEDC Express versus a compared x86 platform using software compression. Following proof points and the detailed measurements are available on the next slides IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
14
Db2 LUW Backup and Restore Performance on z14 vs x86 Broadwell
Operators can perform database backup up to 11.9x faster and database restore up to 2.4x faster for DB2 LUW on a z14 LPAR using zEDC Express versus a compared x86 platform using software compression Details are self-explanatory on the slide. 2.4x Disclaimer: Performance results based on IBM internal tests running database backup and restore with compression on DB2 v11.1.1fp1a on a database of size 385 GB using the build-in software compression mechanism and genwqe-user for zEDC Express on z14. Results may vary. z14 configuration: LPAR with 32 dedicated IFLs, 1-8 IFLs enabled for DB2, 1.5 TB memory, RHEL 7.3 in SMT mode, database and backup located on IBM DS8000. x86 configuration: 36 Intel E v4 2.30GHz Broadwell, 1-8 cores enabled for DB2, 1.5 TB memory, RHEL 7.3, database and backup located on IBM DS8000. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
15
Db2 LUW Backup and Restore Performance on z14 vs x86 Broadwell
75% Operators can perform database backup with up to 75% lesser CPU utilization and database restore with up to 71% lesser CPU utilization for DB2 LUW on a z14 LPAR using zEDC Express versus a compared x86 platform using software compression 71% Details are self-explanatory on the slide. Disclaimer: Performance results based on IBM internal tests running database backup and restore with compression on DB2 v11.1.1fp1a on a database of size 385 GB using the build-in software compression mechanism and genwqe-user for zEDC Express on z14. Results may vary. z14 configuration: LPAR with 32 dedicated IFLs, 1-8 IFLs enabled for DB2, 1.5 TB memory, RHEL 7.3 in SMT mode, database and backup located on IBM DS8000. x86 configuration: 36 Intel E v4 2.30GHz Broadwell, 1-8 cores enabled for DB2, 1.5 TB memory, RHEL 7.3, database and backup located on IBM DS8000. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
16
Db2 LUW Backup and Restore Performance on z14 vs x86 Broadwell – Benchmark Configuration
Benchmark Setup Ran Db2 backup and db2 restore with build-in software compression on x86 with zEDC (GenWQE connected via a named pipe) on z14 Parameter ‘parallelism’ was set to twice the number of cores/IFLs Db2 database size 385 GB Database and backup located on IBM DS8000® System Stack x86 36 Intel E v4 2.30GHz w/ Hyperthreading turned on and 1.5 TB memory running RHEL 7.3 2 TB IBM DS8000 storage Db2 LUW fp1a z14 LPAR with 32 dedicated IFLs and 1.5 TB memory running RHEL 7.3 with SMT enabled, attached zEDC Express genwqe-user Db2 Db2 Db2 build-in compression zEDC Express IBM System Storage® DS8000 IBM System Storage DS8000 Details are self-explanatory on the slide. With software based compression With zEDC Express based compression IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
17
MongoDB Dump and Restore Performance on z14 vs x86 Broadwell
Operators can perform database dump up to 3.5x faster and database restore up to 1.1x faster for MongoDB Enterprise Edition on a z14 LPAR using zEDC Express versus a compared x86 platform using software compression 1.1x Details are self-explanatory on the slide. Disclaimer: Performance results based on IBM internal tests running database dump and restore with compression on MongoDB Enterprise Edition on a database of size 83 GB using pigz v2.3.3 and genwqe-user for zEDC Express on z14. Results may vary. z14 configuration: LPAR with 32 dedicated IFLs, 1-8 IFLs enabled, 1.5TB memory, 40 GB DASD storage, attached zEDC Express, running RHEL 7.3 in SMT mode, database and backup located on IBM DS8000. x86 configuration: 36 Intel E v4 2.30GHz with Hyperthreading, 1-8 cores enabled, 1.5TB memory, HDD storage, running RHEL 7.3, database and backup located on IBM DS8000. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
18
MongoDB Dump and Restore Performance on z14 vs x86 Broadwell
84% Operators can perform database dump with up to 84% lesser CPU utilization and database restore with up to 9% lesser CPU utilization for MongoDB Enterprise Edition on a z14 LPAR using zEDC Express versus a compared x86 platform using software compression 9% Details are self-explanatory on the slide. Disclaimer: Performance results based on IBM internal tests running database dump and restore with compression on MongoDB Enterprise Edition on a database of size 83 GB using pigz v2.3.3 and genwqe-user for zEDC Express on z14. Results may vary. z14 configuration: LPAR with 32 dedicated IFLs, 1-8 IFLs enabled, 1.5TB memory, 40 GB DASD storage, attached zEDC Express, running RHEL 7.3 in SMT mode, database and backup located on IBM DS8000. x86 configuration: 36 Intel E v4 2.30GHz with Hyperthreading, 1-8 cores enabled, 1.5TB memory, HDD storage, running RHEL 7.3, database and backup located on IBM DS8000. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
19
MongoDB Dump and Restore Performance on z14 vs x86 Broadwell – Benchmark Configuration
Benchmark Setup Ran mongodump and mongorestore with software compression (pigz) on x86 with zEDC (genwqe-user lib) on z14 MongoDB database size 83 GB Database and dump located on IBM DS8000 System Stack x86 36 Intel E v4 2.30GHz w/ Hyperthreading turned on and 1.5 TB memory running RHEL 7.3 2 TB IBM DS8000 storage MongoDB Enterprise Edition 3.4.6 pigz 2.3.3 z14 LPAR with 32 dedicated IFLs and 1.5 TB memory running RHEL 7.3 with SMT enabled, attached zEDC Express pigz 2.3.3, genwqe-user pigz compress zEDC Express IBM System Storage DS8000 IBM System Storage DS8000 Details are self-explanatory on the slide. With software based compression With zEDC Express based compression IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
20
Linux Workload Performance on IBM z14 Model M0x
FICON Express16S+ (vs. z13) Run the DBI benchmark on Db2 LUW with up to 20% more throughput using FICON Express16S+ cards on z14 compared to using FICON Express16S cards on z13. Run the pgBench benchmark on PostgreSQL with up to 45% more throughput using FICON Express16S+ cards on z14 compared to using FICON Express16S cards on z13. Run the sysbench benchmark on MariaDB with up to 19% more throughput using FICON Express16S+ cards on z14 compared to using FICON Express16S cards on z13. Following proof points and the detailed measurements are available on the next slides IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
21
Db2 LUW 11.1.1 Performance with FICON Express16S+ Cards
Run the BDI benchmark on DB2 LUW with up to 20% more throughput using FICON Express16S+ cards on z14 compared to using FICON Express16S cards on z13 Details are self-explanatory on the slide. Disclaimer: Performance results based on IBM internal tests running the BDI benchmark, which is based on TPC-DS, on DB2 LUW with BLU Acceleration. The BDI benchmark was configured to run a fixed sequence of queries. DB2 database size was 500 GB. Results may vary. z13 configuration: LPAR with 8 dedicated IFLs, 64GB memory, and 11 TB LUN on IBM FlashSystem 900 attached via FICON Express16S cards, RHEL 7.3 (SMT mode) running DB2 LUW , IBM Java 1.8, and BDI. z14 configuration: LPAR with 8 dedicated IFLs, 64GB memory, and 11 TB LUN on IBM FlashSystem 900 attached via FICON Express16S+ cards, RHEL 7.3 (SMT mode) running DB2 LUW , IBM Java 1.8, and BDI. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
22
Db2 LUW 11.1.1 Performance with FICON Express16S+ Cards – Benchmark Configuration
Benchmark Setup BDI workload driver based on TPC-DS 8 parallel users performing predefined SQL Queries Db2 database size 500 GB System Stack z13 LPAR with 8 dedicated IFLs and 64 GB memory running RHEL 7.3 with SMT enabled 11 TB LUN on IBM FlashSystem 900 attached via FICON Express16S card Db2 LUW , IBM Java 1.8 z14 LPAR with 8 dedicated IFLs and 64GB memory running RHEL 7.3 with SMT enabled 11 TB LUN on IBM FlashSystem 900 attached via FICON Express16S+ card IBM FlashSystem 900 DBI Workload Driver RHEL7.3 Db2 Database DB2 LUW Details are self-explanatory on the slide. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
23
PostgreSQL 9.6.1 Performance with FICON Express16S+ Cards
Run the pgBench benchmark on PostgreSQL with up to 45% more throughput using FICON Express16S+ cards on z14 compared to using FICON Express16S cards on z13 Details are self-explanatory on the slide. Disclaimer: Performance results based on IBM internal tests running the pgBench 9.6 benchmark (64 threads, 1000 clients) remotely against PostgreSQL PostgreSQL database size was 300 GB. Results may vary. z13 configuration: LPAR with 8 dedicated IFLs, 64 GB memory, and 400 GB Flash 9840 LUN attached via FICON Express16S cards, SLES 12 SP2 (SMT mode) running PostgreSQL z14 configuration: LPAR with 8 dedicated IFLs, 64 GB memory, and 400 GB Flash 9840 LUN attached via FICON Express16S+ cards, SLES 12 SP2 (SMT mode) running PostgreSQL IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
24
PostgreSQL 9.6.1 Performance with FICON Express16S+ Cards – Benchmark Configuration
Benchmark Setup Ran pgBench 9.6 workload driver remotely on x86 blade server with 64 threads, 1000 clients read-only (100% read) write-only (100% write) PostgreSQL database size 300 GB System Stack z13 LPAR with 8 dedicated IFLs and 64 GB memory running SLES 12 SP2 with SMT enabled 400 GB LUN on IBM FlashSystem 900 attached via FICON Express16S card PostgreSQL 9.6.1 z14 400 GB LUN on IBM FlashSystem attached via FICON Express16S+ card pgBench PostgreSQL 9.6.1 300GB Database X86 Blade Server Details are self-explanatory on the slide. Linux IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
25
MariaDB 10.1.24 Performance with FICON Express16S+ Cards
Run the sysbench benchmark on MariaDB with up to 19% more throughput using FICON Express16S+ cards on z14 compared to using FICON Express16S cards on z13 Details are self-explanatory on the slide. Disclaimer: Performance results based on IBM internal tests running the sysbench 0.5 benchmark on MariaDB MariaDB database size was 100 GB. Results may vary. z13 configuration: LPAR with 16 dedicated IFLs, 32 GB memory, and 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S cards, SLES 12 SP2 (SMT mode) running MariaDB z14 configuration: LPAR with 16 dedicated IFLs, 32 GB memory, and 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S+ cards, SLES 12 SP2 (SMT mode) running MariaDB IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
26
MariaDB 10.1.24 Performance with FICON Express16S+ Cards – Benchmark Configuration
Benchmark Setup sysbench 0.5 benchmark read-only (100% read) read-write MariaDB database size 100 GB System Stack z13 LPAR with up to 16 dedicated IFLs and 32 GB memory running SLES 12 SP2 with SMT enabled 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S card MariaDB z14 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S+ card IBM FlashSystem 900 sysbench SLES12 SP2 Database MariaDB Details are self-explanatory on the slide. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
27
Pervasive Encryption z14 vs. z13 z14 vs. x86
Linux Workload Performance on IBM z14 Model M0x Pervasive Encryption z14 vs. z13 OpenSSL 1.0.2j provides up to 16x more throughput per core on a z14 LPAR compared to a z13 LPAR. Run the read-mostly workload of the YCSB benchmark on MongoDB Enterprise Edition with only 6% CPU overhead on average when enabling pervasive encryption on a z14 LPAR. z14 vs. x86 Run the DayTrader benchmark on WebSphere® Application Server with pervasive encryption enabled and achieve up to 2.3x more throughput on a z14 LPAR versus a compared x86 platform. OpenSSL 1.0.2j provides up to 7x better performance per core on a z14 LPAR versus a compared x86 platform. Following proof points and the detailed measurements are available on the next slides IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
28
OpenSSL Performance on z14 vs z13
3.4x 2.1x 16x 9.4x 7.4x OpenSSL 1.0.2j provides up to 16x more throughput per core on a z14 LPAR compared to a z13 LPAR Details are self-explanatory on the slide. Disclaimer: Performance result based on IBM internal tests comparing OpenSSL 1.0.2j speed benchmark performance for different ciphers in native LPAR on z14 versus z13. OpenSSL was invoked with the options: speed –elapsed –multi 1 –evp <cipher>. Results may vary. z14 configuration: LPAR with 8 dedicated IFLs, 128 GB memory, 40 GB DASD storage, SLES12 SP2 (SMT mode), libica ( and openssl-ibmca ( exploiting CPACF enhancements. z13 configuration: LPAR with 8 dedicated IFLs, 128 GB memory, 40 GB DASD storage, SLES12 SP2 (SMT mode). IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
29
CPU Overhead with Pervasive Encryption for MongoDB on z14
Run the read-mostly workload of the YCSBQbenchmark on MongoDB Enterprise Edition with only 6% CPU overhead on average when enabling pervasive encryption on a z14 LPAR Details are self-explanatory on the slide. Disclaimer: Performance result is extrapolated from IBM internal tests running MongoDB Enterprise Edition with and without SSL and database encryption driven remotely by 512 total threads of Yahoo! Cloud Serving Benchmark (YCSB) using the workload read-mostly (95% read, 5% update) and a record size of 5KB. Two external x86 blade-servers, each with 4 independent YCSB instances stressed the MongoDB database simultaneously. YCSB was configured to generate constant throughput rates. RSA 4096 bit key for SSL configuration of MongoDB. GCM based ciphers were used for SSL. Database stored via dm-crypt using aes-xts-plain64. CPU utilization for the pervasive encryption case was projected by scaling the achieved throughput with pervasive encryption to the throughput achieved without encryption. Results may vary. z14 configuration: LPAR with 8 dedicated IFLs, 256GB memory, 40 GB DASD storage, RHEL 7.3 (SMT mode), OpenSSL 1.0.1e-fips, 50GB database on IBM DS8000 storage. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
30
LPAR with 8 IFL, 256GB memory, RHEL 7.3 in SMT mode
CPU Overhead with Pervasive Encryption for MongoDB on z14 – Benchmark Setup DS8K YCSB – 0 MongoDB 50 GB Mongo Database TLS v1.2 SSL 10 Gbps Network LPAR with 8 IFL, 256GB memory, RHEL 7.3 in SMT mode dm-crypt x86 Blade – 0 (Client-emulator) YCSB – 1 YCSB – 2 YCSB – 3 x86 Blade – 1 (Client-emulator) Details are self-explanatory on the slide. YCSB v Workload B (Read-mostly, 95% read, 5% update) Record size: 5KB YCSB with target transaction rate specified MongoDB v3.4.1 with SSL disabled or enabled (GCM ciphers) OpenSSL 1.0.1 Memory primed before tests No encryption or dm-crypt with aes-xts-plain64 IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
31
Pervasive Encryption Performance for WebSphere Application Server on z14 vs x86 Broadwell
Run the DayTrader benchmark on WebSphere Application Server with pervasive encryption enabled and achieve up to 2.3x more throughput on a z14 LPAR versus a compared x86 platform Details are self-explanatory on the slide. Disclaimer: Performance result is extrapolated from IBM internal tests running the DayTrader 3 benchmark with pervasive encryption on WebSphere Application Server (WAS) with IBM Java SR5 using DB2 LUW to persist application data. The workload was driven remotely by Apache JMeter to trade stocks among users. SSL encryption protocol TLS v1.2 with cipher suite SSL_RSA_WITH_AES_256_GCM_SHA384, 4096 bit key size was used to encrypt the communication between JMeter, WAS, DB2 and the DB2 database and log files were encrypted using dm-crypt with aes-xts-plain64. Results may vary. z14 configuration: WAS running in a LPAR with 4 dedicated IFLs, 64 GB memory, 80 GB DASD storage, HyperPAV=16, SLES 12 SP2 (SMT mode). DB2 LUW running in a LPAR with 4 dedicated IFLs, 64 GB memory, 80 GB DASD storage, HyperPAV=16, SLES 12 SP2 (SMT mode). WAS and DB2 were communicating via Hipersockets. x86 configuration: WAS running bare metal on a x86 server with 4 enabled Intel(R) Xeon(R) CPU E GHz cores with Hyperthreading, 64GB. DB2 was running bare metal on x86 server with 4 enabled Intel(R) Xeon(R) CPU E GHz cores with Hyperthreading, 64GB memory, local SSD storage. The two servers were connected by 10Gb Ethernet. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
32
Pervasive Encryption Performance for WebSphere Application Server on z14 vs x86 Broadwell – Benchmark Configuration Benchmark Setup DayTrader Benchmark (15000 Users, Stocks) (ftp://public.dhe.ibm.com/software/webservers/appserv/was/DayTrader3Install.zip) Two driving x86 server, each trading for 7500 users 2-6 driver threads (channels) per WAS compute thread SSL encryption protocol TLS v1.2 with cipher suite SSL_RSA_WITH_AES_256_GCM_SHA384, bit key size used between JMeter, WAS, and Db2 Db2 DayTrader database and log files encrypted with dm-crypt using aes-xts-plain64 System Stack z14 2 LPARs, each with 4 IFL, 64 GB memory running SLES12 SP2 with SMT enabled, DS8000 DASD storage WAS with IBM Java SR5 in one LPAR Db in second LPAR LPARs connected via Hipersockets System Stack x86 x86 server with 4 enabled Intel Xeon CPU E GHz cores w/ Hyperthreading, 64 GB memory running WAS with IBM Java SR5 on SLES 12 SP2 x86 server with 4 enabled Intel Xeon CPU E GHz cores w/ Hyperthreading, 64 GB memory running Db2 LUW with IBM Java SR5 on SLES12 SP2 x86 server connected via 10Gbps Ethernet 10Gbit YCSB DASD Storage on IBM DS8000 Server / local SSDs HammerDB Workload Driver RHEL 7.3 2 x86 server (24 cores each) Jmeter using IBM Java Database SLES 12 SP2 WAS (DayTrader) Logs LPAR / x86 server (4 cores, 64 GB memory) Db2 LUW FICON® / SCSI HiperSockets™ / 10Gbit Details are self-explanatory on the slide. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
33
OpenSSL Performance on z14 vs x86 Broadwell
OpenSSL 1.0.2j provides up to 7x better perf ormance per core on a z14 LPAR versus a compared x86 platform Details are self-explanatory on the slide. Disclaimer: Performance result based on IBM internal tests comparing OpenSSL 1.0.2j speed benchmark performance for different ciphers in native LPAR on z14 versus x86 bare metal. OpenSSL was invoked with the options: speed –elapsed –multi 1 –evp <cipher>. Results may vary. x86 configuration: 8 Intel E v4 2.30GHz w/ Hyperthreading turned on, 1.5 TB memory, 500 GB local RAID-5 HDD storage, SLES12 SP2. z14 configuration: LPAR with 8 dedicated IFLs, 128GB memory, 40 GB DASD storage, SLES12 SP2 (SMT mode), libica ( and openssl-ibmca ( exploiting CPACF enhancements. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
34
Pervasive Encryption Performance for Node.js on z14 vs x86 Broadwell
Run the Acme Air benchmark on Node.js 6.10 with pervasive encryption enabled and achieve up to 1.9x more throughput on a z14 LPAR versus a compared x86 platform Up to 2.1x throughput improvement on YCSB with one Cloudant node configuration YCSB read-only 1x – 1.7x YCSB read-mostly 1.5x – 2.1x YCSB write-heavy 1.4x – 2.1x Cloudant: Today we have completed the x86 Haswell runs for a 1-node Cloudant configuration. The Cloudant chart shows now the 1-node results on x86 Haswell vs z13. With the latest Cloudant Local beta version we see 1x - 2.1x better performance for YCSB on z13: YCSB read-only is only 1x -1.7x. YCSB read-mostly is 1.5x - 2.1x (figure at the top). YCSB heavy-write is 1.4x - 2.1x (figure at the bottom). This night we hope to complete the 3-node cluster runs on x86. I will send you an updated Cloudant chart comparing also a 3-node config on z13 vs x86 until tomorrow morning your time. Disclaimer: Performance results based on IBM internal tests running the Acme Air benchmark with 10,000 customers on Node.js v against MongoDB Enterprise Edition v3.4.2 driven remotely by 250 JMeter v2.13 threads. Apache HTTP server v was used as load balancer. TLS v1.2, with SSL cipher suite ECDHE-RSA-AES128-GCM-SHA256 was used between JMeter and Apache HTTP, ECDHE-RSA-AES128-GCM-SHA256 cipher was used between Apache HTTP and Node.js, RSA 4096 bit key for SSL configuration of MongoDB, database encrypted via dm-crypt using aes-xts-plain64. Number of Node.js instances equal twice the number of cores assigned to Node.js, each instance pinned to a single CPU. Results may vary. z14 config: Node.js and Apache HTTP server running in LPAR with 20 dedicated IFLs, 1.5 TB memory, 40 GB DASD storage, SLES 12 SP2 (SMT mode), OpenSSL 1.0.2j-fips, Apache HTTP server pinned to 2 IFLs, Node.js pinned to 1-8 IFLs. MongoDB running in LPAR with 2 dedicated IFLs, 64 GB memory, 40 GB DASD storage, SLES12 SP2 (SMT mode), OpenSSL 1.0.2j-fips, application logs and database on RAM disk. x86 configuration: Node.js and Apache HTTP server running on server with 36 Intel® Xeon® CPU E GHz cores with Hyperthreading, 1.5 TB RAM, SLES12 SP2, OpenSSL 1.0.2j-fips, Apache HTTP server pinned to 2 cores, Node.js pinned to 1-8 cores. MongoDB running on server with 36 Intel® Xeon® CPU E GHz cores with Hyperthreading, 768 GB RAM, only 2 cores active, SLES12 SP2, OpenSSL 1.0.2, application logs and database on RAM disk. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
35
Pervasive Encryption Performance for Node
Pervasive Encryption Performance for Node.js on z14 vs x86 Broadwell – Benchmark Setup on z14 Database LPAR App. Server LPAR Apache HTTP (Load Balancer) Acme Air (Node.js App) Database 10 Gbps Network Encryption Libraries MongoDB JMeter (Client Emulator) x86 Blade Configuration System z14 – 2 LPARs Cores 20 dedicated IFL with SMT to App. Server LPAR 2 dedicated IFL with SMT to Database LPAR Core Assignment Apache: 2 (pinned) Node.js: 1, 2, 3, 4, 8 (pinned) MongoDB: 2 IFL active Memory 1.5 TB App. Server LPAR, 64 GB Database LPAR OS SLES12 SP2 Database MongoDB Enterprise, Database on RAM disk End-to-End Encryption: - JMeter ↔ (SSL) ↔ Apache - Apache ↔ (SSL) ↔ Acme Air - Acme Air ↔ (SSL) ↔ MongoDB - MongoDB ↔ (dm-crypt) ↔ Disk IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
36
Pervasive Encryption Performance for Node
Pervasive Encryption Performance for Node.js on z14 vs x86 Broadwell – Benchmark Setup on x86 x86 Haswell x86 Broadwell Database Server App. Server Apache HTTP (Load Balancer) Acme Air (Node.js App) Database 10 Gbps Network Encryption Libraries MongoDB JMeter (Client Emulator) x86 Blade Configuration System Broadwell and Haswell Cores 36 cores with HT to App. Server 36 cores with HT to Database Server Cores Assignment Apache: 2 (pinned) Node.js: 1, 2, 3, 4, 8 (pinned) MongoDB: 2 cores active Memory 1.5 TB App. Server, 768 GB Database Server OS SLES12 SP2 Database MongoDB Enterprise, Database on RAM disk End-to-End Encryption: - JMeter ↔ (SSL) ↔ Apache - Apache ↔ (SSL) ↔ Acme Air - Acme Air ↔ (SSL) ↔ MongoDB - MongoDB ↔ (dm-crypt) ↔ Disk IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
37
Competitive Workload Performance (vs x86)
Linux Workload Performance on IBM z14 Model M0x Competitive Workload Performance (vs x86) Run the Acme Air benchmark on node.js 6.10 with up to 2.5x more throughput per core on a z14 LPAR versus a compared x86 platform. Run the DayTrader benchmark on WebSphere Application Server with up to 1.9x more throughput per core on a z14 LPAR versus a compared x86 platform. Run the DayTrader benchmark on Apache TomEE with up to 2.3x more throughput per core on a z14 LPAR versus a compared x86 platform. Run the MicroBM_CPU benchmark on InfoSphere® DataStage® 11.5 with up to 2.8x more throughput per core on a z14 LPAR versus a compared x86 platform. Run the pgBench benchmark on PostgreSQL with up to 2x more throughput per core on a z14 LPAR versus a compared x86 platform. Run the YCSB benchmark on MongoDB with up to 2.6x more throughput per core on a z14 LPAR versus a compared x86 platform. Run the sysbench benchmark on MariaDB with up to 3.7x more throughput on z14 with IBM FlashSystem 900 storage versus a compared x86 platform with local SSD storage Run the apacheBench benchmark on Wordpress with up to 1.9x more throughput per core on a z14 LPAR versus a compared x86 platform. Following proof points and the detailed measurements are available on the next slides IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
38
Node.js Performance on z14 vs x86 Broadwell
Node.js Cores / IFLs Run the Acme Air benchmark on node.js 6.10 with up to 2.5x more throughput per core on a z14 LPAR versus a compared x86 platform Details are self-explanatory on the slide. Disclaimer: Performance results based on IBM internal tests running Acme Air with 10,000 customers on Node.js v against MongoDB Enterprise driven remotely by 250 JMeter 2.13 threads. Apache HTTP server was used as load balancer. Results may vary. x86 configuration: 36 Intel E v4 2.30GHz, Apache HTTP server pinned to 1 core, Node.js pinned to 1-16 cores, MongoDB pinned to 2-4 cores, 768GB memory, SLES12-SP2 with Hyperthreading, application logs and database on the RAM disk. z14 configuration: LPAR with 32 dedicated IFLs, Apache HTTP server pinned to 1 IFL, Node.js pinned to 1-16 IFLs, MongoDB pinned to 2-4 IFLs, 768GB memory, 40 GB DASD storage, SLES12-SP2 with SMT, application logs and database on the RAM disk. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
39
Node.js Performance on z14 vs x86 Broadwell – Benchmark Setup
Over Acme Air Node.js Over JMeter (Client Emulator) Apache HTTP (Load Balancer) MongoDB Database 10 Gbps Network Acme Air Node.js Resides on RAM disk xfs formatted Stores customer information Load balancer Directs requests from JMeter Stresses all node.js instances x86 Blade Server 16 cores, 256 GB RAM Generates requests Emulates concurrent clients Acme Air Node.js Main processing unit Provides http interface (Dashboard) Details are self-explanatory on the slide. z14 LPAR or x86 Broadwell server IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
40
WebSphere Application Server Performance on z14 vs x86 Broadwell
Run the DayTrader benchmark on WebSphere Application Server with up to 1.9x more throughput per core on a z14 LPAR versus a compared x86 platform Details are self-explanatory on the slide. Disclaimer: Performance results based on IBM internal tests running Daytrader 3 web application benchmark on Websphere Application Server WAS with IBM Java (SR3). Database DB2 LUW located on the same system was used to persist application data. Half of the compute cores for each system variation under test were bound to DB2, the other half to WAS. The workload was driven remotely by Apache JMeter to trade stocks among users. The utilization of the workload was adjusted by the number of driver threads. Results may vary. x86 configuration: 2-16 Intel(R) Xeon(R) CPU E GHz, 1.5TB fast TruDDR4 2400MHz Memory, and 400GB local HDD storage, SLES12 SP2 with Hyperthreading enabled. z14 configuration: LPAR with 2-16 IFLs, running under SLES12 SP2 (SMT mode), 64GB memory, 80GB DASD storage, HyperPAV=8. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
41
WebSphere Application Server Performance on z14 vs x86 Broadwell – Benchmark Configuration
10Gbit YCSB IBM DASD Storage on DS8K Server FICON HammerDB Workload Driver RHEL 7.3 2 x86 server (24 Cores each) Jmeter using IBM Java Storage DB SLES 12.2 Workload under Test: DayTrader 3 WAS DB Log Storage z14 LPAR (2-16 IFLs (SMT) 64 GB memory) Benchmark Setup DayTrader Benchmark (15000 Users, Stocks) (ftp://public.dhe.ibm.com/software/webservers/appserv/was/DayTrader3Install.zip) ibm-java-x86_64-sdk Two driving x86 server, each trading for 7500 users 2-6 driver threads (channels) per WAS compute thread System Stack z14 LPAR with 2-16 IFL, 64 GB memory running SLES12 SP2 with SMT enabled, DS8K DASD storage WAS with Java pinned to half of the IFLs DB pinned to half of the IFLs x86 2-16 Intel® Xeon CPU E GHz, 1.5TB memory running SLES 12 SP2 with Hyperthreading enabled, local HDD storage WAS with Java pinned to half of the cores Db pinned to half of the cores Details are self-explanatory on the slide. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
42
Apache TomEE Performance on z14 vs x86 Broadwell
Run the DayTrader benchmark on Apache TomEE with up to 2.3x more throughput per core on a z14 LPAR versus a compared x86 platform Details are self-explanatory on the slide. Disclaimer: Performance result is extrapolated from IBM internal tests running DayTrader 3.0 benchmark on Apache TomEE Results may vary. x86 configuration: 2-16 Intel E v4 2.30GHz w/ Hyperthreading turned on, 256GB memory, and 500 GB local RAID-5 HDD storage, SLES 12 SP2. z14 configuration: LPAR with 2-16 dedicated IFLs, 256GB memory, and 420 GB DS8K storage, SLES 12 SP2 (SMT mode). IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
43
Apache TomEE Performance on z14 vs x86 Broadwell – Benchmark Configuration
Apache JMeter Apache TomEE 1.7.1 MariaDB Linux DayTrader 3.0 Benchmark Setup DayTrader benchmark 15000 users 10000 stocks System Stack z14 LPAR with 2-16 dedicated IFLs and 256 GB memory running SLES 12 SP2 with SMT enabled and 420 GB DS8K storage Apache TomEE 1.7.1, MariaDB JMeter 2.13, DayTrader 3.0, IBM Java 1.8 x86 2-16 Intel E v4 2.30GHz w/ Hyperthreading turned on, 256 GB of memory, and 500 GB local RAID-5 HDD storage, SLES 12 SP2 Details are self-explanatory on the slide. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
44
InfoSphere DataStage Performance on IBM z14 vs x86 Broadwell
Run the MicroBM_CPU benchmark on InfoSphere DataStage 11.5 with up to 2.8x more throughput per core on a z14 LPAR versus a compared x86 platform Details are self-explanatory on the slide. Disclaimer: Performance result is extrapolated from IBM internal tests running the MicroBM_CPU 1.0 benchmark on InfoSphere Datastage MicroBM_CPU is a simplified version of a DimTrade_History_Load job in the TPC-ETL benchmark. Results may vary. x86 configuration: Intel(R) Xeon(R) CPU E GHz, 1.5TB fast TruDDR4 2400MHz Memory, and 400GB local HDD storage, RHEL 7.3 with Hyperthreading enabled. z14 configuration: LPAR with 2-32 dedicated IFLs with SMT, 256 GB memory, 200 GB DASD storage (HyperPAV=8), RHEL 7.3. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
45
PostgreSQL Performance on z14 vs x86 Broadwell
Run the pgBench benchmark on PostgreSQL with up to 2x more throughput per core on a z14 LPAR versus a compared x86 platform 1.8x 1.9x 1.5x Details are self-explanatory on the slide. Disclaimer: Performance result is extrapolated from IBM internal tests running pgbench 9.6 benchmark on PostgreSQL (20 GB database in RAM disk). Results may vary. x86 configuration: 2-16 Intel E v4 2.30GHz with Hyperthreading turned on, 64GB memory, and 500 GB local RAID-5 HDD storage, SLES12 SP2. z14 configuration: LPAR with 2-16 dedicated IFLs, 64GB memory, and 40 GB DASD storage, SLES12 SP2 (SMT mode). IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
46
PostgreSQL Performance on z14 vs x86 Broadwell – Benchmark Configuration
Benchmark Setup Ran pgBench workload driver locally with 32 concurrent threads read-only (100% read) write-only (100% write) Database size 20 GB System Stack z14 LPAR with 2-16 dedicated IFL, 64 GB memory, and 40 GB DASD storage running SLES 12 SP2 with SMT enabled PostgreSQL 9.6.1, pgBench 9.6 x86 2-16 Intel E v4 2.30GHz w/ Hyperthreading turned on, 64 GB of memory, and 500 GB local RAID-5 HDD storage running SLES 12 SP2 pgBench PostgreSQL 9.6.1 32 Concurrent Threads/Clients 20 GB Database on RAM disk Linux Details are self-explanatory on the slide. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
47
MongoDB Performance on z14 vs x86 Broadwell
Run the YCSB benchmark on MongoDB with up to 2.6x more throughput per core on a z14 LPAR versus a compared x86 platform 2.4x 2.6x 2.5x Details are self-explanatory on the slide. Disclaimer: Performance results based on IBM internal tests running YCSB (write-heavy, read-only) on local MongoDB Enterprise Release (Database size 5GB). Results may vary. x86 configuration: 36 Intel E v4 2.30GHz with Hyperthreading turned on (2-8 cores dedicated to MongoDB, 20 or 28 cores dedicated to YCSB), 64GB memory, and 480 GB local RAID-5 HDD storage, SLES12 SP2. z14 configuration: LPAR with 36 dedicated IFLs (2-8 cores dedicated to MongoDB, 20 and 28 cores dedicated to YCSB), 64GB memory, and 120 GB DASD storage, SLES12 SP2 (SMT mode). IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
48
MongoDB Performance on z14 vs x86 Broadwell – Benchmark Configuration
Benchmark Setup Ran YCSB workload driver locally read-only (100% read) write-heavy (50% write) Database size 5 GB System Stack z14 LPAR with 36 dedicated IFLs (2-8 IFLs dedicated to MongoDB, IFLs dedicated to YCSB), 64 GB memory, and 120 GB DASD storage running SLES 12 SP2 with SMT enabled MongoDB 3.4.1, YCSB x86 36 Intel E v4 2.30GHz w/ Hyperthreading turned on on (2-8 cores dedicated to MongoDB, cores dedicated to YCSB), 64 GB of memory, and 480 GB local RAID-5 HDD storage running SLES 12 SP2 YCSB MongoDB 3.4.1 5 GB Database Linux Details are self-explanatory on the slide. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
49
Wordpress Performance on z14 vs x86 Broadwell
Run the apacheBench benchmark on Wordpress with up to 1.9x more throughput per core on a z14 LPAR versus a compared x86 platform Details are self-explanatory on the slide. Disclaimer: Performance results based on IBM internal tests running apacheBench 2.3 benchmark against Wordpress from two remote hosts each simulating 30 concurrent users issuing 600 requests against 10 webpages on a Wordpress site. MariaDB located on the same system as Wordpress was used to persist application data. Results may vary. x86 configuration: 2-16 Intel E v4 2.30GHz w/ Hyperthreading turned on, 64 GB memory, 500 GB local RAID-5 HDD storage, SLES12 SP2, Wordpress 4.7.5, MariaDB , ApacheHTTP , PHP z14 configuration: LPAR with 2-16 dedicated IFLs, 64GB memory, and 40 GB DASD storage, SLES12 SP2 (SMT mode), Wordpress 4.7.5, MariaDB , ApacheHTTP , PHP IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
50
Wordpress Performance on z14 vs x86 Broadwell – Benchmark Configuration
Benchmark Setup ApacheBench workload driver Read-only (100% read) 30 users with 600 page hits and 10 web pages System Stack z14 LPAR with 2-16 dedicated IFL, 64 GB memory, and 40 GB DASD storage running SLES 12 SP2 with SMT enabled MariaDB , ApacheHTTP , Wordpress 4.7.5, PHP 5.6.8 x86 2-16 Intel E v4 2.30GHz w/ Hyperthreading turned on, 64 GB of memory, and 500 GB local RAID-5 HDD storage running SLES 12 SP2 MariaDB , ApacheHTTP , Wordpress 4.7.5, PHP 5.6.8 Apache HTTP Webserver MariaDB Database Wordpress Data apacheBench Remote Slave Remote Master HTTP Daemons Wordpress Website PHP Details are self-explanatory on the slide. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
51
MariaDB Performance on z14 with IBM FlashSystem 900 vs x86 Broadwell with local SSDs
Run the sysbench benchmark on MariaDB with up to 3.7x more throughput on z14 with IBM FlashSystem 900 storage versus a compared x86 platform with local SSD storage 3.7x 2.8x Up to 2.1x throughput improvement on YCSB with one Cloudant node configuration YCSB read-only 1x – 1.7x YCSB read-mostly 1.5x – 2.1x YCSB write-heavy 1.4x – 2.1x Cloudant: Today we have completed the x86 Haswell runs for a 1-node Cloudant configuration. The Cloudant chart shows now the 1-node results on x86 Haswell vs z13. With the latest Cloudant Local beta version we see 1x - 2.1x better performance for YCSB on z13: YCSB read-only is only 1x -1.7x. YCSB read-mostly is 1.5x - 2.1x (figure at the top). YCSB heavy-write is 1.4x - 2.1x (figure at the bottom). This night we hope to complete the 3-node cluster runs on x86. I will send you an updated Cloudant chart comparing also a 3-node config on z13 vs x86 until tomorrow morning your time. Disclaimer: Performance results based on IBM internal tests running sysbench 0.5 benchmark on MariaDB MariaDB database size was 100 GB. Results may vary. x86 configuration: 2x 36 Intel E v4 2.30GHz w/ Hyperthreading turned on, 32GB memory, and 2 TB local RAID-5 SSD storage, SLES12 SP2, MariaDB z14 configuration: LPAR with 8-16 dedicated IFLs, 32GB memory, 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S+ cards, SLES12 SP2 (SMT mode), MariaDB IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
52
MariaDB Performance on z14 with IBM FlashSystem 900 vs x86 Broadwell with local SSDs – Benchmark Configuration LPAR w/ SLES 12 SP2 MariaDB IBM FlashSystem 900 sysBench z14 Configuration: Benchmark Setup sysbench 0.5 benchmark read-only (100% read) read-write MariaDB database size 100 GB System Stack x86 2x 36 Intel E v4 2.30GHz w/ Hyperthreading turned on and 32GB memory running SLES 12 SP2 2 TB local RAID-5 SSD storage MariaDB z14 LPAR with 8-16 dedicated IFLs and 32 GB memory running SLES 12 SP2 with SMT enabled 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S+ card x86 server w/ SLES 12 SP2 MariaDB Local SSDs sysBench x86 Configuration: Up to 2.1x throughput improvement on YCSB with one Cloudant node configuration YCSB read-only 1x – 1.7x YCSB read-mostly 1.5x – 2.1x YCSB write-heavy 1.4x – 2.1x Cloudant: Today we have completed the x86 Haswell runs for a 1-node Cloudant configuration. The Cloudant chart shows now the 1-node results on x86 Haswell vs z13. With the latest Cloudant Local beta version we see 1x - 2.1x better performance for YCSB on z13: YCSB read-only is only 1x -1.7x. YCSB read-mostly is 1.5x - 2.1x (figure at the top). YCSB heavy-write is 1.4x - 2.1x (figure at the bottom). This night we hope to complete the 3-node cluster runs on x86. I will send you an updated Cloudant chart comparing also a 3-node config on z13 vs x86 until tomorrow morning your time. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
53
IBM z14 – Optimized for Java
IBM SDK, Java Tech. Edition, Version (IBM Java 8 SR5) Pause-Less Garbage Collection Guarded Storage Facility Cryptographic Function (CPACF) Improved performance of crypto co-processors GCM, SHA-3 hardware acceleration True Random Number Generator Single Instruction Multiple Data (SIMD) Improved performance 32-bit floating point enhancements Packed Decimals support New Instructions Hot cache line hints Arithmetic half-word operations IBM z14 exploitation 50 new instructions exploited! Pause-less Garbage Collection Up to 10x reduction in GC pause times Improved crypto performance for IBMJCE AES-GCM block ciphering on z14 Higher quality True Random Number Generator to seed SecureRandom Performance improvements in ECC, AES, SHA-1, SHA-2 New z14 instruction exploitation Improved PackedDecimal API performance in Data Access Accelerator Auto-SIMD acceleration for 32-bit binary floating point Improved application ramp-up Up-to 50% less CPU to ramp-up to steady-state Details are self-explanatory on the slide. Detailed Java measurements are available on the next slides. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
54
Liberty DayTrader 3 – Linux on Z – AES-GCM z13 vs z14
IBM z14 Model M0x+ Java 8 SR5 AES-GCM cryptography up to 5.1x better throughput over z13 + Java 8 SR3 IBM Liberty with IBM Java 8 SR3, SR5 IBM z13 + IBM z14 – SLES 12 SP1 – 4 IFLs SMT-2 DayTrader 3 (Controlled measurement environment, results may vary) IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
55
Liberty DayTrader 3 – Linux on Z – AES-GCM IBM z14 vs Intel Broadwell
IBM z14 Model M0x+ Java 8 SR5 AES-GCM cryptography up to 2.6x better throughput over Intel Broadwell IBM Liberty with DayTrader3 IBM z14 with IBM Java 8 SR5 – SLES 12 SP1 – 4 IFLs SMT-2 Intel Xeon E v4 – Oracle Hotspot 8_131 – RHEL 7.2 – 4 cores HT (Controlled measurement environment, results may vary) IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
56
Liberty DayTrader 3 – Linux on Z – NoSSL IBM z14 vs Intel Broadwell
IBM z14 Model M0x+ Java 8 SR5 up to 1.6x better throughput over Intel Broadwell IBM Liberty with DayTrader3 with IBM Java 8 SR5 IBM z14 – SLES 12 SP1 – 4 IFLs SMT-2 Intel Xeon E v4 – RHEL 7.2 – 4 cores HT (Controlled measurement environment, results may vary) IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
57
Business Rules Processing – Linux on z14
IBM z14 Model M0x+ Java 8 SR5 up to 27% better throughput over z13 IBM ODM with IBM Java 8 SR3, SR5 IBM z13 + IBM z14 – SLES 12 SP1 – 8 IFLs SMT-2 5 Ruleset (Controlled measurement environment, results may vary) IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
58
Business Rules Processing – IBM z14 vs Intel Broadwell
IBM z14 Model M0xdelivers up to 1.65x more transactions / core over Intel Broadwell (Controlled measurement environment, results may vary) IBM ODM with IBM Java 8 SR5 IBM z14 Model M0x – SLES 12 SP1 – 8 IFLs SMT-2 Intel Xeon E v4 – RHEL 7.2 – 8 cores HT IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
59
Microservices performance cross z14 LPARs vs cross x86 servers
Microservices show a better performance cross z14 LPARs than cross compared x86 servers providing up to 2.4x more throughput per core for the AcmeAir benchmark MariaDB my.cnf settings for z13 and x86 measurements: log-error=/var/log/mariadb/mariadb.log ##### MariaDB temporary tables tmpdir = /mnt/ramdisk #####non innodb options (fixed) max_connections = 2000 back_log = 150 table_open_cache = 2000 key_buffer_size = 16M join_buffer_size = 32K query_cache_type = 0 sort_buffer_size = 32K #####fixed innodb options innodb_open_files = 100 innodb_file_per_table = true innodb_data_file_path = ibdata1:50M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_file_size = 4G innodb_log_buffer_size = 256M innodb_log_files_in_group = 2 innodb_buffer_pool_size = 84G innodb_buffer_pool_instances = 32 innodb_thread_concurrency = 0 innodb_adaptive_hash_index_partitions = 32 #####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_io_capacity = 20000 innodb_flush_neighbors = 0 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_write_io_threads = 2 innodb_read_io_threads = 2 Disclaimer: Performance results based on IBM internal tests running Apache jmeter remotely against AcmeAir microservice ( on WebSphere Liberty. Results may vary. x86 configuration: Apache jmeter 2.13 running on a x86 server with 18 Intel E v4 2.30GHz, 768GB memory, 400 GB local RAID-5 volume on 15k 12Gbps SAS drives, SLES12 SP2, Docker , Kubernetes 1.3.3, etcd 2.1.3, and Calico 1.1/Flannel /none virtual network. AcmeAir flight / booking service and AcmeAir customer / authentication service running on two separate, but identically configured x86 servers with 18 Intel E v3 2.30GHz, 768GB memory, 400 GB local RAID-5 volume on 15k 12Gbps SAS drives, SLES12 SP2, Docker , Kubernetes 1.3.3, Nginx , WebSphere Liberty , MongoDB 3.5.6, and Calico 1.1/Flannel /none virtual network. z14 configuration: Apache jmeter 2.13 running on a LPAR with 18 dedicated IFLs, 768GB memory, 80 GB DASD storage, SLES12 SP2, Docker , Kubernetes 1.3.3, etcd 2.1.3, and Calico 1.1/Flannel /none virtual network. AcmeAir flight / booking service and AcmeAir customer / authentication service running on two separate, but identically configured LPARs with 18 dedicated IFLs, 768GB memory, 80 GB DASD storage, SLES12 SP2, Docker , Kubernetes 1.3.3, Nginx , WebSphere Liberty , MongoDB 3.5.6, and Calico 1.1/Flannel /none virtual network. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
60
Service endpoint (Kuber-netes) Service endpoint (Kuber-netes)
Microservices performance cross z14 LPARs vs cross x86 servers – Benchmark Setup z14 Setup JMeter Service endpoint (Kuber-netes) Nginx CS AS FS BS Mongo z14 LPAR 1 – 18 IFLs LPAR 2 – 18 IFLs LPAR 3 – 18 IFLs Pair of a service and a DB placed in the same LPAR / x86 server Placed an authentication service (AS) and a flight service (FS) in different LPARs / x86 servers Placed an authentication service (AS) and a customer service (CS) in the same LPAR / x86 server x86 Setup JMeter Service endpoint (Kuber-netes) Nginx CS AS FS BS Mongo Server 1 – 18 cores x86 Broadwell Server 2 – 18 cores x86 Haswell Server 3 – 18 cores MariaDB my.cnf settings for z13 and x86 measurements: log-error=/var/log/mariadb/mariadb.log ##### MariaDB temporary tables tmpdir = /mnt/ramdisk #####non innodb options (fixed) max_connections = 2000 back_log = 150 table_open_cache = 2000 key_buffer_size = 16M join_buffer_size = 32K query_cache_type = 0 sort_buffer_size = 32K #####fixed innodb options innodb_open_files = 100 innodb_file_per_table = true innodb_data_file_path = ibdata1:50M:autoextend innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT_NO_FSYNC innodb_log_file_size = 4G innodb_log_buffer_size = 256M innodb_log_files_in_group = 2 innodb_buffer_pool_size = 84G innodb_buffer_pool_instances = 32 innodb_thread_concurrency = 0 innodb_adaptive_hash_index_partitions = 32 #####tuning for RAM disk innodb_adaptive_flushing = 1 innodb_io_capacity = 20000 innodb_flush_neighbors = 0 innodb_io_capacity_max = 40000 innodb_lru_scan_depth = 4096 innodb_purge_threads = 2 innodb_write_io_threads = 2 innodb_read_io_threads = 2 IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
61
Linux Workload Performance on IBM z14 ZR1 Model
Scale-out (vs. IBM z13s®), Scale-up (vs. z13s) Database Backup & Restore (vs. x86) FICON Express16S+ (vs. z13s) Pervasive Encryption (z14 vs. z13s, z14 vs. x86) Competitive Workload Performance (vs x86) Microservices (z14 vs. x86) IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
62
Scale-out (vs. z13s) Scale-up (vs. z13s)
Linux Workload Performance on IBM z14 Model ZR1 Run per core 25% more MongoDB guests with the same throughput under z/VM 6.4 on z14 Model ZR1 compared to z13s Run 87% more MongoDB guests with the same throughput under z/VM 6.4 on z14 Model ZR1 leveraging the additional cores available compared to z13s Use up to 30 IFLs on z14 ZR1 to scale-out MongoDB databases under z/VM 6.4, each with a constant throughput and not more than 10us latency increase per additional MongoDB instance Scale-up (vs. z13s) Scale-up a single MongoDB instance to 6.8 TB in a single system without database sharding and get 2.4x more throughput and 2.3x lower latency on z14 Model ZR1 leveraging the additional memory available compared to z13s Run MongoDB under z/VM 6.4 on z14 Model ZR1 and get 4.6x better performance leveraging additional memory and IFLs available per z/VM instance compared to z13s Following proof points and the detailed measurements are available on the next slides IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
63
MongoDB Consolidation under z/VM on z14 ZR1 versus z13s
Run per core 25% more MongoDB guests with the same throughput under z/VM 6.4 on z14 Model ZR1 compared to z13s Run 87% more MongoDB guests with the same throughput under z/VM 6.4 on z14 Model ZR1 leveraging the additional cores available compared to z13s z13s zHypervisor LPAR (20 IFL, 1 TB memory) z/VM 6.4 MongoDB 2 GB DB MongoBench (SLES guest 100, 2 vCPU, 4 GB) ... (SLES guest 1, 2 vCPU, 4 GB) (SLES guest 2, 2 vCPU, 4 GB) z14 Model ZR1 zHypervisor LPAR (20 IFL, 1 TB memory) z/VM 6.4 MongoDB 2 GB DB MongoBench (SLES guest 125, 2 vCPU, 4 GB) ... MongoDB 2 GB DB MongoBench (SLES guest 1, 2 vCPU, 4 GB) (SLES guest 2, 2 vCPU, 4 GB) Disclaimer: Performance result is extrapolated from IBM internal tests comparing MongoDB performance under z/VM 6.4 with the PTF for APAR VM65942 on z14 versus z13 driven locally by MongoBench ( issuing 90% read and 10% write operations and a target TPS rate of 4000 per guest. Results may vary. z14 configuration: LPAR with 32 dedicated IFLs and 1 TB memory running a z/VM 6.4 instance with the PTF for APAR VM65942 in SMT mode with 200 guests. Each guest was configured with 2 vCPUs and 4 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 2 GB database. The databases were located on a FCP-attached DS8700 LUN with multi-pathing enabled. z13 configuration: LPAR with 32 dedicated IFLs and 1 TB memory running a z/VM 6.4 instance in SMT mode with 160 guests. Each guest was configured with 2 vCPUs and 4 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 2 GB database. The databases were located on a FCP-attached DS8700 LUN with multi-pathing enabled. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
64
MongoDB w/ 2 GB DB MongoBench MongoDB w/ 2 GB DB Mongobench
Scale-out MongoDB instances under z/VM 6.4 on z14 ZR1 with minimal SLA impact Use up to 30 IFLs on z14 ZR1 to scale-out MongoDB databases under z/VM 6.4, each with a constant throughput and not more than 10us latency increase per additional MongoDB instance z14 Model ZR1 zHypervisor LPAR (30 IFL, 1 TB memory) z/VM 6.4 MongoDB w/ 2 GB DB MongoBench (SLES guest 1, 2 vCPU, 4 GB) (SLES guest 2, 2 vCPU, 4 GB) MongoDB w/ 2 GB DB Mongobench (SLES guest 240, 2 vCPU, 4 GB) . . . Disclaimer: Performance result is extrapolated from IBM internal tests running in a z14 LPAR with 32 dedicated IFLs and 1 TB memory a z/VM 6.4 with the PTF for APAR VM65942 instance in SMT mode with up to 256 guests. Each guest was configured with 2 vCPUs and 4 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 2 GB database. The guest image and the databases were located on a FCP-attached DS8700 with multi-pathing enabled. The MongoDB instances were driven locally by a MongoBench ( instance which issued 90% read and 10% write operations with 8 threads against each MongoDB instance. Results may vary. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
65
Scale-out with Docker under z/VM on z14 ZR1
z14 Model ZR1 zHypervisor LPAR 1 (10 IFL, 2 TB memory) z/VM 6.4 865 BusyBox Container with ApacheHTTP z/VM guest 1 (2 vCPU, 16GB) . . . 865 BusyBox Container with ApacheHTTP z/VM guest 128 (2 vCPU, 16GB) LPAR 3 (10 IFL, 2 TB memory) 865 BusyBox Container with ApacheHTTP z/VM guest 257 (2 vCPU, 16GB) 865 BusyBox Container with ApacheHTTP z/VM guest 384 (2 vCPU, 16GB) Scale-out to 330,000 Docker containers in a single z14 ZR1 system, no application server farms necessary Disclaimer: Performance result is extrapolated from IBM internal tests running in a z14 LPAR with 10 dedicated IFLs and 16 GB memory 1000 BusyBox Docker containers with ApacheHTTP. Results may vary. Operating system was SLES12 SP2 (SMT mode). Docker 1.12 was used. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
66
MongoDB Consolidation under z/VM on z14 ZR1
z14 Model ZR1 zHypervisor LPAR (30 IFL, 1 TB memory) z/VM 6.4 MongoDB w/ 2 GB DB MongoBench (SLES guest 1, 2 vCPU, 4 GB) (SLES guest 2, 2 vCPU, 4 GB) MongoDB w/ 2 GB DB Mongobench (SLES guest 240, 2 vCPU, 4 GB) . . . MongoBench Workload Driver (90% read, 10% write, 8 threads) Included in each instance Each MongoDB instance processed 243 million transactions per day Run 240 concurrent databases executing a total of 58 billion database transactions per day on a single z14 ZR1 server Disclaimer: Performance result is extrapolated from IBM internal tests running in a z14 LPAR with 32 dedicated IFLs and 1 TB memory a z/VM 6.4 with the PTF for APAR VM65942 in SMT mode with 256 guests. Each guest was configured with 2 vCPUs and 4 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 2 GB database. The guest image and the databases were located on a FCP-attached DS8700 with multi-pathing enabled. The MongoDB instances were driven locally by a MongoBench ( instance which issued 90% read and 10% write operations with 8 threads against each MongoDB instance. Results may vary. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
67
MongoDB scale-up on z14 ZR1 leveraging the additional memory available compared to z13s
2.4x Scale-up a single MongoDB instance to 6.8 TB in a single system without database sharding and get 2.4x more throughput and 2.3x lower latency on z14 Model ZR1 leveraging the additional memory available compared to z13s 0.44x 0.43x Disclaimer: Performance result is extrapolated from IBM internal tests comparing MongoDB performance in native LPAR on z14 using additional memory versus z13 driven by YCSB (write-heavy, read-only). Results may vary. z14 configuration: LPAR with 12 dedicated IFLs and 20 TB memory running on SLES 12 SP2 (SMT mode) a MongoDB Enterprise Release instance (no sharding, no replication) with a 17 TB database. The database was located on an 18 TB LUN on an IBM FlashSystem 900. z13 configuration: LPAR with 12 dedicated IFLs and 10 TB memory running on SLES 12 SP2 (SMT mode) a MongoDB Enterprise Release instance (no sharding, no replication) with a 17 TB database. The database was located on an 18 TB LUN on an IBM FlashSystem 900. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
68
IBM FlashSystem 900 (8 TB LUN) IBM FlashSystem 900 (8 TB LUN)
MongoDB scale-up on z14 ZR1 leveraging the additional memory available compared to z13s – Benchmark Configuration z13s zHypervisor LPAR (12 IFL, 4 TB memory) MongoDB w/ 6.8 TB database IBM FlashSystem 900 (8 TB LUN) Database does not fit in memory on z13s z14 Model ZR1 zHypervisor LPAR (12 IFL, 8 TB memory) MongoDB w/ 6.8 TB database IBM FlashSystem 900 (8 TB LUN) Database does fit in memory on z14 ZR1 z13s LPAR w/ 12 IFLs and 4 TB memory running SLES 12 SP2, 8 TB FlashSystem 900 storage 6.8 TB MongoDB database, no sharding YCSB Benchmark (write-heavy, read-only) z14 Model ZR1 LPAR w/ 12 IFLs and 8 TB memory running SLES 12 SP2, 8 TB FlashSystem 900 storage YCSB benchmark (write-heavy, read-only) IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
69
MongoDB Consolidation under z/VM on z14 ZR1 leveraging additional memory and IFLs available
z14 Model ZR1 zHypervisor LPAR (30 IFL, 2 TB memory) z/VM 6.4 MongoDB 256 GB database (SLES guest 1, 8 vCPU, 510 GB) YCSB (100% read) Databases fit in memory (SLES guest 2, 8 vCPU, 510 GB) (SLES guest 3, 8 vCPU, 510 GB) (SLES guest 4, 8 vCPU, 510 GB) 184k transactions/sec in total z13s zHypervisor LPAR (20 IFL, 1 TB memory) z/VM 6.3 MongoDB 256 GB database (SLES guest 1, 5 vCPU, 250 GB) YCSB (100% read) (SLES guest 2, 5 vCPU, 250 GB) (SLES guest 3, 5 vCPU, 250 GB) (SLES guest 4, 5 vCPU, 250 GB) Databases do not fit in memory 40k transactions/sec in total Run MongoDB under z/VM 6.4 on z14 Model ZR1 and get 4.6x better performance leveraging additional memory and IFLs available per z/VM instance compared to z13s versus Disclaimer: Performance result is extrapolated from IBM internal tests comparing MongoDB performance under z/VM 6.4 with the PTF for APAR VM65942 on z14 Model ZR1 with MongoDB performance under z/VM 6.3 on z13 driven remotely by YCSB (100% read operations). Results may vary. z14 Model ZR1 configuration: LPAR with 30 dedicated IFLs and 2 TB memory running a z/VM 6.4 instance in SMT mode with 4 guests. Each guest was configured with 8 vCPUs and 510 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 256 GB database. The databases were located on a FCP-attached DS8700 LUN with multi-pathing enabled. 1 FCP path per z/VM guest. z13 configuration: LPAR with 32 dedicated IFLs and 1 TB memory running a z/VM 6.3 instance in SMT mode with 4 guests. Each guest was configured with 8 vCPUs and 250 GB memory and ran a MongoDB Enterprise Server instance (no sharding, no replication) with a 256 GB database. The databases were located on a FCP-attached DS8700 LUN with multi-pathing enabled. 1 FCP path per z/VM guest. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
70
Database Backup & Restore (vs. x86)
Linux Workload Performance on IBM z14 Model ZR1 Database Backup & Restore (vs. x86) Operators can perform database backup up to 8.8x faster and database restore up to 1.6x faster for Db2 LUW on 1 core on a z14 Model ZR1 LPAR using zEDC Express versus a compared x86 platform using software compression Operators can perform database backup with up to 76% lesser CPU utilization and database restore with up to 50% lesser CPU utilization for Db2 LUW on 4 cores on a z14 Model ZR1 LPAR using zEDC Express versus a compared x86 platform using software compression Operators can perform database dump up to 3.2x faster for MongoDB Enterprise Edition on 1 core on a z14 Model ZR1 LPAR using zEDC Express versus a compared x86 platform using software compression Operators can perform database dump with up to 67% lesser CPU utilization for MongoDB Enterprise Edition on 8 cores on a z14 Model ZR1 LPAR using zEDC Express versus a compared x86 platform using software compression Following proof points and the detailed measurements are available on the next slides IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
71
DB2 LUW Backup and Restore Performance on z14 ZR1 vs x86 Broadwell
Operators can perform database backup up to 8.8x faster and database restore up to 1.6x faster for DB2 LUW on 1 core on a z14 Model ZR1 LPAR using zEDC Express versus a compared x86 platform using software compression 1.6x Disclaimer: Performance results based on IBM internal tests running database backup and restore with compression on DB2 LUW v11.1.1fp1a on a database of size 385 GB using the build-in software compression mechanism and genwqe-user for zEDC Express on z14 Model ZR1. Results may vary. z14 Model ZR1 configuration: LPAR with 16 dedicated IFLs, 1 IFL enabled for DB2, 1.5 TB memory, RHEL 7.4 in SMT mode, database and backup located on IBM DS8000. For backup, the input buffer of zEDC was 4 MB and the output buffer was 2 MB. For restore, the input buffer of zEDC was 2 MB and the output buffer was 4 MB. x86 configuration: 36 Intel E v4 2.30GHz, 1 core enabled for DB2, 1.5 TB memory, RHEL 7.3, database and backup located on IBM DS8000. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
72
DB2 LUW Backup and Restore Performance on z14 ZR1 vs x86 Broadwell
76% Operators can perform database backup with up to 76% lesser CPU utilization and database restore with up to 50% lesser CPU utilization for DB2 LUW on 4 cores on a z14 Model ZR1 LPAR using zEDC Express versus a compared x86 platform using software compression 50% Disclaimer: Performance results based on IBM internal tests running database backup and restore with compression on DB2 LUW v11.1.1fp1a on a database of size 385 GB using the build-in software compression mechanism and genwqe-user for zEDC Express on z14 Model ZR1. Results may vary. z14 Model ZR1 configuration: LPAR with 16 dedicated IFLs, 4 IFL enabled for DB2, 1.5 TB memory, RHEL 7.4 in SMT mode, database and backup located on IBM DS8000. For backup, the input buffer of zEDC was 4 MB and the output buffer was 2 MB. For restore, the input buffer of zEDC was 2 MB and the output buffer was 4 MB. x86 configuration: 36 Intel E v4 2.30GHz, 4 cores enabled for DB2, 1.5 TB memory, RHEL 7.3, database and backup located on IBM DS8000. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
73
DB2 LUW Backup and Restore Performance on z14 ZR1 vs x86 Broadwell – Benchmark Configuration
Benchmark Setup Ran db2 backup and db2 restore with build-in software compression on x86 with zEDC (GenWQE connected via a named pipe) on z14 ZR1 Parameter ‘parallelism’ was set to twice the number of cores/IFLs Db2 database size 385 GB Database and backup located on IBM DS8000 System Stack x86 36 Intel E v4 2.30GHz w/ Hyperthreading turned on and 1.5 TB memory running RHEL 7.3 2 TB IBM DS8000 storage Db2 LUW fp1a 1-8 cores enabled for DB2 z14 Model ZR1 LPAR with 16 dedicated IFLs and 1.5 TB memory running RHEL 7.4 with SMT enabled, attached zEDC Express genwqe-user 1-8 IFL enabled for Db2 Db2 IBM System Storage DS8000 DB2 build-in compression DB2 zEDC Express With software based compression on x86 With zEDC Express based compression on z14 Model ZR1 IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
74
MongoDB Dump Performance on z14 ZR1 vs x86 Broadwell
Operators can perform database dump up to 3.2x faster for MongoDB Enterprise Edition on 1 core on a z14 Model ZR1 LPAR using zEDC Express versus a compared x86 platform using software compression Disclaimer: Performance results based on IBM internal tests running database dump with compression on MongoDB Enterprise Edition on a database of size 355 GB created using 4 identical collections of “wikidata”. Software compression on the x86 platform uses the pigz software. zEDC based compression uses genwqe-user for zEDC Express on z14 Model ZR1. Results may vary. z14 Model ZR1 configuration: LPAR with 16 dedicated IFLs, 1 IFL enabled, 1.5 TB memory, RHEL 7.4 in SMT mode, database and dump located on IBM DS8000. x86 configuration: 36 Intel E v4 2.30GHz, 1 core enabled, 1.5 TB memory, RHEL 7.4, database and dump located on IBM DS8000. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
75
MongoDB Dump Performance on z14 ZR1 vs x86 Broadwell
67% Operators can perform database dump with up to 67% lesser CPU utilization for MongoDB Enterprise Edition on 8 cores on a z14 Model ZR1 LPAR using zEDC Express versus a compared x86 platform using software compression Disclaimer: Performance results based on IBM internal tests running database dump with compression on MongoDB Enterprise Edition on a database of size 355 GB created using 4 identical collections of “wikidata”. Software compression on the x86 platform uses the pigz software. zEDC based compression uses genwqe-user for zEDC Express on z14 Model ZR1. Results may vary. z14 Model ZR1 configuration: LPAR with 16 dedicated IFLs, 8 IFL enabled, 1.5 TB memory, RHEL 7.4 in SMT mode, database and dump located on IBM DS8000. x86 configuration: 36 Intel E v4 2.30GHz, 8 cores enabled, 1.5 TB memory, RHEL 7.4, database and dump located on IBM DS8000. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
76
MongoDB Dump Performance on z14 ZR1 vs x86 Broadwell – Benchmark Configuration
Benchmark Setup Ran mongodump with pigz software compression (connected via a pipe) on x86 with zEDC (GenWQE connected via a pipe) on z14 ZR1 pigz parallelism was set to twice the number of cores MongoDB database size 355 GB Database and dump located on IBM DS8000 System Stack x86 36 Intel E v4 2.30GHz w/ Hyperthreading turned on and 1.5 TB memory running RHEL 7.4 2 TB IBM DS8000 storage MongoDB Enterprise Edition 3.4.6, pigz 2.3.4 1-8 cores enabled z14 Model ZR1 LPAR with 16 dedicated IFLs and 1.5 TB memory running RHEL 7.4 with SMT enabled, attached zEDC Express MongoDB Enterprise Edition 3.4.6 genwqe-user 1-8 IFL enabled MongoDB IBM System Storage DS8000 pigz zEDC Express With software based compression on x86 With zEDC Express based compression on z14 Model ZR1 IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
77
FICON Express16S+ (vs. z13s)
Linux Workload Performance on IBM z14 Model ZR1 FICON Express16S+ (vs. z13s) Run the BDI benchmark on Db2 LUW with up to 20% more throughput using FICON Express16S+ cards on z14 Model ZR1 compared to using FICON Express16S cards on z13s Run the pgBench benchmark on PostgreSQL with up to 45% more throughput using FICON Express16S+ cards on z14 Model ZR1 compared to using FICON Express16S cards on z13s Run the sysbench benchmark on MariaDB with up to 19% more throughput using FICON Express16S+ cards on z14 Model ZR1 compared to using FICON Express16S cards on z13s Following proof points and the detailed measurements are available on the next slides IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
78
DB2 LUW Performance with FICON Express16S+ Cards
Run the BDI benchmark on DB2 LUW with up to 20% more throughput using FICON Express16S+ cards on z14 Model ZR1 compared to using FICON Express16S cards on z13s Disclaimer: Performance result is extrapolated from IBM internal tests running the BDI benchmark, which is based on TPC-DS, on DB2 LUW with BLU Acceleration on z14 native LPAR vs z13 native LPAR. The BDI benchmark was configured to run a fixed sequence of queries. DB2 database size was 500 GB. Results may vary. z13 configuration: LPAR with 8 dedicated IFLs, 64GB memory, and 11 TB LUN on IBM FlashSystem 900 attached via FICON Express16S cards, RHEL 7.3 (SMT mode) running DB2 LUW , IBM Java 1.8, and BDI. z14 configuration: LPAR with 8 dedicated IFLs, 64GB memory, and 11 TB LUN on IBM FlashSystem 900 attached via FICON Express16S+ cards, RHEL 7.3 (SMT mode) running DB2 LUW , IBM Java 1.8, and BDI. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
79
Db2 LUW Performance with FICON Express16S+ Cards – Benchmark Configuration
Benchmark Setup BDI workload driver based on TPC-DS 8 parallel users performing predefined SQL Queries Db2 database size 500 GB System Stack z13s LPAR with 8 dedicated IFLs and 64 GB memory running RHEL 7.3 with SMT enabled 11 TB LUN on IBM FlashSystem 900 attached via FICON Express16S card Db2 LUW , IBM Java 1.8 z14 Model ZR1 11 TB LUN on IBM FlashSystem 900 attached via FICON Express16S+ card IBM FlashSystem 900 DBI Workload Driver RHEL7.3 Db2 Database Db2 LUW IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
80
PostgreSQL Performance with FICON Express16S+ Cards
Run the pgBench benchmark on PostgreSQL with up to 45% more throughput using FICON Express16S+ cards on z14 Model ZR1 compared to using FICON Express16S cards on z13s Disclaimer: Performance result is extrapolated from IBM internal tests running the pgBench 9.6 (read-only) benchmark (64 threads, 1000 clients) remotely against PostgreSQL on z14 native LPAR vs z13 native LPAR. PostgreSQL database size was 300 GB. Results may vary. z13 configuration: LPAR with 8 dedicated IFLs, 64 GB memory, and 400 GB Flash 9840 LUN attached via FICON Express16S cards, SLES 12 SP2 (SMT mode) running PostgreSQL z14 configuration: LPAR with 8 dedicated IFLs, 64 GB memory, and 400 GB Flash 9840 LUN attached via FICON Express16S+ cards, SLES 12 SP2 (SMT mode) running PostgreSQL IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
81
PostgreSQL Performance with FICON Express16S+ Cards – Benchmark Configuration
Benchmark Setup Ran pgBench 9.6 workload driver remotely from x86 blade server with 64 threads, 1000 clients read-only (100% read) write-only (100% write) PostgreSQL database size 300 GB System Stack z13s LPAR with 8 dedicated IFLs and 64 GB memory running SLES 12 SP2 with SMT enabled 400 GB LUN on IBM FlashSystem 900 attached via FICON Express16S card PostgreSQL 9.6.1 z14 Model ZR1 400 GB LUN on IBM FlashSystem 900 attached via FICON Express16S+ card pgBench PostgreSQL 9.6.1 300 GB Database X86 Blade Server Linux IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
82
MariaDB Performance with FICON Express16S+ Cards
Run the sysbench benchmark on MariaDB with up to 19% more throughput using FICON Express16S+ cards on z14 Model ZR1 compared to using FICON Express16S cards on z13s Disclaimer: Performance result is extrapolated from IBM internal tests running the sysbench 0.5 (read-only) benchmark on MariaDB on z14 native LPAR vs z13 native LPAR. MariaDB database size was 100 GB. Results may vary. z13 configuration: LPAR with 16 dedicated IFLs, 32 GB memory, and 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S cards, SLES 12 SP2 (SMT mode) running MariaDB z14 configuration: LPAR with 16 dedicated IFLs, 32 GB memory, and 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S+ cards, SLES 12 SP2 (SMT mode) running MariaDB IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
83
MariaDB Performance with FICON Express16S+ Cards – Benchmark Configuration
Benchmark Setup sysbench 0.5 benchmark read-only (100% read) read-write MariaDB database size 100 GB System Stack z13s LPAR with 16 dedicated IFLs and 32 GB memory running SLES 12 SP2 with SMT enabled 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S card MariaDB z14 Model ZR1 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S+ card IBM FlashSystem 900 sysbench SLES12 SP2 Database MariaDB IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
84
Pervasive Encryption z14 vs. z13s z14 vs. x86
Linux Workload Performance on IBM z14 Model ZR1 Pervasive Encryption z14 vs. z13s OpenSSL 1.0.2j provides up to 16.1x more throughput per core on a z14 ZR1 LPAR compared to a z13s LPAR Run the read-mostly workload of the YCSB benchmark on MongoDB Enterprise Edition with only 6% CPU overhead on average when enabling pervasive encryption on a z14 ZR1 LPAR z14 vs. x86 Run the DayTrader 3 benchmark on WebSphere Application Server with pervasive encryption enabled and achieve up to 2x more throughput on 1 core on a z14 Model ZR1 LPAR versus a compared x86 platform Run the Acme Air benchmark on Node.js 6.10 with pervasive encryption enabled and achieve up to 1.6x more throughput on a z14 Model ZR1 LPAR versus a compared x86 platform OpenSSL 1.0.2j provides up to 6.2x better performance per core for cipher aes-256-cbc on a z14 Model ZR1 LPAR versus a compared x86 platform Following proof points and the detailed measurements are available on the next slides IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
85
OpenSSL Performance on z14 ZR1 vs z13s
3.4x 2.1x 16.1x 9.4x 9.5x 7.4x OpenSSL 1.0.2j provides up to 16.1x more throughput per core on a z14 ZR1 LPAR compared to a z13s LPAR Disclaimer: Performance result is extrapolated from IBM internal tests comparing OpenSSL 1.0.2j speed benchmark performance for different ciphers in native LPAR on z14 versus z13. OpenSSL was invoked with the options: speed –elapsed –multi 1 –evp <cipher>. Results may vary. z14 configuration: LPAR with 8 dedicated IFLs, 128 GB memory, 40 GB DASD storage, SLES12 SP2 (SMT mode), libica ( and openssl-ibmca ( exploiting CPACF enhancements. z13 configuration: LPAR with 8 dedicated IFLs, 128 GB memory, 40 GB DASD storage, SLES12 SP2 (SMT mode). IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
86
CPU Overhead with Pervasive Encryption for MongoDB on z14 ZR1
Run the read-mostly workload of the YCSB benchmark on MongoDB Enterprise Edition with only 6% CPU overhead on average when enabling pervasive encryption on a z14 ZR1 LPAR Disclaimer: Performance result is extrapolated from IBM internal tests running MongoDB Enterprise Edition with and without SSL and database encryption driven remotely by 512 total threads of Yahoo! Cloud Serving Benchmark (YCSB) using the workload read-mostly (95% read, 5% update) and a record size of 5KB. Two external x86 blade-servers, each with 4 independent YCSB instances stressed the MongoDB database simultaneously. YCSB was configured to generate constant throughput rates. RSA 4096 bit key for SSL configuration of MongoDB. GCM based ciphers were used for SSL. Database stored via dm-crypt using aes-xts-plain64. CPU utilization for the pervasive encryption case was projected by scaling the achieved throughput with pervasive encryption to the throughput achieved without encryption. Results may vary. z14 configuration: LPAR with 8 dedicated IFLs, 256 GB memory, 40 GB DASD storage, RHEL 7.3 (SMT mode), OpenSSL 1.0.1e-fips, 50GB database on IBM DS8000 storage. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
87
LPAR with 8 IFL, 256GB memory, RHEL 7.3 in SMT mode
CPU Overhead with Pervasive Encryption for MongoDB on z14 ZR1 – Benchmark Configuration DS8K YCSB – 0 MongoDB 50 GB Mongo Database TLS v1.2 SSL 10 Gbps Network LPAR with 8 IFL, 256GB memory, RHEL 7.3 in SMT mode dm-crypt x86 Blade – 0 (Client-emulator) YCSB – 1 YCSB – 2 YCSB – 3 x86 Blade – 1 (Client-emulator) YCSB v Workload B (Read-mostly, 95% read, 5% update) Record size: 5KB YCSB with target transaction rate specified MongoDB v3.4.1 with SSL disabled or enabled (GCM ciphers) OpenSSL 1.0.1 Memory primed before tests No encryption or dm-crypt with aes-xts-plain64 IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
88
OpenSSL Performance on z14 ZR1 vs x86 Broadwell
OpenSSL 1.0.2j provides up to 6.2x better performance per core for cipher aes-256-cbc on a z14 Model ZR1 LPAR versus a compared x86 platform Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests comparing OpenSSL 1.0.2j speed benchmark performance for cipher aes-256-cbc in native LPAR on z14 versus x86 bare metal. OpenSSL was invoked with the options: speed –elapsed –multi 8 –evp aes-256-cbc. Results may vary. x86 configuration: 8 Intel E v4 2.30GHz w/ Hyperthreading turned on, 1.5 TB memory, 500 GB local RAID-5 HDD storage, SLES12 SP2. z14 configuration: LPAR with 8 dedicated IFLs, 128GB memory, 40 GB DASD storage, SLES12 SP2 (SMT mode), libica ( and openssl-ibmca ( exploiting CPACF enhancements. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
89
Pervasive Encryption Performance for Node
Pervasive Encryption Performance for Node.js on z14 ZR1 vs x86 Broadwell Run the Acme Air benchmark on Node.js 6.10 with pervasive encryption enabled and achieve up to 1.6x more throughput on a z14 Model ZR1 LPAR versus a compared x86 platform Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests running the AcmeAir benchmark with 10,000 customers on Node.js v against MongoDB Enterprise Edition v3.4.2 on z14 native LPAR vs x86 bare metal, driven remotely by 250 JMeter v2.13 threads. Apache HTTP server v was used as load balancer. TLS v1.2, with SSL cipher suite ECDHE-RSA-AES128-GCM-SHA256 was used between JMeter and Apache HTTP, ECDHE-RSA-AES128-GCM-SHA256 cipher was used between Apache HTTP and Node.js, RSA 4096 bit key for SSL configuration of MongoDB, database encrypted via dm-crypt using aes-xts-plain64. Number of Node.js instances equal twice the number of cores assigned to Node.js, each instance pinned to a single CPU. Results may vary. z14 configuration: Node.js and Apache HTTP server running in LPAR with 10 dedicated IFLs, 1.5 TB memory, 40 GB DASD storage, SLES 12 SP2 (SMT mode), OpenSSL 1.0.2j-fips, Apache HTTP server pinned to 2 IFL, Node.js pinned to 8 IFL. MongoDB running in LPAR with 2 dedicated IFL, 64 GB memory, 40 GB DASD storage, SLES 12 SP2 (SMT mode), OpenSSL 1.0.2j-fips, application logs and database on RAM disk. x86 configuration: Node.js and Apache HTTP server running on server with 36 Intel® Xeon® CPU E GHz cores with Hyperthreading, 1.5 TB RAM, SLES12 SP2, OpenSSL 1.0.2j-fips, Apache HTTP server pinned to 2 cores, Node.js pinned to 8 cores. MongoDB running on server with 36 Intel® Xeon® CPU E GHz cores with Hyperthreading, 768 GB RAM, only 2 cores active, SLES12 SP2, OpenSSL 1.0.2, application logs and database on RAM disk. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
90
Pervasive Encryption Performance for Node
Pervasive Encryption Performance for Node.js on z14 ZR1 vs x86 Broadwell – Benchmark Configuration on z14 ZR1 Database LPAR App. Server LPAR Apache HTTP (Load Balancer) Acme Air (Node.js App) Database 10 Gbps Network Encryption Libraries MongoDB JMeter (Client Emulator) x86 Blade Configuration System z14 Model ZR1 – 2 LPARs Cores 10 dedicated IFL with SMT to App. Server LPAR 2 dedicated IFL with SMT to Database LPAR Core Assignment Apache: 2 (pinned) Node.js: 1, 2, 3, 4, 8 (pinned) MongoDB: 2 cores active Memory 1.5 TB App. Server LPAR, 64 GB Database LPAR OS SLES 12 SP2 Database MongoDB Enterprise, Database on RAM disk End-to-End Encryption: - JMeter ↔ (SSL) ↔ Apache - Apache ↔ (SSL) ↔ Acme Air - Acme Air ↔ (SSL) ↔ MongoDB - MongoDB ↔ (dm-crypt) ↔ Disk IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
91
Pervasive Encryption Performance for Node
Pervasive Encryption Performance for Node.js on z14 ZR1 vs x86 Broadwell – Benchmark Configuration on x86 x86 Haswell x86 Broadwell Database Server App. Server Apache HTTP (Load Balancer) Acme Air (Node.js App) Database 10 Gbps Network Encryption Libraries MongoDB JMeter (Client Emulator) x86 Blade Configuration System Broadwell and Haswell Cores 36 cores with HT to App. Server 36 cores with HT to Database Server Cores Assignment Apache: 2 (pinned) Node.js: 1, 2, 3, 4, 8 (pinned) MongoDB: 2 cores active Memory 1.5 TB App. Server, 768 GB Database Server OS SLES 12 SP2 Database MongoDB Enterprise, Database on RAM disk End-to-End Encryption: - JMeter ↔ (SSL) ↔ Apache - Apache ↔ (SSL) ↔ Acme Air - Acme Air ↔ (SSL) ↔ MongoDB - MongoDB ↔ (dm-crypt) ↔ Disk IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
92
Pervasive Encryption Performance for WebSphere Application Server on z14 ZR1 vs x86 Broadwell
Run the DayTrader 3 benchmark on WebSphere Application Server with pervasive encryption enabled and achieve up to 2x more throughput on 1 core on a z14 Model ZR1 LPAR versus a compared x86 platform Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests running the DayTrader 3 benchmark with pervasive encryption on WebSphere Application Server (WAS) with IBM Java SR5 using DB2 LUW to persist application data in native LPAR on z14 versus x86 bare metal. The workload was driven remotely by Apache JMeter to trade stocks among users. SSL encryption protocol TLS v1.2 with cipher suite SSL_RSA_WITH_AES_256_GCM_SHA384, 4096 bit key size was used to encrypt the communication between JMeter, WAS, DB2. The DB2 database and log files were encrypted using dm-crypt with aes-xts-plain64. Results may vary. z14 configuration: WAS running in a LPAR with 1 dedicated IFL, 64 GB memory, 80 GB DASD storage, HyperPAV=16, SLES 12 SP2 (SMT mode). DB2 LUW running in a LPAR with 4 dedicated IFL, 64 GB memory, 80 GB DASD storage, HyperPAV=16, SLES 12 SP2 (SMT mode). WAS and DB2 were communicating via Hipersockets. x86 configuration: WAS running on a x86 server with 1 enabled Intel(R) Xeon(R) CPU E GHz core with Hyperthreading, 64GB memory, local SSD storage, SLES 12 SP2. DB2 was running on a x86 server with 4 enabled Intel(R) Xeon(R) CPU E GHz cores with Hyperthreading, 64GB memory, local SSD storage, SLES 12 SP2. The two servers were connected by 10Gb Ethernet. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
93
Pervasive Encryption Performance for WebSphere Application Server on z14 ZR1 vs x86 Broadwell – Benchmark Configuration Benchmark Setup DayTrader Benchmark (15000 Users, Stocks) (ftp://public.dhe.ibm.com/software/webservers/appserv/was/DayTrader3Install.zip) Two driving x86 server, each trading for 7500 users 2-6 driver threads (channels) per WAS compute thread SSL encryption protocol TLS v1.2 with cipher suite SSL_RSA_WITH_AES_256_GCM_SHA384, bit key size used between JMeter, WAS, and Db2 Db2 DayTrader database and log files encrypted with dm-crypt using aes-xts-plain64 System Stack z14 Model ZR1 2 LPARs, each with 64 GB memory running SLES 12 SP2 with SMT enabled, DS8000 DASD storage WAS with IBM Java SR5 in one LPAR with 1-4 IFL Db2 LUW in second LPAR with 4 IFL LPARs connected via HiperSockets™ System Stack x86 x86 server with 1-4 enabled Intel Xeon CPU E GHz cores w/ Hyperthreading, 64 GB memory, running WAS with IBM Java SR5 on SLES 12 SP2, local SSDs x86 server with 4 enabled Intel Xeon CPU E GHz cores w/ Hyperthreading, 64 GB memory, running Db2 LUW on SLES 12 SP2, local SSDs x86 server connected via 10Gbps Ethernet 10Gbit YCSB DASD Storage on IBM DS8000 Server / local SSDs HammerDB Workload Driver RHEL 7.3 2 x86 server (24 cores each) Jmeter using IBM Java Database SLES 12 SP2 WAS (DayTrader) Logs LPAR / x86 server (1-4 cores, 64 GB memory) DB2 LUW (4 cores, FICON / SCSI Hipersockets / 10Gbit IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
94
Pervasive Encryption Performance for Node
Pervasive Encryption Performance for Node.js on z14 Model ZR1 vs x86 Broadwell Run the Acme Air benchmark on Node.js 6.10 with pervasive encryption enabled and achieve up to 1.6x more throughput on a z14 Model ZR1 LPAR versus a compared x86 platform Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests running the AcmeAir benchmark with 10,000 customers on Node.js v against MongoDB Enterprise Edition v3.4.2 on z14 native LPAR vs x86 bare metal, driven remotely by 250 JMeter v2.13 threads. Apache HTTP server v was used as load balancer. TLS v1.2, with SSL cipher suite ECDHE-RSA-AES128-GCM-SHA256 was used between JMeter and Apache HTTP, ECDHE-RSA-AES128-GCM-SHA256 cipher was used between Apache HTTP and Node.js, RSA 4096 bit key for SSL configuration of MongoDB, database encrypted via dm-crypt using aes-xts-plain64. Number of Node.js instances equal twice the number of cores assigned to Node.js, each instance pinned to a single CPU. Results may vary. z14 configuration: Node.js and Apache HTTP server running in LPAR with 10 dedicated IFLs, 1.5 TB memory, 40 GB DASD storage, SLES 12 SP2 (SMT mode), OpenSSL 1.0.2j-fips, Apache HTTP server pinned to 2 IFL, Node.js pinned to 8 IFL. MongoDB running in LPAR with 2 dedicated IFL, 64 GB memory, 40 GB DASD storage, SLES 12 SP2 (SMT mode), OpenSSL 1.0.2j-fips, application logs and database on RAM disk. x86 configuration: Node.js and Apache HTTP server running on server with 36 Intel® Xeon® CPU E GHz cores with Hyperthreading, 1.5 TB RAM, SLES12 SP2, OpenSSL 1.0.2j-fips, Apache HTTP server pinned to 2 cores, Node.js pinned to 8 cores. MongoDB running on server with 36 Intel® Xeon® CPU E GHz cores with Hyperthreading, 768 GB RAM, only 2 cores active, SLES12 SP2, OpenSSL 1.0.2, application logs and database on RAM disk. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
95
Pervasive Encryption Performance for Node
Pervasive Encryption Performance for Node.js on z14 Model ZR1 vs x86 Broadwell – Benchmark Configuration on z14 ZR1 Database LPAR App. Server LPAR Apache HTTP (Load Balancer) Acme Air (Node.js App) Database 10 Gbps Network Encryption Libraries MongoDB JMeter (Client Emulator) x86 Blade Configuration System z14 Model ZR1 – 2 LPARs Cores 10 dedicated IFL with SMT to App. Server LPAR 2 dedicated IFL with SMT to Database LPAR Core Assignment Apache: 2 (pinned) Node.js: 1, 2, 3, 4, 8 (pinned) MongoDB: 2 cores active Memory 1.5 TB App. Server LPAR, 64 GB Database LPAR OS SLES 12 SP2 Database MongoDB Enterprise, Database on RAM disk End-to-End Encryption: - JMeter ↔ (SSL) ↔ Apache - Apache ↔ (SSL) ↔ Acme Air - Acme Air ↔ (SSL) ↔ MongoDB - MongoDB ↔ (dm-crypt) ↔ Disk IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
96
Pervasive Encryption Performance for Node
Pervasive Encryption Performance for Node.js on z14 Model ZR1 vs x86 Broadwell – Benchmark Configuration on x86 x86 Haswell x86 Broadwell Database Server App. Server Apache HTTP (Load Balancer) Acme Air (Node.js App) Database 10 Gbps Network Encryption Libraries MongoDB JMeter (Client Emulator) x86 Blade Configuration System Broadwell and Haswell Cores 36 cores with HT to App. Server 36 cores with HT to Database Server Cores Assignment Apache: 2 (pinned) Node.js: 1, 2, 3, 4, 8 (pinned) MongoDB: 2 cores active Memory 1.5 TB App. Server, 768 GB Database Server OS SLES 12 SP2 Database MongoDB Enterprise, Database on RAM disk End-to-End Encryption: - JMeter ↔ (SSL) ↔ Apache - Apache ↔ (SSL) ↔ Acme Air - Acme Air ↔ (SSL) ↔ MongoDB - MongoDB ↔ (dm-crypt) ↔ Disk IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
97
Competitive Workload Performance (vs x86)
Linux Workload Performance on IBM z14 Model ZR1 Competitive Workload Performance (vs x86) Run the Acme Air benchmark on node.js 6.10 with up to 2.2x more throughput on 1 core on a z14 Model ZR1 LPAR versus a compared x86 platform Run the DayTrader 3 benchmark on WebSphere Application Server with up to 1.6x more throughput on 4 cores on a z14 Model ZR1 LPAR versus a compared x86 platform Run the DayTrader 3.0 benchmark on Apache TomEE with up to 2x more throughput on 8 cores on a z14 Model ZR1 LPAR versus a compared x86 platform Run the MicroBM_CPU 1.0 benchmark on InfoSphere DataStage 11.5 with up to 2.5x more throughput on 30 cores on a z14 Model ZR1 LPAR versus a compared x86 platform Run the pgBench 9.6 benchmark on PostgreSQL with up to 1.7x more throughput on 8 cores on a z14 Model ZR1 LPAR versus a compared x86 platform Run the YCSB (write-heavy) benchmark on MongoDB with up to 2.2x more throughput per core on a z14 Model ZR1 LPAR versus a compared x86 platform Run the sysbench 0.5 read-only benchmark on MariaDB with up to 3.3x more throughput on 16 cores on z14 Model ZR1 with IBM FlashSystem 900 storage versus a compared x86 platform with local SSD storage Run the apacheBench 2.3 benchmark on Wordpress with up to 1.6x more throughput on 8 cores on a z14 Model ZR1 LPAR versus a compared x86 platform Following proof points and the detailed measurements are available on the next slides IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
98
PostgreSQL Performance on z14 Model ZR1 vs x86 Broadwell
Run the pgBench 9.6 benchmark on PostgreSQL with up to 1.7x more throughput on 8 cores on a z14 Model ZR1 LPAR versus a compared x86 platform 1.5x 1.6x 1.7x 1.3x Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests running pgBench 9.6 benchmark on PostgreSQL (20 GB database in RAM disk) on z14 native LPAR vs x86 bare metal. Results may vary. x86 configuration: 8 Intel E v4 2.30GHz with Hyperthreading turned on, 64 GB memory, and 500 GB local RAID-5 HDD storage, SLES 12 SP2. z14 configuration: LPAR with 8 dedicated IFL, 64 GB memory, and 40 GB DASD storage, SLES 12 SP2 (SMT mode). IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
99
PostgreSQL Performance on z14 Model ZR1 vs x86 Broadwell – Benchmark Configuration
Benchmark Setup Ran pgBench workload driver locally with 32 concurrent threads read-only (100% read) write-only (100% write) Database size 20 GB System Stack z14 Model ZR1 LPAR with 2-16 dedicated IFL, 64 GB memory, and 40 GB DASD storage running SLES 12 SP2 with SMT enabled PostgreSQL 9.6.1, pgBench 9.6 x86 2-16 Intel E v4 2.30GHz w/ Hyperthreading turned on, 64 GB of memory, and 500 GB local RAID-5 HDD storage running SLES 12 SP2 pgBench PostgreSQL 9.6.1 32 Concurrent Threads/Clients 20 GB Database on RAM disk Linux IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
100
MongoDB Performance on z14 Model ZR1 vs x86 Broadwell
Run the YCSB (write-heavy) benchmark on MongoDB with up to 2.2x more throughput per core on a z14 Model ZR1 LPAR versus a compared x86 platform 2.1x 2.2x Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests running YCSB (write-heavy) on local MongoDB Enterprise Release (Database size 5 GB) on z14 native LPAR vs x86 bare metal. Results may vary. x86 configuration: 30 Intel E v4 2.30GHz with Hyperthreading turned on (8 cores pinned to MongoDB, 22 cores pinned to YCSB), 64 GB memory, 480 GB local RAID-5 HDD storage, SLES 12 SP2. z14 configuration: LPAR with 30 dedicated IFL (8 IFL pinned to MongoDB, 22 IFL pinned to YCSB), 64 GB memory, 120 GB DASD storage, SLES 12 SP2 (SMT mode). IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
101
MongoDB Performance on z14 Model ZR1 vs x86 Broadwell – Benchmark Configuration
Benchmark Setup Ran YCSB workload driver locally read-only (100% read) write-heavy (50% write) Database size 5 GB System Stack z14 Model ZR1 LPAR with 30 dedicated IFLs (2-8 IFLs pinned to MongoDB, 22 IFLs pinned to YCSB), 64 GB memory, and 120 GB DASD storage running SLES 12 SP2 with SMT enabled MongoDB 3.4.1, YCSB x86 30 Intel E v4 2.30GHz w/ Hyperthreading turned on (2-8 cores pinned to MongoDB, 22 cores pinned to YCSB), 64 GB of memory, and 480 GB local RAID-5 HDD storage running SLES 12 SP2 YCSB MongoDB 3.4.1 5 GB Database Linux IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
102
WebSphere Application Server Performance on z14 Model ZR1 vs x86 Broadwell
Run the DayTrader 3 benchmark on WebSphere Application Server with up to 1.6x more throughput on 4 cores on a z14 Model ZR1 LPAR versus a compared x86 platform Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests running DayTrader 3 web application benchmark on WebSphere Application Server (WAS) with IBM Java (SR3) on z14 native LPAR versus x86 bare metal. Database DB2 LUW located on the same system was used to persist application data. Half of the compute cores were bound to DB2, the other half to WAS. The workload was driven remotely by Apache JMeter to trade stocks among users. The utilization of the workload was adjusted by the number of driver threads. Results may vary. x86 configuration: 8 Intel(R) Xeon(R) CPU E GHz, 1.5TB fast TruDDR4 2400MHz Memory, and 400GB local HDD storage, SLES 12 SP2 with Hyperthreading enabled. z14 configuration: LPAR with 8 IFL, running SLES 12 SP2 (SMT mode), 64GB memory, 80GB DASD storage, HyperPAV=8. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
103
WebSphere Application Server Performance on z14 ZR1 vs x86 Broadwell – Benchmark Configuration
10Gbit YCSB IBM DASD Storage on DS8K Server FICON HammerDB Workload Driver RHEL 7.3 2 x86 server (24 Cores each) Jmeter using IBM Java Storage DB SLES 12 SP2 Workload under Test: DayTrader 3 WAS DB Log Storage z14 ZR1 LPAR (2-16 IFL (SMT) 64 GB memory) Benchmark Setup Daytrader Benchmark (15000 Users, Stocks) (ftp://public.dhe.ibm.com/software/webservers/appserv/was/DayTrader3Install.zip) ibm-java-x86_64-sdk Two driving x86 server, each trading for 7500 users 2-6 driver threads (channels) per WAS compute thread System Stack z14 Model ZR1 LPAR with 2-16 IFL, 64 GB memory running SLES 12 SP2 with SMT enabled, DS8K DASD storage WAS with Java 1.8 SR3 pinned to half of the IFLs DB2 LUW pinned to half of the IFLs x86 2-16 Intel Xeon CPU E GHz, 1.5TB memory running SLES 12 SP2 with Hyperthreading enabled, local HDD storage WAS with Java 1.8 SR3 pinned to half of the cores DB2 LUW pinned to half of the cores IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
104
Apache TomEE Performance on z14 Model ZR1 vs x86 Broadwell
Run the DayTrader 3.0 benchmark on Apache TomEE with up to 2x more throughput on 8 cores on a z14 Model ZR1 LPAR versus a compared x86 platform Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests running DayTrader 3.0 benchmark on Apache TomEE on z14 native LPAR vs x86 bare metal. Results may vary. x86 configuration: 8 Intel E v4 2.30GHz with Hyperthreading turned on, 256 GB memory, and 500 GB local RAID-5 HDD storage, SLES 12 SP2. z14 configuration: LPAR with 8 dedicated IFL, 256 GB memory, and 420 GB DS8K storage, SLES 12 SP2 (SMT mode). IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
105
Apache TomEE Performance on z14 ZR1 vs x86 Broadwell – Benchmark Configuration
Benchmark Setup Daytrader benchmark 15000 users 10000 stocks System Stack z14 Model ZR1 LPAR with 2-16 dedicated IFLs and 256 GB memory running SLES 12 SP2 with SMT enabled and 420 GB DS8K storage Apache TomEE 1.7.1, MariaDB Jmeter 2.13, Daytrader 3.0, IBM Java 1.8 x86 2-16 Intel E v4 2.30GHz w/ Hyperthreading turned on, 256 GB of memory, and 500 GB local RAID-5 HDD storage, SLES 12 SP2 Apache JMeter Apache TomEE 1.7.1 MariaDB Linux DayTrader 3.0 IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
106
Node.js Performance on z14 Model ZR1 vs x86 Broadwell
Run the Acme Air benchmark on node.js 6.10 with up to 2.2x more throughput on 1 core on a z14 Model ZR1 LPAR versus a compared x86 platform Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests running the Acme Air benchmark with 10,000 customers on Node.js v against MongoDB Enterprise on z14 native LPAR vs x86 bare metal, driven remotely by 250 JMeter 2.13 threads. Apache HTTP server was used as load balancer. Results may vary. x86 configuration: 36 Intel E v4 2.30GHz, Apache HTTP server pinned to 1 core, Node.js pinned to 1 core, MongoDB pinned to 2 cores, 768GB memory, SLES 12 SP2 with Hyperthreading, application logs and database on the RAM disk. z14 configuration: LPAR with 4 dedicated IFLs, Apache HTTP server pinned to 1 IFL, Node.js pinned to 1 IFL, MongoDB pinned to 2 IFL, 768GB memory, 40 GB DASD storage, SLES 12 SP2 with SMT, application logs and database on the RAM disk. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
107
Node.js Performance on z14 ZR1 vs x86 Broadwell – Benchmark Configuration
JMeter (Client Emulator) Apache HTTP (Load Balancer) Acme Air (Node.js App) MongoDB Database 10 Gbps Network z14 Model ZR1 LPAR or x86 Broadwell server x86 Blade Server 16 cores, 256 GB RAM Generates requests Emulates concurrent clients Load balancer Directs requests from JMeter Stresses all node.js instances Over Over Resides on RAM disk xfs formatted Stores customer information Main processing unit Provides http interface (Dashboard) IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
108
Wordpress Performance on z14 ZR1 vs x86 Broadwell
Run the apacheBench 2.3 benchmark on Wordpress with up to 1.6x more throughput on 8 cores on a z14 Model ZR1 LPAR versus a compared x86 platform Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests running apacheBench 2.3 benchmark against Wordpress on z14 native LPAR vs x86 bare metal from two remote hosts each simulating 30 concurrent users issuing 600 requests against 10 webpages on a Wordpress site. MariaDB located on the same system as Wordpress was used to persist application data. Results may vary. x86 configuration: 8 Intel E v4 2.30GHz w/ Hyperthreading turned on, 64 GB memory, 500 GB local RAID-5 HDD storage, SLES 12 SP2, Wordpress 4.7.5, MariaDB , ApacheHTTP , PHP z14 configuration: LPAR with 8 dedicated IFL, 64 GB memory, and 40 GB DASD storage, SLES 12 SP2 (SMT mode), Wordpress 4.7.5, MariaDB , ApacheHTTP , PHP IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
109
Wordpress Performance on z14 Model ZR1 vs x86 Broadwell – Benchmark Configuration
Benchmark Setup apacheBench workload driver read-only (100% read) 30 users with 600 page hits and 10 web pages System Stack z14 Model ZR1 LPAR with 2-16 dedicated IFL, 64 GB memory, and 40 GB DASD storage running SLES 12 SP2 with SMT enabled MariaDB , ApacheHTTP , Wordpress 4.7.5, PHP 5.6.8 x86 2-16 Intel E v4 2.30GHz w/ Hyperthreading turned on, 64 GB of memory, and 500 GB local RAID-5 HDD storage running SLES 12 SP2 Apache HTTP Webserver MariaDB Database Wordpress Data apacheBench Remote Slave Remote Master HTTP Daemons Wordpress Website PHP IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
110
InfoSphere DataStage Performance on z14 Model ZR1 vs x86 Broadwell
Run the MicroBM_CPU 1.0 benchmark on InfoSphere DataStage 11.5 with up to 2.5x more throughput on 30 cores on a z14 Model ZR1 LPAR versus a compared x86 platform Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests running the MicroBM_CPU 1.0 benchmark on InfoSphere DataStage on z14 native LPAR vs x86 bare metal. MicroBM_CPU is a simplified version of a DimTrade_History_Load job in the TPC-ETL benchmark. Results may vary. x86 configuration: 16 or 32 Intel E v4 2.30GHz with Hyperthreading turned on, 1.5TB fast TruDDR4 2400MHz Memory, and 400 GB local HDD storage, RHEL 7.3. z14 configuration: LPAR with 16 or 32 dedicated IFLs, 256 GB memory, 200 GB DASD storage (HyperPAV=8), RHEL 7.3 (SMT mode). IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
111
MariaDB Performance on z14 Model ZR1 with IBM FlashSystem 900 vs x86 Broadwell with local SSDs
Run the sysbench 0.5 read-only benchmark on MariaDB with up to 3.3x more throughput on 16 cores on z14 Model ZR1 with IBM FlashSystem 900 storage versus a compared x86 platform with local SSD storage 2.8x 2.6x Disclaimer: Performance results based on IBM internal tests running sysbench 0.5 read-only benchmark on MariaDB MariaDB database size was 100 GB. Results may vary. x86 configuration: 16 Intel E v4 2.30GHz w/ Hyperthreading turned on, 32 GB memory, 2 TB local RAID-5 SSD storage, SLES 12 SP2, MariaDB z14 Model ZR1 configuration: LPAR with 16 dedicated IFL, 32 GB memory, 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S+ cards, SLES 12 SP2 (SMT mode), MariaDB IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
112
z14 Model ZR1 LPAR w/ SLES 12 SP2
MariaDB Performance on z14 Model ZR1 with IBM FlashSystem 900 vs x86 Broadwell with local SSDs – Benchmark Configuration z14 Model ZR1 Configuration: Benchmark Setup Ran sysbench 0.5 benchmark read-only (100% read) read-write MariaDB database size 100 GB System Stack x86 8-16 Intel E v4 2.30GHz w/ Hyperthreading turned on and 32 GB memory running SLES 12 SP2 2 TB local RAID-5 SSD storage MariaDB z14 Model ZR1 LPAR with 8-16 dedicated IFLs and 32 GB memory running SLES 12 SP2 with SMT enabled 150 GB LUN on IBM FlashSystem 900 attached via FICON Express16S+ card z14 Model ZR1 LPAR w/ SLES 12 SP2 MariaDB IBM FlashSystem 900 sysBench x86 Configuration: x86 server w/ SLES 12 SP2 MariaDB Local SSDs sysBench IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
113
Microservices Performance cross z14 Model ZR1 LPARs vs cross x86 servers
Microservices show a better performance cross z14 Model ZR1 LPARs than cross compared x86 servers providing up to 2.1x more throughput per core for the AcmeAir benchmark Disclaimer: Performance result is extrapolated based on a clock ratio of from IBM internal tests running Apache JMeter remotely against AcmeAir microservice ( on WebSphere Liberty on z14 native LPAR vs x86 bare metal. Results may vary. x86 configuration: Apache jmeter 2.13 running on a x86 server with 18 Intel E v4 2.30GHz, 768 GB memory, 400 GB local RAID-5 volume on 15k 12Gbps SAS drives, SLES 12 SP2, Docker , Kubernetes 1.3.3, etcd 2.1.3, and Calico 1.1 virtual network. AcmeAir flight / booking service and AcmeAir customer / authentication service running on two separate, but identically configured x86 servers with 18 Intel E v3 2.30GHz, 768 GB memory, 400 GB local RAID-5 volume on 15k 12Gbps SAS drives, SLES 12 SP2, Docker , Kubernetes 1.3.3, Nginx , WebSphere Liberty , MongoDB 3.5.6, and Calico virtual network. z14 configuration: Apache jmeter 2.13 running on a LPAR with 18 dedicated IFLs, 768 GB memory, 80 GB DASD storage, SLES 12 SP2, Docker , Kubernetes 1.3.3, etcd 2.1.3, and Calico 1.1 virtual network. AcmeAir flight / booking service and AcmeAir customer / authentication service running on two separate, but identically configured LPARs with 18 dedicated IFLs, 768 GB memory, 80 GB DASD storage, SLES 12 SP2, Docker , Kubernetes 1.3.3, Nginx , WebSphere Liberty , MongoDB 3.5.6, and Calico 1.1 virtual network. IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
114
Service endpoint (Kuber-netes) Service endpoint (Kuber-netes)
Microservices performance cross z14 Model ZR1 LPARs vs cross x86 servers – Benchmark Configuration z14 Model ZR1 Setup JMeter Service endpoint (Kuber-netes) Nginx CS AS FS BS Mongo z14 Model ZR1 LPAR 1 – 10 IFLs LPAR 2 – 10 IFLs LPAR 3 – 10 IFLs Pair of a service and a DB placed in the same LPAR / x86 server Placed an authentication service (AS) and a flight service (FS) in different LPARs / x86 servers Placed an authentication service (AS) and a customer service (CS) in the same LPAR / x86 server x86 Setup JMeter Service endpoint (Kuber-netes) Nginx CS AS FS BS Mongo Server 1 – 10 cores x86 Broadwell Server 2 – 10 cores x86 Haswell Server 3 – 10 cores IBM Z / / Apr 10, 2018 / © 2018 IBM Corporation
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.