WebSphere XD Compute Grid High Performance Architectures

WebSphere XD Compute Grid High Performance Architectures
Snehal S. Antani, WebSphere XD Technical Lead SOA Technology Practice IBM Software Services

Overview Key Components of WebSphere XD Compute Grid
Job Scheduler [formerly named the Long Running Scheduler (LRS)] Parallel Job Manager (PJM) Grid Endpoints [formerly named Long Running Execution Environment (LREE)] High Performance Architectures and Considerations

XD Compute Grid Components
Job Scheduler (JS) The job entry point to XD Compute grid Job life-cycle management (Submit, Stop, Cancel, etc) and monitoring Dispatches workload to either the PJM or LREE Hosts the Job Management Console (JMC) Parallel Job Manager (PJM)- Breaks large batch jobs into smaller partitions for parallel execution Provides job life-cycle management (Submit, Stop, Cancel, Restart) for the single logical job and each of its partitions Is *not* a required component in compute grid Grid Endpoints (GEE) Executes the actual business logic of the batch job

XD Compute Grid Components
Load Balancer EJB Web Service JMS Command Line Job Console GEE WAS User JS WAS GEE WAS PJM WAS

Key Influencers for High Performance Compute Grids
Proximity to the Data Bring the business logic to the data: co-locate on the same platform Bring the data to the business logic: in-memory databases, caching Affinity Routing Partitioned data with intelligent routing of work Divide and Conquer Highly parallel execution of workloads across the grid On-Demand Scalability

Proximity to the Data- Co-location of business logic with data
Frame Job Scheduler WAS z/OS WAS z/OS Controller Controller GEE GEE GEE GEE GEE GEE Servant Servant Servant Servant Servant Servant DB2 on z/OS

Proximity to the Data- Bring data to the business logic with caching
Frame CPU CPU Job Scheduler LPAR GEE GEE GEE Data Grid near-cache Data Grid near-cache Data Grid near-cache CPU CPU CPU CPU CPU CPU LPAR LPAR LPAR CPU CPU DG Server DG Server LPAR LPAR Database

Affinity Routing- Partitioned data with intelligent routing of work
Frame Job Scheduler Records A-M Records N-Z WAS z/OS WAS z/OS Controller Controller A-D E-I J-M N-Q R-T W-Z GEE GEE GEE GEE GEE GEE Servant Servant Servant Servant Servant Servant DB2 Data Sharing Partition DB2 Data Sharing Partition Records A-M Records N-Z

Affinity Routing- Partitioned data with intelligent routing of work
Frame CPU CPU Job Scheduler GEE GEE GEE Data Grid near-cache Data Grid near-cache Data Grid near-cache CPU CPU CPU CPU CPU CPU Records A-I Records J-R Records S-Z CPU CPU DG Server DG Server Records A-M Records N-Z Database

Divide and Conquer- Highly Parallel Grid Jobs
Large Grid Job Frame CPU CPU PJM GEE GEE GEE Data Grid near-cache Data Grid near-cache Data Grid near-cache CPU CPU CPU CPU CPU CPU Records A-I Records J-R Records S-Z CPU CPU DG Server DG Server Records A-M Records N-Z Database

On-Demand Scalability- With WebSphere z/OS
Frame Job Scheduler WAS z/OS WAS z/OS Controller Controller zWLM zWLM GEE GEE GEE GEE GEE GEE Servant Servant Servant Servant Servant Servant DB2 on z/OS

On-Demand Scalability- With XD Operations Optimization
Frame CPU CPU CPU CPU Job Scheduler On-Demand Router LPAR LPAR GEE GEE GEE Data Grid near-cache Data Grid near-cache Data Grid near-cache CPU CPU CPU CPU CPU CPU LPAR LPAR LPAR CPU CPU DG Server DG Server LPAR LPAR Database

Backup

OG Server Miss (Access DB)
Near-Cache Hit Probability of cache hit Time (ms) to retrieve data from cache Data Access Probability that data is in cache server OG Server Hit Time (ms) to retrieve data from cache server Probability of cache miss Near-Cache Miss Time (ms) to retrieve data from other storage Probability that data must be retrieved from database OG Server Miss (Access DB) Time (ms) to retrieve data from database Data Access time (ms) = (Probability of near- cache hit) * (Time to retrieve data from near-cache) + (Probability of near-cache miss) * (time to retrieve data from other storage); Time to retrieve data from other storage (ms) = (Probability that data is in cache server) * (Time to retrieve data from cache server) + (Probability that data must be retrieved from database) * (time to retrieve data from database);

Near-Cache Hit P1 S1 Data Access OG Server Hit P3 S3 P2 Near-Cache Miss S2 P4 Data Access = (Near-Cache Hit) + (Near-Cache Miss) Near-Cache Hit = (P1)(S1) Near-Cache Miss = (P2) * [ (P3)(S3) + (P4)(S4) ] Improve data access time by: Increase P1: increase cache size (increase heap, etc) establish request affinity Decrease S1: Dynamically add more CPU Decrease S2 Increase P3 Increase size of cache server Establish query/data affinity Decrease S3 Decrease S4 Reduce network latency OG Server Miss (Access DB) S4

Example calculation Near-Cache Hit 30% 1 ms Data Access OG Server Hit 47.2 ms 70% 10 ms 70% Near-Cache Miss 67 ms 30% OG Server Miss (Access DB) Data Access = (Near-Cache Hit) + (Near-Cache Miss) Near-Cache Hit = (P1)(S1) Near-Cache Miss = (P2) * [ (P3)(S3) + (P4)(S4) ] Near-cache miss = (.7)(10) + (.3)(200) = = 67 ms Data Access = (.3)(1) + (.7)(67) = = 47.2 ms 200 ms

Example calculation- effects of increasing size of near-cache Near-Cache Hit 60% 1 ms Data Access OG Server Hit 27.4 ms 70% 10 ms 40% Near-Cache Miss 67 ms 30% OG Server Miss (Access DB) Data Access = (Near-Cache Hit) + (Near-Cache Miss) Near-Cache Hit = (P1)(S1) Near-Cache Miss = (P2) * [ (P3)(S3) + (P4)(S4) ] Near-cache miss = (.7)(10) + (.3)(200) = = 67 ms Data Access = (.6)(1) + (.4)(67) = = 27.4 ms (47.2 – 27.4) / 47.2 = 42% improvement in data access time 200 ms

Example calculation- effects of adding more CPU and decreasing network latency to the DB Near-Cache Hit 30% 1 ms Data Access OG Server Hit 26.2 ms 70% 10 ms 70% Near-Cache Miss 37 ms 30% Data Access = (Near-Cache Hit) + (Near-Cache Miss) Near-Cache Hit = (P1)(S1) Near-Cache Miss = (P2) * [ (P3)(S3) + (P4)(S4) ] Near-cache miss = (.7)(10) + (.3)(100) = = 37 ms Data Access = (.3)(1) + (.7)(37) = = 26.2 ms (47.2 – 25.9) / 47.2 = 45% improvement in data access time OG Server Miss (Access DB) 100 ms

WebSphere XD Compute Grid Infrastructure Topology Considerations

WebSphere XD Compute Grid High Performance Architectures

Similar presentations

Presentation on theme: "WebSphere XD Compute Grid High Performance Architectures"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

WebSphere XD Compute Grid High Performance Architectures

Similar presentations

Presentation on theme: "WebSphere XD Compute Grid High Performance Architectures"— Presentation transcript:

Similar presentations

About project

Feedback