Copyright © 2007, GemStone Systems Inc. All Rights Reserved. Optimize computations with Grid data caching OGF21 Jags Ramnarayan Chief Architect, GemFire.

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 2 Background on GemStone Systems Known for its Object Database technology since 1982 Now specializes in memory-oriented distributed data management 12 pending patents Over 200 installed customers in global 2000 Grid focus driven by: Very high performance with predictable throughput, latency and availability Capital markets – risk analytics, pricing, etc Large e-commerce portals – real time fraud Federal intelligence

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 3 Agenda Define Memory-oriented data fabric (grid) Problem What is required from solution? Design patterns How does the fabric enable task co-location with data? How does the fabric integrate with compute Grid Scheduler? Data fabric deployment patterns

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 4 What is a Memory oriented Data Fabric? It is Active in nature Application express interest on moving data Complex Continuous queries Distributed Event notifications are reliable and drive high speed workflow Data Source Abstraction Parallel Access and synchronization of data with external sources Abstracts the app away from data sources Pool memory (and disk) across cluster/Grid Managed as a single unit Replicate data for high concurrent load, HA Distribute (partition) data for high data volume, scale Gracefully expand capacity to meet scalability/Perf goals Distributed Data Space Data warehouses Rational databases GRID Applications

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 5 Grid Scheduler Compute farm Data warehouses Rational databases Too many concurrent connections Large database server bottlenecks on network Queries results are large causing CPU bottlenecks Even a parallel file system throttled by disk speeds Too much data transfer Between tasks, Jobs Between Grid and file systems, databases Data consistency issues File system CPU bound job turns into a IO bound Job Where is the problem?

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 6 What are the key requirements? Scale near linear Aggregate data throughput goes up with increasing data fabric servers Data capacity can increase with more memory, disk, network Concurrent clients can increase with increasing servers Scale and Synchronize across data centers Increased fault tolerance Add servers to increase data redundancy and protection against failures

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 7 Before the Jobs commence: Replicate most frequently used data to multiple grid nodes Partition large data sets in memory and disk on grid nodes Allow application to specify the logical partitioning scheme Make multiple copies of partitions for HA and/or for highly parallel access to "hot" data Share data placement information with scheduling engine Allow the application to hint what kind of data it is most sensitive too Dynamically at run time: Co-locate in-process most frequently used data Meet throughput and latency SLA targets Dynamically rebalance data across more nodes Increase replicated copies Minimize movement of data between nodes Job/task engines to clients, between job nodes File system Grid Scheduler Data Fabric/Grid Data warehouses Rational databases Compute Farm What does the data fabric provide?

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 8 Idea – only selective data sets are replicated Used when data volume is small, very frequently used and doesnt change often Ideal configuration Use reliable multicast if network can be configured Can avoid Strict consistency model Possible if updates dont conflict Product should support Async Sync Sync with locks Sync in context of transaction Grid Scheduler Design pattern – full replication A1 B1 C1 A1 B1 C1 A1 B1 C1

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 9 Idea By keeping data spread across many nodes in memory, we can exploit the CPU and network capacity on each node simultaneously to provide linear scalability Data buckets distributed with redundancy Different Partitioning policies dependent on application access patterns Policy 1: Hash partitioning Suitable for key based access Hash on key maps to bucket and to node Single network hop at most Extremely fast Grid Scheduler Design pattern – data partitioning A1 B1 C1 D1 E1 F1 G1 H1 I1

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 10 Query execution for Hash policy Parallelize query to each relevant node Each node executes query in parallel using local indexes on data subset Query result is streamed to coordinating node Individual results are unioned for final result set This scatter-gather algorithm can waste CPU cycles Policy 2 – Grouped on attribute(s) Example: Manage all Sept trades in one data partition Query predicate can be analyzed to target only node with Sept data Design pattern – data partitioning 1. select * from Trades where trade.month = August 2. Parallel query execution 3. Parallel streaming of results 4. Results returned

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 11 Design pattern – data partitioning Dealing with data growth Data buckets are also re-partitioned when capacity exceeded Dealing with data hot spots what happens if data access is not uniform across data range? For instance, this week's transactions/trades are more frequently accessed than last weeks transactions and so on? Multiple strategies: configure higher redundancy level for specific partitions and keep infrequently accessed (historical) data partitions on disk "sense and respond" by re-partitioning or creating redundant copies on the fly Co-locating related data Data access patterns may reveal related data being fetched together. Optimize join processing through co-location

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 12 Design pattern – co-locate task with data f 1, f 2, … f n FIFO Queue Data fabric Resources Exec functions Sept Trades Submit (f1) -> calcHighValueTrades(, where trades.month=Sept) Function (f1) Function (f2) Principle: Move task to computational resource with most of the relevant data before considering other nodes where data transfer becomes necessary Case (I) – Using the Data Fabric API to execute functions (like stored procedures) Location driven by data qualification predicate(s) Data fabric pays attention to load, throughput, latency – Re-balance function by re-partitioning data

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 13 Case (II) – using Grid Scheduling engine specialize in managing the Grid application life cycle and optimize resource utilization Strategy 1 Scheduling engine aware of data placement Data fabric provisions data once and feeds meta data to scheduler submitted jobs provide "data req" hints scheduler matches this to stored meta data and does best effort to route task to data node Continuous feedback to affect task routing Job/Task de-queuing based on data availability and data resource utilization Grid Scheduler Integration with compute Grid Scheduler A1 B1 C1 D1 E1 F1 G1 H1 I1 Grid Engine 2. Exec Job(, ) Grid Engine 3. Task routing based on data location 1. Data location Meta data

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 14 Strategy 2 - Data fabric aware of Job/Application profile, scheduling policies (1) Application statically configures data requirements for Job This apriori knowledge is used by data grid to provision data before batch job start (2) Job priorities can be used to optimize resource utilization For instance, low priority Job data may get more rapidly evicted from the data grid or be managed on disk (3) Jobflow dependencies can be used to optimize resource utilization and provisioning Grid Scheduler Integration with compute Grid Scheduler A1 B1 C1 D1 E1 F1 G1 H1 I1 Grid Engine Job profile, Scheduling policies Grid Engine Job Meta data

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. 15 Data fabric and compute engines are co- located ( peer-2-peer data network ) When: limited parallelism, highly iterative tasks that operate on the same data set over and over Data managed on fabric servers When: Many Jobs with unpredictable data access requirements, large data volume, data life cycle is independent of compute job lifecycle, data is changing constantly and data updates need to be synchronized to back-end databases, etc Super peer architecture Loosely coupled distributed systems partial or full replication data sets are partitioned across data centers Common deployment topologies Client Server Client Peer WAN Distributed System Distributed System

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. Optimize computations with Grid data caching OGF21 Jags Ramnarayan Chief Architect, GemFire.

Similar presentations

Presentation on theme: "Copyright © 2007, GemStone Systems Inc. All Rights Reserved. Optimize computations with Grid data caching OGF21 Jags Ramnarayan Chief Architect, GemFire."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Copyright © 2007, GemStone Systems Inc. All Rights Reserved. Optimize computations with Grid data caching OGF21 Jags Ramnarayan Chief Architect, GemFire.

Similar presentations

Presentation on theme: "Copyright © 2007, GemStone Systems Inc. All Rights Reserved. Optimize computations with Grid data caching OGF21 Jags Ramnarayan Chief Architect, GemFire."— Presentation transcript:

Similar presentations

About project

Feedback