Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.Grid.org.il Distributed Data Management for Compute Grid Presented by Michael Di Stefano Founder of Author of Meeting: Tuesday, September 13 th, 2005.

Similar presentations


Presentation on theme: "Www.Grid.org.il Distributed Data Management for Compute Grid Presented by Michael Di Stefano Founder of Author of Meeting: Tuesday, September 13 th, 2005."— Presentation transcript:

1 www.Grid.org.il Distributed Data Management for Compute Grid Presented by Michael Di Stefano Founder of Author of Meeting: Tuesday, September 13 th, 2005

2 www.Grid.org.il Slide - 2 - Agenda Data Management - The Next Grid Problem Evolution in Compute Topology Objectives of Data Management New Topology – New Data Management Techniques New Techniques, New Research, Emergence of Standards

3 www.Grid.org.il Slide - 3 - Two Components of The Grid Compute GRID  The Grid Operating System - provides the core services for grid computing –Physical Resource Accounting –Process Task Queues –Management of Task/Resource Execution Data GRID  Data Management System of Grid - Manages all aspects –Enterprise Data –Data Scheduling –Replication –Availability –Legacy Access Compute Grid Data Grid

4 www.Grid.org.il Slide - 4 - Compute Grids Roll your own Compute Grid Free Versions of Compute Grids Product and Supported Compute Grids

5 www.Grid.org.il Slide - 5 - Data Grids Data Grid Engine - Movement of Bits and Bytes  FTP  Sockets  Middleware (messaging)  Caches Applications Perspective  Multiple Data Characteristics  Quality of Service  Data Management not Bit/Byte Movement

6 www.Grid.org.il Slide - 6 - Evolution in Computing MainframeMiniClient/Server

7 www.Grid.org.il Slide - 7 - 15 Years of Distributed Computing Evolution Sockets CORBA Messaging Internet Application Servers Tight Bindings Loose Coupling Publish / Subscribe Grid Topology Emerging from the “Evolutionary Mist” Client/Server © Integrasoft, L.L.C. 2005

8 www.Grid.org.il Slide - 8 - Evolution Distributed Data Management for Grid Computing Copyright John Wiley and Sons 2005

9 www.Grid.org.il Slide - 9 - The Grid Topology Client / Server Compute Grid Physical Operational Operating System Physical CPU Peripherals Execution Threads Operating System Physical Nodes Resource/Node Management Inventory of Work/Tasks Resource Inventory Matching of Task to Recourse Close Proximity (Mother Board) Diverse CPU Families Diverse Geography Diverse Network Bandwidth

10 www.Grid.org.il Slide - 10 - Application on the Grid Multiple Data Sources and Destinations  Client Information  Portfolio Information  Market Data Quality of Service Levels  Application in its entirety  Application components  Speed of Access  Query  Updates (Transactional, Optimistic)

11 www.Grid.org.il Slide - 11 - How QoS is Delivered Today Relational Databases  SQL Query  Transactional Updates  Stored Procedures Middleware Queuing  Various delivery modes  Publish and Subscribe  Easy Programmatic API Other  Object Databases  Object Relational Data flow and movement is optimized. Designed to meet Application QoS For Client/Server Topology

12 www.Grid.org.il Slide - 12 - Application Today in Client/Server Threads RAM Connection Pools Tailored Middleware Business Applicatio n Server Machine

13 www.Grid.org.il Slide - 13 - What Happens in a Grid Business Applicatio n Server Machine Compute Grid

14 www.Grid.org.il Slide - 14 - The Data Access Funnel Distributed Data Management for Grid Computing Copyright John Wiley and Sons 2005

15 www.Grid.org.il Slide - 15 - Data Grid Eliminates the Funnel Distributed Data Management for Grid Computing Copyright John Wiley and Sons 2005

16 www.Grid.org.il Slide - 16 - Goals of a Data Management in Grid The Big 3 Goals of Data Management in Grid  Optimize Data Affinity –Minimize Data Movement –Optimize the recourse of the Network  Maintain Business Application QoS for Data Management  Integrate Legacy Systems into the Grid

17 www.Grid.org.il Slide - 17 - How do Achieve Goals of the Data Grid What the Architect/Developer must Address  How many copies or “Replicas” of data are needed in the Data Grid?  How fine is the granularity of my “Data Atoms” to be replicated?  How do best to “Distribute” Data Atoms across the Data Grid?  What level of “Synchronization” is required?  How do “logically group” data along business lines?  How to “Integrate” and “Operate” legacy data sources?  How to manage “Events” in the Data Grid?  Synchronization of data sources external to the Data Grid?

18 www.Grid.org.il Slide - 18 - Data Management in Grid Granularity of Data Atoms Replication Distribution Logical Data Groupings (Data Regions) Synchronization  InterRegion  IntraRegion  External Data Sources Events Integration with Legacy Systems Nothing to do with mechanics of the bits and bytes These are Data Management Issues

19 www.Grid.org.il Slide - 19 - Data Management is NOT Caching Distributed Data Management for Grid Computing Copyright John Wiley and Sons 2005 Moves the bits and bytes -Cache -Grid FTP -Others Data Management to deliver Business Application’s QoS given the “compute topology”

20 www.Grid.org.il Slide - 20 - Engines of a Data Grid Cache  Java based engines such as JCache, Java Spaces, …  Various C++ Caches  Recycled Object Data Base Technology FTP  Grid FTP Meta Data Services File Systems  NFS Distributed File Systems

21 www.Grid.org.il Slide - 21 - Right Tool for the Job Business Applications have specific QoS levels from the Data Grid Complex Analysis of Large Data Sets Dependency of small fast moving data sets Large Static Data Sets …….

22 www.Grid.org.il Slide - 22 - Business Drivers Fueling Grid

23 www.Grid.org.il Slide - 23 - Business Drivers Fueling Grid Distributed Data Management for Grid Computing Copyright John Wiley and Sons 2005

24 www.Grid.org.il Slide - 24 - Limited Patience of Business

25 www.Grid.org.il Slide - 25 - No Data Management Tools Difficult Custom Code Long Time to Delivery No Reuse Business Prospective Increased Complexity Improved Performance Financial ROI Grid fails Wide Spread Acceptance

26 www.Grid.org.il Slide - 26 - Business Prospective Financial ROI With Data Management for Grid Easy to use/understand Reuse Effort on business Increased Complexity Improved Performance Fast Time to Market Ease of Migration to Grid Changes Data Centers

27 www.Grid.org.il Slide - 27 - Data Management in Grid Granularity of Data Atoms Replication Distribution Data Regions Synchronization Integration with Legacy Systems If Distributed Data Management is not addressed, wide acceptance of Grid will fail.

28 www.Grid.org.il Slide - 28 - Measuring QoS to Determine Data Grid Distributed Data Management for Grid Computing Copyright John Wiley and Sons 2005

29 www.Grid.org.il Slide - 29 - Measuring QoS to Determine Data Grid Distributed Data Management for Grid Computing Copyright John Wiley and Sons 2005 Application QoS( Work(), Data(), Time(), Geography() Query() ) Where: Work( batch/atomic, sync/async ) Data( overall size, atomic size, transient, query ) Time( RealTime, Non-RealTime, Near-RealTime ) Geography( Topology, Bandwidth ) Query( Basic, Complex )

30 www.Grid.org.il Slide - 30 - Objective of Data Grid - Data Affinity Low cost of CPU Data size is determined by application Network bandwidth is limited Data and Work need to be co-located Virtual Centrally Managed Data Base Physically Distributed

31 www.Grid.org.il Slide - 31 - How to Achieve Data Affinity Locate data and work close together to minimize data movement across the network  Reactive : Data Grid distributes data in anticipation of where work will be assigned. Distributed Data Management policies of Regionalization Replication Distribution Synchronization  Proactive : Routing of Task to Data. Compute Grid Task Scheduler queries Data Locality Information from Data Grid

32 www.Grid.org.il Slide - 32 - Distributed Data Management Data Regions Replication Distribution Synchronization Load and Store Event

33 www.Grid.org.il Slide - 33 - Distributed Data Management Policies Distributed Data Management for Grid Computing Copyright John Wiley and Sons 2005

34 www.Grid.org.il Slide - 34 - Advanced Topics in Distributed Data Management Natural Attraction Forces of Data Bodies Within a Data Grid To Describe Efficient Data Distribution Patterns ---------------White Paper ------------- Michael Di Stefano September 2004 Distributed Data Management for Grid Computing Copyright John Wiley and Sons 2005

35 www.Grid.org.il Slide - 35 - Advanced Topics in Distributed Data Management Natural Attraction Forces of Data Bodies Within a Data Grid To Describe Efficient Data Distribution Patterns ---------------White Paper ------------- Michael Di Stefano September 2004 Distributed Data Management for Grid Computing Copyright John Wiley and Sons 2005

36 www.Grid.org.il Slide - 36 - Purchasing Information Please Visit www.integrasoftware.com To Purchase your copy of “Distributed Data Management for Grid Computing” To receive a 15% discount.


Download ppt "Www.Grid.org.il Distributed Data Management for Compute Grid Presented by Michael Di Stefano Founder of Author of Meeting: Tuesday, September 13 th, 2005."

Similar presentations


Ads by Google