Presentation on theme: "NetSolve Happenings A Progress Report of the NetSolve Grid Computing System Cluster and Computational Grids for Scientific."— Presentation transcript:
NetSolve Happenings A Progress Report of the NetSolve Grid Computing System Cluster and Computational Grids for Scientific Computing September 24-27, 2000 Le Château de Faverges de la Tour, Lyon, France.
Outline The Grid. NetSolve Overview. The Key to Success: –Interoperability. –Applications. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Interoperability, Applications and NetSolve.
Current Trends in HPC Highlights of TOP 500 computers (June 2000). –#1: 9632 processor Intel based “ASCI Red” at Sandia National Laboratory Gflops. (74.2%) –#2 & #3: 2144 Gflops & 1608 Gflops. (55%, 52%) –Others in top 10: LLNL, LANL, Leibniz Rechenzentrum (Munich), University of Tokyo. –#10: Gflops procs, Cray T3E900 (68.4%) –#250: Gflops. 256 procs, Hitachi based arch. (76.2%) –#500: Gflops. 64 procs, SunHPC (400Mhz) (85.6%)
Computational Grids Motivation –Regardless of the number and capacity of computational resources available, there will always be a need/desire for more computational power. –Innovations to increase computational capacity not only through hardware, but software infrastructures as well. –Often the case where all resources (data, storage facilities, computational servers, human users, etc.) are distributedly (even globally) located. –Need for technology that reliably manages large collections of distributed computational resources, efficiently scheduling and allocating their services to meet the needs of users while providing robustness, high availability and quality of service.
Computational Grids application user
Vision for the Grid Uniform, location independent, and transient access to the resources of science and engineering to facilitate the solution of large scale, complex, multi-institutional, multidisciplinary data and computational based problems. Resources can be: –Hardware (networks, CPU, storage, etc.) –Software (libraries, modules, source code, etc.) –Human collaborators
Attack of the Grid NetSolve AppLeS NWS IBP Habanero Cumulvs Harness WebOS TeraWeb PVM Ninf Globus Condor JINI Legion Electronic Notebook UniCore Ninja NEOS PUNCH Everyware NCSA Workbench Webflow GatewayJiPANG LoCI IPG NAG-NASA SinRG
The NetSolve Grid Environment -Brief Overview of the NetSolve System.
NetSolve Overview More than just a “not very well-defined user- level protocol!” Problem Solving Environment Toolkit Client/Agent/Server system. Remote access to hardware AND software. “Robust, fault-tolerant, flexible, heterogeneous environment that provides dynamic management and allocation policies for distributed computational resources.”
Is That Your Final Answer? NetSolve - The Big Picture Service Results Agent Information Service Query Client Scheduling Computational Resources Dude, I need more computer power. … AND my software selection totally sucks! What’s the name of that rocking system again? NetSolve!
NetSolve Credits Sudesh Agrawal Dorian Arnold Dieter Bachmann Susan Blackford Henri Casanova Jack Dongarra Yuang Hang Karine Heydemann Michelle Miller Keith Moore Terry Moore Ganapathy Raman Keith Seymour Sathish Vahdiyar Tinghua Xu
Interoperability and the Grid
The Problem The goal of the grid: “enable and maintain the controlled sharing of distributed resources to solve multidisciplinary problems of common interest to different groups or organizations.” Hodgepodge of systems – each possessing their unique perspective, AND UNFORTUNATELY their unique custom protocols and components.
Why The Problem? Sociological: –Of course, mine is bigger, better, … Even if not, I cannot admit that, dismiss my efforts and use yours. Technical: –Immaturity –Doesn’t exactly fit needs –Software problems Economical: –Reinvest time and efforts, throwing away existing code to incorporate ones. –I’ve been funded for this, so …
The Problem (cont’d) No single system will emerge as the single Grid computing system of choice: –Each has unique characteristics that appeal to different classes of users Ease of install/administration/maintenance Stringent Security Ease of integration Performance Interface Services Provided Code Robustness/system maturity …
Q & A If interoperability is indeed desirable, necessary or both for success of the Grid. AND The consensus is an unwillingness to change existing custom protocols, objects, etc. THEN Are we stuck?
Current Solutions Laborious integration efforts that only work between specific systems, typically under specialized circumstances. Globus Condor Condor-G NetSolve Globus NetSolve Proxies Condor NetSolve Condor-Servers Ninf NetSolve Ninf Proxies
NetSolve EveryWare Globus Legion … Current Solutions (cont’d) Computing Portals as front-ends to sweep the dirt of un-interoperable systems under the cover. Globus PBS NPACI Resources HotPage Globus Ninf NetSolve JiPANG Legion STOP CAUTION
A Better Solution? Representation standards for objects, protocols, services, etc. would be ideal. Is there a possibility of using _____ to allow us to keep our customizations while allowing other systems to translate/interpret them? XML?
NetSolve Interoperability XML PDFs –Use XML as the language to implement the description of software services. –Proliferation of XML tools and parsers to exploit. –Collaboration with Ninf project to establish a standardized IDL. Investigate XML representation for “standard” Grid components – machines, storage, etc. Standard objects/languages allow systems to share information. There still needs to be some commonly understood protocols to allow inter- system transactions.
NetSolve Interoperability Within the current NetSolve framework: –Publishing the client-proxy interface allows other metacomputing systems to easily leverage NetSolve resources via. –Implementing new proxies allow NetSolve client users to leverage other metacomputing systems.
Client Proxies Negotiates for metacomputing services on behalf of the client. Allows client to be more lightweight. Proxies provide a translation between “language” of the client and “language” of the underlying services, i.e. NetSolve, Globus, etc.
Applications for the Grid Heterogeneous application types/classes –independent parallelism, pipeline simulations may represent a key class of applications that can efficiently perform on a Globally distributed computational infrastructure.
Data Persistence Chain together a sequence of requests. Analyze parameters to determine data dependencies. Essentially a DAG is created where nodes represent computational modules and arcs represent data flow. Transmit superset of all input/output parameters and make persistent near server(s) for duration of sequence execution. Schedule individual request modules for execution.
Request Sequencing Goals: – Transmit no unnecessary (redundant) data parameters. – Ensure all necessary data parameters are transmitted. –Execute modules simultaneously whenever possible.
Request Sequencing Interface … netsl(“command1”, A, B, C); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); … netsl_begin_sequence( ); netsl(“command1”, A, B, C); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); netsl_end_sequence(C, D); …
DAG Construction “C” Implementation. Analyze all input/output references in the request sequence. Two references are equal if they refer to the same memory address. Size parameters checked for “subset” objects. Only NetSolve “Matrices” and “Vectors” are checked. Constructed DAG scheduled for execution at NetSolve server.
DAG for Example Sequence … netsl_begin_sequence( ); netsl(“command1”, A, B, C); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); netsl_end_sequence(C, D); … command1 command2 command3 ABE C D F
netsl(“command1”, A, B, C); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); ClientServer command1(A, B) result C ClientServer command2(A, C) result D ClientServer command3(D, E) result F netsl_begin_sequence( ); netsl(“command1”, A, B, C); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); netsl_end_sequence(C, D); ClientServer sequence(A, B, E) Server ClientServer result F input A, intermediate output C intermediate output D, input E Data Persistence (cont’d)
Enhanced Sequencing Multiple NetSolve server sequencing. –Currently only single NetSolve server can be used to service entire sequence. –If no single server possesses all software, cannot be executed as sequence. –Truly parallel execution only on SMPs like the SGI server used. Investigate whether graph scheduling heuristics and algorithms for parallel machines can apply to distributed resources as well.
Data Logistics and Distributed Storage Infrastructures Expand Data Persistence model to multiple servers using Distributed Storage Infrastructures to conveniently cache data parameters near all involved servers. Example DSIs: IBP, GASS, … Leveraging remote storage as request parameters, users can pre-allocate data to expedite services or use already remote data in NetSolve requests.
Multiple Server Sequencing and DSIs Sequence Parameters DSI data caches Server Server cluster Server client
Conclusion Small likelihood that any single system will emerge as the Grid system of choice. Therefore, the interoperability of systems and standardization of protocols and object representations becomes highly desirable. The Grid community should continue to develop the concepts and technologies necessary to facilitate a seamless Grid environment that is easy to use, highly available and highly efficient. However, they should promote more cooperation and less competition in an effort to establish a global heterogeneous GC fabric that makes supercomputing power available to the masses.