Presentation on theme: "Grid Computing - an Introduction Richard Troy, Chief Scientist 1345 Wicklow Lane, Ormond Beach, FL 32174 386-868-3846 ScienceTools.com."— Presentation transcript:
Grid Computing - an Introduction Richard Troy, Chief Scientist 1345 Wicklow Lane, Ormond Beach, FL ScienceTools.com
What is Grid Computing? Think Electric Power Grid –The term ‘Grid’ suggests a metaphor between computing goals and an electric grid that unites power producers with consumers. –In practice, ‘Grid Computing’ consists of a very specific set of technologies to unite specific computing partners in a “virtual organization” paradigm. –Grid computing shouldn’t be confused with “Utility” or “High-Performance” computing or Distributed Processing - nor is it peer-to-peer. Grid’s goal is to provide a software infrastructure to manage compute and storage resources between nodes hosted by different organizations.
What is Grid Computing? (continued) Grids are always one-off instances and are purpose-built. When we talk about Grid Systems, we’re talking about the software used to create Grids, not Grids themselves. Management of CPU, storage and, to a lesser extent, network resources are assumed of all Grids. Most “Grids” today fall far short of Grid’s goals, as understood by the Global Grid Forum (GGF).
What’s GGF? The Global Grid Forum: GGF is an organization that hopes to come up with Grid “middleware” standards: –"working group" and "research group” forums for discussing ideas –Interest areas include "scheduling", "storage", "security", etc. GGF hopes to develop standards that the whole community will adhere to.
Is Grid a Scientific Computing solution? No! Not at all! Grid addresses resource utilization among partners. You still have all the scientific computing challenges you’d have had if you kept your computing to your own system(s). There are those who are trying to address these other needs, but this isn’t a part of “Grid Computing."
“Virtual Organization?” Right! Just as only paying customers can plug into the electric power grid, only Grid partners can use Grid resources. Grid is intended for inter-organizational computing.
A “Specific Set of Technologies?” Well, sort of: Any set of technologies that gets you to a workable virtual organization that manages compute and storage resources is a “Grid solution.” However: There is a very specific set of technologies and techniques that are the outgrowth of the GGF: these are the ones we’re focused on here. (Globus, GSI, etc) NOTE: This presentation uses “GGF” to refer to the collection of technologies - and the dominant perspective - currently under discussion within the GGF.
What software technologies? Grid is mostly a build-your-own-solution affair: –Some kind of meta-data catalogue(s) to find things –Planner –Scheduler –Executor –Data mover –Security We’ll examine these in detail below...
“Mostly build-your-own?” Yes, mostly you build a Grid yourself, component by component. There’s only one exception: Science Tools’ BigSur System ™ is the word’s only turn-key, ready-to-go-in-a-day Grid system - but it’s more than just Grid: –Non-discipline-specific meta-data management for robust interoperability. –Full, scientifically defensible tracking. –The ability to configure in most any other Grid technology with no coding. –Both inter- and intra-organizational focus. –BigSur’s patented & patent-pending technologies use a different paradigm to provide complete, end-to-end scientific computing.
Meta-data catalogues There’s no standard, but every Grid has them. In GGF, there are two kinds: –one, tailored to the application, takes a “predicate” and returns a “logical list” of needed resources –a “Replica Catalogue” takes that list and returns physical filenames Note that there’s NO focus on grid-related meta-data - not in their paradigm! BigSur provides rich meta-data, is easily extensible, and can provide GGF style functionality.
Planner The Planner is used to determine what processing to perform. GGF provides no standard but most implementations use a Directed Acyclic Graph - DAG. With GGF, there is no automation - this is a human activity. BigSur provides automation and permits cyclic processing.
Scheduler With GGF, a Scheduler moves files to compute hosts and works with an Executor to run jobs. –“Condor” (U.Wisc.) is the most popular scheduler –GRAM (Globus) BigSur has a different paradigm: –scheduling and executor functions are integrated – transport of needed objects (files) is an integral part of running a process, not a discrete step prior to it
Executor Think ‘cron’: Executors provide a means to run a job at the operating system level. (There’s no implication of automation here.) GGF has no standard but several are available. –“DAG Manager” reads DAGs BigSur uses Daemons for this role: –DaemonMasters ™ control individual Daemons –DemandEngine ™ and EagerEngine ™ dispatch jobs based on demand or automation rules
Data Movers Data movers move data between systems: –GridFTP - Argonne National Labs –“DataMover” - Laurence Berkeley Labs –Sabul - U. Illinois –All the old favorites still work, like ftp and scp There are many Data Movers out there. Low level ones are often used by more sophisticated ones.
Data Movers (continued) With GGF, you code it all yourself: –there’s no automatic movement –there’s no tracking –no caches –no standard paradigm for when to move what where With BigSur, you configure “FileTransports”: –May be any Data Mover - no coding required –Integrated support for Network-based file servers (e.g. nfs) –Automatic fetching and caching of objects –Full tracking of objects with fine-grained permissions
Security GGF only offers single-sign-on Authentication: –GGF: GSI-GridMap, a Kerberos-like, certificate based authentication strategy, is most popular. –GGF has nothing whatsoever to say on the subject of permissions - only authentications! BigSur provides full security: –Single-sign-on - ONE user identity per user regardless of access point –May use any/all: Kerberos, GridMap, SSH keys, MD5 passwords –Strong, multiple simultaneous confirmation of systems and users –Full, intuitive permissions strategy including individual object access permissions, project, installation and organizational policies and provides for code-less additions to the permissions scheme
What’s missing? GGF is missing all of the following: –Coordination and Status - no consolidated view –Management - no Grid-wide management tools –Monitoring - no live tracking of processes/objects –Error recovery and workspace cleanup - no policies, no standards –Process restartability –Process management (code, etc) - no Grid-wide awareness of these –Planning (workflow) - no automation, no Grid-wide awareness –Saving of results (policy, standards) (including automated transports) –Tracking - no recording of processing, object lineage or user activity –Performance metrics - no process or network use metrics –Data (disk) cache –Permissions and Object Security BigSur provides built-in solutions to all of these needs.