Presentation on theme: "Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh."— Presentation transcript:
Introduction to the Grid Roy Williams, Caltech
Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh method) Hydrodynamics with PPM and ZEUS finite-difference Up to 9 species of H and He Radiative cooling Uniform UV background (Haardt & Madau) Star formation and feedback Metallicity fields
Enzo Features –N-body gravitational dynamics (particle-mesh method) –Hydrodynamics with PPM and ZEUS finite-difference –Up to 9 species of H and He –Radiative cooling –Uniform UV background (Haardt & Madau) –Star formation and feedback –Metallicity fields
Adaptive Mesh Refinement (AMR) multilevel grid hierarchy automatic, adaptive, recursive no limits on depth,complexity of grids C++/F77 Bryan & Norman (1998) Source: J. Shalf
Distributed Computing Zoo Grid Computing Also called High-Performance Computing Big clusters, Big data, Big pipes, Big centers Globus backbone, which now includes Services and Gateways Decentralized control Cluster Computing local interconnect between identical cpus Peer-to-Peer (Napster, Kazaa) Systems for sharing data without centeral server Internet Computing Screensaver cycle scavenging eg ClimatePrediction.net, Access Grid A videoconferencing system Globus A popular software package to federate resources into a grid TeraGrid A $150M award from NSF to the Supercomputer centers (NCSA, SCSC, PSC, etc etc)
The World Wide Web provides seamless access to information that is stored in many millions of different geographical locations In contrast, the Grid is an emerging infrastructure that provides seamless access to computing power and data storage capacity distributed over the globe. What is the Grid?
Grid was coined by Ian Foster and Carl Kesselman The Grid: blueprint for a new computing infrastructure. Analogy with the electric power grid: plug-in to computing power without worrying where it comes from, like a toaster. The idea has been around under other names for a while (distributed computing, metacomputing, …). Technology is in place to realise the dream on a global scale. What is the Grid?
The Grid relies on advanced software, called middleware, which ensures seamless communication between different computers and different parts of the world The Grid search engine will not only find the data the scientist needs, but also the data processing techniques and the computing power to carry them out It will distribute the computing task to wherever in the world there is spare capacity, and send the result to the scientist How will it work?
The GRID middleware: Finds convenient places for the scientists job (computing task) to be run Optimises use of the widely dispersed resources Organises efficient access to scientific data Deals with authentication to the different sites Interfaces to local site authorisation / resource allocation Runs the jobs Monitors progress Recovers from problems … and …. Tells you when the work is complete and transfers the result back! How will it work?
Benefits for Science More effective and seamless collaboration of dispersed communities, both scientific and commercial Ability to run large-scale applications comprising thousands of computers, for wide range of applications Transparent access to distributed resources from your desktop, or even your mobile phone The term e-Science has been coined to express these benefits
Five Big Ideas of Grid Federated sharing –independent management; Trust and Security –access policy; authentication; authorization Load balancing and efficiency –Condor, queues, prediction, brokering Distance doesnt matter –20 Mbyte/sec, global certificates, Open standards –NVO, FITS, MPI, Globus, SOAP
Grid as Federation Grid as a federation independent centers flexibility unified interface power and strength Large/small state compromise
NASA Information Power Grid DOE Science Grid NSF National Virtual Observatory NSF GriPhyN DOE Particle Physics Data Grid NSF TeraGrid DOE ASCI Grid DOE Earth Systems Grid DARPA CoABS Grid NEESGrid DOH BIRN NSF iVDGL UK e-Science Grid Netherlands – VLAM, PolderGrid Germany – UNICORE, Grid proposal France – Grid funding approved Italy – INFN Grid Eire – Grid proposals Switzerland - Network/Grid proposal Hungary – DemoGrid, Grid proposal Norway, Sweden - NorduGrid DataGrid (CERN,...) EuroGrid (Unicore) DataTag (CERN,…) Astrophysical Virtual Observatory GRIP (Globus/Unicore) GRIA (Industrial applications) GridLab (Cactus Toolkit) CrossGrid (Infrastructure Components) EGSO (Solar Physics) Grid projects in the world
The TeraGrid Vision Distributing the resources is better than putting them at one site Recently awarded $150M by NSF Build new, extensible, grid-based infrastructure to support grid-enabled scientific applications –New hardware, new networks, new software, new practices, new policies Expand centers to support cyberinfrastructure –Distributed, coordinated operations center –Exploit unique partner expertise and resources to make whole greater than the sum of its parts Leverage homogeneity to make the distributed computing easier and simplify initial development and standardization –Run single job across entire TeraGrid –Move executables between sites
TeraGrid Allocations Policies Any US researcher can request an allocation –Policies/procedures posted at: –Online proposal submission https://pops-submit.paci.org/ NVO has an account on Teragrid –(just ask RW)
Wide Variety of Usage Scenarios Tightly coupled simulation jobs storing vast amounts of data, performing visualization remotely as well as making data available through online collections (ENZO) Thousands of independent jobs using data from a distributed data collection (NVO) Science Gateways – "not a Unix prompt"! –from web browser with security –SOAP client for scripting –from application eg IRAF, IDL
Cluster Supercomputer 100s of nodes purged /scratch parallel file system /home (backed-up) login node job submission and queueing (Condor, PBS,..) user metadata node parallel I/O
MPI parallel programming Each node runs same program first finds its number (rank) and the number of coordinating nodes (size) Laplace solver example Algorithm: Each value becomes average of neighbor values node 0node 1 Parallel: Run algorithm with ghost points Use messages to exchange ghost points Serial: for each point, compute average remember boundary conditions
Storage Resource Broker (SRB) Single logical namespace while accessing distributed archival storage resources Effectively infinite storage Data replication Parallel Transfers Interfaces: command-line, API, SOAP, web/portal.
Storage Resource Broker (SRB): Virtual Resources, Replication Browser SOAP client Command-line.... casjobs at JHU tape at sdsc myDisk Similar to NVO VOStore concept certificate File may be replicated File comes with metadata... may be customized
Globus Security Single-sign-on, certificate handling, CAS, MyProxy Execution Management Remote jobs: GRAM and Condor-G Data Management GridFTP, reliable FT, 3rd party FT Information Services aggregating information from federated grid resources Common Runtime Components New web service
Public Grids for Astronomy Data Pipelines –split into independent pieces, send to scheduler Condor, PBS, Condor-G, DAGman, Pegasus –big data storage infinite tape, purged disk, scratch disk no permanent TByte disk Services –VOStore, SIAP –Science gateways asynchronous, secure, web, scripted
Public Grids for Astronomy Databases –Not really supported (note: ask audience if this is true) –VO effort for this (Casjobs, VOStore) Simulation –Forward: 100s synchronized nodes, MPI –Inverse: Independent trials, 1000s of jobs