Presentation on theme: "Foundations for an LHC Data Grid Stu Loken Berkeley Lab."— Presentation transcript:
Foundations for an LHC Data Grid Stu Loken Berkeley Lab
The Message Large-scale Distributed Computing (known as Grids) is a major thrust of the U.S. Computing community Annual investment in Grid R&D and infrastructure is ~$100M per year This investment can and should be leveraged to provide the Regional computing model for LHC
The Vision for the Grid Persistent, Universal and Ubiquitous Access to Networked Resources Common Tools and Infrastructure for Building 21 st Century Applications Integrating HPC, Data Intensive Computing, Remote Visualization and Advanced Collaborations Technologies
The Grid from a Services View Resource-specific implementations of basic services: E.g., Transport protocols, name servers, differentiated services, CPU schedulers, public key infrastructure, site accounting, directory service, OS bypass Resource-independent and application-independent services: E.g., authentication, authorization, resource location, resource allocation, events, accounting, remote data access, information, policy, fault detection Distributed Computing Applications Toolkit Grid Fabric (Resources) Grid Services (Middleware) Application Toolkits Data- Intensive Applications Toolkit Collaborative Applications Toolkit Remote Visualization Applications Toolkit Problem Solving Applications Toolkit Remote Instrumentation Applications Toolkit Applications Chemistry Biology Cosmology High Energy Physics Environment
Grid-based Computing Projects China Clipper Particle Physics Data Grid NASA Information Power Grid: Distributed Problem Solving Access Grid: The Future of Distributed Collaboration
Clipper Project ANL-SLAC-Berkeley Push the limits of very high- speed data transmission Builds on Globus Middleware and high-performance distributed storage Demonstrated data rates up to 50 Mbytes/sec.
China Clipper Tasks High-Speed Testbed –Computing and networking infrastructure Differentiated Network Services –Traffic shaping on ESnet Monitoring Architecture –Traffic analysis to support traffic shaping and CPU scheduling Data Architecture –Transparent management of data Application Demonstration –Standard Analysis Framework (STAF)
Monitoring End-to-end monitoring of the assets in a computational grid is necessary both for resolving network throughput problems and for dynamically scheduling resources. China Clipper adds precision-timed event monitor agents to: –ATM switchs –DPSS servers –Testbed computational resources Produce trend analysis modules for monitor agents Make results available to applications
Particle Physics Data Grid HENP Labs and Universities (Caltech-SLAC lead) Extend GRID concept to large- scale distributed data analysis Uses NGI testbeds as well as production networks Funded by DOE-NGI program
NGI: “Particle Physics Data Grid” ANL(CS/HEP), BNL, Caltech, FNAL, JLAB, LBNL(CS/HEP), SDSC, SLAC, U.Wisconsin High-Speed Site-to-Site File Replication Service FIRST YEAR: SLAC-LBNL at least; Goal intentionally requires > OC12; Use existing hardware and networks (NTON); Explore “Diffserv”, instrumentation, reservation/allocation.
NGI: “Particle Physics Data Grid” Deployment of Multi-Site Cached File Access FIRST YEAR: Read access only; Optimized for 1-10 GB files; File-level interface to ODBMSs; Maximal use of Globus, MCAT, SAM, OOFS, Condor, Grand Challenge etc.; Focus on getting users.
Information Power Grid Distributed High-Performance Computing, Large-Scale Data Management, and Collaboration Environments for Science and Engineering Building Problem-Solving Environments William E. Johnston, Dennis Gannon, William Nitzberg
IPG Requirements Multiple datasets Complex workflow scenarios Data-streams from instrument systems Sub-component simulations coupled simultaneously Appropriate levels of abstraction Search, interpret and fuse multiple data archives Share all aspects of work processes Bursty resource availability and scheduling Sufficient available resources VR and immersive techniques Software agents to assist in routine/repetitive tasks All this will be supported by the Grid. PSEs are the primary scientific/engineering user interface to the Grid.
The Future of Distributed Collaboration Technology: The Access Grid Ian Foster, Rick Stevens Argonne National Laboratory
Beyond Teleconferencing: Physical spaces to support distributed groupwork Virtual collaborative venues Agenda driven scenarios and work sessions Integration with Integrated GRID services
Access Grid Project Goals Enable Group-to-Group Interactions at a Distance Provide a Sense of Presence Use Quality but Affordable Digital IP Based Audio/video (Open Source) Enable Complex Multi-site Visual and Collaborative Experiences Build on Integrated Grid Services Architecture
The Docking Concept for Access Grid Private Workspaces - Docked into the Group Workspace
Ambient mic (tabletop) Presenter mic Presenter camera Audience camera Ambient mic (tabletop) Presenter mic Presenter camera Audience camera Access Grid Nodes Access Grid Nodes Under Development –Library, Workshop –ActiveMural Room –Office –Auditorium
Conclusion A set of closely coordinated projects is laying the foundation for a high- performance distributed computing environment. There appear to be good prospects for a significant long-term investment to deploy the necessary infrastructure to support Particle Physics Data Analysis.