Download presentation
Presentation is loading. Please wait.
1
Distributed Data Access and Analysis
for Next Generation HENP Experiments Harvey Newman, Caltech CHEP 2000, Padova February 10, 2000 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
2
LHC Computing: Different from Previous Experiment Generations
Geographical dispersion: of people and resources Complexity: the detector and the LHC environment Scale: Petabytes per year of data ~5000 Physicists 250 Institutes ~50 Countries Major challenges associated with: Coordinated Use of Distributed Computing Resources Remote software development and physics analysis Communication and collaboration at a distance R&D: A New Form of Distributed System: Data-Grid February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
3
Four Experiments The Petabyte to Exabyte Challenge
ATLAS, CMS, ALICE, LHCB Higgs and New particles; Quark-Gluon Plasma; CP Violation Data written to “tape” ~5 Petabytes/Year and UP (1 PB = 1015 Bytes) 0.1 to Exabyte (1 EB = 1018 Bytes) (~2010) (~2020 ?) Total for the LHC Experiments February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
4
To Solve: the LHC “Data Problem”
While the proposed LHC computing and data handling facilities are large by present-day standards, They will not support FREE access, transport or processing for more than a minute part of the data Balance between proximity to large computational and data handling facilities, and proximity to end users and more local resources for frequently-accessed datasets Strategies must be studied and prototyped, to ensure both: acceptable turnaround times, and efficient resource utilisation Problems to be Explored How to meet demands of hundreds of users who need transparent access to local and remote data, in disk caches and tape stores Prioritise hundreds of requests of local and remote communities, consistent with local and regional policies Ensure that the system is dimensioned/used/managed optimally, for the mixed workload February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
5
MONARC General Conclusions on LHC Computing
Following discussions of computing and network requirements, technology evolution and projected costs, support requirements etc. The scale of LHC “Computing” requires a worldwide effort to accumulate the necessary technical and financial resources A distributed hierarchy of computing centres will lead to better use of the financial and manpower resources of CERN, the Collaborations, and the nations involved, than a highly centralized model focused at CERN The distributed model also provides better use of physics opportunities at the LHC by physicists and students At the top of the hierarchy is the CERN Center, with the ability to perform all analysis-related functions, but not the ability to do them completely At the next step in the hierarchy is a collection of large, multi-service “Tier1 Regional Centres”, each with 10-20% of the CERN capacity devoted to one experiment There will be Tier2 or smaller special purpose centers in many regions February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
6
Bandwidth Requirements Estimate (Mbps) [*] ICFA Network Tas
Year 1998 2000 2005 BW Utilized Per Physicist (and Peak BW Used) 0.05- 0.25 ( ) (2 - 10) ( ) BW Utilized by a University Group BW to a Home-laboratory or Regional Center 34 - 155 622 - 5000 BW to a Central Laboratory Housing One or More Major Experiments 155 - 622 2500 - 10000 BW on a Transoceanic Link 34-155 Bandwidth Requirements Estimate (Mbps) [*] ICFA Network Tas See Circa 2000: Predictions roughly on track: “Universal” BW Growth” by ~2X Per Year; 622 Mbps on Links European and Transatlantic by ~2002-3 Terabit/sec US Backbones (e.g. ESNet) by ~2003-5 Caveats: Distinguish raw bandwidth and effective line capacity; Maximum end-to-end rate for individual data flows “QoS”/ IP has a way to go D388, D402, D274 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
7
CMS Analysis and Persistent Object Store
Online Common Filters and Pre-Emptive Object Creation CMS Slow Control Detector Monitoring “L4” L2/L3 L1 Persistent Object Store Object Database Management System Filtering Simulation Calibrations, Group Analyses User Analysis Offline Data Organized In a(n Object) “Hierarchy” Raw, Reconstructed (ESD), Analysis Objects (AOD), Tags Data Distribution All raw, reconstructed and master parameter DB’s at CERN All event TAG and AODs, and selected reconstructed data sets at each regional center HOT data (frequently accessed) moved to RCs Goal of location and medium transparency On Demand Object Creation C121 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
8
GIOD Summary GIOD has Constructed a Terabyte-scale set of fully simulated CMS events and used these to create a large OO database Learned how to create large database federations Completed the “100” (to 170) Mbyte/sec CMS Milestone Developed prototype reconstruction and analysis codes, and Java 3D OO visualization demonstrators, that work seamlessly with persistent objects over networks Deployed facilities and database federations as useful testbeds for Computing Model studies C51 Hit Track Detector C226 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
9
Data Grid Hierarchy (CMS Example)
1 TIPS = 25,000 SpecInt95 PC (today) = SpecInt95 ~PBytes/sec Online System ~100 MBytes/sec Offline Farm ~20 TIPS Bunch crossing per 25 nsecs. 100 triggers per second Event is ~1 MByte in size ~100 MBytes/sec HPSS Tier 0 CERN Computer Center ~622 Mbits/sec or Air Freight Tier 1 Fermilab ~4 TIPS France Regional Center HPSS Germany Regional Center HPSS Italy Regional Center HPSS HPSS ~2.4 Gbits/sec Tier 2 Tier2 Center ~1 TIPS Tier2 Center ~1 TIPS Tier2 Center ~1 TIPS Tier2 Center ~1 TIPS Tier2 Center ~1 TIPS Tier 3 ~622 Mbits/sec Physicists work on analysis “channels”. Each institute has ~10 physicists working on one or more channels Data for these channels should be cached by the institute server Institute ~0.25TIPS Institute Institute Institute Physics data cache Mbits/sec Tier 4 E277 Workstations February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
10
LHC (and HEP) Challenges of Petabyte-Scale Data
Technical Requirements Optimize use of resources with next generation middleware Co-Locate and Co-Schedule Resources and Requests Enhance database systems to work seamlessly across networks: caching/replication/mirroring Balance proximity to centralized facilities, and to end users for frequently accessed data Requirements of the Worldwide Collaborative Nature of Experiments Make appropriate use of data analysis resources in each world region, conforming to local and regional policies Involve scientists and students in each world region in front-line physics research Through an integrated collaborative environment E163 C74, C292 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
11
Time-Scale: CMS Recent “Events”
A PHASE TRANSITION in our understanding of the role of CMS Software and Computing occurred in October - November 1999 “Strong Coupling” of S&C Task,Trigger/DAQ, Physics TDR, detector performance studies and other main milestones Integrated CMS Software and Trigger/DAQ planning for the next round: May 2000 Milestone Large simulated samples are required: ~ 1 Million events fully simulated a few times during 2000, in ~1 month A smoothly rising curve of computing and data handling needs from now on Mock Data Challenges from 2000 (1% scale) to 2005 Users want substantial parts of the functionality formerly planned for 2005, Starting Now A108 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
12
Roles of Projects for HENP Distributed Analysis
RD45, GIOD: Networked Object Databases Clipper,GC; High speed access to Objects or File data FNAL/SAM for processing and analysis SLAC/OOFS Distributed File System + Objectivity Interface NILE, Condor: Fault Tolerant Distributed Computing with Heterogeneous CPU Resources MONARC: LHC Computing Models: Architecture, Simulation, Strategy, Politics PPDG: First Distributed Data Services and Data Grid System Prototype ALDAP: Database Structures and Access Methods for Astrophysics and HENP Data GriPhyN: Production-Scale Data Grid Simulation/Modeling, Application + Network Instrumentation, System Optimization/Evaluation APOGEE A391 E277 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
13
MONARC: Common Project
Models Of Networked Analysis At Regional Centers Caltech, CERN, Columbia, FNAL, Heidelberg, Helsinki, INFN, IN2P3, KEK, Marseilles, MPI Munich, Orsay, Oxford, Tufts PROJECT GOALS Develop “Baseline Models” Specify the main parameters characterizing the Model’s performance: throughputs, latencies Verify resource requirement baselines: (computing, data handling, networks) TECHNICAL GOALS Define the Analysis Process Define RC Architectures and Services Provide Guidelines for the final Models Provide a Simulation Toolset for Further Model studies 622 Mbits/s Univ 2 CERN 350k SI95 350 Tbytes Disk; Robot Tier2 Ctr 20k SI95 20 TB Disk FNAL/BNL 70k SI95 70 Tbyte Disk; Robot N X 622 Mbits/s Optional Air Freight 622Mbits/s Univ 1 M Model Circa 2005 F148 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
14
MONARC Working Groups/Chairs
“Analysis Process Design” P. Capiluppi (Bologna, CMS) “Architectures” Joel Butler (FNAL, CMS) “Simulation” Krzysztof Sliwa (Tufts, ATLAS) “Testbeds” Lamberto Luminari (Rome, ATLAS) “Steering” Laura Perini (Milan, ATLAS) Harvey Newman (Caltech, CMS) & “Regional Centres Committee” February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
15
MONARC Architectures WG: Regional Centre Facilities & Services
Regional Centres Should Provide All technical and data services required to do physics analysis All Physics Objects, Tags and Calibration data Significant fraction of raw data Caching or mirroring calibration constants Excellent network connectivity to CERN and the region’s users Manpower to share in the development of common validation and production software A fair share of post- and re-reconstruction processing Manpower to share in ongoing work on Common R&D Projects Excellent support services for training, documentation, troubleshooting at the Centre or remote sites served by it Service to members of other regions Long Term Commitment for staffing, hardware evolution and support for R&D, as part of the distributed data analysis architecture February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
16
MONARC and Regional Centres
MONARC RC FORUM: Representative Meetings Quarterly Regional Centre Planning well-advanced, with optimistic outlook, in US (FNAL for CMS; BNL for ATLAS), France (CCIN2P3), Italy, UK Proposals submitted late 1999 or early 2000 Active R&D and prototyping underway, especially in US, Italy, Japan; and UK (LHCb), Russia (MSU, ITEP), Finland (HIP) Discussions in the national communities also underway in Japan, Finland, Russia, Germany There is a near-term need to understand the level and sharing of support for LHC computing between CERN and the outside institutes, to enable the planning in several countries to advance. MONARC Uses traditional 1/3:2/3 sharing assumption February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
17
Regional Center Architecture Example by I. Gaines (MONARC)
Tape Mass Storage & Disk Servers Database Servers Tapes Network from CERN from Tier 2 & simulation centers Tier 2 Local institutes Data Import Data Export Production Reconstruction Raw/Sim ESD Scheduled, predictable experiment/ physics groups Production Analysis ESD AOD AOD DPD Scheduled Physics groups Individual Analysis AOD DPD and plots Chaotic Physicists CERN Tapes Desktops C169 Physics Software Development R&D Systems and Testbeds Info servers Code servers Web Servers Telepresence Servers Training Consulting Help Desk February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
18
Create an Ensemble of (University-Based) Tier2
Data Grid: Tier2 Layer Create an Ensemble of (University-Based) Tier2 Data Analysis Centres Site Architectures Complementary to the the Major Tier1 Lab-Based Centers Medium-scale Linux CPU farm, Sun data server, RAID disk array Less need for 24 X 7 Operation Some lower component costs Less production-oriented, to respond to local and regional analysis priorities and needs Supportable by a small local team and physicists’ help One Tier2 Center in Each Region (e.g. of the US) Catalyze local and regional focus on particular sets of physics goals Encourage coordinated analysis developments emphasizing particular physics aspects or subdetectors. Example: CMS EMU in Southwest US Emphasis on Training, Involvement of Students at Universities in Front-line Data Analysis and Physics Results Include a high quality environment for desktop remote collaboration E277 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
19
MONARC Analysis Process Example
Slow Control/Cal DAQ/RAW February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
20
Monarc Analysis Model Baseline: ATLAS or CMS “Typical” Tier1 RC
CPU Power ~100 KSI95 Disk space ~100 TB Tape capacity 300 TB, 100 MB/sec Link speed to Tier2 10 MB/sec (1/2 of 155 Mbps) Raw data 1% TB/year ESD data 100% TB/year Selected ESD 25% 5 TB/year [*] Revised ESD 25% 10 TB/year [*] AOD data 100% 2 TB/year [**] Revised AOD 100% 4 TB/year [**] TAG/DPD 100% 200 GB/year Simulated data 25% 25 TB/year (repository) [*] Covering Five Analysis Groups; each selecting ~1% of Annual ESD or AOD data for a Typical Analysis [**] Covering All Analysis Groups February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
21
MONARC Testbeds WG: Isolation of Key Parameters
Some Parameters Measured, Installed in the MONARC Simulation Models, and Used in First Round Validation of Models. Objectivity AMS Response Time-Function, and its dependence on Object clustering, page-size, data class-hierarchy and access pattern Mirroring and caching (e.g. with the Objectivity DRO option) Scalability of the System Under “Stress”: Performance as a function of the number of jobs, relative to the single-job performance Performance and Bottlenecks for a variety of data access patterns Tests over LANs and WANs D235, D127 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
22
MONARC Testbeds WG Test-bed configuration defined and widely deployed
“Use Case” Applications Using Objectivity: GIOD/JavaCMS, CMS Test Beams, ATLASFAST++, ATLAS 1 TB Milestone Both LAN and WAN tests ORCA4 (CMS) First “Production” application Realistic data access patterns Disk/HPSS “Validation” Milestone Carried Out, with Simulation WG A108 C113 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
23
MONARC Testbed Systems
February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
24
Multitasking Processing Model
A Java 2-Based, CPU- and code-efficient simulation for distributed systems has been developed Process-oriented discrete event simulation F148 Concurrent running tasks share resources (CPU, memory, I/O) “Interrupt” driven scheme: For each new task or when one task is finished, an interrupt is generated and all “processing times” are recomputed. It provides: An easy way to apply different load balancing schemes An efficient mechanism to simulate multitask processing February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
25
Role of Simulation for Distributed Systems
Simulations are widely recognized and used as essential tools for the design, performance evaluation and optimisation of complex distributed systems From battlefields to agriculture; from the factory floor to telecommunications systems Discrete event simulations with an appropriate and high level of abstraction Just beginning to be part of the HEP culture Some experience in trigger, DAQ and tightly coupled computing systems: CERN CS2 models (Event-oriented) MONARC (Process-Oriented; Java 2 Threads + Class Lib) These simulations are very different from HEP “Monte Carlos” “Time” intervals and interrupts are the essentials Simulation is a vital part of the study of site architectures, network behavior, data access/processing/delivery strategies, for HENP Grid Design and Optimization February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
26
Example : Physics Analysis at Regional Centres
Similar data processing jobs are performed in each of several RCs Each Centre has “TAG” and “AOD” databases replicated. Main Centre provides “ESD” and “RAW” data Each job processes AOD data, and also a a fraction of ESD and RAW data. February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
27
Example: Physics Analysis
February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
28
Simple Validation Measurements The AMS Data Access Case
Distribution of 32 Jobs’ Processing Time Simulation Measurements C113 4 CPUs Client LAN Raw Data DB Simulation mean 109.5 Measurement mean 114.3 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
29
INVOLVING CMS, ATLAS, LHCb, ALICE TIMELY and USEFUL IMPACT:
MONARC Phase 3 INVOLVING CMS, ATLAS, LHCb, ALICE TIMELY and USEFUL IMPACT: Facilitate the efficient planning and design of mutually compatible site and network architectures, and services Among the experiments, the CERN Centre and Regional Centres Provide modelling consultancy and service to the experiments and Centres Provide a core of advanced R&D activities, aimed at LHC computing system optimisation and production prototyping Take advantage of work on distributed data-intensive computing for HENP this year in other “next generation” projects [*] For example PPDG February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
30
Phase 3 System Design Elements
MONARC Phase 3 Technical Goal: System Optimisation Maximise Throughput and/or Reduce Long Turnaround Phase 3 System Design Elements RESILIENCE, resulting from flexible management of each data transaction, especially over WANs SYSTEM STATE & PERFORMANCE TRACKING, to match and co-schedule requests and resources, detect or predict faults FAULT TOLERANCE, resulting from robust fall-back strategies to recover from bottlenecks, or abnormal conditions Base developments on large scale testbed prototypes at every stage: for example ORCA4 [*] See H. Newman, February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
31
MONARC Status MONARC Phase 3 has been Proposed
MONARC is well on its way to specifying baseline Models representing cost-effective solutions to LHC Computing. Discussions have shown that LHC computing has a new scale and level of complexity. A Regional Centre hierarchy of networked centres appears to be the most promising solution. A powerful simulation system has been developed, and is a very useful toolset for further model studies. Synergy with other advanced R&D projects has been identified. Important information, and example Models have been provided: Timely for the Hoffmann Review and discussions of LHC Computing over the next months MONARC Phase 3 has been Proposed Based on prototypes, with increasing detail and realism Coupled to Mock Data Challenges in 2000 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
32
The Particle Physics Data Grid (PPDG)
DoE/NGI Next Generation Internet Project ANL, BNL, Caltech, FNAL, JLAB, LBNL, SDSC, SLAC, U.Wisc/CS Site to Site Data Replication Service 100 Mbytes/sec PRIMARY SITE Data Acquisition, CPU, Disk, Tape Robot SECONDARY SITE CPU, Disk, Tape Robot Coordinated reservation/allocation techniques; Integrated Instrumentation, DiffServ First Year Goal: Optimized cached read access to 1-10 Gbytes, drawn from a total data set of up to One Petabyte Multi-Site Cached File Access Service University CPU, Disk, Users PRIMARY SITE DAQ, Tape, CPU, Disk, Robot Satellite Site Tape, CPU, Disk, Robot February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
33
PPDG: Architecture for Reliable High Speed Data Delivery
Resource Management Object-based and File-based Application Services File Replication Index Matchmaking Service File Access Service Cost Estimation Cache Manager File Fetching Service Mass Storage Manager File Mover File Mover End-to-End Network Services Site Boundary Security Domain February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
34
Distributed Data Delivery and LHC Software Architecture
Software Architectural Choices Traditional, single-threaded applications Allow for data arrival and reassembly OR Performance-Oriented (Complex) I/O requests up-front; multi-threaded; data driven; respond to ensemble of (changing) cost estimates Possible code movement as well as data movement Loosely coupled, dynamic February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
35
ALDAP (NSF/KDI) Project
ALDAP: Accessing Large Data Archives in Astronomy and Particle Physics NSF Knowledge Discovery Initiative (KDI) CALTECH, Johns Hopkins, FNAL(SDSS) Explore advanced adaptive database structures, physical data storage hierarchies for archival storage of next generation astronomy and particle physics data Develop spatial indexes, novel data organizations, distribution and delivery strategies, for Efficient and transparent access to data across networks Example (Kohonen) Maps for data “self-organization” Create prototype network-distributed data query execution systems using Autonomous Agent workers Explore commonalities and find effective common solutions for particle physics and astrophysics data C226 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
36
Beyond Traditional Architectures: Mobile Agents (Java Aglets)
“Agents are objects with rules and legs” -- D. Taylor Agent Application Agent Service Mobile Agents: Reactive, Autonomous, Goal Driven, Adaptive Execute Asynchronously Reduce Network Load: Local Conversations Overcome Network Latency; Some Outages Adaptive Robust, Fault Tolerant Naturally Heterogeneous Extensible Concept: Agent Hierarchies Agent February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
37
D9 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
38
Grid Services Architecture [*]: Putting it all Together
Applns HEP Data-Analysis Related Applications Appln Toolkits Remote data toolkit Remote comp. toolkit Remote viz toolkit Remote collab. toolkit Remote sensors toolkit ... Grid Services Protocols, authentication, policy, resource management, instrumentation, data discovery, etc. Grid Fabric Archives, networks, computers, display devices, etc.; associated local services [*] Adapted from Ian Foster E403 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
39
Grid Hierarchy Goals: Better Resource Use and Faster Turnaround
Efficient resource use and improved responsiveness through: Treatment of the ensemble of site and network resources as an integrated (loosely coupled) system Resource discovery, query estimation (redirection), co-scheduling, prioritization, local and global allocations Network and site “instrumentation”: performance tracking, monitoring, forward-prediction, problem trapping and handling Exploit superior network infrastructures (national, land-based) per unit cost for frequently accessed data Transoceanic links relatively expensive Shorter links normally higher throughput Ease development, operation, management and security, through the use of layered, (de facto) standard services E163 E345 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
40
Grid Hierarchy Concept: Broader Advantages
Greater flexibility to pursue different physics interests, priorities, and resource allocation strategies by region Lower tiers of the hierarchy More local control Partitioning of users into “proximate” communities into for support, troubleshooting, mentoring Partitioning of facility tasks, to manage and focus resources “Grid” integration and common services are a principal means for effective worldwide resource coordination An Opportunity to maximize global funding resources and their effectiveness, while meeting the needs for analysis and physics February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
41
Grid Development Issues
Integration of applications with Grid Middleware Performance-oriented user application software architecture needed, to deal with the realities of data access and delivery Application frameworks must work with system state and policy information (“instructions”) from the Grid ODBMS’s must be extended to work across networks “Invisible” (to the DBMS) data transport, and catalog update Interfacility cooperation at a new level, across world regions Agreement on the use of standard Grid components, services, security and authentication Match with heterogeneous resources, performance levels, and local operational requirements Consistent policies on use of local resources by remote communities Accounting and “exchange of value” software February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
42
Grid Hierarchy Concept: Broader Advantages
Greater flexibility to pursue different physics interests, priorities, and resource allocation strategies by region Lower tiers of the hierarchy More local control Partitioning of users into “proximate” communities into for support, troubleshooting, mentoring Partitioning of facility tasks, to manage and focus resources “Grid” integration and common services are a principal means for effective worldwide resource coordination An Opportunity to maximize global funding resources and their effectiveness, while meeting the needs for analysis and physics February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
43
Content Delivery Networks: a Web-enabled Pre- “Data Grid”
Worldwide Integrated Distributed Systems for Dynamic Content Delivery Circa 2000 Akamai, Adero, Sandpiper Server Networks 1200 Thousands of Network-Resident Servers 25 60 ISP Networks 25 30 Countries 40+ Corporate Customers $ 25 B Capitalization Resource Discovery Build “Weathermap” of Server Network (State Tracking) Query Estimation; Matchmaking/Optimization; Request rerouting Virtual IP Addressing Mirroring, Caching (1200) Autonomous-Agent Implementation February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
44
The Need for a “Grid”: the Basics
Computing for LHC will never be “enough” to fully exploit the physics potential, or exhaust the scientific potential of the collaborations The basic Grid elements are required to make the ensemble of computers, networks, storage management systems, and function as a self-consistent system, implementing consistent (and complex) resource usage policies. A basic “Grid” will an information gathering/ workflow guiding/ monitoring/ and repair-initiating entity, designed to ward off resource wastage (or meltdown) in a complex, distributed and somewhat “open” system. Without such information, experience shows that effective global use of such a large, complex and diverse ensemble of resources is likely to fail; or at the very least be sub-optimal The time to accept the charge to build a Grid, for sober and compelling reasons, is now Grid-like systems are starting to appear in industry and commerce But Data Grids on the LHC scale will not be in production until significantly after 2005 February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
45
Summary The HENP/LHC Data Analysis Problem
Petabyte scale compact binary data, and computing resources distributed worldwide Development of an integrated robust networked data access processing and analysis system is mission-critical An aggressive R&D program is required to develop reliable, seamless systems that work across an ensemble of networks An effective inter-field partnership is now developing through many R&D projects (PPDG, GriPhyN, ALDAP…) HENP analysis is now one of the driving forces for the development of “Data Grids” Solutions to this problem could be widely applicable in other scientific fields and industry, by LHC startup National and Multi-National “Enterprise Resource Planning” February 10, 2000: Distributed Data Access and Analysis for HENP Experiments Harvey B Newman (CIT)
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.