Presentation is loading. Please wait.

Presentation is loading. Please wait.

HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October.

Similar presentations


Presentation on theme: "HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October."— Presentation transcript:

1 HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October 25, 2000 http://l3www.cern.ch/~newman/hpiis2000.ppt

2 Next Generation Experiments: Physics and Technical Goals The extraction of small or subtle new discovery signals from large and potentially overwhelming backgrounds; or precision analysis of large samples The extraction of small or subtle new discovery signals from large and potentially overwhelming backgrounds; or precision analysis of large samples Providing rapid access to event samples and subsets from massive data stores, from ~300 Terabytes in 2001 Petabytes by ~2003, ~10 Petabytes by 2006, to ~100 Petabytes by ~2010. Providing rapid access to event samples and subsets from massive data stores, from ~300 Terabytes in 2001 Petabytes by ~2003, ~10 Petabytes by 2006, to ~100 Petabytes by ~2010. Providing analyzed results with rapid turnaround, by coordinating and managing the LIMITED computing, data handling and network resources effectively Providing analyzed results with rapid turnaround, by coordinating and managing the LIMITED computing, data handling and network resources effectively Enabling rapid access to the data and the collaboration, across an ensemble of networks of varying capability, using heterogeneous resources. Enabling rapid access to the data and the collaboration, across an ensemble of networks of varying capability, using heterogeneous resources.

3 The Large Hadron Collider (2005-) A next-generation particle collider A next-generation particle collider è the largest superconductor installation in the world A bunch-bunch collision will take place every 25 nanoseconds: each generating ~20 interactions A bunch-bunch collision will take place every 25 nanoseconds: each generating ~20 interactions è But only one in a trillion may lead to a major physics discovery Real-time data filtering: Petabytes per second to Gigabytes per second Real-time data filtering: Petabytes per second to Gigabytes per second Accumulated data of many Petabytes/Year Accumulated data of many Petabytes/Year Large data samples explored and analyzed by thousands of geographically dispersed scientists, in hundreds of teams Large data samples explored and analyzed by thousands of geographically dispersed scientists, in hundreds of teams

4 Computing Challenges: LHC Example è Geographical dispersion: of people and resources è Complexity: the detector and the LHC environment è Scale: Tens of Petabytes per year of data 1800 Physicists 150 Institutes 34 Countries Major challenges associated with: Communication and collaboration at a distance Network-distributed computing and data resources Remote software development and physics analysis R&D: New Forms of Distributed Systems: Data Grids

5 Four LHC Experiments: The Petabyte to Exabyte Challenge ATLAS, CMS, ALICE, LHCB Higgs + New particles; Quark-Gluon Plasma; CP Violation Data written to tape ~25 Petabytes/Year and UP; 0.25 Petaflops and UP Data written to tape ~25 Petabytes/Year and UP; 0.25 Petaflops and UP 0.1 to 1 Exabyte (1 EB = 10 18 Bytes) (~2010) (~2015 ?) Total for the LHC Experiments 0.1 to 1 Exabyte (1 EB = 10 18 Bytes) (~2010) (~2015 ?) Total for the LHC Experiments

6 From Physics to Raw Data (LEP) Basic physics Fragmentation,Decay Interaction with detector material Multiplescattering,interactionsDetectorresponse Noise, pile-up, cross-talk,inefficiency,ambiguity,resolution,responsefunction,alignment,temperature 2037 2446 1733 1699 4003 3611 952 1328 2132 1870 2093 3271 4732 1102 2491 3216 2421 1211 2319 2133 3451 1942 1121 3429 3742 1288 2343 7142 Raw data (Bytes)Read-outaddresses, ADC, TDC values, Bit patterns e+e+e+e+ e-e-e-e- f f Z0Z0Z0Z0 _

7 The Compact Muon Solenoid (CMS) MUON BARREL CALORIMETERS Silicon Microstrips (230 sqm) Pixels (80M channels) Scintillating PbWO 4 Cathode Strip Chambers CSC Resistive Plate Chambers RPC Drift Tube Chambers DT Resistive Plate Chambers RPC SUPERCONDUCTING COIL IRON YOKE TRACKERs MUON ENDCAPS Total weight : 12,500 t Overall diameter : 15 m Overall length : 21.6 m Magnetic field : 4 Tesla HCAL Plastic scintillator copper sandwich ECALCrystals

8 From Raw Data to Physics (LEP) e+e+e+e+ e-e-e-e- f f Z0Z0Z0Z0 Basic physics ResultsFragmentation,DecayPhysicsanalysis Interaction with detector material Pattern,recognition,Particleidentification DetectorresponseApplycalibration,alignment 2037 2446 1733 1699 4003 3611 952 1328 2132 1870 2093 3271 4732 1102 2491 3216 2421 1211 2319 2133 3451 1942 1121 3429 3742 1288 2343 7142 Raw data Convert to physicsquantities Reconstruction Simulation (Monte-Carlo) Analysis _

9 Switch Data Fragments from on-detector digitizers Computer Farm raw data summary data Input: 1-100 GB/s Over 1 PetaByte/year 1-200 TB/year High Speed Network * figures are for one experiment Recording: 100-1000 MB/s Recording: 100-1000 MB/s Filtering: 35K SI95 Tape & Disk Servers Real-time Filtering and Data Acquisition*

10 Higgs Search LEPC September 2000

11 10 9 events/sec, selectivity: 1 in 10 13 (1 person in a thousand world populations) LHC: Higgs Decay into 4 muons (tracker only); 1000X LEP Data Rate

12 On-line Filter System u Large variety of triggers and thresholds: select physics à la carte u Multi-level trigger u Filter out less interesting events u Online reduction 10 7 u Keep highly selected events u Result: Petabytes of Binary Compact Data Per Year Level 1 - Special Hardware Level 2 - Processors 40 MHz (1000 TB/sec) equivalent) Level 3 – Farm of Commodity CPUs 75 KHz (75 GB/sec)fully digitised 5 KHz (5 GB/sec) 100 Hz (100 MB/sec) Data Recording & Offline Analysis

13 LHC Vision: Data Grid Hierarchy Tier 1 Tier2 Center Online System Offline Farm, CERN Computer Ctr > 20 TIPS FranceCentre FNAL Center Italy Center UK Center Institute Institute ~0.25TIPS Workstations ~100 MBytes/sec ~2.5 Gbits/sec 100 - 1000 Mbits/sec Physicists work on analysis channels Each institute has ~10 physicists working on one or more channels Physics data cache ~PByte/sec ~0.6-2.5 Gbits/sec Tier2 Center ~622 Mbits/sec Tier 0 +1 Tier 3 Tier 4 Tier2 Center Tier 2 Experiment

14 Why Worldwide Computing? Regional Center Concept Advantages Managed, fair-shared access for Physicists everywhere Managed, fair-shared access for Physicists everywhere Maximize total funding resources while meeting the total computing and data handling needs Maximize total funding resources while meeting the total computing and data handling needs Balance between proximity of datasets to appropriate resources, and to the users Balance between proximity of datasets to appropriate resources, and to the users è Tier-N Model Efficient use of network: higher throughput Efficient use of network: higher throughput è Per Flow: Local > regional > national > international Utilizing all intellectual resources, in several time zones Utilizing all intellectual resources, in several time zones è CERN, national labs, universities, remote sites è Involving physicists and students at their home institutions Greater flexibility to pursue different physics interests, priorities, and resource allocation strategies by region Greater flexibility to pursue different physics interests, priorities, and resource allocation strategies by region è And/or by Common Interests (physics topics, subdetectors,…) Manage the Systems Complexity Manage the Systems Complexity è Partitioning facility tasks, to manage and focus resources

15 Grid Services Architecture [*] GridFabric GridServices ApplnToolkits Applns Data stores, networks, computers, display devices,… ; associated local services Protocols, authentication, policy, resource management, instrumentation, discovery,etc.... RemoteviztoolkitRemotecomp.toolkitRemotedatatoolkitRemotesensorstoolkitRemotecollab.toolkit A Rich Set of HEP Data-Analysis Related Applications [*] Adapted from Ian Foster

16 SDSS Data Grid (In GriPhyN): A Shared Vision Three main functions: Raw data processing on a Grid (FNAL) Raw data processing on a Grid (FNAL) è Rapid turnaround with TBs of data è Accessible storage of all image data Fast science analysis environment (JHU) Fast science analysis environment (JHU) è Combined data access + analysis of calibrated data è Distributed I/O layer and processing layer; shared by whole collaboration Public data access Public data access è SDSS data browsing for astronomers, and students è Complex query engine for the public

17 Principal areas of GriPhyN applicability: Main data processing (Caltech/CACR) è Enable computationally limited searches periodic sources è Access to LIGO deep archive è Access to Observatories Science analysis environment for LSC (LIGO Scientific Collaboration) è Tier2 centers: shared LSC resource è Exploratory algorithm, astrophysics research with LIGO reduced data sets è Distributed I/O layer and processing layer builds on existing APIs è Data mining of LIGO (event) metadatabases è LIGO data browsing for LSC members, outreach Hanford Livingston Caltech MIT INet2 Abilene Tier1 LSC Tier2 OC3 OC48 OC3 OC12 OC48 LIGO Data Grid Vision

18 5 5 250 0.8 8 8 24 * 960 * 6 * 1.5 12 LAN-WAN Routers Computer farm at CERN (2005) Computer farm at CERN (2005) 0.8 Storage Network Farm Network 0.5 M SPECint95 > 5K processors 0.6 PByte disk > 5K disks + 2X More Outside 0.5 M SPECint95 > 5K processors 0.6 PByte disk > 5K disks + 2X More Outside * Data Rate in Gbps Thousands of CPU boxes Thousands of disks Hundreds of tape drives Real-time detector data

19 Tier1 Regional Center Architecture (I. Gaines, FNAL) Tapes Network from CERN Network from Tier 2 centers Tape Mass Storage & Disk Servers Database Servers Physics Software Development R&D Systems and Testbeds Info servers Code servers Web Servers Telepresence Servers Training Consulting Help Desk Production Reconstruction Raw/Sim ESD Scheduled, predictable experiment/ physics groups Production Analysis ESD AOD AOD DPD Scheduled Physics groups Individual Analysis AOD DPD and plots Chaotic Physicists Desktops Tier 2 Local institutes CERN Tapes Support Services

20 RD45, GIODNetworked Object Databases RD45, GIODNetworked Object Databases Clipper/GC High speed access to Objects or File data FNAL/SAM for processing and analysis Clipper/GC High speed access to Objects or File data FNAL/SAM for processing and analysis SLAC/OOFS Distributed File System + Objectivity Interface SLAC/OOFS Distributed File System + Objectivity Interface NILE, Condor:Fault Tolerant Distributed Computing NILE, Condor:Fault Tolerant Distributed Computing MONARCLHC Computing Models: Architecture, Simulation, Strategy, Politics MONARCLHC Computing Models: Architecture, Simulation, Strategy, Politics ALDAPOO Database Structures & Access Methods for Astrophysics and HENP Data ALDAPOO Database Structures & Access Methods for Astrophysics and HENP Data PPDGFirst Distributed Data Services and Data Grid System Prototype PPDGFirst Distributed Data Services and Data Grid System Prototype GriPhyN Production-Scale Data Grids GriPhyN Production-Scale Data Grids EU Data Grid EU Data Grid Roles of Projects for HENP Distributed Analysis

21 CMS Analysis and Persistent Object Store Online Common Filters & Pre-Emptive Object Creation On Demand Object Creation CMS Slow Control Detector Monitoring L4 L2/L3 L1 Persistent Object Store Filtering Simulation Calibrations, Group Analyses User Analysis Data Organized In a(n Object) Hierarchy u Raw, Reconstructed (ESD), Analysis Objects (AOD), Tags Data Distribution u All raw, reconstructed and master parameter DBs at CERN u All event tag and AODs at all regional centers u HOT data moved automatically to RCs

22 GIOD: Globally Interconnected Object Databases Hit Track Detector è MultiTB OO Database Federation; used across LANs and WANs è 170 MByte/sec CMS Milestone è Developed Java 3D OO Reconstruction, Analysis and Visualization Prototypes that Work Seamlessly Over Worldwide Networks è Deployed facilities and database federations as testbeds for Computing Model studies

23 The Particle Physics Data Grid (PPDG) u First Round Goal: Optimized cached read access to 10-100 Gbytes drawn from a total data set of 0.1 to ~1 Petabyte PRIMARY SITE Data Acquisition, CPU, Disk, Tape Robot SECONDARY SITE CPU, Disk, Tape Robot Site to Site Data Replication Service 100 Mbytes/sec ANL, BNL, Caltech, FNAL, JLAB, LBNL, SDSC, SLAC, U.Wisc/CS Multi-Site Cached File Access Service University CPU, Disk, Users PRIMARY SITE DAQ, Tape, CPU, Disk, Robot Satellite Site Tape, CPU, Disk, Robot University CPU, Disk, Users University Users University Users University Users Satellite Site Tape, CPU, Disk, Robot u Matchmaking, Co-Scheduling: SRB, Condor, Globus services; HRM, NWS

24 PPDG WG1: Request Manager tape system HRM Replica catalog Network Weather Service Physical file transfer requests GRID Request Interpreter Disk Cache Event-file Index DRM Disk Cache Request Executor Logical Set of Files Request Planner (Matchmaking) DRM Disk Cache CLIENT Logical Request REQUEST MANAGER

25 LLNL Earth Grid System Prototype Inter-communication Diagram Disk Client Request Manager ISI GSI- wuftpd Disk SDSC GSI- pftpd HPSSHPSS LBNL GSI- wuftpd Disk ANL GSI- wuftpd Disk NCAR GSI- wuftpd Disk LBNL Disk on Clipper HPSSHPSS HRM ANL Replica Catalog GIS with NWS GSI-ncftp LDAP Script LDAP C API or Script GSI-ncftp CORBA

26 Grid Data Management Prototype (GDMP) Distributed Job Execution and Data Handling: è Transparency è Performance è Security è Fault Tolerance è Automation Submit job Replicate data Replicate data Site A Site B Site C r Jobs are executed locally or remotely r Data is always written locally r Data is replicated to remote sites Job writes data locally GDMP V1.1: Caltech + EU DataGrid WP2 Tests by CALTECH, CERN, FNAL, Pisa for CMS HLT Production 10/2000; Integration with ENSTORE, HPSS, Castor

27 GriPhyN: Grid Physics Network A New Form of Integrated Distributed System A New Form of Integrated Distributed System Meeting the Scientific Goals of LIGO, SDSS and the LHC Experiments Meeting the Scientific Goals of LIGO, SDSS and the LHC Experiments u Focus on Tier2 Centers at Universities è In a Unified Hierarchical Grid of Five Levels u 18 Centers; with Four Sub-Implementations è 5 Each in US for LIGO, CMS, ATLAS; 3 for SDSS è Near Term Focus on LIGO, SDSS handling of real data; LHC Data Challenges with simulated data u Cooperation with PPDG, MONARC and EU DataGrid http://www.phys.ufl.edu/~avery/GriPhyN/ http://www.phys.ufl.edu/~avery/GriPhyN/ Data Intensive Science

28 GriPhyN: PetaScale Virtual Data Grids Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, computers, and network ) è Resource è Management è Services Resource Management Services è Security and è Policy è Services Security and Policy Services è Other Grid è Services Other Grid Services Interactive User Tools Production Team Individual Investigator Workgroups Raw data source

29 EU DataGrid http://www.cern.ch/grid Organized by CERN Organized by CERN HEP Participants: Czech Republic, France, Germany, Hungary, Italy, Netherlands, Portugal, UK; (US) HEP Participants: Czech Republic, France, Germany, Hungary, Italy, Netherlands, Portugal, UK; (US) Industrial participation Industrial participation Grid forum context Grid forum context 12 Work Packages (One coordinator each) 12 Work Packages (One coordinator each) è Middleware: Work scheduling; data management; application monitoring; fabric management; storage management è Infrastructure: Testbeds and demonstrators; advanced network services è Applications: HEP, Earth Observation; Biology [*] Basic Middleware Framework: Globus

30 EU DataGrid Project Work Packages

31 Emerging Data Grid User Communities NSF Network for Earthquake Engineering Simulation (NEES) NSF Network for Earthquake Engineering Simulation (NEES) è Integrated instrumentation, collaboration, simulation Grid Physics Network (GriPhyN) Grid Physics Network (GriPhyN) è ATLAS, CMS, LIGO, SDSS è World-wide distributed analysis of Petascale data Access Grid; VRVS: supporting group-based collaboration Access Grid; VRVS: supporting group-based collaborationAnd Genomics, Proteomics,... Genomics, Proteomics,... The Earth System Grid and EOSDIS The Earth System Grid and EOSDIS Federating Brain Data Federating Brain Data Computed MicroTomography … Computed MicroTomography … è NVO, GVO

32 GRIDs In 2000: Summary Grids are changing the way we do science and engineering Grids are changing the way we do science and engineering è From Computation to Data Key services and concepts have been identified, and development has started Key services and concepts have been identified, and development has started Major IT challenges remain Major IT challenges remain è An Opportunity & Obligation for HEP/CS Collaboration Transition of services and applications to production use is starting to occur Transition of services and applications to production use is starting to occur In future more sophisticated integrated services and toolsets (Inter- and IntraGrids+) could drive advances in many fields of science & engineering In future more sophisticated integrated services and toolsets (Inter- and IntraGrids+) could drive advances in many fields of science & engineering HENP, facing the need for Petascale Virtual Data, is both an early adopter, and a leading developer of Data Grid technology HENP, facing the need for Petascale Virtual Data, is both an early adopter, and a leading developer of Data Grid technology

33 Bandwidth Requirements Projection (Mbps): ICFA-NTF

34 US-CERN BW Requirements Projection (PRELIMINARY) [#] Includes ~1.5 Gbps Each for ATLAS and CMS, Plus Babar, Run2 and Other [*] D0 and CDF at Run2: Needs Presumed to Be to be Comparable to BaBar

35 Daily, Weekly, Monthly and Yearly Statistics on the 45 Mbps US-CERN Link

36 HEP Network Requirements and STARTAP Beyond the requirement of adequate bandwidth, physicists in HENPs major experiments depend on: è Network and user software that will work together to provide high throughput and to manage the bandwidth effectively è A suite of videoconference and high-level tools for remote collaboration that make data analysis from the US (and from other world regions) effective è An integrated set of local, regional, national and international networks that interoperate seamlessly, without bottlenecks

37 Configuration at Chicago with KPN/Qwest

38 HEP Network Requirements and STARTAP The STARTAP, a professionally managed international peering point with an open HP policy, has been and will continue to be vital for US involvement in the LHC, and thus for the progress of the LHC physics program. The STARTAP, a professionally managed international peering point with an open HP policy, has been and will continue to be vital for US involvement in the LHC, and thus for the progress of the LHC physics program. Our development of worldwide Data Grid systems, in collaboration with the European Union and other world regions, will depend on the STARTAP for joint prototyping, tests and developments using next- generation network, software and database technology. Our development of worldwide Data Grid systems, in collaboration with the European Union and other world regions, will depend on the STARTAP for joint prototyping, tests and developments using next- generation network, software and database technology. A scalable and cost-effective growth path for the STARTAP will be needed, as a central component of international networks for HENP, and other fields. A scalable and cost-effective growth path for the STARTAP will be needed, as a central component of international networks for HENP, and other fields. è An optical STARTAP handling OC-48 and OC-192 links, with favorable peering and transit arrangements across the US would be well-matched to our future plans.

39 Abilene map http://www.abilene.iu.edu/images/logical.pdf

40 US-CERN line connection to Esnet: to HENP Labs Through STARTAP

41 TCP throughput performance: Caltech/CERN Via STARTAP From Caltech to CERN From CERN to Caltech

42 Vancouver Calgary Regina Winnipeg Ottawa Montreal Toronto Halifax St. Johns Fredericton Charlottetown Chicago Seattle New York Los Angeles Miami Europe Dedicated Wavelength or SONET channel OBGP switches Optional Layer 3 aggregation service Large channel WDM system CA*net 4 Possible Architecture Pasadena

43 Intermediate ISP Tier 1 ISP Tier 2 ISP AS 1 AS 2 AS 3 AS 4 AS 5 Dual Connected Router to AS 5 Optical switch looks like BGP router and AS1 is direct connected to Tier 1 ISP but still transits AS 5 Router redirects networks with heavy traffic load to optical switch, but routing policy still maintained by ISP Bulk of AS 1 traffic is to Tier 1 ISP For simplicity only data forwarding paths in one direction shown Red Default Wavelength OBGP Traffic Engineering - Physical

44 Worldwide Computing Issues Beyond Grid Prototype Components: Integration of Grid Prototypes for End-to-end Data Transport Beyond Grid Prototype Components: Integration of Grid Prototypes for End-to-end Data Transport è Particle Physics Data Grid (PPDG) ReqM è PPDG/EU DataGrid GDMP for CMS HLT Productions Start Building the Grid System(s): Integration with Experiment-specific software frameworks Start Building the Grid System(s): Integration with Experiment-specific software frameworks Derivation of Strategies (MONARC Simulation System) Derivation of Strategies (MONARC Simulation System) è Data caching, query estimation, co-scheduling è Load balancing and workload management amongst Tier0/Tier1/Tier2 sites (SONN by Legrand) è Transaction robustness: simulate and verify Transparent Interfaces for Replica Management Transparent Interfaces for Replica Management è Deep versus shallow copies: Thresholds; tracking, monitoring and control

45 VRVS Remote Collaboration System: Statistics VRVS Remote Collaboration System: Statistics 30 Reflectors 52 Countries Mbone, H.323, MPEG2 Streaming, VNC

46 VRVS: Mbone/H.323/QT Snapshot VRVS Future evolution/integration (R&D) uWider Deployment and Support of VRVS. uHigh Quality video and audio (MPEG1, MPEG2,..). uShared virtual workspaces, applications, and environment uIntegration of H.323 ITU Standard uQuality of Service (QoS) over the network uImproved security, authentication and confidentiality uRemote control of video cameras via a Java applet

47 VRVS R&D: Sharing Desktop VNC technology integrated in the upcoming VRVS release

48 Demonstrations (HN, J. Bunn, P. Galvez): CMSOO and VRVS Demonstrations (HN, J. Bunn, P. Galvez): CMSOO and VRVS CMSOO: Java 3D Event Display IGrid2000 Yokohama, July 2000

49 STARTAP: Selected HENP Success Stories (1) u Onset of large scale optimized Production file transfers, involving both HENP Labs & Universities è Babar, CMS, ATLAS è Upcoming D0, CDF at FNAL/Run2; RHIC u Seamless remote access to Object databases è CMSOO demos: IGrid2000 (Yokohama) è Now starting on distributed CMS ORCA OO (TB to PB) DB Access u CMS User Analysis Environment (UAE) è Worldwide Grid-enabled view of the data, along with visualizations, data presentation and analysis è A User-view across the Data Grid

50 STARTAP: Selected HENP Success Stories (2) u A Principal testbed to develop production Grid systems, of worldwide scope è Grid Data Management Prototype (GDMP; US/EU) è GriPhyN: 18-20 University facilities serving CMS, ATLAS,LIGO and SDSS, è Built on a strong foundation of grid security and information infrastructure Foundation è Deploying a Grid Virtual Data Toolkit (VDT) VRVS: Worldwide-extensible videoconferencing and shared virtual spaces VRVS: Worldwide-extensible videoconferencing and shared virtual spaces u Future: Forward-looking view of Mobile Agent Coordination Architectures è Survivable Loosely Coupled Systems with Unprecedented Scalability


Download ppt "HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October."

Similar presentations


Ads by Google