Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scientific Computing for SLAC Science Bebo White Stanford Linear Accelerator Center October 2006.

Similar presentations


Presentation on theme: "Scientific Computing for SLAC Science Bebo White Stanford Linear Accelerator Center October 2006."— Presentation transcript:

1 Scientific Computing for SLAC Science Bebo White Stanford Linear Accelerator Center October 2006

2 2 Scientific Computing The relationship between Science and the components of Scientific Computing

3 3 Drivers for SLAC Computing Computing to enable today’s data- intensive science –Clusters, interconnects, networks, mass storage, etc. Computing research to prepare for tomorrow’s challenges –Massive memory, low latency, petascale databases, detector simulation, etc.

4 4 SLAC Scientific Computing Science GoalsComputing Techniques BaBar Experiment (winds down 2009-2012) Measure billions of collisions to understand matter-antimatter asymmetry (why matter exists today) High-throughput data processing, trivially parallel computation, heavy use of disk and tape storage. Intercontinental distributed computing. ATLAS Experiment and Experimental HEP Analyze petabytes of data to understand the origin of mass High-throughput data processing. trivially parallel computation, heavy use of disk and tape storage. Intercontinental distributed computing. Accelerator Science Simulate accelerator behavior before construction and during operation Parallel computation, visual analysis of large data volumes Particle Astrophysics (mainly simulation) Star formation in the early universe, colliding black holes, … Parallel computation (SMP and cluster), visual analysis of growing volumes of data Particle Astrophysics Major Projects (GLAST, LSST …) Analyze terabytes to petabytes of data to understand the dark matter and dark energy riddles High-throughput data processing, very large databases, visualization Photon Science Femtosecond x-ray pulses, “ultrafast” science, structure of individual molecules … High throughput data analysis and large-scale simulation New Architectures for SLAC Science Radical new approaches to computing for Stanford-SLAC data- intensive science Current focus: massive solid-state storage for high-throughput, low- latency data analysis

5 5 Data Challenge in High Energy Physics 2006 example SLAC Online System Selection and Compression ~10TB/s Raw data written to tape:10MB/s Simulated and derived data: 20 MB/s International network data flow to “Tier A Centers” 50 MB/s (400Mb/s)

6 6 Tier 1 Online System Event Reconstruction France Germany Institute ~0.25TIPS ~100 MBps ~0.6-2.5 Gbps 100 - 1000 Mbps Physics data cache ~PBps ~0.6-2.5 Gbps Tier 0 +1 Tier 3 Tier 4 Tier 2 2000 physicists in 31 countries are involved in this 20- year experiment in which DOE is a major player. Grid infrastructure spread over the US and Europe coordinates the data analysis Analysis Italy FermiLab, USA Data Challenge in High Energy Physics: CERN / LHC High Energy Physics Data 2008 onwards Event Simulation CERN LHC CMS detector 12,500 tons, $700M 2.5-40 Gbps

7 7 Client Disk Server Tape Server SLAC-BaBar Computing Fabric IP Network (Cisco) 120 dual/quad CPU Sun/Solaris ~700 TB Sun RAID arrays (FibreChannel +some SATA) 1700 dual CPU Linux (over 3700 cores) 25 dual CPU Sun/Solaris 40 STK 9940B 6 STK 9840A 6 STK Powderhorn over 1 PB of data HEP-specific ROOT software (Xrootd) + Objectivity/DB object database some NFS HPSS + SLAC enhancements to ROOT and Objectivity server code

8 8 Used/Required Space

9 9 ESnet: Source and Destination of the Top 30 Flows, Feb. 2005 Terabytes/Month Fermilab (US)  WestGrid (CA) SLAC (US)  INFN CNAF (IT) SLAC (US)  RAL (UK) Fermilab (US)  MIT (US) SLAC (US)  IN2P3 (FR) IN2P3 (FR)  Fermilab (US) SLAC (US)  Karlsruhe (DE) Fermilab (US)  Johns Hopkins 12 10 8 6 4 2 0 LIGO (US)  Caltech (US) LLNL (US)  NCAR (US) Fermilab (US)  SDSC (US) Fermilab (US)  Karlsruhe (DE) LBNL (US)  U. Wisc. (US) Fermilab (US)  U. Texas, Austin (US) BNL (US)  LLNL (US) Fermilab (US)  UC Davis (US) Qwest (US)  ESnet (US) Fermilab (US)  U. Toronto (CA) BNL (US)  LLNL (US) CERN (CH)  BNL (US) NERSC (US)  LBNL (US) DOE/GTN (US)  JLab (US) U. Toronto (CA)  Fermilab (US) NERSC (US)  LBNL (US) CERN (CH)  Fermilab (US) DOE Lab-International R&E Lab-U.S. R&E (domestic) Lab-Lab (domestic) Lab-Comm. (domestic)

10 10 Growth and Diversification Continue shared cluster growth as much as possible Increasing MPI (parallel) capacity and support (astro, accelerator, and more) Grid interfaces and support (Atlas et.al) Large SMPs (Astro) Visualization

11 11 Research - PetaCache The PetaCache architecture aims at revolutionizing the query and analysis of scientific databases with complex structure –Generally this applies to feature databases (terabytes-petabytes) rather than bulk data (petabytes-exabytes) The original motivation comes from HEP –Sparse (~random) access to tens of terabytes today, petabytes tomorrow –Access by thousands of processors today, tens of thousands tomorrow

12 12 Latency Ideal

13 13 Latency Current Reality

14 14 Latency Practical Goal

15 15 PetaCache Summary Data-intensive science increasingly requires low-latency access to terabytes or petabytes Memory is one key: –Commodity DRAM today (increasing total cost by ~2x) –Storage-class memory (whatever that will be) in the future Revolutions in scientific data analysis will be another key –Current HEP approaches to data analysis assume that random access is prohibitively expensive –As a result, permitting random access brings much-less-than- revolutionary immediate benefit Use the impressive motive force of a major HEP collaboration with huge data-analysis needs to drive the development of techniques for revolutionary exploitation of an above-threshold machine

16 16 Research – Very Large Databases 10-year, unique experience with VLDB –Designing, building, deploying, and managing peta-scale production datasets/database – BaBar – 1.4 PB –Assisting LSST (Large Synoptic Survey Telescope) in solving data-related challenges (effort started 4Q 2004)

17 17 LSST – Data Related Challenges (1/2) Large volumes –7 PB/year (image and catalog data) –500 TB/year (database) Todays VLDBs ~10s TB range High availability –Petabytes -> 10s of 1000s of disks -> daily disk failures Real time requirement –Transient alerts generated in < 60 sec

18 18 LSST – Data Related Challenges (2/2) Spatial and temporal aspects –Most surveys focus on a single dimension All data made public with minimal delay –Wide range of users – professional and amateur astronomers, students, general public

19 19 VLDB Work by SCCS Prototyping at SCCS Close collaboration with key MySQL developers Working closely with world-class database gurus

20 20 Research – Geant4 A toolkit simulating elementary particles passing through and interacting with matter, and modeling the detector apparatus measuring the passage of elementary particles and recording the energy and dose deposition Geant4 is developed and maintained by an international collaboration –SLAC is the second largest center next to CERN

21 21 Acknowledgements Richard Mount, Director, SCCS Chuck Boeheim, SCCS Randall Melen, SCCS

22 WWW 2008 April 21-25, 2008 Beijing, China

23 23 Host Institution and Partners Beihang University –School of Computer Science Tsing-Hua University, Peking University, Chinese Academy of Sciences, … Microsoft Research Asia City Government of Beijing (pending)

24 24 BICC: Beijing International Convention Center

25 25 Key Personnel General Chairs: –Jinpeng Huai, Beihang University –Robin Chen, AT&T Labs Conference Vice Chair: –Yunhao Liu, HKUST Local volunteers –6-10 grad students led by Dr. Zongxia Du –In cooperation with John Miller (TBD) IW3C2 Liaison: Ivan Herman PCO: two candidates under consideration

26 26 Local Organizing Committee Composition of Local Organizing Committee: –Vincent Shen, The HK University of Science and TechnologyThe HK University of Science and Technology –Zhongzhi Shi, Chinese Academy of SciencesChinese Academy of Sciences –Hong Mei, Peking UniversityPeking University –Dianfu Ma, Beihang UniversityBeihang University –Guangwen Yang, Tsinghua UniversityTsinghua University –Hsiao-Wuen Hon, Microsoft Research Asia –Minglu Li, Shanghai Jiao Tong UniversityShanghai Jiao Tong University –Hai Jin, Huazhong University of Science and TechnologyHuazhong University of Science and Technology –… and Chinese Internet/Software/Telecom companies

27 27


Download ppt "Scientific Computing for SLAC Science Bebo White Stanford Linear Accelerator Center October 2006."

Similar presentations


Ads by Google