ECFA HL LHC Workshop: ECFA High Luminosity LHC Experiments Workshop For the TDOC preparatory group Niko Neufeld CERN/PH.

Slides:



Advertisements
Similar presentations
The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
Advertisements

GPU System Architecture Alan Gray EPCC The University of Edinburgh.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Lecture 2: Modern Trends 1. 2 Microprocessor Performance Only 7% improvement in memory performance every year! 50% improvement in microprocessor performance.
LHCb Upgrade Overview ALICE, ATLAS, CMS & LHCb joint workshop on DAQ Château de Bossey 13 March 2013 Beat Jost / Cern.
I/O Channels I/O devices getting more sophisticated e.g. 3D graphics cards CPU instructs I/O controller to do transfer I/O controller does entire transfer.
CMS Data Transfer Challenges LHCOPN-LHCONE meeting Michigan, Sept 15/16th, 2014 Azher Mughal Caltech.
UC Berkeley 1 The Datacenter is the Computer David Patterson Director, RAD Lab January, 2007.
1 VLSI and Computer Architecture Trends ECE 25 Fall 2012.
Niko Neufeld, CERN/PH Many stimulating, fun discussions with my event-building friends in ALICE, ATLAS, CMS and LHCb, and with a lot of smart people in.
Niko Neufeld, CERN/PH-Department
ALICE O 2 Plenary | October 1 st, 2014 | Pierre Vande Vyvre O2 Project Status P. Buncic, T. Kollegger, Pierre Vande Vyvre 1.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
3. April 2006Bernd Panzer-Steindel, CERN/IT1 HEPIX 2006 CPU technology session some ‘random walk’
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
Use of GPUs in ALICE (and elsewhere) Thorsten Kollegger TDOC-PG | CERN |
Directed Reading 2 Key issues for the future of Software and Hardware for large scale Parallel Computing and the approaches to address these. Submitted.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
ATLAS in LHCC report from ATLAS –ATLAS Distributed Computing has been working at large scale Thanks to great efforts from shifters.
JLAB Computing Facilities Development Ian Bird Jefferson Lab 2 November 2001.
Programming for GCSE Topic 5.1: Memory and Storage T eaching L ondon C omputing William Marsh School of Electronic Engineering and Computer Science Queen.
Niko Neufeld PH/LBC. Detector front-end electronics Eventbuilder network Eventbuilder PCs (software LLT) Eventfilter Farm up to 4000 servers Eventfilter.
Ian Bird GDB; CERN, 8 th May 2013 March 6, 2013
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Predrag Buncic, October 3, 2013 ECFA Workshop Aix-Les-Bains - 1 Computing at the HL-LHC Predrag Buncic on behalf of the Trigger/DAQ/Offline/Computing Preparatory.
1 CS : Technology Trends Ion Stoica and Ali Ghodsi ( August 31, 2015.
From the Transatlantic Networking Workshop to the DAM Jamboree to the LHCOPN Meeting (Geneva-Amsterdam-Barcelona) David Foster CERN-IT.
Latest ideas in DAQ development for LHC B. Gorini - CERN 1.
1 Evolving ATLAS Computing Model and Requirements Michael Ernst, BNL With slides from Borut Kersevan and Karsten Koeneke U.S. ATLAS Distributed Facilities.
Why it might be interesting to look at ARM Ben Couturier, Vijay Kartik Niko Neufeld, PH-LBC SFT Technical Group Meeting 08/10/2012.
1 LHCC RRB SG 16 Sep P. Vande Vyvre CERN-PH On-line Computing M&O LHCC RRB SG 16 Sep 2004 P. Vande Vyvre CERN/PH for 4 LHC DAQ project leaders.
Niko Neufeld, CERN/PH. Online data filtering and processing (quasi-) realtime data reduction for high-rate detectors High bandwidth networking for data.
Predrag Buncic, October 3, 2013 ECFA Workshop Aix-Les-Bains - 1 Computing at the HL-LHC Predrag Buncic on behalf of the Trigger/DAQ/Offline/Computing Preparatory.
Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015.
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
Niko Neufeld HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
Computer Organization Yasser F. O. Mohammad 1. 2 Lecture 1: Introduction Today’s topics:  Why computer organization is important  Logistics  Modern.
Jeffrey Ellak CS 147. Topics What is memory hierarchy? What are the different types of memory? What is in charge of accessing memory?
Wesley Smith, Graeme Stewart, June 13, 2014 ECFA - TOOC - 1 ECFA - TOOC ECFA - TOOC ECFA HL-LHC workshop PG7: Trigger, Online, Offline and Computing preparatory.
Pierre VANDE VYVRE ALICE Online upgrade October 03, 2012 Offline Meeting, CERN.
Ian Bird WLCG Networking workshop CERN, 10 th February February 2014
LHCbComputing Computing for the LHCb Upgrade. 2 LHCb Upgrade: goal and timescale m LHCb upgrade will be operational after LS2 (~2020) m Increase significantly.
Ian Bird, CERN WLCG LHCC Referee Meeting 1 st December 2015 LHCC; 1st Dec 2015 Ian Bird; CERN1.
Pathway to Petaflops A vendor contribution Philippe Trautmann Business Development Manager HPC & Grid Global Education, Government & Healthcare.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
PetaCache: Data Access Unleashed Tofigh Azemoon, Jacek Becla, Chuck Boeheim, Andy Hanushevsky, David Leith, Randy Melen, Richard P. Mount, Teela Pulliam,
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
26. Juni 2003Bernd Panzer-Steindel, CERN/IT1 LHC Computing re-costing for for the CERN T0/T1 center.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
APE group Many-core platforms and HEP experiments computing XVII SuperB Workshop and Kick-off Meeting Elba, May 29-June 1,
LHCb and InfiniBand on FPGA
MERANTI Caused More Than 1.5 B$ Damage
Joint Genome Institute
Niko Neufeld  (quasi) real-time connectivity requirements ”CERN openlab workshop.
Ian Bird WLCG Workshop San Francisco, 8th October 2016
Workshop Concluding Remarks
Challenges in ALICE and LHCb in LHC Run3
Niko Neufeld LHCb Upgrade Online Computing Challenges CERN openlab Workshop on Data Center Technologies and Infrastructures, Mar 2017.
for the Offline and Computing groups
WLCG: TDR for HL-LHC Ian Bird LHCC Referees’ meting CERN, 9th May 2017.
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
Architecture & Organization 1
LHC Computing re-costing for
CS : Technology Trends August 31, 2015 Ion Stoica and Ali Ghodsi (
Unit 2 Computer Systems HND in Computing and Systems Development
New strategies of the LHC experiments to meet
Architecture & Organization 1
IBM Power Systems.
Computing at the HL-LHC
Presentation transcript:

ECFA HL LHC Workshop: ECFA High Luminosity LHC Experiments Workshop For the TDOC preparatory group Niko Neufeld CERN/PH

ALICE: Pierre Vande Vyvre, Thorsten Kollegger, Predrag Buncic ATLAS: David Rousseau, Benedetto Gorini, Nikos Konstantinidis CMS: Wesley Smith, Christoph Schwick, Ian Fisk, Peter Elmer LHCb: Renaud Legac, Niko Neufeld ECFA TDOC 2013 technology trends - N. Neufeld2

Relevant documents used as inputs by the preparatory group ALICE Upgrade LoI: LHCC ATLAS Phase 2 Upgrade LoI: LHCC CMS Draft Phase 2 Upgrade Document available over the summer 2013 LHCb Framework Upgrade TDR: LHCC I. Bird et al. “Update of the Computing Models of the WLCG and the LHC Experiments”, forthcoming For technology forecasting: document by Bernd Panzer (CERN-IT) with updates in I. Bird et al. ECFA TDOC 2013 technology trends - N. Neufeld3

Focus on Commercial Of the Shelf (COTS) relevant for Trigger, DAQ and computing in three broad topics: Processing FPGA Co-processors (many-cores) CPUs (x86 and others) Interconnects links & serializers networks (LAN & WAN) Storage tapes disks Not covered: ASICs, general semi-conductor development  electronics talks Software  covered by David Architecture  covered by Wesley Not covered ≠ not interesting ECFA TDOC 2013 technology trends - N. Neufeld4

Technology talks are the more interesting the more “new stuff” they contain In summarizing our ideas about the future we are also relying on information we got from industry, often under non-disclosure agreement (NDA) Great care has been taken only to present information which has been cleared for public information Should you find anything on these slides, which you think is under NDA please contact the author 5ECFA TDOC 2013 technology trends - N. Neufeld

6

14 nm CPUs/FPGAs in 2014 (simpler NAND Flash already at 10 nm!) Intel strongly committed to new processor generation every 12 to 18 months: TSMC, Samsung, ST also still in this game expect 10 nm in 2015/16 Moore’s law will hold at least until Source: Source: wikipedia ECFA TDOC 2013 technology trends - N. Neufeld

Moore’s law holds for FPGAs as well Next generation Altera Stratix 10 in 14 nm Smaller feature size means higher-speed and/or less power consumption No problem in logic density Challenges programmability long-term maintenance of design FPGA software which is rapidly changing  conserve tools in virtual machines, but this will need continuous support 8 Source: I. Bolsen (Xilinx) 2012 ECFA TDOC 2013 technology trends - N. Neufeld

Key challenges are power and memory band-width System-on-a-Chip Memory control, voltage regulators integrated (helps with fine-grained power- management) Integration of CPU and GPU ( and DSPs ) (these simpler elements tend to use less power for the same chip area than traditional cores) Unified memory architecture, 3-D DRAM memory on chip  micro-servers based on SOC Market focus on cost effective components for Smartphones, Phablets, Tablets, Ultrabooks, Notebooks Servers ‘less’ innovative More memory channels & bandwidth Larger caches, more cores Focus on vector units HPC, integration of accelerator cards problem of “dark silicon” in processors 9ECFA TDOC 2013 technology trends - N. Neufeld

Memory bandwidth / core is rather decreasing 1 DDR4 is expected to be the last parallel, pluggable memory as we know it This problem is obviously much more acute for many-core architectures 10 Source of raw data: wikipedia Picture shows memory bandwidth /core for 4 recent Intel architectures using the fastest possible DDR3 memory in optimal configuration and the maximum number of cores available for dual-socket server processors 1 Latency and bandwidth are strongly coupled. Actual application performance depends more often on the latency of memory access which in turn depend on the organization of the and size of the caches ECFA TDOC 2013 technology trends - N. Neufeld

Moore’s law again Biggest challenges: Programming & Vectorization (David’s talk) Memory bandwidth  stacked DRAM will come soon, should help with memory bottle-neck For scientific computing this domain is a match between two companies: Intel (Xeon/Phi) and Nvidia (CUDA) – but surprises (AMD) are always possible. 11 source ECFA TDOC 2013 technology trends - N. Neufeld ATLAS study on triggering on displaced jets – up to 60 x faster than x86 version

Very strong in fast growing mobile market Interest in the HPC community (“Project Mont Blanc”) Very power-efficient Very “European” 12ECFA TDOC 2013 technology trends - N. Neufeld

13 Modern ARM SoC are competitive micro-servers HEP applications being ported Current challenges: memory amount and bandwidth (scaling) cost / performance ratio (when disregarding power) Moore’s law holds for ARM as well  and GPUs are added to ARMs too source: ECFA TDOC 2013 technology trends - N. Neufeld LHCb Reco on ARM

Principle technology roadmaps are very challenging, but not completely unrealistic Clear focus on lower end consumer markets, server market at best stable Very few companies can afford fabrication lines Price/performance improvements are achieved via: Technology and fabrication (smaller structure sizes), or longer amortization periods of fabs for the same chip generation Disadvantage smaller structure sizes mean better power/performance ratio TCO, electricity costs increase total number of cores increases  I/O, streams per disk spindle go up 14 Better ratio in 2013 than expected Assume now a 25% improvement, IFF constant efficiency of the use of the processors by our software (not obvious! c.f. David’s talk) Reminder: 5% difference leads to a factor 1.63 over 10 years CERN Computer Centre ECFA TDOC 2013 technology trends - N. Neufeld

15ECFA TDOC 2013 technology trends - N. Neufeld

Commercially driven by data-center market For > 10 Gbit/s interfaces use 4 fibre-pairs in parallel. E.g.: 40/100 GigE, FDR/EDR InfiniBand Up to 150 m multi-mode optics is still cheaper than single-mode, but distances get shorter and expensive high-quality fibres (OM4) are required The next years will see higher integration Network adapter in the CPU Physical layer in the CPU (silicon photonics) Speed [Gb/s] on one pairTarget TechnologyYear 1010/40 Ethernet FDR InfiniBand G Ethernet ?? G Ethernet20?? ECFA TDOC 2013 technology trends - N. Neufeld16

ECFA TDOC 2013 technology trends - N. Neufeld PCIe Gen4 expected Chelsio T5 (40 GbE and Intel 40 GbE expected Mellanox FDR Mellanox 40 GbE NIC PCIe Gen3 available EDR (100 Gb/s) HCA expected 100 GbE NIC expected 17

Brocade MLX: GigE Juniper QFabric: up to GigE Mellanox SX6536: 648 x 56 Gb (IB) / 40 GbE ports Huawei CE12800: 288 x 40 GbE / 1152 x 10 GbE Each with sufficient bandwidth to run the entire Run #2 DAQ of all LHC experiments together ECFA TDOC 2013 technology trends - N. Neufeld Date of release 18

ATLAS transferred 171 PB over the wide-area network (“internet”) in 2012 Moving quickly into the 100 Gbps era, 400 Gbps not far behind Transatlantic 100 Gbps transmission link has been demonstrated (ANA-100 G) LHCONE project for future Tier1 – Tier 2 connectivity Technology moving forward steadily but funding needed to keep up with hardware development 19ECFA TDOC 2013 technology trends - N. Neufeld Yet another “Moore’s law” Ack. to A. Barczyk (CALTECH) for help with this slide

By the end of LS2 links with > 100 Gbit/s will be readily available Prices for networking in the local area are dropping steadily The entry of silicon-photonics should reduce the price of optical links even more In non-radiation environments all network and link needs of LHC experiments will be satisfied by industry 20ECFA TDOC 2013 technology trends - N. Neufeld

21ECFA TDOC 2013 technology trends - N. Neufeld

22 20 TB disk 60 TB disk Today 3- 4 TB disk SSD are not replacing HDD any time soon - Same for non-volatile memory Storage not a problem for Online applications (80 GB/s ALICE after LS2) But: while capacity is increasing, parallel I/O capability is not  more applications sharing fewer disks  big challenge for offline applications ECFA TDOC 2013 technology trends - N. Neufeld

Price continues to fall Still a price advantage over disk, but economic threshold moving out: i.e. need more and more data to justify tape infrastructure cost Overall a declining market 23ECFA TDOC 2013 technology trends - N. Neufeld

ratio as expected Assume now a 20% improvement Reminder: 5% difference leads to a factor 1.63 over 10 years Technology roadmap for HDD and Tape is reasonable New HDD technologies complex and very expensive Market is shrinking for tapes & market for HDDs under pressure Price/performance improvements are achieved via: Technology and fabrication (smaller structure sizes) Longer amortization periods of production lines for the same HDD generation  higher power consumption per GB, disk size not increasing (actually a good point) Assume a factor 3? less costs for tape storage (CHF/GB) Merge tape+disk costs? Large site dependencies Current focus is on space and not I/O performance, Problem ?!

Cost structure for a site: CPU processing Disk storage Tape storage Networking (LAN and WAN) Services (Databases, home directory, backup, exp. Services, etc.) Electricity Boundary conditions: Constant or decreasing budget Limited space Limited cooling capacity Limited electricity capacity Lifecycle of equipment ( 4-5 years ) ECFA TDOC 2013 technology trends - N. Neufeld25 Cost for electricity is increasing at 4-5% per year (average euro-zone)  60% in 10 years Job characteristics: MC and processing versus analysis I/O 10 Gbit necessity, local worker node SSDs, Storage hierarchies, etc. Power efficiency versus performance Disk to tape ratio, availability of tape technology

Big-data and cloud-computing drive market for Commercial Off The Shelf IT equipment  HEP profits from that, but depends on economy at large We expect current performance rates (and price performance improvements) to grow at historic rates until at least LS2 25% performance improvement per year in computing at constant cost local area network and link technology sufficient for all HL- LHC needs also beyond LS2 wide area network growth sufficient for LHC needs, provided sufficient funding 20% price-drop at constant capacity expected for disk- storage, Far beyond LS2 the technical challenges for further evolution seem daunting. Nevertheless the proven ingenuity and creativity of the IT justify cautious optimism The fruits of technology are there, but hard work is needed to make the best of it 26ECFA TDOC 2013 technology trends - N. Neufeld source

ECFA TDOC 2013 technology trends - N. Neufeld27

28 based on a realistic model for the LHCb upgrade ECFA TDOC 2013 technology trends - N. Neufeld