Status and prospects of computing at IHEP

Slides:



Advertisements
Similar presentations
The strategy of Accelerator based High Energy Physics of China J. Gao On behalf of CEPC+SppC Group IHEP, CAS, China Roundtable discussion: “Future machines“
Advertisements

Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
BESIII physical offline data analysis on virtualization platform Qiulan Huang Computing Center, IHEP,CAS CHEP 2015.
Division Report Computing Center CHEN Gang Computing Center Oct. 24, 2013 October 24 ,
YAN, Tian On behalf of distributed computing group Institute of High Energy Physics (IHEP), CAS, China CHEP-2015, Apr th, OIST, Okinawa.
BESIII Production with Distributed Computing Xiaomei Zhang, Tian Yan, Xianghu Zhao Institute of High Energy Physics, Chinese Academy of Sciences, Beijing.
IHEP Computing Center Site Report Shi, Jingyan Computing Center, IHEP.
IHEP(Beijing LCG2) Site Report Fazhi.Qi, Gang Chen Computing Center,IHEP.
BEIJING-LCG Network Yan Xiaofei
IHEP Computing Site Report Shi, Jingyan Computing Center, IHEP.
QI Fazhi / IHEP CC HEPiX IPv6 F2F Meeting IPv6 Network Status in IHEP/China QI Fazhi Computing Center, IHEP July 4, 2013.
Building Virtual Scientific Computing Environment with Openstack Yaodong Cheng, CC-IHEP, CAS ISGC 2015.
Virtual Cluster Computing in IHEPCloud Haibo Li, Yaodong Cheng, Jingyan Shi, Tao Cui Computer Center, IHEP HEPIX Spring 2016.
The Status and Plans for SDN IHEP Fazhi Qi CC,IHEP Fazhi Qi/IHEPCC 1.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
Update on Computing/Cloud Marco Destefanis Università degli Studi di Torino 1 BESIII Ferrara, Italy October 21, 2014 Stefano Bagnasco, Flavio Astorino,
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
Grid technologies for large-scale projects N. S. Astakhov, A. S. Baginyan, S. D. Belov, A. G. Dolbilov, A. O. Golunov, I. N. Gorbunov, N. I. Gromova, I.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
Particle Physics Sector Young-Kee Kim / Greg Bock Leadership Team Strategic Planning Winter Workshop January 29, 2013.
KEK CC - present and future - Mitsuaki NOZAKi (KEK)
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
IHEP Computing Center Site Report Shi, Jingyan Computing Center, IHEP.
CEPC software & computing study group report
Dynamic Extension of the INFN Tier-1 on external resources
WLCG IPv6 deployment strategy
Review of the WLCG experiments compute plans
Status of WLCG FCPPL project
e-Science Activities in China
Status of BESIII Computing
The advances in IHEP Cloud facility
Distributed Computing in IHEP
Ian Bird WLCG Workshop San Francisco, 8th October 2016
The Beijing Tier 2: status and plans
Volunteer Computing for Science Gateways
AWS Integration in Distributed Computing
Elastic Computing Resource Management Based on HTCondor
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
Grid site as a tool for data processing and data analysis
Integration of Openstack Cloud Resources in BES III Computing Cluster
INFN Computing Outlook The Bologna Initiative
ATLAS Cloud Operations
Yaodong CHENG Computing Center, IHEP, CAS 2016 Fall HEPiX Workshop
Clouds of JINR, University of Sofia and INRNE Join Together
LHC DATA ANALYSIS INFN (LNL – PADOVA)
The INFN TIER1 Regional Centre
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
Southwest Tier 2.
EUChinaGRID Applications
A high-performance computing facility for scientific research
Статус ГРИД-кластера ИЯФ СО РАН.
Computing at CEPC Xiaomei Zhang Xianghu Zhao
BEIJING-LCG2 Site Report
WLCG Collaboration Workshop;
Xiaomei Zhang On behalf of CEPC software & computing group Nov 6, 2017
Yifang Wang Institute of High Energy Physics Elba, May 22, 2012
Welcome to IHEP.
LHC Data Analysis using a worldwide computing grid
Welcome to Beijing.
Introduction of IHEP.
Introduction of IHEP.
Welcome to Beijing.
Particle Physics Theory
Stanford Linear Accelerator
11th France China Particle Physics Laboratory workshop (FCPPL2018)
This work is supported by projects Research infrastructure CERN (CERN-CZ, LM ) and OP RDE CERN Computing (CZ /0.0/0.0/1 6013/ ) from.
Exploit the massive Volunteer Computing resource for HEP computation
Zhihui Sun , Fazhi Qi, Tao Cui
Exploring Multi-Core on
Presentation transcript:

Status and prospects of computing at IHEP Xiaomei Zhang IHEP Computing Center April 4, 2018, CNAF

IHEP at a Glance Institute of High Energy Physics, CAS, Beijing, China The largest fundamental research center in China Particle physics, AstroParticle physics, Accelerator and synchrotron radiation technology and applications The major institute conducts the high energy physics experiments in China

. BES/CEPC/DYB/JUNO/LHAASO Experiments and challenges . BES/CEPC/DYB/JUNO/LHAASO

CEPC Collider Ring(50Km) Experiments at IHEP BESIII (Beijing Spectrometer III at BEPCII) DYB (Daya Bay Reactor Neutrino Experiment) JUNO (Jiangmen Underground Neutrino Observatory) BTC IP1 IP2 e+ e- e+ e- Linac (240m) LTB CEPC Collider Ring(50Km) Booster(50Km) LHAASO Large High Altitude Air Shower Observatory YBJ (Tibet-ASgamma ARGO-YBJ Experiments) HXMT Hard X-Ray Moderate Telescope

BEPCII/BESIII BEPC: Beijing Electron-Positron Collider BES: BEijing Spectrometer, general-purpose detector located on BEPC upgrade: BEPCII/BESIII, operational in 2008 2.0 ~ 4.6 GeV/C (3~10)×1032 cm-2s-1 Produce 100TB/year, 5PB in total for 5 years 5000 Cores for data process and physics analysis

CEPC Next Generation Accelerator in China after BEPCII which will complete its mission about 2021 Two phases CEPC (Circular Electron Positron Collider, e+e- ~ Higgs/Z factory) Precision measurement of the Higgs/Z boson, about 12 years Beam energy ~120 GeV Estimated to produce 200TB/year for Higgs factory and >100PB/year for Z factory SPPC(Super Proton Proton Collider, pp ~ A discovery machine) Discover new physics Beam energy ~50TeV Estimated to produce 100PB/year

CEPC Collider Ring(50Km) CEPC collider is planed to build with the 50/100KM ring at QingHuangDao close to Beijing CEPC timetable Pre-study, R&D and preparation work pre-study: 2013-2015 R&D: 2016-2020 Engineering Design: 2015-2020 Construction: 2021-2027 Data taking: 2028-2035 BTC IP1 IP2 e+ e- e+ e- Linac (240m) LTB CEPC Collider Ring(50Km) Booster(50Km)

Neutrino Experiments Daya Bay Reactor Neutrino Experiment To measure the mixing angle θ13  300 collaborators from 38 institutions Produces ~200TB/Year (2011-2018) JUNO - Jiangmen Underground Neutrino Observatory Start to build in 2014, operational in 2020 20 kt LS detector 2-3% energy resolution Rich physics opportunities Estimated to produce 2PB data/year for 10 years

LHAASO Large High Altitude Air Shower Observatory, located on the border of Sichuan and Yunnan Province Multipurpose project with a complex detector array for high energy gamma ray and cosmic ray detection Expected to be operational in 2019 ~1.2PB data/year * 10 Years On-site storage and computing resources. Data will be filtered and compressed before transferring back to IHEP

.Computing .Storage .Network Computing environment at IHEP .Computing .Storage .Network

HTC Computing resources The DIRAC-based distributed computing system for BESIII, JUNO.. ~ 3,500 CPU cores IHEPCloud based on Openstack ~ 700 CPU cores Local clusters 16,000 CPU cores Scheduler: PBS/Maui (before) HTCondor (now) Grid site (WLCG) 1,200 CPU Cores, CreamCE 1PB for CMS and Atlas Going to support LHCb this year

HPC farm HPC farm is being built up last year in IHEP using SLURM, to support BESIII partial wave analysis, Geant4 detector simulation (CPU time and memory consuming), Simulation and modelling for accelerator design, Lattice QCD calculation Machine learning applications in IHEP experiments

Storage Lustre as main disk storage Gluster system DPM & dCache Capacity: 9 PB with 60 disk servers Gluster system 350TB storage provided for cosmic-ray experiments DPM & dCache 1PB, With SRM interface HSM, with Modified CASTOR 2 tape libraries + 2 robots, 26 drives Capacity: 5 PB

Storage EOS 797TB, Beryl v0.3.195, 5 servers For LHAASO dCache DPB CASTOR CVMFS Gluster NFS DPB dCache AFS

Local network The largest campus network and bandwidth among all CAS institutes 10G backbone IPv4/IPv6 dual-stack Wireless covered >3000 end users

Data Center network 160 Gbps (4X40Gbps) for Layer2 Switches 2X10 Gbps for storage nodes

International and domestic links Dedicated Links for three other IHEP sites Shenzhen (Dayabay) Dongguan (CSNS) Tibet (YBJ/ARGO) Kaiping (JUNO) Chengdu (LHAASO) Good Internet connections IHEP-Europe: 10 Gbps IHEP-USA: 10 Gbps Join LHCOne this year Dedicated 20Gb/s are added Switching Tier2 to LHCOne Non-LHC experiments applying to join 17

Research activities .Distributed computing .Virtualization and cloud .Volunteer Computing .Software Defined Network

DIRAC-based Distributed Computing Have developed the distributed computing system with DIRAC in collaboration with JINR. Currently is being used by several experiments, including BESIII, JUNO, CEPC 15 Sites from USA, Italy, Russia, Turkey, Taiwan, Chinese Universities etc. 3500 cores, 500TB IHEP 10Gb/s 10Gb/s 10Gb/s 19

DIRAC-based Distributed Computing 4 resource type resources are supported Grid, Cluster, Cloud and Volunteer computing Distributed computing has successfully integrated cloud resources based on pilot schema, implementing elastic scheduling according to job requirements Now there are 7 cloud resources joining

Multi-core supports in distributed computing In order to support JUNO’s multi-core jobs, Multi-core(Mcore) scheduling is being studied Single-core scheduling: One-core pilot pull one-core job (A) multi-core scheduling: Two ways are considered Mcore pilots added to pull Mcore jobs (B) M-core pilots pull M-core jobs Pilot “starving” when matching with mixture of n-core job (n=1, 2…) Standard-size pilots with dynamic slots (C) Standard-size: whole-node, 4-node, 8-node…. Pilots pull n-core jobs (n=1, 2…..) until all internal slots are used up 21

Virtualization and Cloud (1) IHEPCloud has been built and launched in May 2014 Scale Resources : 700 cores Users: > 1000 Three current use cases Self-Service virtual machine platform Mainly for testing and development Virtual Computing Cluster automatically supplement when physical queue is busy Distributed computing system Work as a cloud site

Virtualization and Cloud (2) Core middleware: openstack Configuration management tool: Puppet Automatically VM images for different applications Authentication SSO with Email account External storage Using CEPH to support GLANCE, NOVA and Cinder

Volunteer Computing – CAS@HOME CAS@HOME was Launched and hosted in June 2010 First volunteer computing project in China Support scientific computing from Chinese Academy of Sciences and other Research Institutes Host multiple applications from various research fields, including nanotechnology, bioinformation, physics 10K active users 1/3 are Chinese 23K active hosts 1.3 TFLOPS (real time computing power)

Volunteer Computing – ATLAS@HOME IHEP involved in the prototype of ATLAS@home and mainly contributed to solve three problems ( host virtualization, platform integration, security) The first ATLAS@home was deployed and maintained at IHEP in 2013, and moved to CERN in 2014 IHEP contributes manpower in maintaining and upgrading ATLAS@home Use virtualization to run ATLAS jobs on heterogeneous computers Event display on BOINC GUI Support of Multiple core jobs Use ATLAS@home to exploit extra CPU from Grid sites and any clusters BOINC scheduling optimization

SDN (1) Motivation Goals Massive data sharing and exchanging among HEP experiments collaborating sites Virtualization and scaling-up of computing and storage resources require more flexible and automatic network Easier control over network facility and infrastructure IPv4 performance differences among the collaborating sites Goals A Private Virtual Network across Chinese HEP Experiment collaborating sites Based on SDN architecture An intelligent data transfer network route selection algorithm Based on IPV6 among Chinese collaborating sites, but transparent to IPV4 applications

SDN (2) Components Participants End user network Backbone network(IPv6) L2VPN gateway Openflow switch Control center Participants IHEP/SDU/SJTU/… CSTNet / CERNet Ruijie networks Network path IPv4 (Mbps) IPv6 (Mbps) SDN (Mbps) SDU-IHEP 3.15 12.60 SDU-SJTU 3.51 66.40 SJTU-IHEP 28.60 380.00 SDU-SJTU-IHEP 38.60