Distributed Computing in IHEP

Distributed Computing in IHEP
Xiaomei Zhang On behalf of IHEP distributed computing group 2017 Spring Hepix , Budpest

Motivation Increasing experiments and data volume in IHEP
Put pressure on one single data center Extra resources needed as a supplement Possible resource contributing from wide international cooperation in experiments Various heterogeneous opportunistic resources existing Distributed computing is the way to integrate as a whole Distributed and heterogeneous resources

BESIII (Beijing Spectrometer III at BEPCII)
A bit history Distributed Computing (DC) in IHEP was first built in 2012 Meet peek need of BESIII with ~3PB/5year Put into production in 2014 Resources at beginning: 80% Batch, 20% Grid Integrate cloud in 2015 Resources now: 10%Grid, 65%Batch, 25%Clouds BESIII (Beijing Spectrometer III at BEPCII)

CEPC Collider Ring(50Km)
A bit history In 2015, evolved into a general platform for multi experiments More new experiments are coming in IHEP JUNO, LHAASO, CEPC…… More than one experiments express interests on using or evaluating Save manpower and simplify management BTC IP1 IP2 e+ e- e+ e- Linac (240m) LTB CEPC Collider Ring(50Km) Booster(50Km) LHAASO Large High Altitude Air Shower Observatory CEPC JUNO (Jiangmen Underground Neutrino Observatory)

Computing model IHEP as central site Remote sites Data flow
Raw data processing, bulk reconstruction, analysis Remote sites MC production, analysis Sites without SE only for MC, job output directly to remote SEs Data flow Central storage in IHEP IHEP -> Sites, DST for analysis Sites -> IHEP, MC data for backup Comparing LCG, size and man power is small. We keep everything simple to make sites and central management as easy as possible with small working group Simple, easy with small group Most of sites without grid experience

Resource Site: 15 from USA, Italy, Russia, Turkey, Taiwan, China Universities(8) Network: 10Gb/s to USA and Europe, 10 Gb/s to TaiWan and Mainland Joining LHCONE is in plan to future improve network Bottleneck in end-to-end, PerfSonar monitoring in plan Resource: ~3000 CPU cores, ~500TB storage Job input and output directly from remote SE IHEP 10Gb/s 10Gb/s 10Gb/s

DIRAC-based WMS DIRAC (Distributed Infrastructure with Remote Agent Control) Middle layer between jobs and heterogeneous resources GANGA and JSUB for massive job submission and management JSUB, newly developed for flexible workflow and general purpose CVMFS (CERN VM FileSystem) deploy experiment software to remote sites

CVMFS set-up in IHEP In 2013, the repo boss.cern.ch for BESIII created at CERN In 2015, IHEP CVMFS Stratum0(S0) created Support other experiments including CEPC, JUNO, LHAASO…… 3 repositories, ~600GB In 2017 New IHEP S0 with HA created for both DC and local batch In 2017, IHEP CVMFS Stratum1(S1) created, both serve IHEP S0 and CERN (RAL) S0 Speed LHC and non-LHC software access in Asia Plan to have S1 outside Asia to speed up access of IHEP software

JSUB (Job submission tool)
Lightweight and general framework developed to take care of life cycle of tasks (a bunch of jobs) Extensible arch with plug-ins make easy for experiments to create own Modular designs allow job workflow to be customizable “Step” and “Module” can be reused

FroNtier/squid for offline DB access
Static SQLite DB on CVMFS and Mirror DB are in use for DC FroNtier/Squid based on Cache tech is considered Provide real-time and more stable database access in DC env On-going works Allow Frontier to accept MySQL to be backend Bind Frontier with experiment software to allow transparent access

Multi-VO control in DIRAC
VOMS set up to group experiments cepc, juno, bes Resource scheduling and control Users grouped with VOs Jobs can get VO info from owners Pilots separated with VOs Pull only same VO jobs Resources tagged with VOs Name the owners Scheduler can do the matching and priority control based on VO and roles Different colors for Different VOs

Metadata and File catalogue
Built based on DFC (DIRAC File Catalogue) Directory-like usage, tightly coupled with DIRAC WMS Combine Replica, Metadata and Dataset Catalogue Performance similar to LFC+AMGA, more convenient Permission control through DIRAC user management Allow separated usage and control for each experiment Current data registered Size: ~300TB

Central SE with StoRM Start with dCache, and Switch to StoRM in 2015
Simply data exchange between local and grid One set-up to allow multi-experiments Frontend(SRM, HTTP, xrootd) + Backend(Lustre) Frontend, permission control and role matching with VOMS Backend, local file systems with experiment data Performance fine with current load Version: StoRM Capacity: ~2.5PB Guide to help small sites to have SE with StoRM Lustre Capacity(TB) Model owner /gridfs 66 RW public /bes3fs 1100 RO bes /juofs 502 juno /cefs 794 cepc

Massive data transfer system
Developed as a DIRAC service Share data among sites, Transfer between site SEs Dataset supported, multi-streams design for high speed Permission control through DIRAC user management Each year, ~100TB data transfers between sites

Elastic cloud integration
Implement cloud integration in an elastic way Extension VMDIRAC with VM scheduler was introduced VM booted with “pilot” to pull jobs from TaskQueue VM contextualization done with Cloud-init Cloud types supported OpenStack, OpenNebula, AWS Interface used: libcloud, rOCCI, boto Not easy to find a general layer to meet all requirements

Elastic cloud integration
Become important part of IHEP DC 6 cloud sites are available from Italy, China, Russia, More than 700K jobs have been processed in the past 2 years ~ 5% failure related with VM performance issue VM got stuck, not easy to track down

Commercial cloud exploration
In , AWS cloud integrated in elastic way as others Trial done with the support of Amazon AWS China region BES image created and upload to AWS Connect with AWS API in VMDIRAC elastic scheduling Tests done and Price evaluated 400K BES rhopi events simulated with 100% success rate simu+reco+ana with few output, mainly focus on evaluating CPU price Proper CPU type need to be chosen for best performance and price Price is still a bit high (~10) comparing to self-maintenance batch Use Amazon market price in China

Action-based Site Monitoring
Motivation Improve site stability Ease the life of admins A global site status view Components Information Collection and Display Decision and actions Both Active and Passive info collected Policies defined for automatic actions taken in case of problems Sending warning messages, ban sites

Production status Total Jobs are 728K in 2016, 665K in 2015, 340K in 2014 Max running jobs can reach 2K (First season in 2015) Data exchange about 300TB directly from jobs each year

Multi-core supports Multi-process/thread applications are booming in HEP Best exploitation of multicore CPU architectures Decrease memory usage per core Multi-core(Mcore) supports are considered Current: One-core pilot pull one-core job First way: Mcore pilots added to pull Mcore jobs M-core pilots pull M-core jobs Easy to implement Pilot “starving” when matching with mixture of n-core job (n=1, 2…)

Multi-core supports Second way: Standard-size pilots with dynamic patitionable slots Standard-size: whole-node, 4-node, 8-node…. Pilots pull n-core jobs (n=1, 2…..) until internal slots used up More complicated to implement Schedule efficiency is more complicated than single-core How many pliots needed? Best match-making of jobs to pilots? Pilot Job pool

Future Plan HPC federation with DIRAC started to build a “grid” of HPC computing resources HPC resource becomes more and more important in high energy physics data processing Many HPC computing centers are being built up in recent years among HEP data centers Scaling and related performance study are in considerations to meet possible challenge of large experiments Data Federation with Cache considered Speed up data access in sites Free small sites from maintenance of storage

Summary Mature techs used as much as possible to keep easy and simple for sites and central manager Developments done to meet specific needs of our experiments Small scale and work fine with current resource and load Keep up with advanced techs to meet future challenge and more requirements from experiments

Thank you!

Distributed Computing in IHEP

Similar presentations

Presentation on theme: "Distributed Computing in IHEP"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Computing in IHEP

Similar presentations

Presentation on theme: "Distributed Computing in IHEP"— Presentation transcript:

Similar presentations

About project

Feedback