Presentation is loading. Please wait.

Presentation is loading. Please wait.

WLCG Overview Board, September 3 rd 2010 P. Mato, P.Buncic Use of multi-core and virtualization technologies.

Similar presentations


Presentation on theme: "WLCG Overview Board, September 3 rd 2010 P. Mato, P.Buncic Use of multi-core and virtualization technologies."— Presentation transcript:

1 WLCG Overview Board, September 3 rd 2010 P. Mato, P.Buncic Use of multi-core and virtualization technologies

2  Two R&D projects stated early 2008 in PH Department (under White Paper Theme 3) ◦ WP8 - Parallelization of Software Frameworks to exploit Multi-core Processors ◦ WP9 - Portable Analysis Environment using Virtualization Technology  Kick-off Workshop on April 15, 2008  1 st Workshop on adapting applications and computing services to multi-core and virtualization took place in June 2009 ◦ number of follow up actions were identified  2 nd Workshop on June 21-22, 2010 ◦ The goals were to review the progress on the follow-up actions, get new feedback from the experiments and set directions for the two R&D WPs Introduction 2

3 WP8 - Multicore R&D  The aim of the Multicore R&D project is to investigate novel software solutions to efficiently exploit the new multi-core architecture of modern computers in our HEP environment  Motivation: ◦ industry trend in workstation and “medium range” computing  Activity divided in four “tracks” ◦ Technology Tracking & Tools ◦ System and core-lib optimization ◦ Framework Parallelization ◦ Algorithm Optimization and Parallelization 3

4 Activities  Code optimization ◦ Direct collaboration with INTEL experts established to help analyzing and improve the code  Exploiting event parallelism ◦ Sharing data between processes to save memory ◦ Simulate events in different threads using Geant4 ◦ Parallel analysis using PROOF Lite and GaudiPython  Algorithm parallelization ◦ Ongoing effort in collaboration with OpenLab and Root teams to provide basic thread-safe/multi-thread library components  Random number generators, parallel minimization/fitting algorithms, parallel/Vector linear algebra  Deployment issues ◦ Current batch/grid infrastructure has to be configured to support multi-core/full node allocation 4

5 Aims to provide a complete, portable and easy to configure user environment for developing and running LHC data analysis locally and on the Grid independent of physical software and hardware platform (Linux, Windows, MacOS)  Code check-out, edition, compilation, local small test, debugging, …  Grid submission, data access…  Event displays, interactive data analysis, …  Suspend, resume… Decouple application lifecycle from evolution of system infrastructure Reduce effort to install, maintain and keep up to date the experiment software (CernVM-FS) CernVM 1.x (SLC4) and CernVM 2.x (SLC5) released Small (200-350 MB) image Available for all popular hypervisors and on Amazon Cloud (EC2) Available for all popular hypervisors and on Amazon Cloud (EC2) 5 WP9 - Virtualization R&D

6 CernVM Users ~2200 different IP addresses 6

7 Proxy Server Proxy Server Proxy Server Proxy Server CernVM HTTP server HTTP server HTTP server HTTP server Proxy Server Proxy Server HTTP server HTTP server HTTP server HTTP server Proxy Server Proxy Server HTTP server HTTP server HTTP server HTTP server Proxy Server Proxy Server Web scale using Web technology Proxy and slave servers could be deployed on strategic locations to reduce latency and provide redundancy Collaboration with CMS/Frontier deployment: reusing 70+ already deployed squid servers + Commercial SimpleCDN as backup 7

8  Ideally the analysis activity should be a continuum in terms of tools, paradigms, software frameworks, models.. ◦ Identical analysis applications should be able to run the same way on a desktop/laptop, a small cluster, a large cluster and the Grid  CernVM is a convenient tool in hands of our end users/physicists and lets them use experiment software frameworks on their laptops ◦ with little overhead ◦ without need to continously download, install and configure new versions of experiment software  ATLAS, CMS, LHCb, LCD, NA61, TH…  Physicist seem to like CernVM on their laptops and on the Cloud ◦ Can we have it on the Grid, please? From laptop to Cloud and Grid 21/6/10Welcome and Introduction, P. Mato/CERN8

9 Actions from the 1 st Workshop 9  Multicore ◦ Try submission of parallel jobs (multi-threaded, multi-process, MPI) with the existing LSF infrastructure ◦ Deploy multi-core performance and monitoring tools ◦ Running multi-core jobs on the Grid  Virtualization ◦ Transition of CernVM beyond the R&D phase ◦ Using CernVM images in virtualized batch systems ◦ Prototype an ‘lxcloud’ solution for submitting user jobs using the EC2/Nimbus API ◦ Establish procedures for creating trusted images (e.g. CernVM) acceptable for Grid sites ◦ Investigate scenarios for reducing the need for public IP addresses on WNs

10 Feedback from the 2 nd Workshop (1) 10  Experiments requested access to whole nodes ◦ Allow them to test multi-threaded and multi-process applications that are being developed, and also their pilot frameworks managing the correct mix of jobs to best optimize the entire node resources ◦ Taking responsibility for ensuring that the node is fully utilized  Issues ◦ The implementation would require end-to-end changes from the end-user submission framework to the local batch configuration and Grid middleware ◦ Adaptation of the accounting (memory and CPU) and monitoring tools ◦ Need for better handing of large files that will eventually result from larger parallel jobs

11 Feedback from the 2 nd Workshop (2) 11  Service Virtualization ◦ Virtualization of Services is in general well accepted and the experience so far is very positive ◦ In particular VO Boxes virtualization is already planed (at CERN)  Worker Node Virtualization ◦ The lxclud prototype development based on OpenNebula at CERN-IT is encouraging  Issues ◦ Questions about the lifetime of the Virtual Machines and the need of an API to control them were raised ◦ Experiments expressed some concern about the performance loss on Virtual Machines, in particular for I/O operations to local disk

12  Generation of Virtual Machine Images ◦ HEPiX policy document has been prepared establishing obligations for people providing virtual images  There is general agreement that the experiment software should be treated independently from the base operating system  CernVM File System can be game changer ◦ Could be deployed on standard worker nodes to solve problem of software distribution ◦ Experiments requested support for the CernVM File System (CVMFS) as the content delivery mechanism for adding software to a VM image after it has been instantiated  LHCb and ATLAS requested IT to host and provide 24*7 support for the CernVM infrastructure 12 Feedback from the 2 nd Workshop (3)

13 Progress since the Workshop 13  Multicore ◦ ATLAS, CMS and LHCb have all released "production-grade" parallel multi- process applications. ◦ Both ATLAS and LHCb are now testing submission of "full-node" jobs.  Virtualization ◦ Separated release cycles for CernVM-FS and CernVM ◦ Multi VO support for CernVM-FS (via automounter) ◦ CernVM as job hosting environment  Tested ATLAS/Panda pilot running in CernVM/CoPilot on lxcloud prototype  Theory group applications (MC generators) are now running on BOINC/CernVM  Developed contextualization tools  For EC2 API and compatible infrastructure  HEPIX compatible ◦ Benchmarking and performance evaluation of PROOF is under way

14 Summary  Both WP8 & WP9 are making very good progress in close cooperation with experiments  Applications being developed to exploit new hardware architectures and virtualization technology impose new requirements on the computing services provided by the local computer centers or by the Grids ◦ We need to be able to submit jobs that require “whole node” ◦ CernVM infrastructure services must be suported 24*7  Experiment would like to be given an opportunity to test these new developments on local batch clusters and on the Grid 14


Download ppt "WLCG Overview Board, September 3 rd 2010 P. Mato, P.Buncic Use of multi-core and virtualization technologies."

Similar presentations


Ads by Google