RALPP Site Report HEP Sys Man, 11 th May 2012 Rob Harper.

Slides:



Advertisements
Similar presentations
Alastair Dewhurst, Dimitrios Zilaskos RAL Tier1 Acknowledgements: RAL Tier1 team, especially John Kelly and James Adams Maximising job throughput using.
Advertisements

Report of Liverpool HEP Computing during 2007 Executive Summary. Substantial and significant improvements in the local computing facilities during the.
Xrootd and clouds Doug Benjamin Duke University. Introduction Cloud computing is here to stay – likely more than just Hype (Gartner Research Hype Cycle.
User Documentation.  You cannot build a system for a client and leave them without adequate documentation  Computer systems are complex, require configuration.
Chris Brew RAL PPD Site Report Chris Brew SciTech/PPD.
Birmingham site report Lawrie Lowe: System Manager Yves Coppens: SouthGrid support HEP System Managers’ Meeting, RAL, May 2007.
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
IHEP Site Status Jingyan Shi, Computing Center, IHEP 2015 Spring HEPiX Workshop.
Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley.
A comparison between xen and kvm Andrea Chierici Riccardo Veraldi INFN-CNAF.
Efficiently store fewer bits. File1 File2 After Dedup: Before Dedup:5TB Chunk Store Non-Optimized Files Optimized file stubs Savings = 4TB 1TB.
Introduction to DoC Private Cloud
Virtualization for Cloud Computing
VIRTUALISATION OF HADOOP CLUSTERS Dr G Sudha Sadasivam Assistant Professor Department of CSE PSGCT.
Tier-1 experience with provisioning virtualised worker nodes on demand Andrew Lahiff, Ian Collier STFC Rutherford Appleton Laboratory, Harwell Oxford,
Hyper-V 3.0 – What’s New in Windows Server 2012? Brien Posey
SAP on windows server 2012 hyper-v documentation
SouthGrid Status Pete Gronbech: 4 th September 2008 GridPP 21 Swansea.
System Center 2012 Setup The components of system center App Controller Data Protection Manager Operations Manager Orchestrator Service.
1 The Virtual Reality Virtualization both inside and outside of the cloud Mike Furgal Director – Managed Database Services BravePoint.
UKI-SouthGrid Overview and Oxford Status Report Pete Gronbech SouthGrid Technical Coordinator GridPP 24 - RHUL 15 th April 2010.
A. Mohapatra, HEPiX 2013 Ann Arbor1 UW Madison CMS T2 site report D. Bradley, T. Sarangi, S. Dasu, A. Mohapatra HEP Computing Group Outline  Infrastructure.
How to Resolve Bottlenecks and Optimize your Virtual Environment Chris Chesley, Sr. Systems Engineer
SAIGONTECH COPPERATIVE EDUCATION NETWORKING Spring 2010 Seminar #1 VIRTUALIZATION EVERYWHERE.
SAIGONTECH COPPERATIVE EDUCATION NETWORKING Spring 2009 Seminar #1 VIRTUALIZATION EVERYWHERE.
Appendix B Planning a Virtualization Strategy for Exchange Server 2010.
RAL PPD Site Update and other odds and ends Chris Brew.
Southgrid Technical Meeting Pete Gronbech: 16 th March 2006 Birmingham.
SouthGrid Status Pete Gronbech: 2 nd April 2009 GridPP22 UCL.
Oxford Update HEPix Pete Gronbech GridPP Project Manager October 2014.
Session objectives Discuss whether or not virtualization makes sense for Exchange 2013 Describe supportability of virtualization features Explain sizing.
RD Connection Brokers Personal Desktop Pooled Desktops RD WEB Session Hosts VDI Corp LAN User login Get list of published apps & collections User.
Intel IT Overlay Jeff Sedayao PlanetLab Workshop at HPLABS May 11, 2006.
RAL PPD Computing A tier 2, a tier 3 and a load of other stuff Rob Harper, June 2011.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
UKI-SouthGrid Update Hepix Pete Gronbech SouthGrid Technical Coordinator April 2012.
Virtualization for the LHCb Online system CHEP Taipei Dedicato a Zio Renato Enrico Bonaccorsi, (CERN)
P. Kuipers Nikhef Amsterdam Computer- Technology Nikhef Site Report Paul Kuipers
London Tier 2 Status Report GridPP 11, Liverpool, 15 September 2004 Ben Waugh on behalf of Owen Maroney.
An Agile Service Deployment Framework and its Application Quattor System Management Tool and HyperV Virtualisation applied to CASTOR Hierarchical Storage.
Virtualisation & Cloud Computing at RAL Ian Collier- RAL Tier 1 HEPiX Prague 25 April 2012.
Southgrid Technical Meeting Pete Gronbech: May 2005 Birmingham.
IHEP(Beijing LCG2) Site Report Fazhi.Qi, Gang Chen Computing Center,IHEP.
1 | SharePoint Saturday Calgary – 31 MAY 2014 About Me.
© Copyright 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Restricted Module 7.
UKI-SouthGrid Overview and Oxford Status Report Pete Gronbech SouthGrid Technical Coordinator HEPSYSMAN – RAL 10 th June 2010.
Ian Gable HEPiX Spring 2009, Umeå 1 VM CPU Benchmarking the HEPiX Way Manfred Alef, Ian Gable FZK Karlsruhe University of Victoria May 28, 2009.
SERVER I SLIDE: 3. SERVER I Topic for tomorrow: Chapter 3: Configuring Hyper-V ■■ Objective 3.1: Create and configure virtual machine settings (Group.
RAL PPD Tier 2 (and stuff) Site Report Rob Harper HEP SysMan 30 th June
BaBar Cluster Had been unstable mainly because of failing disks Very few (
Scientific Computing in PPD and other odds and ends Chris Brew.
1 Update at RAL and in the Quattor community Ian Collier - RAL Tier1 HEPiX FAll 2010, Cornell.
The RAL Tier-1 and the 3D Deployment Andrew Sansum 3D Meeting 22 March 2006.
OpenNebula: Experience at SZTAKI Peter Kacsuk, Sandor Acs, Mark Gergely, Jozsef Kovacs MTA SZTAKI EGI CF Helsinki.
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
Oracle 10g database installation kit  A bundle of scripts which allows to install Oracle 10g database server on a single node: Useful for both experienced.
2007/05/22 Integration of virtualization software Pierre Girard ATLAS 3T1 Meeting
This courseware is copyrighted © 2016 gtslearning. No part of this courseware or any training material supplied by gtslearning International Limited to.
Cambridge Site Report John Hill 20 June 20131SouthGrid Face to Face.
RAL Site Report HEP SYSMAN June 2016 – RAL Gareth Smith, STFC-RAL With thanks to Martin Bly, STFC-RAL.
Microsoft Exam
Title of the Poster Supervised By: Prof.*********
Experience of Lustre at QMUL
HEPiX Spring 2014 Annecy-le Vieux May Martin Bly, STFC-RAL
Experience of Lustre at a Tier-2 site
Oxford Site Report HEPSYSMAN
PES Lessons learned from large scale LSF scalability tests
Microsoft Virtual Academy
Presentation transcript:

RALPP Site Report HEP Sys Man, 11 th May 2012 Rob Harper

My talk will be... Where we’re at now Our new stuff, including – GridPP purchases – DRI networking kit Benchmarking and hyperthreads Virtual machine infrastructure Managing configuration and stuff: cfEngine vs Puppet Future stuff

RALPP For Dummies Part of SouthGrid Staff – Chris Brew (part) – Rob Harper (part) One cluster serving Tier 2 (85%) and Tier 3 (15%), managed by Torque/Maui dCache storage

RALPP CPU

Cluster is currently nominally: 2,872 Job slots 26,409 HS06 Where available, hyperthreads used to get 150% of physical cores

RALPP Storage TB

RALPP Storage 1,060 TB in production Soon to be 1,260 TB

New Stuff: GridPP Purchases CPU: – 9 * Viglen/Supermicro Twin 2 Intel E5645 based 48 GB / node Using hyperthreads => 648 job slots, 6208 HS06 Disk: – 5 * Viglen/Supermicro 24 bay storage nodes => 200 TB of disk pool

New Stuff: Networking DRI money bought us: – 5 * Force10 s4810 switches – A heap of 10Gb NICs for older disk pool nodes – A heap of 10Gb cables Coming soon: a much reconfigured network...

New Network Layout

Benchmarking & Hyperthreads We ran HS06 benchmark on a heap of nodes with varying numbers of concurrent benchmark jobs Going past # of physical cores did give us some gains

Benchmarking & Hyperthreads So we committed 1.5 * physical cores as job slots for some nodes and ran real jobs No significant drop in efficiency More work done Many details on SouthGrid blog at

Virtual Machines Current set-up: – Xen VMs spread between a couple of servers – Local storage, nothing clever Currently in test: – Cluster running HyperV Yes, we’ll be running Linux VMs on Windows – EqualLogic storage iSCSI Mirroring, etc.

Configuration Management Already much discussed yesterday, but here’s our perspective... We currently rely on cfEngine v2 This is not supported natively on SL6 (or at all) Main options seem to be: – Crowbar in legacy cfEngine – cfEngine v3 – will need configs rewritten – Switch to Puppet – will need configs rewritten

Puppet Puppet seems to be a strong choice Particularly as other Tier 2s are coming to the same decision Not got far yet We have a working Puppet Master with some basic manifests set up We have an SL6 client for test purposes Planning to use Puppet for SL6 hosts as we set them up – leaving SL5 kit on cfEngine

Puppet Our cfEngine config relies massively on EditFiles functionality Puppet does not have this – Can run scripts to do edits – Can use modules (eg. iptables) that do the work for you We need to learn to think in a different way to take advantage of Puppet

Things to come... Getting network configuration updated Start deploying VMs in HyperV Getting Puppet configuration management running properly Start using SL6 as a standard install for services where we have no reason not to Improved monitoring