Kashif Mohammad VIPUL DAVDA

Slides:

Advertisements

Similar presentations

NorthGrid status Alessandra Forti Gridpp12 Brunel, 1 February 2005.

Advertisements

Duke Atlas Tier 3 Site Doug Benjamin (Duke University)

IHEP Site Status Jingyan Shi, Computing Center, IHEP 2015 Spring HEPiX Workshop.

Oxford Site Update HEPiX Sean Brisbane Tier 3 Linux System Administrator March 2015.

A comparison between xen and kvm Andrea Chierici Riccardo Veraldi INFN-CNAF.

New VOMS servers campaign GDB, 8 th Oct 2014 Maarten Litmaath IT/SDC.

Tier-1 experience with provisioning virtualised worker nodes on demand Andrew Lahiff, Ian Collier STFC Rutherford Appleton Laboratory, Harwell Oxford,

Cambridge Site Report Cambridge Site Report HEP SYSMAN, RAL th June 2010 Santanu Das Cavendish Laboratory, Cambridge Santanu.

Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel.

A Makeshift HPC (Test) Cluster Hardware Selection Our goal was low-cost cycles in a configuration that can be easily expanded using heterogeneous processors.

Southgrid Technical Meeting Pete Gronbech: 16 th March 2006 Birmingham.

October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.

Oxford STEP09 Report Ewan MacMahon/ Pete Gronbech HEPSYSMAN RAL 2nd July 2009.

Testing Session Testing Team-Release Management Team.

OSG Area Coordinator’s Report: Workload Management February 9 th, 2011 Maxim Potekhin BNL

Grid Developers’ use of FermiCloud (to be integrated with master slides)

Tier 3(g) Cluster Design and Recommendations Doug Benjamin Duke University.

UKI-SouthGrid Overview and Oxford Status Report Pete Gronbech SouthGrid Technical Coordinator HEPSYSMAN RAL 30 th June 2009.

Oxford Update HEPix Pete Gronbech GridPP Project Manager October 2014.

Storage Wahid Bhimji DPM Collaboration : Tasks. Xrootd: Status; Using for Tier2 reading from “Tier3”; Server data mining.

Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.

Configuration Management with Cobbler and Puppet Kashif Mohammad University of Oxford.

Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.

Southgrid Technical Meeting Pete Gronbech: 26 th August 2005 Oxford.

Open Science Grid OSG CE Quick Install Guide Siddhartha E.S University of Florida.

SAS Grid Department of Finance Canada. Agenda SAS in the Department of Finance Before the implementation of SAS Grid Implementation of SAS Grid Effect.

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Stuart Kenny and Stephen Childs Trinity.

Southgrid Technical Meeting Pete Gronbech: May 2005 Birmingham.

Welcome to the PVFS BOF! Rob Ross, Rob Latham, Neill Miller Argonne National Laboratory Walt Ligon, Phil Carns Clemson University.

Online System Status LHCb Week Beat Jost / Cern 9 June 2015.

IHEP(Beijing LCG2) Site Report Fazhi.Qi, Gang Chen Computing Center,IHEP.

QWG Errata Management Framework Ian Collier 10 th Quattor Workshop Rutherford Appleton Laboratory October 2010.

Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.

Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.

BaBar Cluster Had been unstable mainly because of failing disks Very few (

15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK

Open Science Grid Build a Grid Session Siddhartha E.S University of Florida.

SL5 Site Status GDB, September 2009 John Gordon. LCG SL5 Site Status ASGC T1 - will be finished before mid September. Actually the OS migration process.

© 2006 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 UC 7.0 Install and Upgrade Changes TOI Josh Rose UCBU Software Engineer.

The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.

Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.

DPM: Future Proof Storage Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI

UK Status and Plans Catalin Condurache – STFC RAL ALICE Tier-1/Tier-2 Workshop University of Torino, February 2015.

Virtual machines ALICE 2 Experience and use cases Services at CERN Worker nodes at sites – CNAF – GSI Site services (VoBoxes)

IHEP Computing Center Site Report Shi, Jingyan Computing Center, IHEP.

Status of the SL5 migration ALICE TF Meeting

Australia Site Report Lucien Boland Goncalo Borges Sean Crosby

WLCG IPv6 deployment strategy

RHEV Platform at LHCb Red Hat at CERN 17-18/1/17

Status of BESIII Distributed Computing

Pete Gronbech GridPP Project Manager April 2016

The EDG Testbed Deployment Details

The New APEL Client Will Rogers, STFC.

Dag Toppe Larsen UiB/CERN CERN,

Operations and plans - Polish sites

Dag Toppe Larsen UiB/CERN CERN,

HEPiX Spring 2014 Annecy-le Vieux May Martin Bly, STFC-RAL

Moroccan Grid Infrastructure MaGrid

Andrea Chierici On behalf of INFN-T1 staff

Stuart Wild. Particle Physics Group Meeting, January 2010.

AGLT2 Site Report Shawn McKee/University of Michigan

Moving from CREAM CE to ARC CE

Oxford Site Report HEPSYSMAN

Edinburgh (ECDF) Update

AGLT2 Site Report Shawn McKee/University of Michigan

BEIJING-LCG2 Site Report

ETHZ, Zürich September 1st , 2016

RHUL Site Report Govind Songara, Antonio Perez,

Pete Gronbech, Kashif Mohammad and Vipul Davda

Ste Jones John Bland Rob Fay

Presentation transcript:

Kashif Mohammad VIPUL DAVDA Oxford Site Report Kashif Mohammad VIPUL DAVDA

Since Last HepSysman: Grid DPM Head Node Upgrade to Centos7 DPM Head node was migrated to Centos7 on new hardware Puppet managed Went smoothly but required a lot of planning Details in wiki https://www.gridpp.ac.uk/wiki/DPM_upgrade_at_Oxford ! ARC CE upgrade Upgraded one ARC CE and few WNs to Centos 7 Completely managed by Puppet Atlas is still not filling it up

Since Last HepSysman: Local Main cluster is still SL6 running torque and maui Parallel Centos7 HT Condor based cluster is ready with few WNs’ A small Slurm cluster is also there ! Restructuring various data partitions across servers Gluster Story

Gluster Story Our lustre file system was hopelessly old and MDS and MDT servers were running on out of warranty hardware So the option was to move to the new version of lustre or something else Some of the issue with lustre Requires separate MDS and MDT servers Need to build kernel every time At the time, there was some confusion whether lustre will remain open source or not

Gluster Story Gluster is easy to install and doesn’t require any metadata server I setup a test cluster and used tens of terabytes to test and benchmark The result was comparable to Lustre Setup production cluster with almost default configuration Happy and rsynced /data/atlas to new gluster Still worked OK and then allowed users It sucked and was taking 30 mins to do ls !

Gluster Story Sent SOS on gluster mailing list and did some extensive googling Came up with many optimization and at the end it worked Performance improved dramatically Later added some more servers to the existing cluster and online rebalancing worked very well Last week we had another issue …

Gluster Story: Conclusion It doesn’t work very well with millions of small files; but I think it is same with lustre Supported by RedHat and developers from RedHat actively participate in mailing list Our Atlas file system has 480TB storage and more than 100 million files. LHCb has 300TB and much less number of files. I think number of files is an issue as I haven’t seen any issue with LHCb

OpenVas: Open Vulnerability Scanner and Manager

OpenVas

OpenVas

Bro + ELK Dell Force 10 oVirt Host ELK Bro Server VMs Bro FB Beat Kib ES LS Mirror Ports oVirt Host ELK Bro Server VMs

Bro + ELK