Computer Operations Group Data Centre Facilities Hitendra Patel June 2015.

Slides:



Advertisements
Similar presentations
POWER QUALITY BACKGROUND & ELEC-SAVER TVSS
Advertisements

Substation L1/ MCC L1 Update March 21 st, Current Location.
EOC: Semi-Annual Review of DOH Survey Citations, The Top 10!
Utility Management Providence Health System - Oregon Environment of Care.
Preparing for Power Outages Like any other part of the infrastructure, electrical power to the campus can fail, either as an isolated incident (e.g., tripped.
Greenbush School Maintenance Calendar. JULY Week 1 ❏ Check clock systems ❏ Fertilize athletic fields ❏ Drag and roll baseball/softball fields ❏ Clean.
Monthly Management Report November Submitted By Facility Management Team 1 Monthly Report-(Jul 2011)
Zycko’s data centre proposition The very best products and technologies Focus on what your customers need Turning products into solutions Wide range of.
ICW Water Leak at FCC Forces a Partial Computing Outage at Feynman Computing Center J. MacNerland, M. Thomas, G. Bellendir, A. Walters May 31, 2007.
UPS Improvements to Beam Availability at the Australian Synchrotron Light Source Don McGilvery.
Measuring and Validating Attempts to Green Columbia’s Data Center October 14, 2010 Rich Hall Peter M Crosta Alan Crosswell Columbia University Information.
Undertake basic electrical tasks
Flanders Electric – Engineering Division Welcomes Youto an Overview of Reconditioned Switchgear.
Electrical Power Systems
1 Lesson 3 Computer Protection Computer Literacy BASICS: A Comprehensive Guide to IC 3, 3 rd Edition Morrison / Wells.
Gary C Akers Technical Manager. * No wall clutter. * Reduced risk of clashes. * Aesthetically pleasing. * Site benefits. * Environmental benefits. * Cooling.
All content in this presentation is protected – © 2008 American Power Conversion Corporation Rael Haiboullin System Engineer Capacity Manager.
Managing a computerised PO Operating environment 1.
A New Building Data Center Upgrade capacity and technology step 2011 – May the 4 th– Hepix spring meeting Darmstadt (de) Pascal Trouvé (Facility Manager.
MODULAR DATA CENTER PUE
Charles F. Hurley Building Case Study B.J. Mohammadipour Bureau of State Office Buildings.
Power Distribution Harmonics Case Study of 285-3F Chiller Plant Michael W. Harmon Principle Engineer Savannah River Nuclear Solutions, LLC Savannah River.
Vidhyadeep Institute Of Management & Technology, Anita, Kim Electronics & Communication Department Guided by Mr. Vicky Paperwala.
Managing Computerised Offices Operating environment
Unit 201: Health and safety in building services engineering
CV activities on LHC complex during the long shutdown Serge Deleval Thanks to M. Nonis, Y. Body, G. Peon, S. Moccia, M. Obrecht Chamonix 2011.
Operating and maintenance of generator. Operating Generator If it is manually operated generator, whenever power goes off, before starting the generator,
Monthly Management Report january Submitted By Facility Management Team 1 Monthly Report-(Jul 2011)
Monthly Management Report Submitted By Facility Management Team
Building A Computer Centre HEPiX Large Cluster SIG October 21 st 2002 CERN.ch.
Monthly Management Report December Submitted By Facility Management Team 1 Monthly Report-(Jul 2011)
Physical Infrastructure Issues In A Large Centre July 8 th 2003 CERN.ch.
Progress Energy Corporate Data Center Rob Robertson February 17, 2010 of North Carolina.
Claudio Bortolin. 1 st June: PT100 electrical connection failure (Temp Chiller water In&Out) 1) it required an access to the cavern to be fixed 2) access.
Computer Centre Upgrade Status & Plans Post-C5, June 27 th 2003 CERN.ch.
HEPiX Fall 2014 Tony Wong (BNL) UPS Monitoring with Sensaphone: A cost-effective solution.
Cctv Management Solution - CMS From – Intelitech Solutions Pvt. Limited. [ ITSPL Group ]
The 13 Amp PLUG Fuse Insulating Casing Live Wire Neutral Cable grip Flexible Copper conductors Earth.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Computer Centre Upgrade Project Wayne Salter HEPiX November.
Utility Engineers, PC.  Generation  Transmission  Distribution.
Requirements for computing room A. Gianoli, M. Serra, P. Valente.
52 RCACS Ground School Engines PO 407 EO 4 “Fuel Problems, Ignition, and Basic Electrical System”
Fire Risk In Power Facilities. Introduction Fires in Power Plants or Distribution Facilities have occurred World Wide Many have resulted in Plant Shutdowns.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Computer Centre Consolidation Project Vincent Doré IT Technical.
1 Fault-Tolerant Computing Systems #1 Introduction Pattara Leelaprute Computer Engineering Department Kasetsart University
SERVICE ENTRANCE EQUIPMENT
Overview of the main events related to TS equipment during 2007 Definition Number and category of the events Events and measures taken for each machine.
Field Replaceable Units, Common Problems, and Flowcharts.
Monthly Management Report Jul2012. Submitted By Facility Management Team 1 Monthly Report-(Jul 2011)
19.2 Domestic electricity Paying for electrical energy  Most of the electrical appliances in your home use mains electricity.  Mains electricity: alternating.
Vault Reconfiguration IT DMM January 23 rd 2002 Tony Cass —
DOMESTIC ELECTRICAL CIRCUITS National Diploma Mechanical and Electrical Services Construction.
This courseware is copyrighted © 2016 gtslearning. No part of this courseware or any training material supplied by gtslearning International Limited to.
Ag Power Ag Mechanics Reduce Preventive maintenance on tractors will _____ repairs and downtime.
Monthly Management Report September Submitted By Facility Management Team 1 Monthly Report-(Jul 2011)
Chapter 3 PHYSICAL INJURY AND CONTROLS 3.2 Electrical Safety
FESS/Ops Shutdown Summary Week Ending 3Aug12. CUB Shut-Down Activities Shut Down Plant on July 21 hrs. Electrical – High Voltage Air Switch/Fuse.
West Cambridge Data Centre Ian Tasker Information Services.
Detector Safety System overview DSS is a PLC based system designed to protect ATLAS detector from damages. Front end (hardware) 2 redundant PLC ~700 input.
S A Griffiths CM42 June 2015 Electrical & Control.
1 Taming the Data Center SDR 1.3 Butch Adkins Infrastructure & Operations.
Industrial Resources, Inc.
CANOVATE MOBILE (CONTAINER) DATA CENTER SOLUTIONS
Chapter 2: Introduction to Lab Procedures and Tool Use
MODULAR DATA CENTER PUE
CERN Data Centre ‘Building 513 on the Meyrin Site’
Data Center Service Brian Messenger 11/15/2016.
HVAC Repair - A Guide for Everyone
Oxford Site Report HEPSYSMAN
Energy PAD Logistic – January 16 septembre 2018.
Presentation transcript:

Computer Operations Group Data Centre Facilities Hitendra Patel June 2015

Computer rooms HPD LPD (Dual DX CRACs) UPS (Dual DX CRACs) Rack Capacity Total 340. in-use 270 Power 8MW. In-use 980kW UPS 600kVA Cross-linked transformers HVAC 4x 750kW each (N+1) Air Cooled (Glycol) with 26 CRACS including Dual DX CRACS In-row12 each R89 Capacity

Operations Tasks And Challenges Staff Management Capacity Management Preventive Monitoring Documentation Security Management Visitors / VIP Project Management Repairs Energy Management CCTV/ Assess Control Vendor Management Reporting Drills Performing DCIM Mini Training Skills Risk Analysis Health And Safety Incident Response Change Management Reputation

June 2009 Water leak Effect Water leakage from AHU on the first floor identified. Water leaked into the SL8500 tape robot Water leak missed a rack in the HPD computer room Solution Modified with overflow drainage Installed water detection system linked to the auto shut-off the water on the 1 st / 2 nd floor. Installed bund under the 1 st floor kitchen

Water Leak

August 2009 Air Conditioning failure Effect Complete air conditioning auto-shutdown All racks in the HPD computer room shutdown UPS and LPD computer room survived because of dual CRAC (DX and chilled-water) Cause was a faulty sensor HVAC system managed by BMS Solution BMS reconfigured at act as slave/notification only Complete temperature sensors installed and alert notification setup

Air Conditioning Failure

Effect Some EMC power supplies detecting auto-shutdown Criticial Tier1 database racks affected UPS supply under-load 20% usage Solution 100m power cable installed to reduced noise *DID NOT WORK* Purchased 4kVA isolating transformers for each supply to the rack. *WORKED* November 2009 Noise in the UPS power supply

4kVA isolating transformer

June 2010 Dust contamination in the HPD Computer Room Effect Pipe lagging on the chiller water ring mains coming off Orange dust contamination in the room Health & Safety issues Solution Access limited to HPD. Masks must be worn Replaced ALL the lagging Underfloor/overhead cleaning implemented and CRAC filters replaced Routine checks on the lagging

HPD Dust Contamination

July 2010 Transformer TX2 tripped Effect Loss of power to racks and CRACs fed from E circuit Solution Transformers cross-linked (dual) so power switched to another transformer. *Manual switched over* Fault identified to Restricted Earth Leakage. Changed the error rate

Effect Burning smell from Distribution Board (DB) Temperature 105 °c Solution Emergency shutdown of the DB Fault with Active Filter within the DB. Switched off Active Filter and switched DB back on Temperature sensors installed in all DBs. Notification via Pager/SMS December 2010 Distribution Board over-heated

Year 2011 Fault-free year

Effect Site-wide power cut across RAL Loss of power to R89 Data Centre R89 Generator tripped UPS shutdown Solution Generator fault with Restricted mechanical board end UPS monitoring software deployed Console room and Operations room on UPS feed R89 Lights on UPS feed. Health & Safety issue. November 2012 RAL Power Cut

November 2012 Planned Essential board upgrade Effect Upgrade Essential board from 400amp to 630amp New DX dual CRACs installed in LPD Computer room Temporary supply to UPS feed installed to avoid downtime Move BMS panel to Essential Board Only UPS risk At Fault Engineers forgot to reconnect the neutral conductors Power surge to UPS feed Damage in excess of £250K

Effect Generator failed to start UPS Computer room on UPS feed with NO power! Bus-coupler failed to close when fuse put back in Solution Bus-coupler manually switched in and normal power restored Fault identified to faulty battery in the transformer. Could have been as a result of the power cut in early November All internal batteries removed from the transformers and now powered from central source. All batteries now monitored Generator Testing regularly – On-load every quarterly and Off-load monthly November 2012 Generator Test

Faulty Transformer batteries

Effect Upgrade Essential board from 400amp to 630amp New DX dual CRACs installed in LPD Computer room Move BMS panel to Essential Board Only UPS risk Included Electrical Testing of UPS circuits (Electricity at Works Regulations Solution Change Management / committee to review the work flow/risks Permanent non-UPS supply installed to provide extra resilience Temporary power to CRACS in the UPS room Dual power supply servers switched to non-UPS to avoid downtime Everyone debrief of the tasks Better risk management processes November 2013 Planned Essential Board upgrade 2 nd attempt

Summary Reviewed Preventive Maintenance Plans Developed schedules/cycles on what required maintenances Invoked quarterly testing of the Generator/UPS *on-Load* Routine checks of the lagging and underfloor pipework's implemented PC0Webs cards installed in CRACS/Chillers and monitored by Nagios and BMS system Better understanding of the HVAC system and cross training Working together – Estates department and Computer Operations Directors understand to core business need of R89 Data Centre 2014

Effect Electrical Testing of all non-UPS circuits (Electricity at Works Regulations Testing of Distribution Board (DBs) x 11 in the HPD/LPD Circuit testing under floor Problems Some circuits incorrectly labelled. Rack PDUs overloaded Solution Change Management / committee to review the work flow/risks Detailed planning on what DB should be tested to avoid downtime Everyone debrief of the tasks Better risk management processes Reviewed circuit labelling and processes in place. February 2015 Electrical Testing of HPD / LPD Computer Rooms

Question time?