Presented by Joe Soroka

Slides:



Advertisements
Similar presentations
Electrical Systems Chapter 9.
Advertisements

General Safety Overview and Information
Engine Tune-Up By Jonathan Rollins.
Electrical Installation
Adam Adgar School of Computing and Technology
Thermography Adam Adgar School of Computing and Technology.
3.4.5 Student Book © 2004 Propane Education & Research CouncilPage Maintaining Bulk Plant Pumps, Strainers and Bypass Systems Knowing how to maintain.
ADX 150 – Engine Repair ADX 170 – Climate Controls
Thermostats, Pressure Switches, and Other Electric Control Devices
Refrigerators.
Setup/Installation/Operation of an Environmental Control Unit (ECU)
Engine Maintenance Chapter 10 Emergency Repairs Afloat.
VSE Corporation Proprietary Information
Constellation Heating and Cooling Building Automation Systems.
Testing steam traps using airborne ultrasound inspection
RMM Systems, LLC By Sam Rietta RMM Systems, LLC is a manufacturer's representative firm focused on providing value added sales solutions by providing.
Energy Service Productivity Management ©2007 ESPM Energy Consultants, L.L.C. All Rights Reserved.
Quad Plus Shredder Drive Maintenance Joe Kowalkowski Engineering Manager – Recycling Industry
INTD 51 human environments building systems. heating/ventilation/air-conditioning (HVAC) maintain a comfortable indoor climate control temperature and.
Air-Conditioning Control Systems
MAIN ELECTRICAL SURVEY ITEMS, GENERATORS AND GOVERNORS,, CIRCUIT BREAKERS, SWITCHBOARDS AND FITTINGS (Adapted from:D.T. Hall:Practical Marine Electrical.
MENG 547 LECTURE 3 By Dr. O Phillips Agboola. C OMMERCIAL & INDUSTRIAL BUILDING ENERGY AUDIT Why do we audit Commercial/Industrial buildings Important.
Maintenance At Your WWTP August 2010
Section 4: Electric Motors
Engineer Presentation
Condensation Piping and Installation Drain Pans
(Adapted from:D.T. Hall:Practical Marine Electrical Knowledge)
Vibration Sensors for Cooling Towers Challenges Cooling towers offer the vibration analyst many challenges in sensor selection, mounting and environmental.
Hydraulics.
Diesel Generator Reliability: Lessons Learned from Storms
Direct Current Circuits
DHHB, 1ID MAINTENANCE Effects of Cold on Military Vehicles.
Harnessing Free Heat. The Energy Harness provides hot water using multiple heat sources, making the best use of low grade heat. It improves the efficiency.
During a mains supply interruption the entire protected network is dependent on the integrity of the UPS battery as a secondary source of energy. A potential.
Quality Service Providing Quality Performance.
1.4b Charging System 1. Charging System Function of the charging system 2 Convert mechanical energy into electrical energy Recharge battery Provide higher.
CHAPTER 6 Moving Heat: Heating and Air Conditioning Principles
Troubleshooting Electric Control Devices
CHAPTER 4 CPB 20004: Plant Utility n Maintenance
Trindel Insurance Fund
Motor Testing (Motor Only)
A smart signalling system for Indian railways Smart signalling system – user’s view Full capacity realisation Flexibility of movements Easy to operate.
1 SHELL FLEET TECHNOLOGY TOUR August 29, VALUE DELIVERY THROUGH SERVICES OIL CONDITIONING MONITORING PROGRAM.
1-1-Why Maintenance HVAC ? Increases equipment life & reliability Reduces size & scale & number of repairs Lowers maintenance costs through better.
Utility Engineers, PC.  Generation  Transmission  Distribution.
International Atomic Energy Agency 1 Grid, Industrial involvement and procurement Akira OMOTO DIR, NENP.
Large Tonnage Chiller WC Centrifugal. Content Mechanical System Key Components Motor Starters Application 2.
DEMAS ENGINEERING SERVICES LIMITED. The Condition Based Monitoring maintenance Experts Our Vision To be the most sought after provider of high quality,
NERC Lessons Learned Summary LLs Published in September 2015.
Modern Maintenance. Management
Teknologi Pusat Data 12 Data Center Site Infrastructure Tier Standard: Topology Ida Nurhaida, ST., MT. FASILKOM Teknik Informatika.
WELCOME!. Who is T&T Power Group? WELCOME! Who is T&T Power Group? What can we do for you?
Using existing lifts in existing buildings to evacuate disabled persons Derek Smith Technical Director UK Lift and Escalator Industry Association.
GENERATORS A Large Piece of Equipment That Gets Ignored!
Chapter 3 PHYSICAL INJURY AND CONTROLS 3.2 Electrical Safety
Process Safety Management Soft Skills Programme Nexus Alliance Ltd.
Engineering Services Predictive Maintenance. Página Web Página Web
TROUBLESHOOTING ELECTRIC MOTORS
CANOVATE MOBILE (CONTAINER) DATA CENTER SOLUTIONS
Section 6: Air-Conditioning (Heating And Humidification)
14 Refrigerant Recovery, Recycling, and Recharging.
Fans & Heaters technical seminar
Electrical Gas Plumbing Heating & Cooling
The Control of Hazardous Energy (Lockout-Tagout)
Personal Protective Equipment
Emergency Electrical Power Supply for Buildings
Charge 2. Equipment Protection 3. Definition and Documentation
Presentation transcript:

Presented by Joe Soroka RAMPS© Reliability, Availability, Maintainability, Predictability, Scalability Presented by Joe Soroka For additional information visit www.totalsitesolutions.com

SCALABILITY PREDICTABILITY AVAILABILITY While budgets may be tighter the requirement for maximum uptime has not gone away The design of your facility is only one piece of the pie that will effect your site’s uptime It is important that we are aware of how Reliability, Availability, Maintainability, Predictability and Scalability all affect your site’s uptime SCALABILITY PREDICTABILITY MAINTAINABILITY AVAILABILITY RELIABILITY

RELIABILITY Reliability is the ability of a system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances

Reliability What is reliability? Modeling Equipment selection Weibull Markov Reward modeling Modeling IEEE Gold Book Procedures: accurate, confirmed/tested Equipment selection Generator UPS Systems EPO Systems Switchgear Monitoring systems For additional information visit www.totalsitesolutions.com

Reliability Reliability Reliability modeling Equipment Commissioning Operations & maintenance For additional information visit www.totalsitesolutions.com

Reliability Bathtub curve of reliability Infant mortality Useful life Burn in/load testing Commissioning Useful life Proper maintenance End of life Identify and replace prior to entering this period For additional information visit www.totalsitesolutions.com

Reliability The reliability of a system is no greater than the weakest component in a system series In a complex system you need to identify and quantify the importance of each component in the system A reliability block diagram is a graphical representation of the components of the system and how they are related to reliability For additional information visit www.totalsitesolutions.com

Reliability Many of the reliability design ideas share a common philosophy with those recommended for availability This is because there is a very close relationship between reliability and availability While reliability is about how long an application runs between failures, availability is the ability of a system to tolerate failures and how long it is accessible to the users Obviously, when a system's components and services are highly reliable, they cause fewer failures from which to recover and thereby help increase availability For additional information visit www.totalsitesolutions.com

Reliability Equipment Major manufacturers Past experiences Local maintenance support Parts distribution centers Fine line between leading edge and bleeding edge Formal submittal review meetings For additional information visit www.totalsitesolutions.com

Reliability Equipment Generator’s isolation valves ATS bypass TVSS indicators and alarms Lightening protection EPO systems Wiring Control relays Covers Diagrams Testing Day 2 changes For additional information visit www.totalsitesolutions.com

Reliability Equipment Generators Redundant batteries Battery monitoring Fuel level monitoring Water heater jacket isolation valves Silicon heater hoses Coolant level pre-alarms, both cores Water separators (Racor Filters) with alarms Engine diagnostic link For additional information visit www.totalsitesolutions.com

Reliability Equipment UPS systems Dual input Maintenance bypass cabinet Advanced monitoring Battery monitoring Redundant battery strings for VRLAs Site specific procedures For additional information visit www.totalsitesolutions.com

Reliability Equipment Automatic Transfer Switches (ATS) Maintenance bypass or wrap around breakers Phase sync monitoring Pause Neutral/dual solenoids Monitoring Transient Voltage Surge Suppression (TVSS) Indication of operation Surge counter For additional information visit www.totalsitesolutions.com

Reliability Equipment EPO systems Wiring in conduit and not open plenum Control relay coils should not be energized until activation Secondary covers installed over the EPO buttons Detailed and accurate schematics diagrams System should be designed so it can be tested System should be capable of making day 2 changes without risk Part of an engineered drawing and not a cloud saying “by others” For additional information visit www.totalsitesolutions.com

Reliability Equipment Thermal runway Increase heat density Reduce time to thermal runway Increase the need for a reliable HVAC system Specialized HVAC systems Possibly switching from emergency to UPS power Long UPS battery runtimes may be unclear Rack layout, equipment airflow direction Cold/hot aisle Enclosed hot aisles Type rack Doors Vents Fans For additional information visit www.totalsitesolutions.com

Reliability Equipment Water storage Chilled water Makeup water In the event of power outage or temporary chiller failure, do you have the capability to ride through Makeup water How reliable is the city water supply Do you have diverse sources Water storage tanks Well Other water sources For additional information visit www.totalsitesolutions.com

Reliability Commissioning Commissioning – With each project being unique, there is a need to determine how much commissioning is appropriate for the project. Factors that influence this decision include: Building’s mission-criticality Facility’s use or purpose Complexity of the building’s systems Building type and size Project type, whether existing building system or retrofit, or both Building tenant or occupant demographics System reliability requirements Owner’s objective in commissioning the building; IAQ, system reliability and/or energy efficiency Project budget For additional information visit www.totalsitesolutions.com

Operation and Maintenance Reliability Operation and Maintenance Use a pilot/copilot approach Commercial airplanes do not fly with just one pilot - why would you Standardize as much as possible Standard procedures Standard process Use a Computer Maintenance Management System (CMMS) Timely reports and schedules Accurate information Archive past performance Instant access to information For additional information visit www.totalsitesolutions.com

AVAILABILITY Availability is the ability of a system to tolerate failures Refers to the time that a system is available to its users This means the process continues to be served through the failure and that, ideally, the failure is transparent to the user For additional information visit www.totalsitesolutions.com

Availability Availability Design Resources Procedures For additional information visit www.totalsitesolutions.com

Availability Availability is typically expressed by the number of nines Downtime per year Availability # of nines Downtime 90% 1-nine 36.5 days/year 99% 2 nines 3.65 days/year 99.9% 3 nines 8.76 hours/year 99.99% 4 nines 52 minutes/year 99.999% 5 nines 5 minutes/year 99.9999% 6 Nines 31 seconds/year For additional information visit www.totalsitesolutions.com

Availability Failures can be attributed to the following causes: Design failures This class of failures takes place due to inherent design flaws in the system. In a well designed system, this class of failures should make a very small contribution to the total number of failures Infant mortality This class of failures cause newly manufactured hardware to fail. This type of failure can be attributed to manufacturing problems like poor soldering, leaking capacitor etc. These failures should not be present in systems leaving the factory as these faults will show up in proper factory system burn-in tests For additional information visit www.totalsitesolutions.com

Availability Random failures Wear out Random failures can occur during the entire life-cycle of a system. These failures can lead to system failures. Redundancy is provided to recover from this class of failure Wear out Once a hardware module has reached the end of its useful life, degradation of component characteristics will cause hardware modules to fail. These types of faults can be weeded-out by preventive maintenance and routing of hardware For additional information visit www.totalsitesolutions.com

Availability Design Designing systems with sufficient levels of redundancy Eliminating single points of failure Availability design guidelines Consult your engineer TIA Standard - TIA 942 Uptime Institute – Tier Definition For additional information visit www.totalsitesolutions.com

Availability Design System design should have multiple paths Active or passive, depending upon the site reliability requirements If redundant paths need to be VE? out to meet the project budget, consider adding the breaker or valve now or later; when budget allows add the actual feed By adding the breaker or valve up front you will be able to install temporary cable or piping when an emergency arises For additional information visit www.totalsitesolutions.com

Availability Design When performing maintenance, and decreasing the availability of system redundancy, move the reduction of availability away from the critical load and toward the utility as much as possible i.e. If you had a system plus system design and you are going to take the UPS out of service for maintenance, do not just open the UPS system and allow downstream dual cord devices and static transfer switch handle the loss of redundancy (?) Place the UPS in maintenance bypass to continually feed the second source with stable power Better yet, place the UPS on generators or alternate UPS supply to avoid sending unprotected utility power to the critical load For additional information visit www.totalsitesolutions.com

Availability Resources Technical resources Parts Onsite spares Operation staff Response staff Maintenance & repair staff Parts Onsite spares Manufacturer spares Vendor spares Supply houses For additional information visit www.totalsitesolutions.com

Availability Operation Staff Resources Operation staff Whether you are using in-house or contracted staff, it is important to ensure they have the proper resources Proper access to the facility If using key card system what happens when the card readers lose power? Who has the keys? Do you have all of your operation staff’s phone numbers Cell numbers and home numbers Company and personal emails For additional information visit www.totalsitesolutions.com

Availability Response Staff Resources Emergency response Types of emergency responses Additional operation staff Electrical, mechanical & plumbing contractors General construction Testing and repair firms Fire and security Hazardous material spill List of suppliers and vendors Emergency contact information Alternate contact information Contracts in place to execute after hours support Meet them before an emergency arises, have them at the site for lunch For additional information visit www.totalsitesolutions.com

Availability Maintenance & Repair Staff Resources Do you have the necessary contracts in place? Is there maintenance your operation staff can perform in house? Do you have alternate contact numbers for your maintenance providers? Do they have proper access to the facility? Do you have a second string waiting on the sidelines in case of an emergency? For additional information visit www.totalsitesolutions.com

Availability Parts Resources Parts and supplies Define and assess critical parts Stock critical parts onsite Have an annual budget for spare parts that increases a little each year Verify that your vendors and contractors have spare parts handy Identify supply houses and suppliers that have parts you need Have after hours phone number(s) to get parts from supply houses Have contracts in place and make sure they are active For additional information visit www.totalsitesolutions.com

Availability Procedures Operation Maintenance Emergency Troubleshooting For additional information visit www.totalsitesolutions.com

Availability Operation Procedures Operation procedures Have detailed procedures that are specific to your developed site Procedures should be tested and verified Procedures should be inventoried and updated regularly Operating procedures should be placed at the point of use and not locked-up in the building manger’s office For additional information visit www.totalsitesolutions.com

Availability Maintenance Procedures Maintenance procedures Have detailed procedures for maintenance Ask your maintenance provider to furnish all of the required maintenance procedures prior to performing maintenance, so you can review and comment on them Use detailed procedures during your maintenance activities Review procedures after the maintenance has been completed For additional information visit www.totalsitesolutions.com

Availability Emergency Procedures Emergency procedures In case of an emergency, where are your procedures Can you access them Are they at multiple locations During an emergency is not the time to try to figure out how to restore a system Perform dry runs on the procedures at least once a year Update and change, as required For additional information visit www.totalsitesolutions.com

Availability Troubleshooting Procedures Manuals Drawings Available Correct Drawings Available and complete As-builts Develop troubleshooting flow diagrams For additional information visit www.totalsitesolutions.com

MAINTAINABILITY Maintainability is defined as the probability of performing a successful repair action or preventative maintenance within a given time In other words, maintainability measures the ease and speed with which a system can be restored to operational status

Maintainability Design Equipment Staff Location Maintenance program Training Coordination Maintenance windows For additional information visit www.totalsitesolutions.com

Maintainability Design Goals of Maintainability Maximize efficiency and accuracy of on-line replacement of system components Facilitate and minimize troubleshooting time at each level of maintenance activity Allow test, checkout, troubleshooting and repair procedures to be unit-specific and structured to aid in identification of faulty units, then sub units Reduce downtime Provide easy access to malfunctioning components Allow for high degree of standardization Minimize time and cost of maintenance training Simplify new equipment design and shorten design time by using previously developed, standard building blocks For additional information visit www.totalsitesolutions.com

Maintainability Design Equipment Access Labeling Minimize troubleshooting time Monitoring Procedures Standardization Test and service points For additional information visit www.totalsitesolutions.com

Equipment Accessibility Maintainability Equipment Accessibility Design Accessibility refers to the relative ease with which a system can be accessed Sufficient clearance to use the tools needed to complete the tasks Adequate space to permit convenient removal and replacement of components Adequate visual exposure to the task area Adequate safety and working clearances Adequate space for required rigging equipment Adequate hallway, corner and door clearances back to loading dock For additional information visit www.totalsitesolutions.com

Maintainability Ease Removal and Replacement Design Equipment rooms should be designed so that rapid, safe and easy removal and replacement of malfunctioning components can be accomplished by one technician, when possible With space at a premium in a data center the tendency is to design the equipment room to the minimum code requirements. This saves space in the design and meets the minimum code requirements but in many cases increases the time required to maintain and repair a system. These minimum clearance spaces will cost more in the long run. For example; 1. safety is hampered when dealing with minimum clearance. Backing into another panel and tripping a breaker or tripping on a housekeeping pad and landing on a rotating pump. 2. increased downtime, either a part is not changed in time because of its difficulty of replacing it, or during an outage the time to repair is increased do do space limitations. For additional information visit www.totalsitesolutions.com

Maintainability Labeling Design Labeling should: Identify a specific device Identify the purpose or function of a specific device Present critical information Present safety Information Should be legible Should use contrasting colors Ensure that your labeling is controlled to ensure its accuracy and standardization Periodic inspections and examinations Accuracy of Identification required Time available for recognition Location and distance at which identification must be read Level and color of illumination Criticality of the function identified Label design and identifying information used within and between systems For additional information visit www.totalsitesolutions.com

Maintainability Minimize Troubleshooting Time Design Comprehensive monitoring Procedures Standardization Test and service points For additional information visit www.totalsitesolutions.com

Maintainability Monitoring Design Monitoring capabilities Event notification Event reconstruction Event mitigation Determine maintenance frequencies Allow for accurate and efficient communication of events For additional information visit www.totalsitesolutions.com

Maintainability Monitoring Design What type of monitoring system do I need? No monitoring Not recommended for any mission critical facility Remote Alarm Status Panel (RASP) No trending or time stamping Gives visual and auditable notification Usually for one device or system Monitoring with dry contacts Limited number of points Limited time stamping Status is either on or off Serial interfaces Comprehensive data Data points with values rather than on/off Flexible and expandable For additional information visit www.totalsitesolutions.com

Maintainability Procedures Design Emergency Operating Procedures (EOP) Developed for failure modes Readily available for use – locate at point-of-service Should be developed and tested during the commissioning phase Detailed – switch level Update any changes discovered Method Operating Procedure (MOP) Developed for all operations Have back-out procedures included Use with pilot/copilot approach For additional information visit www.totalsitesolutions.com

Maintainability Procedures Design Trouble-shooting procedures Trouble-shooting flow charts Restoration procedures Maintenance procedures Detailed procedures Include measure points for future trending Used and completed during maintenance For additional information visit www.totalsitesolutions.com

Maintainability Procedures Design Common procedures error traps In-field decisions Vague instructions Undefined or uncommon terms Burdensome or complex instruction Multiple actions Inconsistent statements or actions Misleading or missing critical information Interfacing with external procedures Lack of ownership Lack of quality assurance review For additional information visit www.totalsitesolutions.com

Maintainability Standardization Design Standardization ensures consistency and comparability of knowledge and parts Acronyms Reduce confusion Manufacturers Reduced spare part counts Familiarization with operations and maintenance Layouts Increase ease-of-use Labeling For additional information visit www.totalsitesolutions.com

Maintainability Test and Service Points Design Test points provide a means for conveniently and safely determining the operational status of equipment and isolating malfunctions Test points, strategically placed, make signals available to the technician for checking, adjusting or troubleshooting Service points provide means for lubricating, filling, draining, charging and similar functions For additional information visit www.totalsitesolutions.com

Maintainability Test and Service Points Design General principles for test and service points Avoiding need for frequent testing and service Standardization Test and service point compatibility Labeling dangerous test and service compatibility Distinctively different connectors and fittings Location of test, service and adjustment points For additional information visit www.totalsitesolutions.com

Maintainability Equipment Ordering the right accessories with your equipment can make a big difference when it comes to the maintainability of your equipment When ordering equipment or reviewing design documents, solicit input from your operations and maintenance staff involved It’s much cheaper to order it right the first time, than to upgrade it later in the field For additional information visit www.totalsitesolutions.com

Maintainability Generators Equipment Water separators for fuel Radiator water level Isolation valves on water jacket heaters Generator-mounted circuit breakers Battery cables Battery monitoring Fuel-level monitor For additional information visit www.totalsitesolutions.com

Maintainability Switchgear Equipment Annual infrared thermal scanning Protective relays Breaker testing PLC Code Hard copy Up-loadable copy Beware of small UPS systems Station batteries Internal cleaning Mimic bus For additional information visit www.totalsitesolutions.com

Maintainability Automatic Transfer Switches Equipment Maintenance bypass Order it with a maintenance bypass or design the system to have a manually operated breaker bypass to wrap around the ATS to both sources For additional information visit www.totalsitesolutions.com

Maintainability UPS Systems Equipment AC filter capacitors 3-5 years DC filter capacitors Transfer circuits Capture the transfer between UPS and bypass Procedures Detail PM procedures Capture before and after readings Calibration/maintenance Capture details Don’t just do a “dust and clean” PM For additional information visit www.totalsitesolutions.com

Maintainability Batteries Equipment VLA (flooded) VRLA (sealed) Vented lead acid Quarterly maintenance VRLA (sealed) Valve-regulated lead acid Semi-annual maintenance Float voltage Room temperature Proper maintenance Water as required Battery monitoring Batteries found UPS systems Generators Switchgear PLCs and breakers Telecom equipment For additional information visit www.totalsitesolutions.com

Maintainability PDU’s Equipment Shutdown alarms EPO circuits Identify and understand them EPO circuits If used, is it maintainable? Monitoring Main Sub-panels Branch circuit breakers Snap-in vs. bolt-in breakers Use bolt-in breakers only Transformers K-rated For additional information visit www.totalsitesolutions.com

Maintainability Load Banks Equipment Permanently installed load banks Generator testing Annual load test Troubleshooting UPS system testing Paralleling gear Set-up and calibration For additional information visit www.totalsitesolutions.com

Maintainability Water Source Equipment Alternate water source needs to be capable of supplying water, so that the primary water source can be removed for maintenance Usage metering should be on each water source Types of alternate water source City water Wells Storage tanks For additional information visit www.totalsitesolutions.com

Maintainability Pumps Equipment Alignment Bearings Will reduce wear and tear on shafts, bearings and seals Reduce vibration Decrease current draw Bearings Accessible grease fittings Grease as required Infrared thermal scanning Motor problems Alignment issues For additional information visit www.totalsitesolutions.com

Maintainability CRAH/CRAC Equipment Temperature and humidity set points Should be set the same Humidifiers Have replacements for bulbs and canisters Filters Use a pre-filter in dirty locations Make sure your dirty filter Differential Pressure (DP) switch is set correctly Alignment Proper alignment will reduce wear on the shaft and bearings Bearings Grease when required Infrared thermal heat scan Refrigerant leaks can activate fire alarms For additional information visit www.totalsitesolutions.com

Maintainability Staff Dispatched service Verify your vendors qualifications as a company Request resumes of the people performing work at your site Review their technical aptitude Verify your vendors training programs Onsite operation and maintenance staff Verify that they are managed correctly (in-house or contracted) Verify your staff’s resumes and qualifications Verify training programs For additional information visit www.totalsitesolutions.com

Maintainability Location Location and access of valuable resources is important when situations arise 3:00 am Sunday morning is not the time to try to locate fuses required to get your site up and running There are various resources you should consider before the need arises; Equipment Technicians Parts Procedures Manuals Drawings For additional information visit www.totalsitesolutions.com

Maintainability Training It is important that your operation and maintenance staff is adequate and regularly trained When an emergency occurs they should have the confidence and experience to complete the task at hand Available training methods; Self paced Classroom Web based Manufacturer’s training On-the-job training Procedure development Training module development Test beds Simulators For additional information visit www.totalsitesolutions.com

Maintainability Coordination Work activities – it is important to closely coordinate maintenance activities, to maintain a reliable, efficient and safe working environment During outage windows we have the tendency to plan too many activities at once. Make sure you don’t have too many people working in the same space at once For additional information visit www.totalsitesolutions.com

Maintainability Coordination Pay particular attention to planning of your maintenance activities CRAC units – refrigerant leaks will activate the fire systems; make sure you disable the fire system* prior to charging a system Under floor cleaning – can activate the fire alarm system; make sure you deactivate the fire alarm system* before you start to clean under the floor There are other maintenance activities and tests that could mistakenly set-off the fire alarm system *When you disable a fire alarm system, make sure you follow the required procedures by OSHS, NFPA, local authorities, your company and your insurance underwriter. This could include, but is not limited to; additional fire extinguishers, posting fire watch, notification, special procedures, and tagging For additional information visit www.totalsitesolutions.com

Maintainability Coordination Maintenance activities If you are planning to transfer your UPS to a generator maintenance bypass to perform maintenance on the UPS, PM the generator first If you are planning to perform an open transfer to the building electrical system, inspect your UPS batteries first Be aware of maintenance activities of building-wide systems that can effect the data center’s Chillers Pumps Electrical service For additional information visit www.totalsitesolutions.com

Maintainability Maintenance Windows Maintenance windows Downtime vs. reduced reliability Reduction in reliability Design system to have various maintenance capabilities Move away from critical loads and towards utility “Make sure you plan your maintenance windows carefully between IT and Facilities.” For additional information visit www.totalsitesolutions.com

Maintainability Maintenance Windows IT maintenance windows are often loaded with IT tasks and therefore are not completely available for facilities tasks Need to clearly define the true window for facility maintenance Maintenance window is midnight to 6 am IT takes an hour to shut down and an hour to start-up Real outage is limited to 1 am to 5 am For additional information visit www.totalsitesolutions.com

PREDICTABILITY Predictability is the ability to detect the onset of a failed system before it happens Predictive analysis can be performed by: Reviewing PM data Conducting failure analysis Monitoring systems Trending Advance diagnostics

Predictability Reviewing PM data PM should not only be a time to complete preventative maintenance tasks, but also be used as a diagnostic tool Use detailed PM guides and complete them so they can be reviewed later Review your PM task list and add additional items that can be used to perform predictive analysis Record before and after data. This is important to set baselines and conduct trending For additional information visit www.totalsitesolutions.com

Predictability Conducting failure analysis Event occurs Complete an incident report Incident report should only contain facts of what happened during the event Stabilize the system Repair the system Take accurate and specific notes Take before and after readings Document For additional information visit www.totalsitesolutions.com

Predictability Conduct root cause analysis Recommendations It is not necessary to prevent the first, or root cause from happening It is merely necessary to break the chain of events at any point and thus final failure cannot occur Recommendations Make recommendation to prevent future failures Implement those changes in the failed system and other similar systems When the fault leads to an initial design problem, redesign is necessary Where the fault leads back to equipment failure, develop ways to improve the component wear, quality and life Where the fault leads back to a failure of procedures, it is necessary to either address the procedural weakness or to install a method to protect against the damage caused by the procedural failure For additional information visit www.totalsitesolutions.com

Predictability Monitoring systems Install a monitoring system Monitor as much as you can, as long as you do something with the points you select Know what you are monitoring and what effects the points Develop your point list to assist you in predictive analysis Comprehensive monitoring systems will provide you with the best information For additional information visit www.totalsitesolutions.com

Predictability Trending Once your monitoring system is installed, select key points to trend Use your trends to develop replacement and PM intervals Items you can trend: Temperatures Pressure Flow rates Usage Time Consumption Load For additional information visit www.totalsitesolutions.com

Predictability Advance diagnostic techniques Infrared thermal imaging Oil analysis Coolant analysis Fuel analysis Ultrasonic analysis Power quality testing Battery impedance testing Vibration testing Motor analysis Eddy current analysis Laser alignment Balancing For additional information visit www.totalsitesolutions.com

Predictability Uses for an IR camera Belt tension Pump alignment Bearings Electrical connections Turbo chargers Roof leaks Poor insulation Room seals For additional information visit www.totalsitesolutions.com

Unless you are the Predator you will need to use an IR Camera Predictability Unless you are the Predator you will need to use an IR Camera Infrared thermography Is the process of developing visual images that represent variations in the IR spectrum Any object that is above absolute zero omits IR energy IR spectrum is between 2.0 and 15 microns IR spectrum falls outside the range of the human eye IR cameras detect the temperature changes that can potentially mean the presence of conditions or stressors that act to decrease the life of the equipment design The IR camera can have many uses in a data center For additional information visit www.totalsitesolutions.com

Predictability Overloaded Breaker Fuse Connection Loose Cable Defective Breaker For additional information visit www.totalsitesolutions.com

Predictability Pump Alignment Water Under Roof Tank Level Missing Insulation For additional information visit www.totalsitesolutions.com

Predictability Oil analysis Oil analysis is used to define three basic machine conditions Condition of the oil can determine lubricate viscosity, acidity , etc. Lubrication system condition: Have physical boundaries been violated? i.e. fuel in oil Machine condition by looking for wear particulars For additional information visit www.totalsitesolutions.com

Predictability Oil analysis Oil condition is most easily determined by measuring the viscosity, acid number and base number Additional tests can determine the presence and/or effectiveness of oil additives such as anti-wear addictiveness, antioxidants, corrosion inhibitors, and anti-foam agents Component wear can be determined by measuring the amount of wear metals such as iron, copper, chromium, aluminum, lead, tin and nickel, and can identify when a particular part is wearing Contamination is determined by measuring water content, specific gravity, and the level of silicon. Change in specific gravity typically indicates presence of other oil or fuel contamination For additional information visit www.totalsitesolutions.com

Predictability Metals Engines Gears Iron Chrome Aluminum Nickel Copper Cylinder heads, rings, gears, crankshafts Gears, bearings Chrome Rings, liners, exhaust valves Roller bearings Aluminum Pistons, thrust bearings, turbo bearings, main bearings Pump, thrust washers Nickel Valve plating, steel alloy from crankshaft, camshafts Steel alloy from roller bearings Copper Lube coolers, main and rod bearings, bushings, turbo bearings Brushings, thrust plates Lead Main and rod bearings, bushings, lead solder Bushings, grease contamination Tin Piston flashing, bearing overlays, bronze alloy Bearing cage metal Silver Wrist pin bushings, silver solder from lube coolers Silver solder from lube coolers Titanium Gas turbine bearings. Hubs, turbine blades N/A For additional information visit www.totalsitesolutions.com

Predictability Coolant analysis Regular coolant testing and routine maintenance can help you achieve maximum system efficiency and save you time and money in less downtime A cooling system is subject to pitting, corrosion, cavitations, erosion and electrolysis Although coolants are formulated to help prevent these problems from occurring, coolant analysis will identify if they are present and determine if the coolant you're using is providing adequate protection For additional information visit www.totalsitesolutions.com

Predictability Fuel analysis Fuel analysis can point to solutions for filter plugging, loss of power or poor injector performance Testing bulk fuel storage tanks can verify compliance with required supplier specifications For additional information visit www.totalsitesolutions.com

Predictability Ultrasonic inspection Ultrasonic or ultrasound are sound waves above 20kHz to 100kHz that can not be heard by humans Unlike IR, ultrasound travels a short distance from the source Ultrasonic detectors can be used to detect component wear, fluid leaks, vacuum leaks and steam trap failures Even though such a leak may not be audible to the human ear, ultrasound will still be detectable with the appropriate tool For additional information visit www.totalsitesolutions.com

Predictability Pressure and vacuum leaks can occur in various locations Compressed air Heat exchangers Boilers Condensers Tanks Pipes Valves Steam traps Ultrasonic inspections can detect these small leaks For additional information visit www.totalsitesolutions.com

Predictability Mechanical systems suffer from wear through constant operation, and ultrasonic inspection can detect wear in these systems Mechanical applications Bearings Lack of lubrication Pumps Motors Gear/gearboxes Fans Compressors For additional information visit www.totalsitesolutions.com

Predictability Mechanical devices are not the only devices that omit ultrasonic sound. Electrical equipment will also generate ultrasonic waves if arching, tracking or corona are present Electrical applications Arching, tracking and corona Switchgear Transformer Insulators Circuit breakers For additional information visit www.totalsitesolutions.com

Predictability Power quality testing Hardware and software are frequently blamed for all types of problems that may actually originate from within your building’s electrical distribution system; poor power quality In many cases, the number one indication that you have a power quality problem is intermittent, unexplained technology equipment or process failures Responding service technicians may complete a work report with the words “no trouble found" For additional information visit www.totalsitesolutions.com

Predictability Impedance testing A substitute to performing a full load test The internal resistance of a cell can be determined by how that cell responds to a momentary load The instantaneous voltage drop and load current applied are used to calculate the resistance Most cell testers can check the impedance with the battery online or offline For additional information visit www.totalsitesolutions.com

Predictability Vibration analysis The level and frequency of the vibration of rotating machinery are not distinguishable to the human touch Can be used to discover and diagnose a wide range of problems related to rotating equipment For additional information visit www.totalsitesolutions.com

Predictability Vibration monitoring can detect; Unbalance Eccentric rotors Misalignment Mechanical looseness or weakness Types of systems that vibration analysis should be performed on; Generators Cooling tower fans Chillers Pumps CRAH/CRAC Air handlers For additional information visit www.totalsitesolutions.com

Predictability Tests used to perform motor analysis Infrared Vibration analysis Surge comparison Motor current signature comparison Motor faults or conditions can be detected Winding short circuits Open coils Improper torque settings As well as other mechanical problems For additional information visit www.totalsitesolutions.com

Predictability Types of motor analysis Surge comparison testing identifies insulation deterioration by applying a high frequency transient surge to equal parts of a winding, and by comparing the resulting voltage waveform Motor Current Signature Analysis (MCSA) provides a non-intrusive method of detecting mechanical and electrical problems For additional information visit www.totalsitesolutions.com

Predictability Eddy current analysis Detects surface and subsurface defects Detects variations in alloy, heat treatments, hardness, structure and other physical metallurgical conditions Should be done on chillers each year when the tubes are being cleaned For additional information visit www.totalsitesolutions.com

Predictability Alignment inspection Shafts and pumps should have the proper alignment, and is best accomplished by using laser alignment When machines are improperly aligned there are added loads to the bearings and couplings which can result in early and unplanned failures For additional information visit www.totalsitesolutions.com

Predictability Balance Reduce wear and tear on bearings, shafts and motors Can be detected with the use of infrared cameras and vibration meters Requires balancing equipment to verify and correct balancing For additional information visit www.totalsitesolutions.com

SCALABILITY Scalability is a desirable property of a system which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged without impact to operations For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added

Scalability What do we want… a flexible, scalable, reliable, highly performing, and highly available computer infrastructure that adapts to a wide range of continuously evolving and challenging demands For additional information visit www.totalsitesolutions.com

Scalability What does it take? Requirements analysis Basis of Design (BOD) Design Modular approach Avoid excessive equipment Pay as you go Expansion techniques For additional information visit www.totalsitesolutions.com

Scalability Good planning and decisions are the foundation of a highly scalable facility At no point in the lifecycle of a mission-critical facility can you have greater impact on scalability then during the design phase Start with a Requirements Analysis (RA) of your data center needs Use the results of your RA to develop a Basis of Design (BOD) The RA and BOD are living documents and you need to update them as changes occur For additional information visit www.totalsitesolutions.com

Scalability Requirements Analysis Requirements analysis Growth modeling takes the hardware platform requirements and turns them into space, power and cooling requirements Considers both current and future technology impacts on space, power and cooling Typically done for 3+ year planning This leads to the critical infrastructure’s BOD For additional information visit www.totalsitesolutions.com

Scalability Basis of Design Roadmap to a reliable and quality-designed site More often then not, the BOD is lacking in detail Define the requirements of the site Defines the reliability, availability, maintainability, scalability and operational parameters Should be updated regularly For additional information visit www.totalsitesolutions.com

Scalability Designing with scalability in mind Scalability Reduced initial cost Reduced time to install equipment Reduces the requirements of purchasing large systems Not an advantage for fast-growing facilities Modular design can be more precisely matched to reflect; Lower capital investment “Pay as you go approach” Budget/capital constraints Controlled growth Unanticipated growth For additional information visit www.totalsitesolutions.com

Scalability Equipment rooms When possible, design equipment rooms with space for expansion Design hallways, corridors and doors to allow access for new equipment Conserve wall space for future panels and equipment For additional information visit www.totalsitesolutions.com

Scalability Switchgear Expansion breakers Expansion cells Be aware of bussing configuration, use fully-rated bus throughout Use larger frame breakers with adjustable trips Have expansion in your Programmable Logic Controller (PLC) Have access to programming codes Have current backup For additional information visit www.totalsitesolutions.com

Scalability UPS systems Remember Size parallel cabinet and static switch for full build-out If modules are upgradeable, size feeders to full build-out If equipped with sync control cabinet, size for full build- out Remember When you start to add more then 3 modules in parallel, the redundancy begins to drop For additional information visit www.totalsitesolutions.com

Scalability Critical distribution Dual main input Spare breakers Allow for the possibility of a second source to supply load during cutover or expansion activities Could be used to connect temporary equipment for emergencies Load bank testing Spare breakers Allow for additional PDU and expected new load Up-frame the breaker so that larger loads may be added i.e. use 400A frame breakers with 225A rating plugs to power PDUs For additional information visit www.totalsitesolutions.com

Scalability Power Distribution Units (PDUs) Typically you run out of circuits before capacity Install junction box below floor to allow for additional power whips. Bottom plates usually do not have enough knock-out Order PDU’s with additional 225A sub-fed breakers to support additional Remote Power Panel (RPP) Consider in-row PDU’s to save space For additional information visit www.totalsitesolutions.com

Scalability EPO systems Plan on the fact that the EPO system will have items added and removed from it EPO should be an engineered device and not a cloud stating ”by others” System should be documented Should have an Active, Test and Off mode of operation Installed with isolation relays Centrally located in an EPO control cabinet with room for expansion For additional information visit www.totalsitesolutions.com

Scalability Chilled water systems When possible, up-size piping Have additional valves installed under the floor so you can add CRAH units as needed Have valves installed for additional pumps and chillers Have a valve connection that can be easily hooked-up to a temporary chiller For additional information visit www.totalsitesolutions.com

Scalability Monitoring systems Make sure that the system is expandable Some systems are not up-gradable, while others require adding another module to the communication trunk Make sure you will not be locked in with an uncooperative manufacturer Have access to the programming function and required passwords For additional information visit www.totalsitesolutions.com

Scalability Expansion techniques Implementation of new systems while the facility is in “production” is a business reality The need for hot cutover occurs more often. For safety reasons, hot cutover should be a last resort With proper upfront planning, the need for hot taps and cutovers can be reduced or eliminated For additional information visit www.totalsitesolutions.com

UPTIME Uptime (Ŷ) is a measure of the time a system has been "up“, running and available. It came into use to describe the opposite of downtime, times when a system was not operational ρ = Reliability ά = Availability ц = Maintainability ∏ = Predictability ∑ = Scalability

SCALABILITY PREDICTABILITY AVAILABILITY Reliability (ρ) is the ability of a system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances SCALABILITY PREDICTABILITY MAINTAINABILITY AVAILABILITY RELIABILITY

SCALABILITY PREDICTABILITY AVAILABILITY Availability (ά) is the ability of a system to tolerate failures Refers to the time that a system is available to its users This means the process continues to be served through the failure and that, ideally, the failure is transparent to the user SCALABILITY PREDICTABILITY MAINTAINABILITY AVAILABILITY RELIABILITY

SCALABILITY PREDICTABILITY AVAILABILITY Maintainability (ц) is defined as the probability of performing a successful repair action or preventative maintenance within a given time In other words, maintainability measures the ease and speed with which a system can be restored to operational status SCALABILITY PREDICTABILITY MAINTAINABILITY AVAILABILITY RELIABILITY

SCALABILITY PREDICTABILITY AVAILABILITY Predictability (∏) is the ability to detect the onset of a failed system before it happens Predictive analysis can be performed by: Reviewing PM data Conducting failure analysis Monitoring systems Trending Advance diagnostics SCALABILITY PREDICTABILITY MAINTAINABILITY AVAILABILITY RELIABILITY

SCALABILITY PREDICTABILITY AVAILABILITY Scalability (∑) is a desirable property of a system which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added SCALABILITY PREDICTABILITY MAINTAINABILITY AVAILABILITY RELIABILITY

UPTIME ρ * ά *ц * ∏ * ∑ = Ŷ

SCALABILITY PREDICTABILITY AVAILABILITY Be sure to look at more than just the design of your facility… don’t miss a step. Use RAMPS to achieve maximum uptime! SCALABILITY PREDICTABILITY MAINTAINABILITY AVAILABILITY RELIABILITY

Presented by Joe Soroka RAMPS© Reliability, Availability, Maintainability, Predictability, Scalability Presented by Joe Soroka For additional information visit www.totalsitesolutions.com