Cisco Confidential 1 © 2010 Cisco and/or its affiliates. All rights reserved. Welcome Technical Services Virtual Boot Camp Session 8 Technical Services India Team
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2 Technology · Architecture Overview UCS C-series UCS B-series · UCS Interoperability Hardware Software · Troubleshooting Case Study (Lab Demo) Q&A
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4 Technology · Firmware Install and upgrade UCS C-series UCS B- series · Troubleshooting Case Study (Lab Demo) Important logs Part Identification and RMA Q&A
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5 Nirmal Sodani Technical Support Manager Mohit Mmangal Manager, CSC Avinash Shukla TAC Escalation Engineer Vinay Sharma Lead, CSC Teclus D'Souza TAC Escalation Engineer Chetan Badami Technical Escalation Engineer
Cisco Confidential 6 © 2010 Cisco and/or its affiliates. All rights reserved. Technology – UCS Avinash Shukla Teclus D'Souza Chetan Badami
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Agenda UCS Upgrade Procedure C-series B-series UCS Upgrade Procedure C-series B-series UCS Troubleshooting UCSM / FI / IOM / Blade C-series UCS Troubleshooting UCSM / FI / IOM / Blade C-series UCS H/W and S/W Interoperability
© 2010 Cisco Systems, Inc. All rights reserved.CAE BootcampPresentation_ID 8 UCS H/W and S/W Interoperability Avinash Shukla Cisco TAC
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Operating System Check the support matrix before installing the OS on the blade Install / keep the drivers (Eth / FC) updated as per the matrix Few important things to check: –Is the blade running the certified OS and OS version? –Are there any special needs for that OS? E.g. VMWare – OEM Image –Are the drivers at the OS level updated and current? Answer: –UCS S/W and H/W matrix – – ml
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM H/W and S/W Interop
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM What each matrix provides
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Sample..driver versions
© 2010 Cisco Systems, Inc. All rights reserved.CAE BootcampPresentation_ID 14 UCS Upgrades
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Agenda C series firmware upgrade Pre-requisites Firmware ISO location and downloading Upgrade process B series firmware upgrade Pre-requisites Firmware bundles and downloading Upgrade process Additions / Modifications from version 2.1
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Pre-requisites C Series
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Things to consider Release Notes will cover gotchas and concerns in the upgrade process Upgrades from one version back will always work Check release notes about prior versions –If customer is really far behind it might require two upgrades to get to current code Schedule an maintenance window –CIMC and server will reboot during upgrade
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Firmware ISO
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM C Series Upgrade Downloading iso file
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Upgrade process C Series
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Map the iso on the KVM
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Boot from Virtual Media
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM HUU Screen and options
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM HUU Screen and options
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM After all component upgrade
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Verify Upgrade To verify check that all components are upgraded
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Pre-requisites B Series
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Things to consider Release Notes will cover gotchas and concerns in the upgrade process Upgrades from one version back will always work Check release notes about prior versions –If customer is running a very old version, it might require two upgrades to get to current code Schedule an maintenance window –FI and IOM will reboot during upgrade –Make sure network and storage fabric are redundant Highly recommended to backup UCSM configuration
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Be patient Upgrade process is not quick Sometimes bugs will result in the first release after FCS Expect a maintenance release shortly after FCS Follow the upgrade procedure for each version –The procedure is not always the same from one version to another.
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Downgrading Sometimes there might be data loss Might have to erase config to downgrade –Database changes in new versions cannot always be back ported
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Bundles
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Bundles Prior to 1.4 there was only all inclusive bundle Now there are multiple bundles –Infra-bundle – contains code for FI, IOM, and UCSM –B-series bundle – contains BIOS and blade specific code –C-series bundle – contains BIOS and rack server specific code
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Bundles All firmware work is done from Equipment tab in UCSM
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Bundles Packages can be viewed/deleted from “Packages” tab
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Bundles Bundles are downloaded from the “Download Tasks” tab Downloads can be through desktop or using ftp/scp/sftp/tftp
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Cisco.com to download FCS bundles B-Series packages C-Series packages FI, IOM, and UCSM software
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Pre 1.4 bundles are single download bundles
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Upgrade process B Series
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Upgrade Process Again always consult release notes Upgrade through GUI is easiest General Process is Backup UCS Config (Full & All Config) Download code Update components Activate components in order of (Check RN cause order can change) Interface cards – Set Startup Only CIMC IOM – Set Startup Only UCSM FI
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Updating Components Update means copy new code to backup location of all UCSM components Simply stages the new code Can update all components at once
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Updating Components Time to update will vary based on component IOMs take a long time. Up to 5 minutes If any component has issues check FSM for that component Update process does not work on FI Once everything is in “Ready” state you can move to Activate
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Activate Components In this process you activate the code that you copied Some code is activated but set to “activate on next reboot” Understand that in this stage you can create outages Activate “leaves of the tree” first –BU uses this term to mean that order should be Interface card = leaf CIMC = twig IOM = branch UCSM = trunk FI = root
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Activate Blade Components Recommended Method is to use Policies –Host Firmware Policy to apply latest BIOS, Board Controller, Adapters, etc.
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Activate Interface cards Set to “Set to startup version only” If you uncheck above box it will cause a blade reboot!!!
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Activate CIMC CIMC can bet activated without disruption to OS on blade KVM session will be lost while activating
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Activate IOM Same as Interface card “Set Startup Version Only” IOM needs to be at same version as FI!!!
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Activate UCSM Will cause UCSM to disconnect Takes a few minutes
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Activate Fabric Interconnect Recommended to activate one FI at a time A complete outage will not occur Fail one fabric Wait for all Network and FC traffic failover to second Fabric Highly recommended to have an outage window Biggest risk is SAN storage FI will upgrade and reboot Part of the process is to reboot connected IOM as well Can take up to minutes for FI and all IOM to come back online If any failure during first FI upgrade STOP! Do not attempt to upgrade second FI
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Activate Fabric Interconnect Activate FI from Equipment tab Upgrade subordinate first
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Activate Fabric Interconnect Choose correct Kernel and System Version FI will take a few minutes and then reboot IOMs will get updated as well
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Verify Fabric Interconnect upgrade Make sure IOM and FI all match the correct running version
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Upgrade Primary Fabric Interconnect Upgrade the Primary FI now using same process UCSM will failover to subordinate FI Will need to log back in to UCSM
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Problems Biggest concern is a failed IOM upgrade –There is no way in field to upgrade an IOM manually –RMA the failed IOM –Can attempt a physical reseat of IOM Failed FI upgrade can be recovered –Similar to N5K will require access to console and tftp server to boot from –Refer to FI recovery method
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Host Firmware
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Host firmware Highly recommended that Blade BIOS match running UCSM system Best way to upgrade BIOS is through Host Firmware Policy Create policy in UCSM Apply policy to SP Will reboot the blade so need outage window
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Create Host Firmware Policy From Server Tab
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Host Firmware Policy Note that Firmware Policy can include –Adapters, BIOS, Board Controller, FC Adapters, HBA Option ROM and Storage Controller Note how Adapters and FC adapters can be part of a policy –If adapters are part of policy then they can only be changed as part of firmware policy Recommended to upgrade BIOS and Storage Controller at a minimum Board adapter rarely changes and is specific to B230 and B440
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Set BIOS versions Best to choose all hardware Set BIOS to the latest in the pull down for each blade/server Latest BIOS version will be different for some servers
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Add the new Firmware Policy to a SP Select the Host Firmware policy Blade will reboot once you “Save Changes”
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Additions / Modifications from version 2.1
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM –Firmware Auto Install – Install Infrastructure Firmware – Install Server Firmware We just made it simple to upgrade
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Firmware Auto-Install Firmware Auto-Install implements package version based upgrades for both UCS Infrastructure components and Server components Firmware Auto-Install can not be used to upgrade Management Extensions and Capability Catalog. These are simple occasional updates in UCSM and hence left under user control. It is a two step process - “Install Infrastructure Firmware” and “Install Server Firmware”. It is recommended to run “Install Infrastructure Firmware” first and then “Install Server Firmware” All existing firmware upgrade mechanisms are retained. For users who do not want to use Auto-Install, they can continue to use existing documented way of doing firmware upgrades.
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Install Infrastructure Firmware (contd) This is the sequence followed by “Install Infrastructure Firmware” 1. Upgrade UCSM 2. Update backup image of all IOMs 3. Activate all IOMs with setstartup option 4. Activate secondary Fabric Interconnect 5. Wait for User Acknowledgement*** 6. Activate primary Fabric Interconnect
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Install Infrastructure Firmware GUI
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Install Infrastructure Firmware - Cancelling A scheduled “Install Infra” operation can be cancelled But an “Install Infra” operation which is already “In Progress” can not be cancelled. Both GUI and CLI options are available for cancelling.
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Install Infrastructure Firmware – User Acknowledgement for primary FI “Install Infra” expects an explicit permission from user to start firmware upgrade on primary Fabric Interconnect. This is necessary to protect the data path for servers. As part of “Install Infra”, secondary FI’s firmware is upgraded first. Secondary FI reboots as part of firmware activation. After secondary FI comes online, users are expected to check if the data path is ready for a reboot of primary FI When users have ensured that the data path is ready, they can acknowledge reboot of primary FI.
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Acknowledge Primary FI reboot
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Install Server Firmware Install-Server offers a way to update multiple host firmware packages using package versions. It provides the list of Service Profiles that will be affected when a host firmware package is modified. Multiple SPs can use the same host firmware package. It also provides a final summary of physical servers that will be rebooted for the set of host firmware packages that are getting modified. Only GUI is available for "Install Server Firmware". No CLI.
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Install Server Firmware – Screen 1
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Install Server Firmware – Screen 2
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Install Server Firmware – Screen 3
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Install Server Firmware – Screen 4
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Install Server Firmware – Screen 5
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Install Server Firmware – Screen 6
© 2010 Cisco Systems, Inc. All rights reserved.CAE BootcampPresentation_ID 75 Troubleshooting the Cisco Unified Computing System Chetan Badami Cisco TAC
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Agenda Troubleshooting UCSM & Fabric Interconnect Fault types Clustering issues Common issues Blade Servers IOM & Chassis
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM UCS System Components UCS manager UCS Fabric Interconnect (6xxx) UCS Fabric Extenders (2xxx) UCS 5100 Blade Chassis UCS B-series servers Nexus 2000 switch UCS C-series servers UCS Network adapters
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM UCS 6200 Fabric Interconnect (FI) Standalone or Clustered Primary / Subordinate Data Management Engine (DME) FI-B# FI-A# Virtual IP IP #B IP #A Management Network Cluster links DB
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM UCSM UCSM GUI CLI UCS-A# scope server x/y NXOS UCS-A# connect nxos a UCS-A(nxos)# show… XML API
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Fault Types TypeDescription FSMAn FSM task has failed to complete successfully, or Cisco UCS Manager is retrying one of the stages of the FSM. equipmentCisco UCS Manager has detected that a physical component is inoperable or has another functional issue. serverCisco UCS Manager cannot complete a server task, such as associating a service profile with a server. environmentCisco UCS Manager cannot successfully configure a component.
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Fault Types TypeDescription managementCisco UCS Manager has detected a power problem, thermal problem, voltage problem, or loss of CMOS settings. connectivityCisco UCS Manager has detected a connectivity problem, such as an unreachable adapter. NetworkCisco UCS Manager has detected a network issue, such as a link down. operationalCisco UCS Manager has detected an operational problem, such as a log capacity issue or a failed server discovery.
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM FarNorth-A# scope server ? WORD / dynamic-uuid Dynamic UUID FarNorth-A# scope server 1/1 FarNorth-A /chassis/server # show event Events per Component
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM UCSM Faults - GUI
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Information Fault Major Fault
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Finite State Machine (FSM) Workflow with many stages Data Management Engine (DME) … Application Gateway (AG) … End Point (EP)
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Error Description for that stage Stage Description Operation (workflow) FSM Details
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Contexts UCS has three CLI “Contexts” UCSM (GUI Equivalent, uses the “ scope ” command) NXOS (not configurable – read only) Management (file management, tech support, reboot) UCSM Local-Management NXOS
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Scope Scoping – movement to different UCS configuration components Details on hardware components done with connect command You want to be on the Primary Fabric Interconnect UCS-B# scope ? adapter Mezzanine Adapter chassis Chassis eth-server Ethernet Server Domain eth-storage Ethernet Storage eth-traffic-mon Ether Traffic Monitoring Domain eth-uplink Ethernet Uplink fabric-interconnect Fabric Interconnect fc-storage FC Storage fc-traffic-mon FC Traffic Monitoring Domain fc-uplink FC Uplink fex FEX (fabric-extender) Module firmware Firmware host-eth-if Host Ethernet Interface host-fc-if Host FC Interface license License monitoring Monitor the system org Organizations power-cap-mgmt Power Cap Mgmt security security mode server Server service-profile Service Profile system Systems vhba vHBA vnic vNIC
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Connect - Hardware Troubleshooting FarNorth-B# connect adapter Mezzanine Adapter bmc Baseboard Management Controller (CIMC) clp Connect to DMTF CLP iom IO Module local-mgmt Connect to Local Management CLI nxos Connect to NXOS CLI Connect – attaches you to hardware and read only NXOS FarNorth-A# connect local-mgmt a Fabric A Defaults to primary b Fabric B FarNorth-A(local-mgmt)# ? cd Change current directory clear Reset functions cluster Cluster mode connect Connect to Another CLI copy Copy a file cp Copy a file delete Delete managed objects dir Show content of dir enable Enable end Go to exec mode erase Erase erase-log-config Erase the mgmt logging config file exit Exit from command interpreter install-license Install a license ls Show content of dir mkdir Create a directory move Move a file mv Move a file ping Test network reachability pwd Print current directory reboot Reboots Fabric Interconnect rm Remove a file rmdir Remove a directory run-script Run a script show Show running system information ssh SSH to another system tail-mgmt-log Tail mgmt log file telnet Telnet to another system terminal Set terminal line parameters top Go to the top mode traceroute Traceroute to destination Most dangerous - erase configuration - reboot
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Connect NXOS Used to assist in troubleshooting – very familiar to IOS and Nexus - all the show commands Used to run advised debugs – By TAC Commands: –Show switch running config (non server config) –Clear interface counters found on the FI Cannot be used to configure UCS (read only)
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Connect to NXOS FarNorth-A# connect nxos ? a Fabric A b Fabric B Popular examples: show run show fex detail show interface show lacp show trunk show cdp debug show npv flogi-table show mac-address-table FarNorth-A(nxos)# ? clear Reset functions [Only place to clear counters] cli CLI commands debug Debugging functions debug-filter Enable filtering for debugging functions ethanalyzer Configure cisco packet analyzer interface A live capture will start on following interface no Negate a command or set its defaults ntp NTP configuration show Show running system information system System management commands terminal Set terminal line parameters test Test command undebug Disable Debugging functions (See also debug) end Go to exec mode exit Exit from command interpreter pop Pop mode from stack or restore from name push Push current mode to stack or save it under name where Shows the cli context you are in
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM UCSM – Common issues Is the other FI up and operational? Are clustering links up? Is there at least 1 chassis successfully discovered on both FIs? UCS-A# show cluster extended-state UCS-A# show pmon state UCS-A(local-mgmt)# cluster lead a UCS-A(local-mgmt)# cluster force primary UCS-A /monitoring/sysdebug # show cores DME Clustering problems
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Sample – Cluster state
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Sample – Process state (pmon)
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Agenda Troubleshooting UCSM & Fabric Interconnect Blade Servers CIMC/BIOS OBFL/SEL IOM & Chassis
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Blade servers Blade overview – Hardware & Software Components CPU & Heatsink Memory DIMMS Mezzanine Adapter CIMC HDD
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Blade servers CIMC – Monitors Temperature and Power readings – KVM & vMedia – Blade control BIOS – Can be configured via F2 or via BIOS Policy Blade overview – CIMC and BIOS
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM OBFL Onboard Fault Log stores hardware logs on the different components, saved at time of issue. Alternate method to viewed by connecting to the internal component end device. Show tech-support will capture required logs for support.
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM System Event Log (SEL) - Events Supported Server BIOS events 3 Kinds of equipment end-points: Memory Unit (DIMM) ECC errors, Address Parity, Memory Mismatch Processor Unit Memory Mirroring, Sparing, SMI Link errors Motherboard PCIe, QPI uncorrectable errors, Legacy PCI errors All these errors are modeled as stats properties. The ones for which thresholds are not defined get reported as statistics only BMC, BIOS, OS log platform errors to CIMC’s System Event Log (SEL) Buffer POST and Run Time errors Used as an Effective health monitoring tool
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM System Event Logs Make sure that servers are discovered Make sure backup destination path is valid Can be done via CLI also System Event Logs = Management Logs on earlier releases Chassis Server
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Corrupt CIMC Firmware Post Failure Not Completing boot Connect to CIMC in band manager to diagnose View Logs, collect tech-support, Monitor KVM output Manually reboot CIMC Fault codes: CIMC Booting Problems - Blades
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Connecting to CIMC Debug Utility To verify health of blade if questioning UCSM and wanting to look at lowest level of Blade data points Used to determine blade components issues at the source. UCS-A# connect cimc 1/1 Trying Connected to Escape character is '^]'. CIMC Debug Firmware Utility Shell ____________________________________ Debug Firmware Utility alarms cores exit help [COMMAND] images mctools memory messages network obfl post power sensors sel fru mezz1fru mezz2fru tasks top update users version Chassis 1 Server 1 Motherboard CIMC
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Blade servers – Common issues Server discovery failed – Check minimum software version – Reseat blade – Minimum hardware satisfied? No KVM Video – Does the CIMC have an IP? Is the BIOS corrupt? – Recover BIOS – Reset CMOS UCS-A# show version UCS-A /system # show capability UCS-A /chassis/server/cimc # show mgmt-if UCS-A /chassis/server # show post UCS-A /chassis/server # reset-kvm UCS-A /chassis/server # recover-bios UCS-A /chassis/server # reset-cmos CIMC issues
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Blade servers – Common issues Blade won’t boot – Did POST complete? Types of DIMM errors – Mapped out – Disabled – Inoperable – Degraded UCS-A# connect cimc x/y [ help ] # post [ post ] # obfl [ obfl ] # sel UCS-A /chassis/server # show memory [detail] UCS-A /chassis/server/memory-array/dimm # show stats memory-error-stats detail Hardware issues
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Blade servers – Common issues Service profile modifications – Firmware updates – Configuration changes OS initiated Hardware issue IOM / FI issues Use Maintenance policies to defer changes Check OS Unexpected reboot UCS-A /chassis/server# show fsm status UCS-A# connect cimc x/y [ help ] # post [ post ] # obfl [ obfl ] # sel
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Blade servers – Top 5 commands UCS-A /chassis/server # show inventory expand detail UCS-A /chassis/server # show status detail UCS-A /chassis/server # show post UCS-A /chassis/server # show sel UCS-A /chassis/server# show fsm status
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Agenda Troubleshooting UCSM & Fabric Interconnect Blade Servers IOM & Chassis Discovery issues Fan/Thermal/PSU Tech-support
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM IOM & Chassis CMC responsibilities – Chassis Discovery – Local cluster management – Power & Thermal Management Overview Chassis Management Controller Chassis Management Controller FLASH EEPROM DRAM Control IO Chassis Signals Switch Fabric links To Interconnect To Blades ASIC
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM IOM & Chassis – Common issues Check chassis discovery policy Server ports defined correctly FI to IOM 1:1 relationship only UCS-A(nxos)# show run interface ethernet x/y UCS-A(nxos)# show interface fex-fabric UCS-A(nxos)# show fex detail Chassis not discovering
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM IOM & Chassis – Common issues Spinning at 100% – Temperature – Any fans missing? – CMC access to thermal sensors – Component discovery UCS-A# connect iom 1 fex-1# show platform software cmcctrl thermal status fex-1# show platform software cmcctrl fancontrol all fex-1# show platform software cmcctrl ohms all Fan issues
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Logs for troubleshooting General UCS issues UCS-A(local-mgmt)# show tech-support ucsm detail UCS-A(local-mgmt)# show tech-support chassis # all detail Networking Issues Upstream_Switch# show tech-support details SAN Issues UCS-A(nxos)# show tech-support npv MDS# show tech-support details
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM UCSM and Chassis show tech from GUI Log into the UCSM GUI Select the admin tab -> faults, Audit and event-logs section -> Tech Support File
© 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM Where to find more information Hardware Installation & Service Guides Information wp wp38892 Release Notes Software Upgrade & Installation Information UCS Troubleshooting Guide UCS Faults Reference Cisco Support Community
Troubleshooting UCS C-series
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 115 March “Month of Routing Protocol Technology” Session 9 – 11 th Mar 2014 Session 10 – 25 th Mar 2014 April “Month of Wireless Technology” And many more……Months and Technologies
Thank you.