Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cisco Confidential 1 © 2010 Cisco and/or its affiliates. All rights reserved. Welcome Technical Services Virtual Boot Camp Session 8 Technical Services.

Similar presentations


Presentation on theme: "Cisco Confidential 1 © 2010 Cisco and/or its affiliates. All rights reserved. Welcome Technical Services Virtual Boot Camp Session 8 Technical Services."— Presentation transcript:

1 Cisco Confidential 1 © 2010 Cisco and/or its affiliates. All rights reserved. Welcome Technical Services Virtual Boot Camp Session 8 Technical Services India Team

2 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2 Technology · Architecture Overview UCS C-series UCS B-series · UCS Interoperability Hardware Software · Troubleshooting Case Study (Lab Demo) Q&A

3 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3 https://supportforums.cisco.com/docs/DOC-37994...PPT https://supportforums.cisco.com/videos/7517....Video https://supportforums.cisco.com/docs/DOC-37851...Q&A

4 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4 Technology · Firmware Install and upgrade UCS C-series UCS B- series · Troubleshooting Case Study (Lab Demo) Important logs Part Identification and RMA Q&A

5 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5 Nirmal Sodani Technical Support Manager Mohit Mmangal Manager, CSC Avinash Shukla TAC Escalation Engineer Vinay Sharma Lead, CSC Teclus D'Souza TAC Escalation Engineer Chetan Badami Technical Escalation Engineer

6 Cisco Confidential 6 © 2010 Cisco and/or its affiliates. All rights reserved. Technology – UCS Avinash Shukla Teclus D'Souza Chetan Badami

7 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 7 Agenda  UCS Upgrade Procedure C-series B-series  UCS Upgrade Procedure C-series B-series  UCS Troubleshooting UCSM / FI / IOM / Blade C-series  UCS Troubleshooting UCSM / FI / IOM / Blade C-series  UCS H/W and S/W Interoperability

8 © 2010 Cisco Systems, Inc. All rights reserved.CAE BootcampPresentation_ID 8 UCS H/W and S/W Interoperability Avinash Shukla Cisco TAC

9 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 9 Operating System  Check the support matrix before installing the OS on the blade  Install / keep the drivers (Eth / FC) updated as per the matrix  Few important things to check: –Is the blade running the certified OS and OS version? –Are there any special needs for that OS? E.g. VMWare – OEM Image –Are the drivers at the OS level updated and current?  Answer: –UCS S/W and H/W matrix –http://www.cisco.com/web/techdoc/ucs/interoperability/matrix/matrix.html –http://www.cisco.com/en/US/products/ps10477/prod_technical_reference_list.ht ml

10 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 10 H/W and S/W Interop

11 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 11

12 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 12 What each matrix provides

13 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 13 Sample..driver versions

14 © 2010 Cisco Systems, Inc. All rights reserved.CAE BootcampPresentation_ID 14 UCS Upgrades

15 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 15 Agenda  C series firmware upgrade  Pre-requisites  Firmware ISO location and downloading  Upgrade process  B series firmware upgrade  Pre-requisites  Firmware bundles and downloading  Upgrade process  Additions / Modifications from version 2.1

16 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 16 Pre-requisites C Series

17 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 17 Things to consider  Release Notes will cover gotchas and concerns in the upgrade process  Upgrades from one version back will always work  Check release notes about prior versions –If customer is really far behind it might require two upgrades to get to current code  Schedule an maintenance window –CIMC and server will reboot during upgrade

18 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 18 Firmware ISO

19 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 19 C Series Upgrade  Downloading iso file

20 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 20 Upgrade process C Series

21 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 21 Map the iso on the KVM

22 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 22 Boot from Virtual Media

23 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 23 HUU Screen and options

24 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 24 HUU Screen and options

25 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 25 After all component upgrade

26 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 26 Verify Upgrade  To verify check that all components are upgraded

27 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 27 Pre-requisites B Series

28 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 28 Things to consider  Release Notes will cover gotchas and concerns in the upgrade process  Upgrades from one version back will always work  Check release notes about prior versions –If customer is running a very old version, it might require two upgrades to get to current code  Schedule an maintenance window –FI and IOM will reboot during upgrade –Make sure network and storage fabric are redundant  Highly recommended to backup UCSM configuration

29 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 29 Be patient  Upgrade process is not quick  Sometimes bugs will result in the first release after FCS  Expect a maintenance release shortly after FCS  Follow the upgrade procedure for each version –The procedure is not always the same from one version to another.

30 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 30 Downgrading  Sometimes there might be data loss  Might have to erase config to downgrade –Database changes in new versions cannot always be back ported

31 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 31 Bundles

32 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 32 Bundles  Prior to 1.4 there was only all inclusive bundle  Now there are multiple bundles –Infra-bundle – contains code for FI, IOM, and UCSM –B-series bundle – contains BIOS and blade specific code –C-series bundle – contains BIOS and rack server specific code

33 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 33 Bundles  All firmware work is done from Equipment tab in UCSM

34 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 34 Bundles  Packages can be viewed/deleted from “Packages” tab

35 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 35 Bundles  Bundles are downloaded from the “Download Tasks” tab  Downloads can be through desktop or using ftp/scp/sftp/tftp

36 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 36 Cisco.com to download FCS bundles B-Series packages C-Series packages FI, IOM, and UCSM software

37 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 37 Pre 1.4 bundles are single download 1.0-1.3 bundles

38 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 38 Upgrade process B Series

39 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 39 Upgrade Process  Again always consult release notes  Upgrade through GUI is easiest  General Process is Backup UCS Config (Full & All Config) Download code Update components Activate components in order of (Check RN cause order can change) Interface cards – Set Startup Only CIMC IOM – Set Startup Only UCSM FI

40 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 40 Updating Components  Update means copy new code to backup location of all UCSM components  Simply stages the new code  Can update all components at once

41 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 41 Updating Components  Time to update will vary based on component  IOMs take a long time. Up to 5 minutes  If any component has issues check FSM for that component  Update process does not work on FI  Once everything is in “Ready” state you can move to Activate

42 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 42 Activate Components  In this process you activate the code that you copied  Some code is activated but set to “activate on next reboot”  Understand that in this stage you can create outages  Activate “leaves of the tree” first –BU uses this term to mean that order should be Interface card = leaf CIMC = twig IOM = branch UCSM = trunk FI = root

43 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 43 Activate Blade Components  Recommended Method is to use Policies –Host Firmware Policy to apply latest BIOS, Board Controller, Adapters, etc.

44 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 44 Activate Interface cards  Set to “Set to startup version only”  If you uncheck above box it will cause a blade reboot!!!

45 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 45 Activate CIMC  CIMC can bet activated without disruption to OS on blade  KVM session will be lost while activating

46 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 46 Activate IOM  Same as Interface card “Set Startup Version Only”  IOM needs to be at same version as FI!!!

47 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 47 Activate UCSM  Will cause UCSM to disconnect  Takes a few minutes

48 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 48 Activate Fabric Interconnect  Recommended to activate one FI at a time  A complete outage will not occur  Fail one fabric  Wait for all Network and FC traffic failover to second Fabric  Highly recommended to have an outage window  Biggest risk is SAN storage  FI will upgrade and reboot  Part of the process is to reboot connected IOM as well  Can take up to 10-15 minutes for FI and all IOM to come back online  If any failure during first FI upgrade STOP! Do not attempt to upgrade second FI

49 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 49 Activate Fabric Interconnect  Activate FI from Equipment tab  Upgrade subordinate first

50 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 50 Activate Fabric Interconnect  Choose correct Kernel and System Version  FI will take a few minutes and then reboot  IOMs will get updated as well

51 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 51 Verify Fabric Interconnect upgrade  Make sure IOM and FI all match the correct running version

52 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 52 Upgrade Primary Fabric Interconnect  Upgrade the Primary FI now using same process  UCSM will failover to subordinate FI  Will need to log back in to UCSM

53 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 53 Problems  Biggest concern is a failed IOM upgrade –There is no way in field to upgrade an IOM manually –RMA the failed IOM –Can attempt a physical reseat of IOM  Failed FI upgrade can be recovered –Similar to N5K will require access to console and tftp server to boot from –Refer to FI recovery method

54 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 54 Host Firmware

55 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 55 Host firmware  Highly recommended that Blade BIOS match running UCSM system  Best way to upgrade BIOS is through Host Firmware Policy  Create policy in UCSM  Apply policy to SP  Will reboot the blade so need outage window

56 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 56 Create Host Firmware Policy  From Server Tab

57 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 57 Host Firmware Policy  Note that Firmware Policy can include –Adapters, BIOS, Board Controller, FC Adapters, HBA Option ROM and Storage Controller  Note how Adapters and FC adapters can be part of a policy –If adapters are part of policy then they can only be changed as part of firmware policy  Recommended to upgrade BIOS and Storage Controller at a minimum  Board adapter rarely changes and is specific to B230 and B440

58 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 58 Set BIOS versions  Best to choose all hardware  Set BIOS to the latest in the pull down for each blade/server  Latest BIOS version will be different for some servers

59 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 59 Add the new Firmware Policy to a SP  Select the Host Firmware policy  Blade will reboot once you “Save Changes”

60 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 60 Additions / Modifications from version 2.1

61 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 61 –Firmware Auto Install – Install Infrastructure Firmware – Install Server Firmware We just made it simple to upgrade

62 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 62 Firmware Auto-Install  Firmware Auto-Install implements package version based upgrades for both UCS Infrastructure components and Server components  Firmware Auto-Install can not be used to upgrade Management Extensions and Capability Catalog. These are simple occasional updates in UCSM and hence left under user control.  It is a two step process - “Install Infrastructure Firmware” and “Install Server Firmware”.  It is recommended to run “Install Infrastructure Firmware” first and then “Install Server Firmware”  All existing firmware upgrade mechanisms are retained. For users who do not want to use Auto-Install, they can continue to use existing documented way of doing firmware upgrades.

63 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 63 Install Infrastructure Firmware (contd)  This is the sequence followed by “Install Infrastructure Firmware” 1. Upgrade UCSM 2. Update backup image of all IOMs 3. Activate all IOMs with setstartup option 4. Activate secondary Fabric Interconnect 5. Wait for User Acknowledgement*** 6. Activate primary Fabric Interconnect

64 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 64 Install Infrastructure Firmware GUI

65 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 65 Install Infrastructure Firmware - Cancelling A scheduled “Install Infra” operation can be cancelled But an “Install Infra” operation which is already “In Progress” can not be cancelled. Both GUI and CLI options are available for cancelling.

66 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 66 Install Infrastructure Firmware – User Acknowledgement for primary FI “Install Infra” expects an explicit permission from user to start firmware upgrade on primary Fabric Interconnect. This is necessary to protect the data path for servers. As part of “Install Infra”, secondary FI’s firmware is upgraded first. Secondary FI reboots as part of firmware activation. After secondary FI comes online, users are expected to check if the data path is ready for a reboot of primary FI When users have ensured that the data path is ready, they can acknowledge reboot of primary FI.

67 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 67 Acknowledge Primary FI reboot

68 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 68 Install Server Firmware Install-Server offers a way to update multiple host firmware packages using package versions. It provides the list of Service Profiles that will be affected when a host firmware package is modified. Multiple SPs can use the same host firmware package. It also provides a final summary of physical servers that will be rebooted for the set of host firmware packages that are getting modified. Only GUI is available for "Install Server Firmware". No CLI.

69 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 69 Install Server Firmware – Screen 1

70 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 70 Install Server Firmware – Screen 2

71 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 71 Install Server Firmware – Screen 3

72 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 72 Install Server Firmware – Screen 4

73 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 73 Install Server Firmware – Screen 5

74 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 74 Install Server Firmware – Screen 6

75 © 2010 Cisco Systems, Inc. All rights reserved.CAE BootcampPresentation_ID 75 Troubleshooting the Cisco Unified Computing System Chetan Badami Cisco TAC

76 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 76 Agenda Troubleshooting  UCSM & Fabric Interconnect  Fault types  Clustering issues  Common issues  Blade Servers  IOM & Chassis

77 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 77 UCS System Components  UCS manager  UCS Fabric Interconnect (6xxx)  UCS Fabric Extenders (2xxx)  UCS 5100 Blade Chassis  UCS B-series servers  Nexus 2000 switch  UCS C-series servers  UCS Network adapters

78 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 78 UCS 6200 Fabric Interconnect (FI)  Standalone or Clustered Primary / Subordinate Data Management Engine (DME) FI-B# FI-A# Virtual IP IP #B IP #A Management Network Cluster links DB

79 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 79 UCSM  UCSM GUI  CLI UCS-A# scope server x/y  NXOS UCS-A# connect nxos a UCS-A(nxos)# show…  XML API

80 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 80 Fault Types TypeDescription FSMAn FSM task has failed to complete successfully, or Cisco UCS Manager is retrying one of the stages of the FSM. equipmentCisco UCS Manager has detected that a physical component is inoperable or has another functional issue. serverCisco UCS Manager cannot complete a server task, such as associating a service profile with a server. environmentCisco UCS Manager cannot successfully configure a component.

81 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 81 Fault Types TypeDescription managementCisco UCS Manager has detected a power problem, thermal problem, voltage problem, or loss of CMOS settings. connectivityCisco UCS Manager has detected a connectivity problem, such as an unreachable adapter. NetworkCisco UCS Manager has detected a network issue, such as a link down. operationalCisco UCS Manager has detected an operational problem, such as a log capacity issue or a failed server discovery.

82 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 82 FarNorth-A# scope server ? WORD / dynamic-uuid Dynamic UUID FarNorth-A# scope server 1/1 FarNorth-A /chassis/server # show event Events per Component

83 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 83 UCSM Faults - GUI

84 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 84 Information Fault Major Fault

85 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 85 Finite State Machine (FSM)  Workflow with many stages  Data Management Engine (DME) … Application Gateway (AG) … End Point (EP)

86 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 86 Error Description for that stage Stage Description Operation (workflow) FSM Details

87 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 87 Contexts  UCS has three CLI “Contexts” UCSM (GUI Equivalent, uses the “ scope ” command) NXOS (not configurable – read only) Management (file management, tech support, reboot) UCSM Local-Management NXOS

88 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 88 Scope  Scoping – movement to different UCS configuration components Details on hardware components done with connect command You want to be on the Primary Fabric Interconnect UCS-B# scope ? adapter Mezzanine Adapter chassis Chassis eth-server Ethernet Server Domain eth-storage Ethernet Storage eth-traffic-mon Ether Traffic Monitoring Domain eth-uplink Ethernet Uplink fabric-interconnect Fabric Interconnect fc-storage FC Storage fc-traffic-mon FC Traffic Monitoring Domain fc-uplink FC Uplink fex FEX (fabric-extender) Module firmware Firmware host-eth-if Host Ethernet Interface host-fc-if Host FC Interface license License monitoring Monitor the system org Organizations power-cap-mgmt Power Cap Mgmt security security mode server Server service-profile Service Profile system Systems vhba vHBA vnic vNIC

89 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 89 Connect - Hardware Troubleshooting FarNorth-B# connect adapter Mezzanine Adapter bmc Baseboard Management Controller (CIMC) clp Connect to DMTF CLP iom IO Module local-mgmt Connect to Local Management CLI nxos Connect to NXOS CLI  Connect – attaches you to hardware and read only NXOS FarNorth-A# connect local-mgmt a Fabric A Defaults to primary b Fabric B FarNorth-A(local-mgmt)# ? cd Change current directory clear Reset functions cluster Cluster mode connect Connect to Another CLI copy Copy a file cp Copy a file delete Delete managed objects dir Show content of dir enable Enable end Go to exec mode erase Erase erase-log-config Erase the mgmt logging config file exit Exit from command interpreter install-license Install a license ls Show content of dir mkdir Create a directory move Move a file mv Move a file ping Test network reachability pwd Print current directory reboot Reboots Fabric Interconnect rm Remove a file rmdir Remove a directory run-script Run a script show Show running system information ssh SSH to another system tail-mgmt-log Tail mgmt log file telnet Telnet to another system terminal Set terminal line parameters top Go to the top mode traceroute Traceroute to destination Most dangerous - erase configuration - reboot

90 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 90 Connect NXOS  Used to assist in troubleshooting – very familiar to IOS and Nexus - all the show commands  Used to run advised debugs – By TAC  Commands: –Show switch running config (non server config) –Clear interface counters found on the FI  Cannot be used to configure UCS (read only)

91 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 91 Connect to NXOS FarNorth-A# connect nxos ? a Fabric A b Fabric B Popular examples: show run show fex detail show interface show lacp show trunk show cdp debug show npv flogi-table show mac-address-table FarNorth-A(nxos)# ? clear Reset functions [Only place to clear counters] cli CLI commands debug Debugging functions debug-filter Enable filtering for debugging functions ethanalyzer Configure cisco packet analyzer interface A live capture will start on following interface no Negate a command or set its defaults ntp NTP configuration show Show running system information system System management commands terminal Set terminal line parameters test Test command undebug Disable Debugging functions (See also debug) end Go to exec mode exit Exit from command interpreter pop Pop mode from stack or restore from name push Push current mode to stack or save it under name where Shows the cli context you are in

92 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 92 UCSM – Common issues  Is the other FI up and operational?  Are clustering links up?  Is there at least 1 chassis successfully discovered on both FIs? UCS-A# show cluster extended-state UCS-A# show pmon state UCS-A(local-mgmt)# cluster lead a UCS-A(local-mgmt)# cluster force primary UCS-A /monitoring/sysdebug # show cores DME Clustering problems

93 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 93 Sample – Cluster state

94 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 94 Sample – Process state (pmon)

95 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 95 Agenda Troubleshooting  UCSM & Fabric Interconnect  Blade Servers  CIMC/BIOS  OBFL/SEL  IOM & Chassis

96 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 96 Blade servers Blade overview – Hardware & Software Components CPU & Heatsink Memory DIMMS Mezzanine Adapter CIMC HDD

97 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 97 Blade servers  CIMC – Monitors Temperature and Power readings – KVM & vMedia – Blade control  BIOS – Can be configured via F2 or via BIOS Policy Blade overview – CIMC and BIOS

98 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 98 OBFL  Onboard Fault Log stores hardware logs on the different components, saved at time of issue.  Alternate method to viewed by connecting to the internal component end device.  Show tech-support will capture required logs for support.

99 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 99 System Event Log (SEL) - Events Supported  Server BIOS events 3 Kinds of equipment end-points: Memory Unit (DIMM) ECC errors, Address Parity, Memory Mismatch Processor Unit Memory Mirroring, Sparing, SMI Link errors Motherboard PCIe, QPI uncorrectable errors, Legacy PCI errors All these errors are modeled as stats properties. The ones for which thresholds are not defined get reported as statistics only  BMC, BIOS, OS log platform errors to CIMC’s System Event Log (SEL) Buffer  POST and Run Time errors  Used as an Effective health monitoring tool

100 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 100 System Event Logs Make sure that servers are discovered Make sure backup destination path is valid Can be done via CLI also System Event Logs = Management Logs on earlier releases Chassis Server

101 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 101  Corrupt CIMC Firmware Post Failure Not Completing boot  Connect to CIMC in band manager to diagnose  View Logs, collect tech-support, Monitor KVM output  Manually reboot CIMC Fault codes: http://www.cisco.com/en/US/partner/docs/unified_computing/ucs/ts/faults/reference/ErrMess.html http://www.cisco.com/en/US/partner/docs/unified_computing/ucs/ts/faults/reference/ErrMess.html CIMC Booting Problems - Blades

102 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 102 Connecting to CIMC Debug Utility  To verify health of blade if questioning UCSM and wanting to look at lowest level of Blade data points  Used to determine blade components issues at the source. UCS-A# connect cimc 1/1 Trying 127.5.1.1... Connected to 127.5.1.1. Escape character is '^]'. CIMC Debug Firmware Utility Shell ____________________________________ Debug Firmware Utility alarms cores exit help [COMMAND] images mctools memory messages network obfl post power sensors sel fru mezz1fru mezz2fru tasks top update users version Chassis 1 Server 1 Motherboard CIMC

103 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 103 Blade servers – Common issues  Server discovery failed – Check minimum software version – Reseat blade – Minimum hardware satisfied?  No KVM Video – Does the CIMC have an IP?  Is the BIOS corrupt? – Recover BIOS – Reset CMOS UCS-A# show version UCS-A /system # show capability UCS-A /chassis/server/cimc # show mgmt-if UCS-A /chassis/server # show post UCS-A /chassis/server # reset-kvm UCS-A /chassis/server # recover-bios UCS-A /chassis/server # reset-cmos CIMC issues

104 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 104 Blade servers – Common issues  Blade won’t boot – Did POST complete?  Types of DIMM errors – Mapped out – Disabled – Inoperable – Degraded UCS-A# connect cimc x/y [ help ] # post [ post ] # obfl [ obfl ] # sel UCS-A /chassis/server # show memory [detail] UCS-A /chassis/server/memory-array/dimm # show stats memory-error-stats detail Hardware issues

105 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 105 Blade servers – Common issues  Service profile modifications – Firmware updates – Configuration changes  OS initiated  Hardware issue  IOM / FI issues  Use Maintenance policies to defer changes  Check OS Unexpected reboot UCS-A /chassis/server# show fsm status UCS-A# connect cimc x/y [ help ] # post [ post ] # obfl [ obfl ] # sel

106 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 106 Blade servers – Top 5 commands UCS-A /chassis/server # show inventory expand detail UCS-A /chassis/server # show status detail UCS-A /chassis/server # show post UCS-A /chassis/server # show sel UCS-A /chassis/server# show fsm status

107 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 107 Agenda Troubleshooting  UCSM & Fabric Interconnect  Blade Servers  IOM & Chassis  Discovery issues  Fan/Thermal/PSU  Tech-support

108 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 108 IOM & Chassis  CMC responsibilities – Chassis Discovery – Local cluster management – Power & Thermal Management Overview Chassis Management Controller Chassis Management Controller FLASH EEPROM DRAM Control IO Chassis Signals Switch 1 - 4 Fabric links To Interconnect To Blades ASIC

109 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 109 IOM & Chassis – Common issues  Check chassis discovery policy  Server ports defined correctly  FI to IOM 1:1 relationship only UCS-A(nxos)# show run interface ethernet x/y UCS-A(nxos)# show interface fex-fabric UCS-A(nxos)# show fex detail Chassis not discovering

110 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 110 IOM & Chassis – Common issues  Spinning at 100% – Temperature – Any fans missing? – CMC access to thermal sensors – Component discovery UCS-A# connect iom 1 fex-1# show platform software cmcctrl thermal status fex-1# show platform software cmcctrl fancontrol all fex-1# show platform software cmcctrl ohms all Fan issues

111 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 111 Logs for troubleshooting  General UCS issues UCS-A(local-mgmt)# show tech-support ucsm detail UCS-A(local-mgmt)# show tech-support chassis # all detail  Networking Issues Upstream_Switch# show tech-support details  SAN Issues UCS-A(nxos)# show tech-support npv MDS# show tech-support details

112 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 112 UCSM and Chassis show tech from GUI  Log into the UCSM GUI  Select the admin tab -> faults, Audit and event-logs section -> Tech Support File

113 © 2011 Cisco and/or its affiliates. All rights reserved.Cisco Public BRKCOM-3001 113 Where to find more information  Hardware Installation & Service Guides Information http://www.cisco.com/en/US/docs/unified_computing/ucs/overview/guide/UCS_roadmap.html# wp38892 http://www.cisco.com/en/US/docs/unified_computing/ucs/overview/guide/UCS_roadmap.html# wp38892  Release Notes http://www.cisco.com/en/US/products/ps10281/prod_release_notes_list.html http://www.cisco.com/en/US/products/ps10281/prod_release_notes_list.html  Software Upgrade & Installation Information http://www.cisco.com/en/US/products/ps10281/prod_installation_guides_list.html http://www.cisco.com/en/US/products/ps10281/prod_installation_guides_list.html  UCS Troubleshooting Guide http://www.cisco.com/en/US/docs/unified_computing/ucs/ts/guide/UCSTroubleshooting.html http://www.cisco.com/en/US/docs/unified_computing/ucs/ts/guide/UCSTroubleshooting.html  UCS Faults Reference http://www.cisco.com/en/US/docs/unified_computing/ucs/ts/faults/reference/ErrMess.html http://www.cisco.com/en/US/docs/unified_computing/ucs/ts/faults/reference/ErrMess.html  Cisco Support Community https://supportforums.cisco.com/community/netpro/data-center/unified-computing https://supportforums.cisco.com/community/netpro/data-center/unified-computing

114 Troubleshooting UCS C-series

115 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 115 March “Month of Routing Protocol Technology” Session 9 – 11 th Mar 2014 Session 10 – 25 th Mar 2014 April “Month of Wireless Technology” And many more……Months and Technologies

116 Thank you.


Download ppt "Cisco Confidential 1 © 2010 Cisco and/or its affiliates. All rights reserved. Welcome Technical Services Virtual Boot Camp Session 8 Technical Services."

Similar presentations


Ads by Google