Digitalstroopwafel

Anything can be useful

Author: hans

Cisco 4500x – ROMMON upgrade steps

Recently during upgrading the IOS-XE versions on a few 4500x VSS clusters, I also had to upgrade the ROMMON version
to make them ISSU compatible for later IOS-XE releases.
Below I described the steps:

 

First check the upgrade path here:

https://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst4500/release/note/OL_24829.html#pgfId-178803

We follow the console upgrade path because the switch will shutdown, and you need to manually pull the power cords and put them back.
It is recommended to upgrade the ROMMON onsite in case of problems.

Step one

Copy the new ROMMON to the bootflash (primary) and slavebootflash (standby)
Also check the current ROMMON version and make sure that the files are available on the flash so we can always go back in case of problems.
Beware with downgrading the ROMMON because the ROMMON is hardware version specific and you can break the switch with it when going to low.

The specific upgrade path is described below:

Step Two
Attached your console cable to the standby node.

Step Three
SSH into the primary and execute the follow command:
redundacy reload peer

Step Four
press Ctrl-C to stop the boot process on the standby node, then re-enter ROMMON mode

Step Five

Boot from the new rommon upgrade file:

boot bootflash:cat4500-e-ios-promupgrade-150-1r-SG11

Step Six
The switch now starts the ROMMON upgrade process. It can take approx 20 minutes to finish.

Step Seven
The switch can shutdown if you upgrade from SG10 to SG11 or SG14, see cisco bug > CSCut66603
If this happens, take out the power cords (the power supply status led will be flashing) and re-seat them again.
The primary switch will now alert that there is an incompatibility detected in the log:

Aug 6 10:35:29: %INSTALLER-3-ISSU_OP_ERR: 1 installer: Not in ISSU, service incompatibility detected, reloading the standby
Aug 6 10:35:29: %RF-5-RF_TERMINAL_STATE: 1 ha_mgr: Terminal state reached for (SSO)

The secondary switch will try to connect to the VSS but since the ROMMON version is different from the primary node it will fail this process and reboot.

Step Eight

Repeat the above upgrade steps for the primary node. All network connections which were still available on the primary node will disconnect.
After the primary switch is rebooted, the secondary node will probably become primary VSS master and part of the connectivity will be restored.

Step nine

After the primary switch is upgraded and rebooted, it should reconnect itself as secondary node in the VSS cluster.
Verify on the primary node if the switch is back with redundancy
show issu state

You can also check the current ROMMON version when its loaded

SWITCH22#show version
Cisco IOS Software, IOS-XE Software, Catalyst 4500 L3 Switch Software (cat4500e-UNIVERSALK9-M), Version 03.06.07.E RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2017 by Cisco Systems, Inc.
Compiled Wed 12-Jul-17 13:41 by prod_rel_team

Cisco IOS-XE software, Copyright (c) 2005-2015 by cisco Systems, Inc.
All rights reserved. Certain components of Cisco IOS-XE software are
licensed under the GNU General Public License ("GPL") Version 2.0. The
software code licensed under GPL Version 2.0 is free software that comes
with ABSOLUTELY NO WARRANTY. You can redistribute and/or modify such
GPL code under the terms of GPL Version 2.0.
(http://www.gnu.org/licenses/gpl-2.0.html) For more details, see the
documentation or "License Notice" file accompanying the IOS-XE software,
or the applicable URL provided on the flyer accompanying the IOS-XE
software.

ROM: 15.0(1r)SG11
SWITCH22 uptime is 23 hours, 57 minutes
Uptime for this control processor is 23 hours, 58 minutes
System returned to ROM by reload
System restarted at 11:20:07 CEST Mon Aug 6 2018
System image file is "bootflash:/cat4500e-universalk9.SPA.03.06.07.E.152-2.E7.bin"
Jawa Revision 2, Winter Revision 0x0.0x41

 

Cisco 4500x vss ISSU upgrade process and issues

Recently I had to upgrade 4500x vss pair to a newer IOS-XE release. While trying to start the ISSU process I came across a few errors. Eventually after fixing those errors I finally managed to start the ISSU process.
Below I recorded some of the headache points came across and how I fixed them, and also the ISSU process itself.

I started with doing the usual checks on the health status of the vss pair. (redundancy, issu state, license and software state)

Then I tried to start the issu process with:

issu loadversion 1 bootflash:<newimage> 11 slavebootflash:<newimage>

And instantly A nice error came up:

% Active Boot variable is invalid

If this happens make sure to set the current BOOTVAR to the current running IOS version, and the config-reg to value 0x2102!!
Also do not forget to do a write after chaning 1 of the bootvar settings. (after deleting, do a wr..  after adding do a wr..  etc)

In my case the BOOTVAR was set correctly but the error persisted. So I decided to do a force-failover between the active -standby and back again so both units get a reload
before trying the ISSU again.

redundancy force-failover

Now after both units were reloaded and the vss pair was back in its original state I tried the ISSU again.
The process started, the standby node finally rebooted with the NEW ios, and after a while gave an error on the console:

%C4K_REDUNDANCY-2-IOS_VERSION_CHECK_FAIL: IOS version mismatch. Active supervisor version is 15.2(2)E5 (cat4500e-UNIVERSALK9-M). Standby supervisor version is 15.2(2)E8 (cat4500e-UNIVERSALK9-M). Redundancy feature may not work as expected. %C4K_REDUNDANCY-2-NON_SYMMETRICAL_REDUNDANT_SYSTEM: STANDBY:STANDBY supervisor will operate in fallback redundancy mode rpr. 
%C4K_REDUNDANCY-3-COMMUNICATION: STANDBY:Communication with the peer Supervisor has been established 
%C4K_REDUNDANCY-2-VS_REBOOT_ON_RPR_FALLBACK: STANDBY:Supervisor in virtual-switch configuration cannot operate in redundancy mode RPR, will be reset 
%RF-5-RF_RELOAD: STANDBY:Self Reload. Reason: Virtual-switch fallback to RPR 
%SYS-5-RELOAD: STANDBY:Reload requested by Platform redundancy manager. Reload Reason: Virtual-switch fallback to RPR. 
Message from sysmgr: Reason Code:[3] Reset Reason:Reset/Reload requested by [console]. [Reload command]

The standby node after this just rebooted back to the old IOS.

This one gave me some headache at first, until someone pointed out to me that I should the ISSU Compatibility matrix here:
http://docwiki.cisco.com/wiki/Catalyst_4500/4900_series_ISSU_compatibility_matrix

For some IOS-XE versions, especially older versions, it is not possible to instantly upgrade to the latest version in a software train/different software train.
After checking the compatibility matrix it showed I could not upgrade from 3.6.5 to 3.6.8 directly but hat to go to 3.6.7 first.

Uploading this image, and restarting the ISSU process, I finally succeeded with using ISSU to update the IOS version..

Below are the normal ISSU steps normally used for Cisco 4500-x switches

ISSU Upgrade steps:

There are two ways to perform an ISSU upgrade: manually, with four commands; or automatically, with one command.
To have more control, the four command method is preferred.

Step one:

Check the following :
The active and the standby nodes must have the same (supervisor engine) hardware and software versions(Rommon, license features )
Make sure to double check the required rommon version or if a rommon update is needed!

Make a backup of the config and the current IOS image  🙂

Step two:

The new and old Cisco IOS software images must be loaded into the file systems (bootflash of the active and slavebootflash of the standby)
of both the active and the standby supervisor engines before you begin the ISSU process

bootflash: <– active node
slavebootflash: <– standby node

Step three:

Check if the redundancy settings are ok.
Standby node should be in STANDBY-HOT mode.

show redundancy

swpg001#show redundancy
Redundant System Information :
------------------------------
       Available system uptime = 48 weeks, 1 hour, 48 minutes
Switchovers system experienced = 2
              Standby failures = 4
        Last switchover reason = user_forced
                 Hardware Mode = Duplex
    Configured Redundancy Mode = Stateful Switchover
     Operating Redundancy Mode = Stateful Switchover
              Maintenance Mode = Disabled
                Communications = Up

Current Processor Information :
------------------------------
               Active Location = slot 1/1
        Current Software state = ACTIVE
       Uptime in current state = 20 hours, 28 minutes
                 Image Version = Cisco IOS Software, IOS-XE Software, Catalyst 4500 L3 Switch  Software (cat4500e-UNIVERSALK9-M), Version 03.06.05.E RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2016 by Cisco Systems, Inc.
Compiled Thu 02-Jun-16 04:02 by prod
               BOOT = bootflash:/cat4500e-universalk9.SPA.03.06.05.E.152-2.E5.bin,1;
        Configuration register = 0x2102

Peer Processor Information :
------------------------------
              Standby Location = slot 2/1
        Current Software state = STANDBY HOT
       Uptime in current state = 16 hours, 41 minutes
                 Image Version = Cisco IOS Software, IOS-XE Software, Catalyst 4500 L3 Switch  Software (cat4500e-UNIVERSALK9-M), Version 03.06.05.E RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2016 by Cisco Systems, Inc.
Compiled Thu 02-Jun-16 04:02 by pr
               BOOT = bootflash:/cat4500e-universalk9.SPA.03.06.05.E.152-2.E5.bin,1;
        Configuration register = 0x2102

 

Step four:

Verify ISSU state and write down the slot ID’s of the active and standby node.
In this case the primary node has slot 1 and the secondary node slot 2.

swpg001# show issu state detail
Slot = 1
RP State = Active
ISSU State = Init
Operating Mode = Stateful Switchover
Current Image = bootflash:/cat4500e-universalk9.SPA.03.06.05.E.152-2.E5.bin
Pre-ISSU (Original) Image = N/A
Post-ISSU (Targeted) Image = N/ASlot = 11
RP State = Standby
ISSU State = Init
Operating Mode = Stateful Switchover
Current Image = bootflash:/cat4500e-universalk9.SPA.03.06.05.E.152-2.E5.bin
Pre-ISSU (Original) Image = N/A
Post-ISSU (Targeted) Image = N/A

Step five:

Check and if needed set the BOOTVAR to the current running image.

The current running version should be set in the current BOOTVAR.
Also the config-register should be set to 0x2102

Do not forget to write the config if the BOOTVAR is changed!

swpg001#sh bootvar

BOOT variable = bootflash:/cat4500e-universalk9.SPA.03.06.05.E.152-2.E5.bin,1;

CONFIG_FILE variable does not exist
BOOTLDR variable does not exist
Configuration register is 0x2102

Standby BOOT variable = bootflash:/cat4500e-universalk9.SPA.03.06.05.E.152-2.E5.bin,1;
Standby CONFIG_FILE variable does not exist
Standby BOOTLDR variable does not exist
Standby Configuration register is 0x2102

*** EXTRA INFO: ***

It is possible that even tho the BOOTVAR is set to the right variable, the ISSU process fails with the error:

“Active BOOT variable is invalid”

In this case, after setting the BOOTVAR again and doing an WR MEM, a forced-failover should be applied 2x
so both nodes are reloaded before attempting ISSU again.

 

Step six:

First load the new IOS version on the current STANDBY (slot id 11 in this case) node

issu loadversion 1 bootflash:new_image 11 slavebootflash:new_image

Note: with the first command the STANDBY node will reboot with the NEW ios image.

It may take several seconds after entering the issu loadversion command for Cisco IOS software to load onto the standby supervisor engine and the standby supervisor engine to transition to SSO mode.
If you enter the show issu state command too quickly, you may not see the information you need.

Step seven:

After the standby node is reloaded, we can switch over to this node as the active node:

issu runversion 11 slavebootflash:new_image
This command will reload the Active unit. Proceed ? [confirm]

A switchover occurs at this point and the standby node becomes primary active, while the active node reboots with the new image and becomes standby node
if everything seems OK, the rollback timer needs to be stopped.
If this is not done the system automatically aborts the ISSU process and reverts to the original Cisco IOS software version. By default the rollback timer is 45 minutes.

To stop the timer and accept the new running version continue to the next step.

Step eight:

After the previous command and the switchover from the active to the standby node, the standby node became the new active node.
Do not forget that the SLOT ID’s are fixed and move with the switchover. So the new situation is Active node now has SLOT ID 11, and the Standby node SLOT ID 1.

We want to accept the new image running on the now Active node (slot id 11 in this case)

issu acceptversion 11 bootflash:new_image

This will stop the rollback timer and accepts the new image as the new default image.

Step nine:

Now for the last step we need to load the standby supervisor with the new image

issu commitversion 1 slavebootflash:new_image

This will reboot the standby node with the new image.

After the switch is back ‘online’.
Check the following if all states are OK

show redundancy states

show redundancy

show issu state detail

 

OPTIONAL:  Last step

Do a redundancy force-failover so that the original primary node which now is standby is the active node again.

 

© 2018 Digitalstroopwafel

Theme by Anders NorenUp ↑