Fabric failover is a unique feature of the Cisco Unified Computing System that provides a “teaming-like” function in hardware . This function is entirely transparent to the operating system running on the server and does not require any configuration inside the OS. I think that this feature is quite useful, because it creates resiliency for UCS blades or rack servers, without depending on any drivers or configuration inside the operating system.
I have often encountered servers that were supposed to be redundantly connected to the network, as they were physically connected to two different switches. However, due to missing or misconfigured teaming, these servers would still lose their connectivity if the primary link failed. Therefore, I think that a feature that offers resiliency against path failures for Ethernet traffic without any need for teaming configuration inside the operating system, is very interesting. This is especially true for bare-metal Windows or Linux servers on UCS blades or rack servers.
In this post I do not intend to cover the basics of Fabric Failover, as this has already been done excellently by other bloggers. So if you need a quick primer or refresher on this feature, then I recommend that you read Brad Hedlund’s classic post “Cisco UCS Fabric Failover: Slam Dunk? or So What?“.
Instead of rehashing the basic principles of fabric failover, I intend to dive a bit deeper into the UCSM GUI, UCSM CLI and NX-OS CLI to examine and illustrate the operation of this feature inside UCS. This serves a dual purpose: Gaining more insight in the actual implementation of the fabric failover feature and getting more familiar with some essential UCS screens and commands.
So to start, I created a service profile for a VMware ESXi host that has two separate vNICs, named eth0 and eth1. vNIC eth0 has fabric A as the primary fabric with failover to B and vNIC eth1 has fabric B as the primary fabric with failover to A.
Note: This setup is not typical for an ESXi deployment, but more common for Windows or Linux bare-metal deployments. In a VMware setup, failover and load-balancing is usually configured at the vSwitch level. However, for this specific example I decided to use ESXi because I already had it running in the lab anyway and it also illustrates how VM MAC addresses are handled by the fabric failover mechanism.
The screenshot below shows the vNIC setup for my service profile:
In order to analyze the behavior of the fabric failover feature we need to find the vEthernet interfaces that are associated with these vNICs. Using the UCSM GUI we can find them in the “VIF Paths” tab:
As you can see vNIC eth0 is associated with two virtual Ethernet interfaces: veth 703 on fabric A and veth 704 on fabric B. Likewise, vNIC eth1 is associated with veth 705 on fabric B and veth 706 on fabric A. This view also illustrates the physical component-to-component path that is being used by these interfaces. This same information can also be obtained from the UCSM CLI using the show service-profile circuit
command:
UCS-60-B# show service-profile circuit name POD60-ESX-1 | egrep "eth|Fabric|VIF" Fabric ID: A VIF vNIC Link State Overall Status Prot State Prot Role Admin Pin Oper Pin Transport 703 eth0 Up Active Active Primary 0/0 0/1 Ether 706 eth1 Up Active Passive Backup 0/0 0/1 Ether Fabric ID: B VIF vNIC Link State Overall Status Prot State Prot Role Admin Pin Oper Pin Transport 704 eth0 Up Active Passive Backup 0/0 0/2 Ether 705 eth1 Up Active Active Primary 0/0 0/2 Ether
This command also reveals some additional detail about the failover roles of the VIFs. VIF 703 on fabric A is marked as role primary and state active for eth0, while VIF 704 on fabric B is marked as role backup and state passive for that same vNIC. For eth1 we can see that VIF 705 on fabric B is marked as primary/active and VIF 706 on fabric A is marked as backup/passive. Another way to view the same information from the UCSM GUI is to set the scope to the physical adapter and view the vNICs from there:
UCS-60-A# scope adapter 1/1/1 UCS-60-A /chassis/server/adapter # show host-eth-if Eth Interface: ID Dynamic MAC Address Name Operability ---------- ------------------- ---------- ----------- 1 00:25:B5:60:00:0E eth0 Operable 2 00:25:B5:60:00:0F eth1 Operable UCS-60-A /chassis/server/adapter # scope host-eth-if 1 UCS-60-A /chassis/server/adapter/host-eth-if # show vif VIF: ID Fabric ID Transport Tag Status Overall Status ---------- --------- --------- ----- ----------- -------------- 703 A Ether 0 Allocated Active 704 B Ether 0 Allocated Passive UCS-60-A /chassis/server/adapter/host-eth-if # up UCS-60-A /chassis/server/adapter # scope host-eth-if 2 UCS-60-A /chassis/server/adapter/host-eth-if # show vif VIF: ID Fabric ID Transport Tag Status Overall Status ---------- --------- --------- ----- ----------- -------------- 705 B Ether 0 Allocated Active 706 A Ether 0 Allocated Passive
Now let’s move to the NX-OS CLI and see how this same information is represented in the networking components of the UCS Fabric Interconnects. Fabric Interconnect A shows the following:
UCS-60-A# connect nxos a UCS-60-A(nxos)# show int veth 703 Vethernet703 is up Bound Interface is Ethernet1/1/1 Hardware: Virtual, address: 000d.ecf3.1140 (bia 000d.ecf3.1140) Description: server 1/1, VNIC eth0 Encapsulation ARPA Port mode is trunk EtherType is 0x8100 Rx 2729 unicast packets 0 multicast packets 36 broadcast packets 2765 input packets 695320 bytes 0 input packet drops Tx 1553 unicast packets 76282 multicast packets 116278 broadcast packets 194113 output packets 18743038 bytes 0 flood packets 0 output packet drops UCS-60-A(nxos)# show int veth 706 Vethernet706 is up Bound Interface is Ethernet1/1/1 Hardware: Virtual, address: 000d.ecf3.1140 (bia 000d.ecf3.1140) Description: server 1/1, VNIC eth1 Encapsulation ARPA Port mode is trunk EtherType is 0x8100 Rx 0 unicast packets 0 multicast packets 0 broadcast packets 0 input packets 0 bytes 0 input packet drops Tx 0 unicast packets 0 multicast packets 0 broadcast packets 0 output packets 0 bytes 0 flood packets 0 output packet drops UCS-60-A(nxos)# show mac address-table int veth 703 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 198 0025.b560.000e static 0 F F Veth703 * 199 0050.56b3.3173 dynamic 10 F F Veth703 * 932 0050.567e.66fb dynamic 40 F F Veth703 UCS-60-A(nxos)# show mac address-table int veth 706
The output confirms that veth 703 is acting as the active interface for vNIC eth0. The statistics show that this interface has been forwarding packets and in the MAC address table we can see both static (vNIC MAC) and dynamic (VM or VMK MAC) entries associated with the interface. Interface veth 706 is passive for vNIC eth1 and this is confirmed by the fact that all packet counters are set to 0 and there are no MAC addresses associated with the interface.
To complement this information let’s obtain the same output from FI-B:
UCS-60-B(nxos)# show int veth 704, veth 705 Vethernet704 is up Bound Interface is Ethernet1/1/1 Hardware: Virtual, address: 000d.ecf0.7a00 (bia 000d.ecf0.7a00) Description: server 1/1, VNIC eth0 Encapsulation ARPA Port mode is trunk EtherType is 0x8100 Rx 0 unicast packets 0 multicast packets 0 broadcast packets 0 input packets 0 bytes 0 input packet drops Tx 0 unicast packets 0 multicast packets 0 broadcast packets 0 output packets 0 bytes 0 flood packets 0 output packet drops Vethernet705 is up Bound Interface is Ethernet1/1/1 Hardware: Virtual, address: 000d.ecf0.7a00 (bia 000d.ecf0.7a00) Description: server 1/1, VNIC eth1 Encapsulation ARPA Port mode is trunk EtherType is 0x8100 Rx 34 unicast packets 53 multicast packets 72 broadcast packets 159 input packets 24105 bytes 0 input packet drops Tx 33 unicast packets 89330 multicast packets 138303 broadcast packets 227666 output packets 21926655 bytes 0 flood packets 0 output packet drops UCS-60-B(nxos)# show mac address-table interface veth 704 UCS-60-B(nxos)# show mac address-table interface veth 705 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 198 0025.b560.000f static 0 F F Veth705 * 199 0050.569a.67a3 dynamic 100 F F Veth705 * 199 0050.56b3.6001 dynamic 780 F F Veth705
This shows similar results: Interface veth 704 is passive for vNIC eth0 and not forwarding any traffic. Interface veth 705 on the other hand is active for eth1, as can be seen from the traffic statistics and MAC address table.
Now that we have examined the baseline setup in UCSM and NX-OS it is time to let the fabric failover feature kick in and see what happens. To force a failover I disable the uplinks to fabric interconnect A on the upstream switches. This will leave the vEthernet interfaces on FI-A without an uplink to be pinned to and consequently they will go down. This should trigger the fabric failover mechanism for vNIC eth0 and make interface veth 704 on fabric B the active vNIC for eth0. It should also cause the MAC addresses that were learned on fabric A for eth0 to move to fabric B.
So I disable the uplinks to FI-A and then reexamine the situation from NX-OS on FI-A:
UCS-60-A(nxos)# sh int veth 703, veth 706 brief -------------------------------------------------------------------------------- Vethernet VLAN Type Mode Status Reason Speed -------------------------------------------------------------------------------- Veth703 198 eth trunk down ENM Source Pin Fail auto Veth706 198 eth trunk down ENM Source Pin Fail auto
As expected the vEthernet interfaces have gone down by lack of an uplink to be pinned to. Now let’s have a look on FI-B:
UCS-60-B(nxos)# show int veth 704 Vethernet704 is up Bound Interface is Ethernet1/1/1 Hardware: Virtual, address: 000d.ecf0.7a00 (bia 000d.ecf0.7a00) Description: server 1/1, VNIC eth0 Encapsulation ARPA Port mode is trunk EtherType is 0x8100 Rx 78 unicast packets 0 multicast packets 16 broadcast packets 94 input packets 15359 bytes 0 input packet drops Tx 48 unicast packets 4282 multicast packets 8613 broadcast packets 12943 output packets 1255967 bytes 0 flood packets 0 output packet drops UCS-60-B(nxos)# show mac address-table int veth 704 Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+------------------ * 198 0025.b560.000e static 0 F F Veth704 * 199 0050.56b3.3173 dynamic 110 F F Veth704 * 932 0050.567e.66fb dynamic 460 F F Veth704
As can be seen from the packet counters, interface veth 704 is now carrying traffic for vNIC eth0 and the MAC addresses have moved to Fabric Interconnect B as well.
Not only have the MAC addresses moved inside UCS from FI-A to FI-B, but UCS has also sent out gratuitous ARP packets for each of these MAC addresses to notify the upstream switches of the moved MAC addresses. This mechanism prevents packets to these MAC addresses from being black-holed in the LAN. To show the gARPs being sent through Fabric Interconnect B, I ran ethanalyzer from NX-OS using the following command:
UCS-60-B(nxos)# ethanalyzer local interface inbound-hi display-filter "arp" limit-captured-frames 0
When I shut down the uplinks to FI-A ethanalyzer captured the following packets:
2012-09-27 14:55:06.465188 00:50:56:7f:68:d8 -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request) 2012-09-27 14:55:06.465210 00:50:56:7f:68:d8 -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request) 2012-09-27 14:55:06.470990 00:25:b5:60:00:0c -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request) 2012-09-27 14:55:06.471884 00:25:b5:60:00:0c -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request) 2012-09-27 14:55:06.472197 00:25:b5:60:00:0c -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request) 2012-09-27 14:55:06.480495 00:50:56:b3:31:73 -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request) 2012-09-27 14:55:06.480518 00:50:56:b3:31:73 -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request) 2012-09-27 14:55:06.480524 00:50:56:7e:66:fb -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request) 2012-09-27 14:55:06.480545 00:50:56:7e:66:fb -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request) 2012-09-27 14:55:06.488295 00:25:b5:60:00:0e -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request) 2012-09-27 14:55:06.488723 00:25:b5:60:00:0e -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request) 2012-09-27 14:55:06.488974 00:25:b5:60:00:0e -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
This clearly shows that gratuitous ARPs are sent for the MAC addresses that were associated with the failed vEthernet.
Now let’s revisit the UCSM CLI commands that we looked at before triggering the failover:
UCS-60-B# show service-profile circuit name POD60-ESX-1 | egrep "eth|Fabric|VIF" Fabric ID: A VIF vNIC Link State Overall Status Prot State Prot Role Admin Pin Oper Pin Transport 703 eth0 Error Error No Protection Primary 0/0 0/0 Ether 706 eth1 Error Error No Protection Backup 0/0 0/0 Ether Fabric ID: B VIF vNIC Link State Overall Status Prot State Prot Role Admin Pin Oper Pin Transport 704 eth0 Up Active Active Backup 0/0 0/2 Ether 705 eth1 Up Active Active Primary 0/0 0/2 Ether UCS-60-B# scope adapter 1/1/1 UCS-60-B /chassis/server/adapter # show host-eth-if Eth Interface: ID Dynamic MAC Address Name Operability ---------- ------------------- ---------- ----------- 1 00:25:B5:60:00:0E eth0 Operable 2 00:25:B5:60:00:0F eth1 Operable UCS-60-B /chassis/server/adapter # scope host-eth-if 1 UCS-60-B /chassis/server/adapter/host-eth-if # show vif VIF: ID Fabric ID Transport Tag Status Overall Status ---------- --------- --------- ----- ----------- -------------- 703 A Ether 0 Allocated Link Down 704 B Ether 0 Allocated Active
The output here also indicates that vNIC eth0 has shifted its traffic to fabric B.
Finally, let’s see what the situation looks like in the “VIF Paths” tab in the GUI:
So let’s summarize the behavior that we observed:
When fabric failover is enabled for a vNIC, this causes two vEthernet interfaces to be created, one on each fabric. These two VIFs work together as an active/standby failover pair. By default, the vEthernet interface on the primary fabric carries all the traffic, but when the primary fabric fails, the vEthernet on the other fabric assumes the forwarding role. The failover also triggers MAC address learning, both within UCS, as well as on the upstream switches through the gARP process.
In addition, this post has shown how to use the UCSM GUI, UCSM CLI, and NX-OS CLI to analyze the frame forwarding inside UCS.
Do you know a CLI command or way to either down the active vEthernet interface or failover to the backup if you wanted to test it without taking down the FI uplinks or IOM module.
Hi Joe,
I don’t know any action in the GUI or CLI that would allow you to shut a specific vEthernet interface. However, there are several alternative actions that come to mind:
– If you select a specific NIC on the blade in the equipment tab, there is a “reset connectivity” button. I haven’t tested this, but with a bit of luck this would trigger the failover, at least for a brief period of time.
– If you use static pinning between the FI and the IOM (instead of a port-channel), then you could disable the link between the FI and the IOM that the blade is pinned to. However, this will also affect other blades that are pinned to that same uplink.
None of these options does exactly what you want, but maybe one of them works well enough for your scenario. This of course really depends on what you are trying to achieve. I suppose you are looking at testing failover for a single blade on a live UCS system?
Tom
P. S. I didn’t have the opportunity to test these options in the lab yet, so you would have to test them out yourself to see if they do what you want.
Hi Tom,
Nice post, and I have one clarification.
Veth 705 is active in Fab-B. But from Esxi I have two nics one connected to FAB-A and second to FAB-B.
From Esxi
========
vmnic 0
vmnic 1
Even Veth 705 is active via FAB-B in Esxi when go to networking I can see
vmnic 0 = full
vmnic 1 = standby
why vmnic 1 showing standby? please explain.
And Appreciate if you can explain how this technology work for vm-fex.
I heard for vm-fex one of vmnic we have to use for vm-traffic and other vmnic for vmotion, that case how I can give redundancy?.
Hi Abdul,
It’s a bit hard to answer this question without knowing a bit more about your configuration, but the “full” and “standby” states that you are seeing in ESXi do not map to the UCS active and standby states, but refer to the configuration in vSphere/ESXi. The hardware failover function is performed by UCS at the adapter and transparent to the OS (ESXi).
Normally, I would either provide redundancy at the network adapter hardware layer through fabric failover, or at the software layer (vSwitch, teaming), not both.
In this lab exercise I have created two vNICs with failover enabled. Each of these vNICs has two associated vEthernet interfaces (so four vEths in total) and is protected at the hardware layer. The two vNICs are presented to ESXi, which may then apply its own failover mechanisms at the vSwitch level. If you have this set up as “active/standby” then this may explain what you’re seeing. Keep in mind that it is a bit hard for me to judge this without seeing your actual setup, but this might explain the behavior you are seeing.
Hope this helps,
Tom
Hi Many Thanks for your reply,
We used M81-KR mezzanine card ( two ports, one connecting to IOM-1 and second to IOM-2).
From Fabric interconnect we created two vnics. In Esxi I can see two VNICS but one of them is standby.
vnic 0 and vnic 1 (standby).
My question is that why vnic 1 is showing standby from esxi?.
Hi Abdul,
As said, I think this has more to do with your ESXi vSwitch configuration than with UCS. Hardware failover is transparent to the operating system, so it would not be visible at the ESXi level.
In order to provide a better answer it would be useful if you could send me a screenshot of what it exactly is that you are seeing and screenshots of the vSwitch configuration. You can send them directly to me at tom.lijnse@layerzero.nl and I can have a look to see if I can give you a better explanation.
Tom
Many Thanks,
I observed this during my UCS study class. So at the moment I don’t have any screenshot to share with you.
As you said if we create two vNICS ( one for FAB-A and other for FAB-B and enable failover for the same vlan’s ) both of them will be active for data traffic? That case no chance of duplicate MAC learning and MAC flooding by uplink switch?
Please explain if both IO modules are actively forwarding the traffic how it going to load balance the traffic?
If we enable failover form UCS side is it required to create two VNICs for the same vlan from UCS side?.
Hi Abdul,
If you have two vNICs connected to an ESXi host, the load balancing is determined by the VMware vSwitch and the default method is to load balance on a per-VM basis. So the first VM would use the first vmnic, the next VM would use the second vmnic, and so on. So this means that both fabrics are actively used, but each individual VM MAC address only appears on a single fabric.
In this scenario, you do not really need the hardware failover, because the vSwitch will not only load balance, but also fail over to the other vmnic if a vmnic would fail.
On non-hypervisor servers, such as Windows servers, the hardware failover can function as a replacement for software teaming/bonding in the OS. In this case, I would only create a single vNIC for a VLAN, not two. On the server side, you would only see one adapter, configured with the IP address for the VLAN. To the Windows admin it looks like the server is single-homed. However, the NIC is still protected by UCS, without the need to create an adapter team in Windows.
Tom
Well Explained and many thanks, Really appreciate if you can share some technical notes with screenshots or examples for vmfex ( normal and universal pass through), explaining how the vmkernal and vmtraffic is distrubuted over vswitch and vmfex-pass through-switch.
Hi Abdul,
I am sorry, I don’t really have notes or screenshots on VM-FEX to share. That would be something for another post. Maybe the following blog post has the information that you are looking for: Cisco UCS VM-FEX for VMware vSphere
Tom
Hi Tom,
Any specific reason why VMs share the same veths XX? It that because of trunk configured for interfaces
Thank you !!
Hi Santosh,
A virtual ethernet interface in UCS represents the port that the vNIC on a physical server (blade or rack mount) connects to. So the veth interface essentially connects to the ESXi host. This implies that multiple VMs on that host could connect through that same vethernet interface.
In this particular case, two vNICs were configured for the host (one on fabric A and one on B). It is then up to the vSwitch on the host to decide how to balance the VMs over these two uplinks.
I hope this answers your question.
Tom
its amazing.
thanks Tom
Hi,
I have an issue with a virtual machine not being able to get an dhcp/pxe request when booting. each vm has 2 vnics (1 to fabric A and 1 to Fabric B) fabric a is on 1 FI and fabric B is on the other. when the vm is using vnic1 (A) and the pxe server is using vnic 2 (fabric B) the server does not boot, if they are connected to the same fabric the server boots. it looks like something is stopping broadcast requests between the 2 FI’s , does anyone have any clues?
thanks
Hi Justin,
I am not sure what is going on there.
The only obvious thing to point out is that there is no direct connectivity between the A and B fabric inside the UCS system, so these DHCP requests have to go from Fabric Interconnect A to Fabric Interconnect B across the upstream LAN switches. Is there any chance that these DHCP packets are dropped by the LAN switches? (VLAN does not exist/VLAN not allowed on trunks/…)
Kind regards,
Tom
Great Article.
I have one doubt, I have created two MAC address ranges. e.g.
MAC-Adreess Range 1- to be used on all vNIC’s attached to FI-A
MAC-Adreess Range 2- to be used on all vNIC’s attached to FI-B
In such scenario when my vNIC1 configugured with MAC-Range-1 if failsover to FI-B there will not be any issue moving this MAC to FI-B?
Hi Khurram,
Sorry for the late reply.
To answer your question: If a vNIC has an address in MAC range 1, then that address will fail over to the B side when the A fabric fails for that vNIC. A gratuitous ARP sent by fabric interconnect B will notify the upstream network that this MAC address has now moved to FI B. So even though in the general case the range 1 addresses only show up on the A fabric, during a failure condition (some) of these MAC addresses will be used on the B fabric.
Hope this helps,
Tom
Very informative and detailed explained the back group process of Fabic failure.
Thanks so much tom, it’s very informative.