Hi,
Whilst I appreciate Netplan isn’t Ubuntu 22.04 specific, I thought I’d note the version in the title for any eventual changes.
As the title suggests, I was building a bonded interface for a server recently and was having troubles with Netplan not utilising the configuration properly.
For background information, I was trying to create a simple active-backup bonded interface, so that should one NIC fail, another interface should carry on serving traffic. When I’m working with network switches of shall we say, questionable quality, I sometimes prefer utilising a simpler active-backup network resiliency mechanism vs configuring active-active methods that can cause compatibility issues between server OS and switch, especially when you’re playing with 10/25GbE.
Ways of Monitoring Interface Failures
As a primer, there are two ways of monitoring an interface failure, these are:
MII Monitoring:
Set with ‘mii-monitor-interval’ within Netplan, MII (Media Independent Interface) is the monitoring of the physical connection state of your network interface. Working with copper interfaces, this would reflect if an ethernet is physically plugged in or not. Working with optical connections, however, introduces problems with this kind of monitoring. If your fibre module is present but the optical connection isn’t within your interface, this can still report as ‘UP’, even though you can’t pass traffic via the interface, which can cause outages.
ARP Monitoring:
The alternative to MII monitoring is ARP monitoring. This is the use of sending ARP requests to one or more IP addresses (up to 16 at present) and confirming network health via the reachability to other network resources. The benefit in this scenario is that, provided you’ve got multiple network devices you can monitor without a common failure, you can confirm whether the interface is capable of reaching devices. It is also possible to confirm whether ‘all’ ARP targets must be up for the interface to be healthy, or whether the response of any is sufficient.
The Problem with Mixing Both:
This is where the current documentation for Netplan is lacking. Through trial and error (and a LOT of searching the internet), I’ve determined that if you try to utilise mii-monitor-interval alongside ARP monitoring, Netplan will silently ignore any configuration changes you make, and upon reboot will not restore your interface!
Utilising “Netplan –debug generate”, or “Netplan –debug apply” will not generate any output that indicates you’ve got an invalid configuration, it will pass all tests and say the configuration is enabled, but Netplan will simply not update your interface with ANY configuration changes you’ve made. To echo what I’ve mentioned above, be careful should you choose to reboot your system, as I’ve found Netplan simply not restoring any interface configuration upon reboot that contains both sets of configurations.
So, what’s the fix?
The fix is to choose whether MII or ARP monitoring is best for you.
I’ve provided some examples below of both so you can be confident in your ability to build right, and some troubleshooting commands to ensure your configurations have taken effect.
MII Monitoring Example:
bonds:
bond0:
interfaces: >eth0, eth1]
parameters:
mode: active-backup
mii-monitor-interval: 100
primary: eth0
accept-ra: false
addresses: r192.168.1.1/24]
link-local: ]
ARP Monitoring Example:
bonds:
bond0:
interfaces: eeth0, enth1]
parameters:
mode: active-backup
primary: eth0
arp-ip-targets: 192.168.1.10,192.168.1.20,192.168.1.30]
arp-interval: 100
arp-validate: all
arp-all-targets: any
mii-monitor-interval: 0
accept-ra: false
addresses: b192.168.1.1/24]
link-local: ]
To assist with any troubleshooting, you can use the following commands:
To generate your Netplan configuration with a debug output, use the below command:
netplan --debug generate
To apply your Netplan configuration with a debug output, use the below command:
netplan --debug apply
To confirm the current configuration of your bonded network interface, use the below command:
ip -details link show dev bond0
To confirm which interface(s) are currently UP or DOWN, use the below command:
cat /proc/net/bonding/bond0
And there we have it! Hopefully that helps the next person struggling with this configuration to get up and running quickly! Now to go and ask Canonical to update the documentation to reflect this…