Veeam v13: High Availability Cluster Setup from Scratch

Forum|Forum|4 months ago
March 16, 2026
10 comments
339 views

eblack
Influencer

High availability for VBR in v13 is not a general clustering feature for every backup server. It is a very specific design: two Linux Software Appliance nodes, one active and one passive, sharing the same external repositories and presenting a single cluster endpoint.

That distinction matters, because a lot of people hear ‘HA cluster’ and assume shared backup data, multi-node active/active processing, or some kind of three-node quorum design. This is not that.

The active node runs backup and replication operations. The passive node sits in standby, continuously receiving a replicated copy of the primary node’s configuration database. If the primary node fails, the passive node is promoted and work continues against the same repository infrastructure. The backup data does not move. The active role does.

What is actually replicating between the nodes

The thing being replicated is the VBR configuration database and the configuration state attached to it.

Patroni handles that part. PostgreSQL changes are streamed from the active node to the standby node as WAL data. As long as connectivity between the nodes is healthy, the standby stays current. If that link drops, the primary keeps accumulating WAL until the standby comes back and catches up.

What does not replicate is the backup data itself. Repositories stay where they already are. Both nodes need to be able to reach them, but the repositories are not part of the HA mechanism. That is why failover is about moving the active cluster role, not moving the storage.

The user-facing piece of that design is the virtual cluster IP. You connect to the cluster endpoint, not to the individual node addresses. During failover or switchover, that virtual IP moves to the node that is now primary.

This feature is gated harder than people expect

Hard limit	Detail
Node count	Exactly two nodes. No three-node or quorum layout.
Platform	Linux Software Appliance only. Windows VBR cannot participate.
License	Veeam Data Platform Premium on the primary node.
Networking	Both nodes and the virtual IP must be on the same subnet with static addresses.
Repositories	Local repository on the backup server cannot be used. All repositories must be external.
Management	Windows VBR console required for cluster assembly and management.

This requires VBR 13.0.1 or later. It was not included in 13.0.0.

It also requires a Veeam Data Platform Premium license on the primary node. Foundation, Advanced, Essentials, Community Edition, NFR, none of those qualify for this feature.

And it is Linux Software Appliance only. Windows-based VBR does not participate in this cluster design at all. If the backup server is still on Windows, this feature is not available no matter how much people wish it were.

What to get right before starting

Pre-flight checklist: reserve three IPs, create forward and reverse DNS records, deploy the second appliance on the same VBR version, and load the Premium license on the primary node before opening the wizard.

The preparation work is where most failed attempts should have been saved.

You need three static IP addresses on the same subnet: primary node, secondary node, and virtual cluster IP.

You also need DNS to be correct before you touch the wizard. That means forward and reverse records for all three addresses. Not just the two node names. All three. If DNS is wrong, cluster assembly fails and you lose time fixing something that should have been validated earlier.

The secondary node needs to be a fresh Linux appliance deployment on the same VBR version as the primary. Same ISO version. Same general initial setup quality.

The Premium license only needs to be applied to the primary node.

The step that catches most people

Before you can even run the cluster wizard, each node has to receive an HA enable request through Host Management.

That is not a Windows-console step.

You open the Host Management console for the primary node, submit the HA request there, and do the same thing on the secondary node. If a Security Officer account is configured on either node, that approval has to happen separately before the request is actually valid.

There is also a time limit on that approval window. If you do not finish cluster assembly soon enough, the request expires and you do it again.

Assembling the cluster

Once both nodes have the HA enable request approved, move to the Windows console and connect directly to the primary node. Not the cluster IP, because that does not exist yet.

From there, launch Create HA Cluster on the primary node, enter the virtual cluster IP and cluster endpoint DNS name, confirm the primary node details, enter the secondary node address and Host Administrator credentials, then review and finish.

Veeam assigns the virtual IP to the primary, connects to the secondary, initializes it as standby, deploys the HA services, and starts synchronization.

After that, disconnect from the individual primary node and reconnect to the cluster endpoint. That becomes the entry point from then on.

If you want the cluster DNS name included in the self-signed certificate alternate names, you need to regenerate the certificate after assembly. That does not happen automatically.

How to confirm it is healthy

After reconnecting to the cluster endpoint, the cluster should show as a single logical entry with both nodes underneath it.

The node roles should be obvious: one primary and one secondary. Both should look healthy.

If the secondary is already showing warnings, check the node-to-node TCP path first, then confirm the Patroni service is alive on both sides. That is the plumbing this whole feature depends on.

Enterprise Manager has to be updated too

If Enterprise Manager is in use, re-add the VBR server there by using the cluster virtual IP or cluster DNS name.

If EM is still pointed at the old individual node address, it will work until the moment a switchover or failover happens, and then it becomes somebody’s problem later.

Planned switchover

Switchover is the clean case. Both nodes are healthy, both are in sync, and you are moving the active role because you want to do maintenance, not because something exploded.

Right-click the HA cluster, choose Switchover, let Veeam verify sync state, stop active jobs on the current primary, move the virtual IP, and promote the standby.

The process takes about ten minutes. Test this before a real incident so you know the virtual IP moves properly and the console reconnects cleanly.

You do not have to switch it back afterward unless you want to.

Unplanned failover

Failover is for the bad day. The primary is down and not coming back in the time available.

At that point the secondary is promoted, the virtual IP is assigned to it, and operations resume there.

This can be triggered manually from the Windows console, or automatically if Veeam ONE alarm actions were configured ahead of time.

Once the failed node is repaired and returns, it rejoins as the new secondary. You do not have to tear the cluster down and rebuild it from zero after every real failover.

Final thoughts

The easiest way to misunderstand the v13 HA cluster is to think of it as storage-level HA for Veeam.

It is not. It is configuration and control-plane HA for the Linux Software Appliance. The repositories stay external. The active role moves. The cluster endpoint stays the thing you connect to.

If I were deploying this in production, I would make sure five things were clean before touching the wizard: correct license, matching versions on both appliances, three static IPs on the same subnet, forward and reverse DNS for all three, and a non-emergency switchover test after assembly.

That is what turns this from a feature on the slide into something you can actually trust during maintenance or a real outage.

Quick operational summary

• Configuration database moves between nodes; repositories do not.

• Connect to the cluster virtual IP or DNS endpoint after assembly.

• Enterprise Manager must be re-added by using the cluster endpoint.

• Switchover is for planned maintenance. Failover is for node loss.

+22

Chris.Childerhose
Veeam Legend, Veeam Vanguard
Forum|Forum|4 months ago
March 16, 2026

This is a really great guide on setting up an HA Cluster. I have been through this many times in my lab and even contacted Support about some issues that I helped them work through to fix.

Hopefully, soon they will resolve the issue with the SSL when you disassemble the cluster and then try to assemble it again, as it does not connect to the secondary node due to SSL resignature. I have found the workaround to this was to change the IP of my secondary node, which then creates a new SSL and allows you to assemble the cluster again. Just a tip. 😉

eblack
Author
Influencer
Forum|Forum|4 months ago
March 16, 2026

This is a really great guide on setting up an HA Cluster. I have been through this many times in my lab and even contacted Support about some issues that I helped them work through to fix.

Nice add Chris, thanks!

+22

Chris.Childerhose
Veeam Legend, Veeam Vanguard
Forum|Forum|4 months ago
March 16, 2026

Not seeing your reply @eblack just that you quoted my post.

eblack
Author
Influencer
Forum|Forum|4 months ago
March 16, 2026

Nice add Chris, thanks!

+22

Chris.Childerhose
Veeam Legend, Veeam Vanguard
Forum|Forum|4 months ago
March 16, 2026

Nice add Chris, thanks!

Not a problem. I have many tips and tricks for HA Clusters, so if anyone has questions just reach out. Even have tips for VONE as well working with them. 😂

eblack
Author
Influencer
Forum|Forum|4 months ago
March 16, 2026

Perfect. 🤝

kciolek
Influencer
Forum|Forum|4 months ago
March 16, 2026

great article @eblack! I recently setup the HA on my lab production servers. Very easy to setup once all the pre-requisites were met. Testing the failover was flawless.

Ken Ciolek | SHI Labs Data Protection & Storage Lead | Object First Ace | Veeam Rising Star | Commvault Global Ambassador

eblack
Author
Influencer
Forum|Forum|4 months ago
March 16, 2026

great article @eblack! I recently setup the HA on my lab production servers. Very easy to setup once all the pre-requisites were met. Testing the failover was flawless.

That’s great to hear.

+23

coolsport00
Veeam Legend
Forum|Forum|4 months ago
March 16, 2026

Nice guide @eblack . I tested this out in the BETA, but at the time..wasn’t able to fail it back to original node in that build. Pretty nice feature...thanks for sharing!

Shane Williford - Veeam VMCA/VMCE | Veeam Legend | VUG Leader | VCP-DCV | Twitter: @coolsport00

eblack
Author
Influencer
Forum|Forum|4 months ago
March 16, 2026

Nice guide @eblack . I tested this out in the BETA, but at the time..wasn’t able to fail it back to original node in that build. Pretty nice feature...thanks for sharing!

Agreed, I’m a fan of the new feature.

Sign up

Login to the community