Veeam Backup Service Fails to Start – AKA How I broke my lab this time…

  • 19 January 2022
  • 7 comments
  • 20145 views

Userlevel 7
Badge +20

 

It feels like everyone is always trying so hard to be perfect, I’m certainly not! But I’m happy to learn from and share my mistakes. I was messing about recently with Veeam Disaster Recovery Orchestrator and made some changes to my lab, and all was fine when I shut down. A few days later I fired up the lab again, but my Veeam Backup & Replication server wouldn’t start its core “Veeam Backup Service” service. I over-engineered my response expecting it to be something far more complex than it actually was, and here’s my story.

 

Hour 1: The Service Won’t Start

I fired up my Veeam Backup & Replication server, ready to write my next blog post, uh-oh, the service won’t start! I jumped into the logs and couldn’t see anything of any note, really!

This was my log, if you don’t load/can’t read images, the final three lines contained the following:

[db] Minimal supported database server version is 10.0.1600.22
[db] Connecting to SQL Server (CurrentUser=[NT AUTHORITY\SYSTEM], ServerInstance=[WIN-0MNHMU5DV8K\VEEAMSQL2016], Options=[Direct])
[db] Removing 'Initial Catalog' property from connection string

And then nothing afterwards!

Puzzled? I was! Now lets troubleshoot this!

 

Hour 2: What did I change?

I’m one of those people that doesn’t lab every day, maybe twice a week, so there are periods of amnesia (not really, just sleeping) between sessions and I can’t quite remember what I was in the middle of. But this time I could remember, I’d changed the VBR server to be domain joined. The service runs as a local service, that wouldn’t do it surely? After checking through my notes and confirming that wasn’t the problem, I pushed on.

Recently I’d added more VMs to my hypervisor, which is where my VBR virtual machine was also residing, it’s on a single 7.2k RPM disk so the IO is not anything to brag about.

By default Windows will allow 30 seconds for a service to start, so maybe I was just delivering too little IO to the service/VM? To test this, I changed the service timeout from 30 seconds to 120 seconds. You can do this yourself by using Registry Editor and changing the following:

HKLM\SYSTEM\CurrentControlSet\Control\ServicesPipeTimeout

Change this DWORD from 30000 (decimal) to whatever value you want, it’s measured in milliseconds. Be aware there are limits to this, 120000 (decimal) is the maximum value I’ve seen consistently work across the different Windows OS’. There’s nothing defined in the service that I can see that inforces this as the parameter is just an integer, but this may be handled elsewhere. Setting this to too high a value will result in Windows ignoring the parameter and using the default though, so there’s definitely logic somewhere outside of my view of source code.

I still wasn’t able to get the service to start properly, and felt like hitting my head against a wall, but then noticed the log output had changed…

 

Hour 3: Cause & Resolution

You’re likely here looking for the answer to your problems, and hopefully this is the answer you seek.

When I checked the logs next, I found more information, the first line being the most telling:

[DB | ERROR] Failed to connect to server. (Microsoft.SqlServer.Management.Common.ConnectionFailureException)

Reading through the logs, it hit me, I didn’t want to have randomly named servers in my domain as part of my testing, so I renamed the hostname at the same time! To VeeamBandR! The answer was actually staring me in the face earlier in the logs too…

The logs had told me the server name it was trying to connect to and its ConnectionTimeout values, the service was being terminated before the timeout was reached which meant I wasn’t getting full log output and the server hostname was clearly wrong.

With that out of the way, I knew how to fix this.

To do so, open the Veeam Backup and Replication “Configuration Database Connection Settings” tool. It will look like the screenshot below.

You’ll notice a field labelled “Server name:” this is the field we need to change, don’t worry about waiting for it to finish resolving the server name as it will timeout, instead just change the prefix to the new server name, as above my screenshot says “VEEAMBANDR\VEEAMSQL2016” whereas when I entered the application it said “WIN-0MNHMU5DV8K\VEEAMSQL2016”. Click through the rest of the screens and then Veeam will offer to start the product automatically on exit. Check this box if you’d like and then click Finish.

In my scenario, the connection was fixed and I was good to keep on labbing!

I also broke my Veeam ONE database with this same thing so I’ll follow up this post with another about how to fix Veeam ONE’s database next.


7 comments

Userlevel 7
Badge +12

This sounds familiar; I’m not sure how often this happend to me in a lab :smile:

Userlevel 7
Badge +20

This sounds familiar; I’m not sure how often this happend to me in a lab :smile:

😆 at least what we practice in the lab we don’t often repeat in production!

Userlevel 7
Badge +17

Oh, oh, this sounds very familiar 😃😃😃

I had this problem with my very first VBR server years ago. The environment was rapidly developing and the naming schema of the servers changed…. 😎👍🏼

Userlevel 7
Badge +11

Thank you for sharing the story @MicoolPaul, I love troubleshooting posts :)

Userlevel 7
Badge +12

This sounds familiar; I’m not sure how often this happend to me in a lab :smile:

😆 at least what we practice in the lab we don’t often repeat in production!

That’s true; I don’t remember this happening anytime in production :sweat_smile:

Userlevel 7
Badge +13

Coffee, biscuits and troubleshooting. That’s my fav morning.

Nice story @MicoolPaul :sunglasses:

Userlevel 1
Badge

Thanks for this… it didn’t actually solve my issue, but it pointed me in the right direction!

Comment