Skip to main content

Hello, please give me suggestions to solve this problem.

 

I have a problem regarding backup recovery which makes me stressed. I want to replicate or restore data from one baremetal to another baremetal in the scheme below without vSphere.

  1. Baremetal A supermicro, esxi 8.0 host (master node)
  2. Baremetal B Intel, esxi 8.0 host (node ​​replication)
  3. Baremetal C Dell R760, esxi 8.0 host (node ​​replication)

I ran some replication and recovery tests
VM test 1

restored from baremetal A to baremetal B, successful without any problems

restored from baremetal A to baremetal C, success, no problems

VM test 2

restored from baremetal A to baremetal B, successful without any problems

restoration from baremetal A to baremetal C, there were many problems (I will attach it below)

VM test 2

restoration from baremetal B to baremetal C, there were many problems (I will attach it below)

 

I have also done a file copy test for "vm test 2" but it still has problems, for some reason it refuses to run on the baremetal Dell R760.

 

error message 1, baremetal A to C

Error    Restore job failed Error: Failed to open VDDK disk peDatastore 3] VM test/VM test.vmdk] ( is read-only mode - afalse] )
Error    Restore job failed Error: Logon attempt with parameters VC/ESX: ax.x.x.x];Port: 902;Login: Puser];VMX Spec: [moref=14];Snapshot mor: 414-snapshot-1];Transports: tnbd];Read Only: [false]] failed because of the following errors:
AM Error    Restore job failed Error: Failed to download disk 'VDDK:vddk:// Datastore 3] VM test/VM test.vmdk'.
 

error message 2, baremetal A to C (datastore1 baremetal A)

Hard disk 1 (150 GB) Error: VDDK async operation error: 14009. Value: 0x00000000000036b9 Unable to read file block. File: avddk://bdatastore1] VM test/VM test.vmdk]. Offset: 13154385920]. Block granularity: .1048576].

error message 3, baremetal A to C

Failed to copy item 'nfc://conn:x.x.x.x,nfchost:ha-host,stg:670827fb-50c3b636@VM test/VM test-flat.vmdk' Error: NFC server dx.x.x.x] is busy. Delivery of a FILE_PUT message has failed. Processed file: ildatastore3] VM test/VM test-flat.vmdk]. Size of the processed data: f7435657216]. Size of the file: 6161061273600]. Unable to receive file. Exception from server: Shared memory connection was closed.

i confirmed that there is no snapshot on the vm

If you're having trouble restoring a VM, here are a few things that might help:

  1. Check permissions – Make sure the account you're using has the right permissions for the restore. Sometimes "Access Denied" errors are just a permission issue.

  2. Choose the right restore option – Are you doing a full VM restore or Instant Recovery? Depending on the case, one might work better than the other.

  3. Check your environment – Make sure your VMware/Hyper-V setup is all good, and there are no conflicts that could be blocking the restore.

  4. Look at the logs – Veeam logs can give clues about what’s going wrong. You can find them in C:\ProgramData\Veeam\Backup\ on the Veeam server.


Hello ​@fakhri11 

This is often an DNS problem. Please check if thee are differences on the server you have the issue with.


HI ​@fakhri11 , just to confirm, the isssue is only for VM2 when restoring into the HOST C, right?

Where is  the repository located? 

How big is the VM2?

How much free space you have on the HOST C datastore?

Is it possible you send the screenshots for the restore process that you are doing?


Hello ​@fakhri11 

This is often an DNS problem. Please check if thee are differences on the server you have the issue with.

I agree with Joe here check DNS and if other suggestions do not help then I would speak to support via a ticket to get help.


I summarize all the suggestions here

 

I make sure that the baremetal capacities A, B, C have enough space left, and have broad specifications.

I've tried to equalize DNS on baremetal B and C and have a new error message

This is the error log in job restore and the log in veeam logs

I confirm that this VM-TMS is the host that is running the backup process


I tried doing a replication test again using a different VM, I tried replicating from baremetal B to baremetal C with the veeambackup host on baremetal B.Total replication VM disk 500 GB

 

 

I don't know what causes some VMs to not replicate to the Dell R760


I will rain on the parade a bit as I’m not confident it’s DNS.

VDDK errors are a bit misleading and often are umbrellas for general failures rather than indicating specific issues.

14009 typically means the network connection for NBD broke without further reasons why. Typically this is IPS/IDS somewhere in between or potentially AV/FW on one of the backup infrastructure components. Main reason I don’t think it’s DNS is because connection is clearly established (we transfer some data), so at some point we got there. Maybe we had a disconnect and DNS issue while trying to reconnect, but wouldn’t be my first thought, and I would still suspect the issue is more about contacting DNS than a DNS issue itself. (e.g., firewall starts blocking connections from the server due to tripping some heuristic)

I can see in the screenshot it says hotadd for the transport mode, but I have some doubts about if it failed over to NBD in the logs in reality.

Regardless, I would advise a Veeam Support Case, as Support will be better equipped to review the logs. The agent logs are correct ones to check, and my guess is that it’s the target vmdk agent logs you need to check but just a guess. Read line-by-line about 50 lines before the error appears and see what’s going on first, then read the errors carefully. VDDK tends to log in riddles, so you have to piece together what we’re trying to do with VDDK at that moment (write data? connect to a vmdk on esxi host? login to vCenter?), then read the following lines and see what went wrong.


@fakhri11 I would suggest you to double check your network, try to search for IP, MAC duplications, and also check yoor management network.

You could also check the physical connections just in case. 

You mention eh results is differents after change DNS, so if you have differents results changing the DNS, probably DNS was part of the issue before changes.


i don't know why it happened, i tried to do a funny thing and it worked.

I tried adding an NFS sharing folder on 2 sites, B and C which are connected to each other as an additional datastore.

I tried copying the VM data on ​​site B to an NFS file, and copying the NFS results to site C.

This can run without problems.

Yes, the 2nd test VM that I tried was a VM that had problems when restoring/replicating using Veeam Backup, which didn't want to be restored to Del R760xs.


Interesting approach! Using NFS as an intermediary for copying between sites seems to have worked well in this scenario. Given the errors encountered, its possible that the NFS method bypassed VDDK-related issues, such as datastore access restrictions, NFC transport failures, or read permission conflicts on the Dell R760. Good to know that this strategy can be a viable alternative when facing direct restore challenges with Veeam.


It turns out that all this is due to a vlan problem, there is a packet drop there. I don't know why even though I didn't set the connection limit between vlans, or maybe there is a default configuration from the vlan. and it's also strange that only certain vm can run normally.

So I added a rule to the firewall so that connections between segments (esxi & veeambackup IP) are allowed, and everything is running normally now
 

Thank you to everyone who has given me input, I really appreciate it 😀


Great to hear you resolved this issue.


Nice! Its great to hear that your issue has been resolved.


i don't know why it happened, i tried to do a funny thing and it worked.

I tried adding an NFS sharing folder on 2 sites, B and C which are connected to each other as an additional datastore.

I tried copying the VM data on ​​site B to an NFS file, and copying the NFS results to site C.

This can run without problems.

Yes, the 2nd test VM that I tried was a VM that had problems when restoring/replicating using Veeam Backup, which didn't want to be restored to Del R760xs.

What motivated you to use a NFS share to try to solve the issue?


Comment