Skip to main content

Since AD Recovery is not a routine operation performed daily, many of us lack experience with it. This is evident in the article below.​​​

Recently, my lab environments have been repeatedly restarting due to a faulty switch. After extensive research to resolve this issue before performing VM recovery, I discovered that the Hyper-V host was experiencing unplanned restarts. This can cause the virtual hard disks connected to a virtual IDE controller to become inconsistent if they are being used by virtual machines.

According to Microsoft, when you virtualise your domain controller (DC) on a Hyper-V host server, a crash or power outage on the Hyper-V host can lead to several issues. The Active Directory database may become corrupted, or the virtual machine may fail to start, resulting in an error message like the one shown below.

Troubleshoot via DSRM

DSRM (Directory Services Repair Mode or Directory Services Restore Mode in versions prior to Windows Server 2012) is a special boot mode for Windows Server domain controllers. It functions similarly to Safe Mode with Networking but does not run Active Directory. Administrators use DSRM to restore Active Directory from a backup. It also helps resolve various issues with AD.

In later versions of Windows Servers, use the Advanced Boot Options menu or the Windows Recovery Environment to access DSRM as shown below. Under Choose an Option, select Troubleshoot

Select “Advanced Options” as shown below

On the Startup Settings, please select Restart.

Select the Directory Services Repair Mode (DSRM), and then log in with the DSRM account. here is the link to the blogpost.

Verify AD Replication

Active Directory replication issues might cause inconsistencies or make the domain controller unavailable. Use the following command to determine the replication status and summary

repadmin /replsummary

repadmin /showrepl

As you can see in the images above, we have gotten two errors from the AD Replication Commands “Win32 Error 1355(0x54b) – the specified domain does not exist. Learn more about the System Error Codes (1300-1699). Let us also run the netdom verify command as shown below. 

Perform AD Database Integrity

After reviewing the Event Viewer, i decided to perform the initial integrity check, launch PowerShell or Command Prompt and type the following command:

ESENTUTL /g C:\windows\NTDS\ntds.dit /!10240 /8 /o

As you can see, there are inconsistency in the database with error message -1811. This is because of the unplanned shutdown. That is “the Administrator modified logs or lost I/O flush on shutdown”.

Repair NTDS database

Now let us attempt repair the ntds database. But let us perform the integrity check once more using the NTDS commands.

ntdsutil.exe
activate instance ntds
files
integrity

As you can see this, this failed with a new error -501 JET_errLogFileCorrupt. This error is because of the Hardware corrupting the I/O at writing, or the hardware lost flush caused the log to become unusable. This means that the database (DB) is left in a corrupted state.

Solution: Perform VM Recovery

Microsoft recommends restoring the database from a known good backup or reinstalling the domain controller (DC). In my case, I have a backup of the entire VM, so I will restore it. You can learn about the different recovery options here:  ttps://helpcenter.veeam.com/docs/backup/vsphere/vm_restores.html?ver=120 

This time, I will perform “Instant Recovery” which instantly recover workloads (VMs, EC2 instances, physical servers and so on) directly from compressed and deduplicated backup files as HyperV VM. When you perform Instant Recovery, Veeam Backup & Replication mounts recovered VM images to a host directly from backups stored on backup repositories.

Instant Recovery improves recovery time objectives (RTO) and minimises disruption and downtime of production workloads. However, Instant Recovery offers “temporary spares” for VMs with limited I/O performance.

Instance VM Recovery helps to minimize downtime by quickly making the VM available after a failure or data loss. The VM runs from the backup storage, allowing users to access it almost immediately.

To give the recovered VMs full I/O performance, you must finalize Instant Recovery by migrating the recovered VMs to the production environment.

VM is available again and useable

There are no longer replication errors. By the way, this is the only DC in this domain at the moment.

 

This is a great article as I have used Veeam a few times recently after power outages to recover AD for my homelab.  Nice write-up @Iams3le 


This is a great article as I have used Veeam a few times recently after power outages to recover AD for my homelab.  Nice write-up @Iams3le 

Awesome 👍


Very good write up. I had to restore all 4 DC’s after hours once in a critical situation and was flat down. It’s always good to know what roles are on what DC!


Very good write up. I had to restore all 4 DC’s after hours once in a critical situation and was flat down. It’s always good to know what roles are on what DC!

Not a good situation to be in, but glad you got this covered. I agree with you, knowing your infrastructure as you said is a great help in performing AD recovery.


Super-helpful article.  My only addition is to make sure that if you are going to use DSRM, make sure your DSRM password is documented.  I cannot tell you how many clients when we onboard have no DSRM passwords listed in their onboarding documentation.  Any time a DC is built, I log the DSRM password.  I’ve only used it a couple of times, but that’s not zero as well.  Fortunately, the easy button tends to be a restore from backup, or even just building a new DC, assuming that other DC’s in the domain have survived to replicate from.


Super-helpful article.  My only addition is to make sure that if you are going to use DSRM, make sure your DSRM password is documented.  I cannot tell you how many clients when we onboard have no DSRM passwords listed in their onboarding documentation.  Any time a DC is built, I log the DSRM password.  I’ve only used it a couple of times, but that’s not zero as well.  Fortunately, the easy button tends to be a restore from backup, or even just building a new DC, assuming that other Ds in the domain have survived to replicate from

 

Good point. Luckily, the DSRM password can be reset. 


 

 

Good point. Luckily, the DSRM password can be reset. 

 

Good point….for reference:  https://learn.microsoft.com/en-us/troubleshoot/windows-server/active-directory/reset-directory-services-restore-mode-admin-pwd


 

 

Good point. Luckily, the DSRM password can be reset. 

 

Good point….for reference:  https://learn.microsoft.com/en-us/troubleshoot/windows-server/active-directory/reset-directory-services-restore-mode-admin-pwd

From the source! This a very good guide: But you will find the step-by-step instructions on our blog post very useful: https://techdirectarchive.com/2022/07/04/laps-in-windows-how-to-reset-directory-services-restore-mode-dsrm-password/


Great article @Iams3le 

One of those where you hope you don’t need to use it but it's there!


Great article @Iams3le 

One of those where you hope you don’t need to use it but it's there!

Exactly, cheers Dips! 


Comment