
Demystifying Veeam Housekeeping: The Silent Guardian of Your Backups
Housekeeping is an automatic, background, and largely invisible maintenance process that manages the lifecycle of backup files in Veeam Backup & Replication.
This process cleans the backup chain, deletes old restore points, merges files, verifies integrity, reclaims free space, and optimizes the storage system.
“Housekeeping is a background mechanism used for the regular cleaning, merging, and verification of backup files.”
Veeam typically uses forward incremental or reverse incremental backup strategies. Over time, these strategies can lead to the following issues:
-
The backup chain becomes long and fragmented.
-
Storage space is unnecessarily consumed.
-
Restore performance degrades.
-
Silent data corruption (e.g., bit rot) goes undetected.
-
Blocks belonging to deleted VMs remain on disk.
Housekeeping exists to solve these problems. Its goal is to ensure the backup system remains sustainable, reliable, and performant.
When Does Housekeeping Run?
Housekeeping operates entirely automatically without user intervention. The default schedule is as follows:
-
Background Retention: Every 24 hours, at 00:30 (server time).
-
Health Check: Monthly, typically on the first Friday night.
-
Compact Full: Immediately after each merge operation.
-
Space Reclamation: Continuously, in the background.
These processes are not interrupted if a backup job is running. If a backup is active, housekeeping is paused and resumes after the job finishes.
What Exactly Happens in the Background?
Housekeeping is a layered, autonomous, and highly intelligent maintenance system composed of multiple sub-processes. Let's break down this process step-by-step, with micro-level technical details, delving into Veeam's internal workings.
1. Trigger
The Veeam Backup Service (VeeamBackupSvc.exe) initiates a housekeeping control loop every 4 hours. At 00:30, Background Retention is officially triggered.
2. Metadata Analysis
The system reads job configurations from the Configuration Database (SQL Server or PostgreSQL).
It then lists the current backup files for each job.
These queries gather critical information such as retention policy, GFS (Grandfather-Father-Son) settings, and the current number of restore points.
3. Chain Analysis
At this stage, the physical file system is scanned:
-
Storage Location: e.g.,
D:\Backup\SQL Servers Backup\ -
File Types:
-
.vbk→ Full backup -
.vib→ Incremental backup -
.vbm→ Metadata (chain map, CBT data, hashes)
-
The .vbm file keeps track of, for each restore point:
-
Which blocks it contains
-
Their SHA1 hashes
-
The CBT (Changed Block Tracking) state
-
GFS labels
4. Retention Enforcement
Scenario: Retention = 14 points, Current = 15 points → Retention policy is triggered. (Forward Incremental Scenario)
Decision Engine:
-
The oldest incremental is identified: e.g.,
VBK_2025-10-09.vib -
The number of merge points is calculated.
-
The first full backup is merged with the oldest incremental backup.
-
The oldest incremental file is deleted.
5. Compact Full – Space Cleanup
After a merge, the old .vbk file may contain blocks from VMs that have been deleted.
Compact Process:
-
A new temporary file is created: e.g.,
VBK_2025-10-16.vbk.compact.tmp -
Only blocks currently in use are copied.
-
The old file is deleted.
-
The new file is renamed: e.g.,
VBK_2025-10-16.vbk
6. Health Check – Integrity Verification
Runs monthly, on the first Friday night:
-
All
.vbkand.vibfiles are opened in read-only mode. -
For each block:
-
A CRC32 checksum is calculated.
-
A SHA1 hash is calculated.
-
It is compared against the hash stored in the database (
[Backup.Model.Blocks]).
-
-
Result:
-
If it doesn't match → Repair from the nearest healthy point is attempted.
-
If repair fails → The file is marked as "Corrupted".
-
7. Space Reclamation – Free Space Recovery
Runs continuously in the background:
-
CBT data cleanup.
-
Removal of
.ctkfiles for deleted VMs. -
Deduplication Database cleanup.
-
Veeam.Agent.exesends a reclaim command to the storage appliance. -
For ReFS/XFS with fast clone, deleted blocks are immediately freed.
8. Repository Maintenance
| Storage Type | Operations |
|---|---|
| ReFS / XFS | Metadata sync, block freeing. |
| Object Storage | If immutability is enabled, deletion is delayed until the hold expires. |
| SOBR (Scale-Out) | Data movement between performance and capacity tiers, extent cleanup. |
| Dedupe Appliance | Sends a maintenance mode signal. |
Deduplication Appliance and the Maintenance Mode Signal: What Does It Mean in Veeam?
As mentioned earlier, the "Dedupe Appliance – Maintenance mode signal" is a compatibility mechanism used in Veeam Backup & Replication's integration with deduplication appliances (such as Dell EMC Data Domain, ExaGrid, HPE StoreOnce). It is designed to prevent conflicts during housekeeping operations. Let's explain this in detail, step by step.
1. What is a Deduplication Appliance?
-
Dedupe Appliance: These are specialized storage devices that identify and store duplicate data blocks only once. Veeam uses these devices as backup repositories.
-
Example Integrated Appliances:
-
Dell EMC Data Domain
-
ExaGrid
-
HPE StoreOnce
-
Quantum DXi
-
-
Advantage: Can reduce storage space by up to 90%, but they can be performance-sensitive—especially read/write operations can be slow.
-
Veeam Integration: Veeam connects directly to these appliances using their native APIs. This ensures that Veeam's own compression/deduplication does not conflict with the appliance's deduplication.
2. What is the "Maintenance Mode Signal" from Veeam?
-
The Signal: During housekeeping (e.g., merge, compact full, health check operations), Veeam sends an API call or signal to the deduplication appliance. This temporarily places the appliance into a maintenance mode.
-
Purpose: To prevent conflicts between Veeam's background housekeeping operations and the appliance's own internal maintenance tasks.
-
Example: During housekeeping, Veeam performs intensive read operations while deleting old backups or merging chains. If the appliance's own deduplication optimization runs simultaneously, performance can degrade by up to 50%.
-
How the Signal is Sent: It is handled by the
Veeam.Agent.exe(transport agent). Veeam sends a "maintenance mode" command via the appliance's API → The appliance optimizes its traffic (e.g., by shortening its read queue). -
Automatic for Integrated Appliances: Veeam automatically sends this signal for integrated appliances (see list above). For non-integrated devices, manual configuration may be required.
