We’re planning a large-scale migration of Veeam Backup for Microsoft 365, moving the object repository from one S3 backend to another S3 backend (cluster to cluster). We’re talking about approximately 1.2PB of data.
The challenge: we want to complete the migration without losing restore points / effective retention, and without duplicating the storage footprint during the retention period (duplicating 1.2PB is not an option).
I’d like to ask the community:
1️⃣ Has anyone performed an S3 → S3 migration of VB365 data at this scale?
What approach worked for you (storage-level replication, incremental sync with aws s3 sync, rclone, endpoint switch, etc.)?
What “gotchas” did you encounter (eventual consistency, versioning, object lock, throttling, API limits, lifecycle policies, etc.)?
How did you validate that the backup chain / restore points remained intact?
2️⃣ Metadata / internal structure in S3
Does anyone have detailed knowledge of what metadata Veeam relies on when storing VB365 data in S3?
Are there specific headers, object metadata, tags, ETags, timestamps, ACLs, storage class settings, or versioning attributes that must be preserved exactly for the repository to remain consistent after migration?
Is there any technical documentation available describing the internal object structure, indexes, or manifests?
3️⃣ Operating model without temporary double capacity
Our goal is to migrate while maintaining retention and restore capability throughout the process, but without temporarily doubling 1.2PB of storage.
If anyone has implemented a phased migration model, temporary coexistence of repositories, progressive rehydration, or controlled repository cutover, I’d really appreciate hearing the details.
👉 Are there any Veeam developers or PMs working on this part of the code who could share more technical insight? Especially regarding metadata dependencies, S3 repository consistency, and official recommendations for this type of migration.
Any real-world experience — including “don’t do it this way” lessons learned — would be greatly appreciated 🙏
Thanks!
Best answer by HangTen416
1️⃣ Has anyone performed an S3 → S3 migration of VB365 data at this scale?
What approach worked for you (storage-level replication, incremental sync with aws s3 sync, rclone, endpoint switch, etc.)?
Rclone works
Initial full rclone copy can be run while jobs are enabled
Once this initial rclone copy is complete, disable the jobs
Run the rclone copy again to capture deltas (incremental rclone copy)
Create repositories in VB365 pointing to the new buckets
Allow VB365 to sync and index
Point the jobs to the new repositories and enable the jobs
Some of the above VB365 tasks can be done via Powershell or API
--stats-one-line (shows progress in one line. Important to see how many transfers are happening vs checking for duplicates without filling the screen with stats)
--transfers # (number of concurrent transfers, play with the number and keep track of resources being consumed on rclone server)
--checkers # (number of concurrent checks for duplicate files, play with the number and keep track of resources being consumed on rclone server)
--no-check-certificate
--checksum (very important option for incremental backup to quicker check for duplicate files)
What “gotchas” did you encounter (eventual consistency, versioning, object lock, throttling, API limits, lifecycle policies, etc.)?
VB365 does not use versioning
Object lock does not prevent copying of data.
Throttling and API limits are based on source and target storage platforms
How did you validate that the backup chain / restore points remained intact?
Do a manual check of size of buckets and number of objects
2️⃣ Metadata / internal structure in S3
Does anyone have detailed knowledge of what metadata Veeam relies on when storing VB365 data in S3?
Are there specific headers, object metadata, tags, ETags, timestamps, ACLs, storage class settings, or versioning attributes that must be preserved exactly for the repository to remain consistent after migration?
Is there any technical documentation available describing the internal object structure, indexes, or manifests?
3️⃣ Operating model without temporary double capacity
Our goal is to migrate while maintaining retention and restore capability throughout the process, but without temporarily doubling 1.2PB of storage.
This is not possible. Even in VBR, VeeaMover will have double the storage temporarily
If anyone has implemented a phased migration model, temporary coexistence of repositories, progressive rehydration, or controlled repository cutover, I’d really appreciate hearing the details.
--stats-one-line (shows progress in one line. Important to see how many transfers are happening vs checking for duplicates without filling the screen with stats)
--transfers # (number of concurrent transfers, play with the number and keep track of resources being consumed on rclone server)
--checkers # (number of concurrent checks for duplicate files, play with the number and keep track of resources being consumed on rclone server)
--no-check-certificate
--checksum (very important option for incremental backup to quicker check for duplicate files)
What “gotchas” did you encounter (eventual consistency, versioning, object lock, throttling, API limits, lifecycle policies, etc.)?
VB365 does not use versioning
Object lock does not prevent copying of data.
Throttling and API limits are based on source and target storage platforms
How did you validate that the backup chain / restore points remained intact?
Do a manual check of size of buckets and number of objects
2️⃣ Metadata / internal structure in S3
Does anyone have detailed knowledge of what metadata Veeam relies on when storing VB365 data in S3?
Are there specific headers, object metadata, tags, ETags, timestamps, ACLs, storage class settings, or versioning attributes that must be preserved exactly for the repository to remain consistent after migration?
Is there any technical documentation available describing the internal object structure, indexes, or manifests?
3️⃣ Operating model without temporary double capacity
Our goal is to migrate while maintaining retention and restore capability throughout the process, but without temporarily doubling 1.2PB of storage.
This is not possible. Even in VBR, VeeaMover will have double the storage temporarily
If anyone has implemented a phased migration model, temporary coexistence of repositories, progressive rehydration, or controlled repository cutover, I’d really appreciate hearing the details.
@edh - Just wanted to follow up on your question to see if you received an answer and if so, please mark the best answer, even if it is your own that helped with your issue.