Skip to main content

Initial problem and considerations

 

As a managed service provider responsible for the data protection of numerous customers, the complexity and scale of backup administration have grown significantly over time. With several thousand virtual and physical machines under management and a continually increasing number of customers the need for standardization, automation, and scalable practices has become critical.

Most customer environments have evolved to follow a consistent standard, particularly in terms of naming conventions for machines and the structure of backup jobs. However, each customer often brings specific requirements, especially when it comes to service level agreements (SLAs), which remain individually tailored. For new customers or legacy deployments, a gradual process is in place to bring their environments in line with this common standard. This ensures both operational consistency and improved manageability across the entire service portfolio.

Given the dynamic growth of machines requiring backup, it became clear that traditional, manual processes for assigning systems to backup jobs would not scale effectively. The administrative overhead and risk of human error would increase with each new machine or customer onboarded. To address this challenge, the organization moved toward a more automated and intelligent approach to backup job management.

The solution centers around a conceptual framework that enables simplified administration and automation. While full technical details cannot be disclosed, I offer a high-level overview of how this framework operates, emphasizing how its components interact to deliver robust and efficient data protection services at scale. This series of articles is intended to share insights from this real-world implementation, offering a valuable perspective on managing backup operations in a rapidly growing, multi-tenant environment, without compromising on flexibility, control, or reliability.

 

What are tags?

Tags are a powerful and flexible tool for managing IT resources across complex and distributed environments. As customizable metadata labels, they allow administrators to attach meaningful context to virtual machines, storage volumes, applications, and network components. Unlike physical attributes such as location or hardware configuration, tags operate as logical identifiers, typically expressed as key-value pairs, like Environment=Production or Owner=Finance. This abstraction layer enables more efficient, scalable, and policy-driven management practices.

Tags are widely used in virtualized and cloud environments for:

  • Backup policy assignment based on protection tiers
  • Resource organization and cost allocation (chargeback/showback)
  • Automation of deployment and lifecycle operations
  • Security classification and compliance enforcement

By enabling policy-driven and context-aware management, tags are a core element in scalable, automated, and resilient IT operations.

 

Why we use tags

Tags play a central role in our backup management strategy, primarily because they offer a platform-independent method of classifying and managing resources. Most modern virtualization and cloud platforms support tagging, which is a crucial factor in ensuring that our concept remains flexible and future-proof. By relying on tags, we avoid tying our processes and automation logic to a single vendor or technology stack. Instead, we create a consistent, scalable framework that can be applied across the diverse environments we manage - both today and as our infrastructure continues to evolve.

The usefulness of tags extends beyond infrastructure alone. Several leading backup software providers recognize the value of tags and offer native support for using them as selection criteria within backup jobs. This allows us to dynamically associate virtual machines and other resources with the appropriate backup policies without having to maintain complex, manually curated lists. It streamlines operations, reduces the potential for human error, and makes our backup administration more responsive to changes in the environment.

In our current landscape, we rely on Veeam as our primary backup solution, and it fully supports our tag-based approach. At present, we have no reason to move away from Veeam—it meets our needs and integrates well with our operational model. That said, we remain realistic about the future. Technological landscapes shift, requirements evolve, and products come and go. By building our backup architecture around a broadly supported and vendor-agnostic mechanism like tags, we position ourselves to adapt to whatever changes the future may bring, without having to redesign our foundational processes.

 

Concept and components

 

The concept consists of the interaction of the following components:

  • The company-wide configuration database
  • Tag values
  • Automation with Ansible
  • Veeam Backup and Replication
  • PowerShell scripts

We will have a short look at each of these components and discuss the sense of it.

 

The company-wide configuration database

The configuration database contains information about all managed machines. What could be more logical than using it to manage the tag values of the machines to be secured?

By definition, backups are only available for machines listed in the database. Machines on the platforms that are not listed in the database are automatically excluded from backups. Yes, this is a harsh approach, but it was necessary so that the database could be brought up to a reasonable level.

 

Tag values

The tag values come from three sources:

  • Standard definitions for a customer to which the machine belongs.
  • Information from the machine name.
  • Individual tag values for a specific machine that can be set in the database.

To distinguish the tags used for backup, they begin with the BKP_ proficiency. This allows other teams to use their own tags without conflicting with our concept.

 

Automation with Ansible

Automations based on Ansible fulfill several tasks:

  • Setting the tags on the respective platform.
  • Comparing the tags on the respective platform with the desired tags in the configuration database.
  • Adjusting the tags according to the settings in the configuration database.

Veeam Backup & Replication

 

Veeam Backup & Replication forms the heart of the concept. The product executes the defined backup jobs and manages the backup data of the protected machines.

Veeam can assign machines to backup jobs based on tag values on various platforms. On platforms where this isn't natively possible, the assignment is done via scripts.

 

PowerShell Scripts

We are using PowerShell scripts on the VBR server to automate management task within the Veeam application. This spares manual intervention for our administrators.

We will consider converting them to REST API. This may be useful when Veeam Server is available on Windows and Linux in the near future.

The scripts perform the following tasks:

  • Checking, monitoring and reporting.
  • Managing and creating backup jobs
  • Moving backups between backup jobs if the tags of one or more machines have changed.

 

 

I hope I was able to pique your curiosity about our solution.

We will discuss the individual components of this concept in more detail in the coming weeks.

Stay tuned 😎

Another fantastic article Joe.  I love tags with Veeam and we use them all the time as it makes backing up critical servers a breeze.  Also it allows the people that manage different systems to be able to add their servers to backups themselves simply by tagging them in VMware.  Looking forward to more.

 
 
 

I agree that backup tags are very important in the backup strategy. Unfortunately, with vCloud Director, Veeam doesn’t seem to be able to back things up by tag. Hopefully this gets addressed in a future update, or a workaround is available.


I agree that backup tags are very important in the backup strategy. Unfortunately, with vCloud Director, Veeam doesn’t seem to be able to back things up by tag. Hopefully this gets addressed in a future update, or a workaround is available.

Hello ​@Tommy O'Shea ,

did you file a feature request for this?

Veeam has added support for several plattforms and increased functionality in the last times.