Skip to main content

Everyday is a school day, and today I found out something really cool that Veeam was doing, that I never knew about because “It Just Works”.

I had an alert generated from a customer system today that I had never seen before. Now I’ve seen plenty of alerts for different backup issues, whether they’re caused by networks, BSODs, disk space constraints etc, but I got surprised by this completely new one.

 

A Tape Drive Alert, but of an unexpected variety

 

Warning: “TapeDrive alert: The voltage supply to the tape drive is outside the specified range.”

As I said above, I’d never seen this warning before! I didn’t know that Veeam was tracking such attributes of the tape drives it uses. So I set about looking up the root cause of the problem and busted out some “Google-Fu” to find who else had these issues in the past and I found this page of Veeam Documentation:

Tape Drive Alerts – Veeam Backup Guide for vSphere

This web page has all of the alert codes, the severity of the issue, a description about the issue and what has caused the issue.

Now some of these errors you might be expecting to see such as “Media Error”, your data is at risk or read/write warnings, but some of the warnings here really go above and beyond what Veeam NEEDS to look at to do its job, examples:

  • Cooling fan failure
  • Voltage supply low/high / Power Consumption low/high
  • Humidity
  • Firmware
  • Redundant Power supply failed.

 

Ok cool, but why is this a big deal?

 

As technologies have evolved, we’ve added more abstract layers and isolations between tiers, and it’s great from some perspectives but restrictive to others. These tiers are often unaware of faults or early warning signs above or below them, for example:

  • In the storage world, an operating system doesn’t know that a disk has failed in a RAID array if it’s controlled by a RAID controller, as long as it can continue to read/write it’s unaware of a problem with the underlying layer.
  • In the networking world, a device doesn’t know that a highly available route has lost one of its paths, the traffic still flows, so it is unaware.
  • In the computing world, a virtual machine doesn’t know the physical host it resides upon has lost a redundant power supply, it is still running and so it is unaware.

Due to these scenarios we expect a lack of interaction between our systems, that we ourselves must aggregate this information, or use dedicated reporting systems to aggregate this information. Some exceptions exist of course, but largely the awareness of other layers is completely siloed. This is a missed opportunity as awareness of each of these scenarios could provide benefits.

Now, Veeam has their own monitoring and reporting product, Veeam One, so surely this would be the perfect time to talk about this and how it will collect all of this information? Wrong!

Veeam Backup & Replication actually is performing some key monitoring tasks built in, with these alerts for the Tape Drive being one of them. Realistically, Veeam only needs to concern itself when data isn’t being read or written successfully, and normally we would expect this siloed approach to our infrastructure, as long as Veeam can write a backup to tape, its job is done, right?

Veeam goes above and beyond here, sensor information provided by the Tape Drive is fed back into the job report, if the Tape Drive is reporting the humidity is too high, Veeam will tell you! Loss of power redundancy? Veeam will tell you that too! And Veeam is going to put that information straight into the output of the tape jobs impacted by this so you can get an immediate scope of what percentage of your tape jobs are at risk!

 

Conclusion

 

Credit needs to be given to Veeam for including this in their base product and not hiding this away behind an additional monitoring tool. Using these sensors to detect potential issues allows customers to proactively replace failing components and product their data availability. I’ve gone through the help center version history and can see this alert table was first added in version 9.5u4 so the functionality has existed for some time.

Fine feature :sunglasses: Good to know that hardware problems of a tapedrive are shown within Veeam.

I have never seen these alerts up to now. Out tapedrives we are using with Veeam didn’t have a problem since we are working with them….


Very interesting as we are using tape more often with VCC so will look in to this.


Great finding, Michael!

Conclusion (2): tape is still not dead, and voltage outside the specified range will not kill it too! :joy:


Nice 😎

But… don't you monitor the library to catch such events?


As promised, an update on this topic.


I’ve heard back from Veeam’s Technical Writers to confirm that the severity of the alert has no bearing on the tape job’s success within reason, and that a critical alert could still result in a successful tape job, whilst a warning could still be cause for a job failure.

As a result these alerts are provided for convenience and informational purposes to assist in maintaining a healthy tape infrastructure :relaxed:


@MicoolPaul : Hats off to you sir, see more posts from you !


Aha, ok. 😀 it is a single drive...

Most libraries have some monitor and notification features on board.


Nice 😎

But… don't you monitor the library to catch such events?

Normally we would, this is a customer with an existing tape drive from before we started providing them services. Due to COVID restrictions we’ve not been able to go to site and inventory their physical estate but they claim they have no monitoring capabilities for the hardware… it’s highly doubtful, but it’s been a long story why we haven’t been given all the information necessary. So until then we’ve said we’ll only manage Veeam. Hardware is on them to support!


I’ve got another update to share on this topic. The customer who’s environment generated the voltage warning only performs a tape backup once a month. This month is completely failed, so the voltage warning was a great indicator of a pending problem. Well done Veeam for sharing this or the customer would’ve been completely taken by surprise! It’s also sped up the troubleshooting process as we have an idea of what was happening before “the event”.


Thx @MicoolPaul for sharing, I also was not aware about those notifications. As you already mentioned, even when you use the product every day, you definitely are still learning new things 🙂 !


@MicoolPaul : Is there any auto category defined on these alerts like Critical, Major, minor or just warning !

There’s three categories, “Critical”, “Warning” and “Information”, I’ve submitted feedback for this pages’ documentation as there’s no mention of how this may impact the job success/warning/error status. Typically the notification I had was of the “Warning” type and the job finished with a warning, but I haven’t got a spare stack of tape drives to start trashing them in different ways to test…

I’ll update this if/when Veeam reply to me :slight_smile:

 

@vNote42 @JMeixner you’ve forgotten about the REAL destroyer of tapes… kids! :joy:

 

Right! But I think even my kids would need some time to destroy one of these tapes! :zap::rofl:


@MicoolPaul : Is there any auto category defined on these alerts like Critical, Major, minor or just warning !

There’s three categories, “Critical”, “Warning” and “Information”, I’ve submitted feedback for this pages’ documentation as there’s no mention of how this may impact the job success/warning/error status. Typically the notification I had was of the “Warning” type and the job finished with a warning, but I haven’t got a spare stack of tape drives to start trashing them in different ways to test…

I’ll update this if/when Veeam reply to me :slight_smile:

 

@vNote42 @JMeixner you’ve forgotten about the REAL destroyer of tapes… kids! :joy:

 


Great finding, Michael!

Conclusion (2): tape is still not dead, and voltage outside the specified range will not kill it too! :joy:


Mhh, the drives are very robust nowadays.

I know only two ways to kill a drive…. water and dust….

and this guy

 


@MicoolPaul : Is there any auto category defined on these alerts like Critical, Major, minor or just warning !


You say, the error occurs every Saturday… is the drive used on Saturdays only or is it used on other days without error?

Is the same tape in the drive when the error occurs?

 

If you have a maintenance contract for the drive/library, I would have it checked.


Greeting from Phuket.

 

Now we got error as pic below

From this error we need to replace new tape drive or not?

 

And this error will alert on every Saturday.


You say, the error occurs every Saturday… is the drive used on Saturdays only or is it used on other days without error?

Is the same tape in the drive when the error occurs?

 

If you have a maintenance contract for the drive/library, I would have it checked.

Used on other days without error and change tape every 2 days (it mean I have 2 group of tape one tape for Mon, Wed, Fri and one for Tue, Thu, Sat, Sun)


Ok.

Is the amount of data that is written to tape larger on Saturdays as on the other days?

 

Like I said, I would have it checked by the manufacturer / maintenance provider.


Ok.

Is the amount of data that is written to tape larger on Saturdays as on the other days?

 

Like I said, I would have it checked by the manufacturer / maintenance provider.

Yes, The data on Saturday are larger than other days


Great finding, Michael!

Conclusion (2): tape is still not dead, and voltage outside the specified range will not kill it too! :joy:


No, tape is definitely not dead.

Have a look at this roadmap…
https://blocksandfiles.com/2021/02/04/petabyte-tape-cartridges-are-coming/

No it is definitely not.  We have a service here built around Tape using Veeam and another product for offline backups for clients.


@MicoolPaul : Is there any auto category defined on these alerts like Critical, Major, minor or just warning !

There’s three categories, “Critical”, “Warning” and “Information”, I’ve submitted feedback for this pages’ documentation as there’s no mention of how this may impact the job success/warning/error status. Typically the notification I had was of the “Warning” type and the job finished with a warning, but I haven’t got a spare stack of tape drives to start trashing them in different ways to test…

I’ll update this if/when Veeam reply to me :slight_smile:

 

@vNote42 @JMeixner you’ve forgotten about the REAL destroyer of tapes… kids! :joy:

 

:thumbsup:


Great finding, Michael!

Conclusion (2): tape is still not dead, and voltage outside the specified range will not kill it too! :joy:


No, tape is definitely not dead.

Have a look at this roadmap…
https://blocksandfiles.com/2021/02/04/petabyte-tape-cartridges-are-coming/


@vNote42 @JMeixner you’ve forgotten about the REAL destroyer of tapes… kids! :joy:

 

Haha, yes. But I don’t let the kids play with the tape drives. They may destroy a single tape but not the drives :sunglasses:


Great finding, Michael!

Conclusion (2): tape is still not dead, and voltage outside the specified range will not kill it too! :joy:


Mhh, the drives are very robust nowadays.

I know only two ways to kill a drive…. water and dust….

and this guy

 


Ok, ok 😀😀😀

You are right. I forgot the third way - pure violence….

 


Great finding, Michael!

Conclusion (2): tape is still not dead, and voltage outside the specified range will not kill it too! :joy:


Mhh, the drives are very robust nowadays.

I know only two ways to kill a drive…. water and dust….


Comment