Solved

snapshots gone amok, yet nowhere to be found


Some time ago K10 went bananas and I had to disable it entirely. I have 20TB used on snapshots, it just keeps growing. I got to delete restore points, yet nothing happens. I’m now stuck with a small army of FCD’s I cannot delete manually or expand or do anything with.

 

Where do I get rid of all these snapshots K10 created? I’ve deleted EVERY single restore point for each backed up application. This was backing up ~2TB, now I’m filling up my SAN at a rate of 5TB/day if I don’t disable the policy.

 

I cannot find anything using all the documented commands via kubectl.

 

Latest version K10, vsphere 7x etc.

 

Do you have some tool that use the VDDK api to delete orphaned or all snapshots? I’m getting close to kicking K10 out.

 

 

icon

Best answer by Tipsmark 27 March 2023, 02:01

View original

4 comments

Userlevel 7
Badge +20

@jaiganeshjk 

After more hours of digging I’m seeing this using govc… please tell me this is not the army of the undeleteable? How do I get rid of all these? Tried a few manual using mob but so far no success. 

 

 

Yeah this is it - an insane amount of garbage.

 

Veeam, you need to create a tool to clean up these kind of things - no idea why K10 suddenly went insane and never cleaned up after itself. I wasted a few days searching around for solutions before today. This caused a number of deployment failures due to disk full - who expects a few hundred percent data growth? Not me. I now have to repair some databases and such as they hit the wall. None will be restored from K10 backup that’s for sure. Too risky to operate with hidden things showing up here and there.

 

 

And to add further to this - while snapshots played a role it appears K10 left no less than 41 volumes behind - no idea why. I ran a match between volumeHandle on each pvc and govc disk.ls and delete everything not used in kubernetes. It’s a mystery and in the future I’ll write a script to compare actual pvc usage vs. what is on the datastore.

Oh and K10 still displays 20TB used despite it all being deleted now and a new report was ran. Not exactly robust.

Comment