Skip to main content
Solved

VolumeSnapshot and VolumeSnapshotContent not deleted using rook-ceph


Hi,

 

I’m currently using K10 in a “no-local, all S3” policy, meaning that every local snapshot made for backups has to be immediatley deleted once it has been exported to S3, as seen here :

 

I have no problems with backups whatsoever, but I do have a pile (1700+) of VolumeSnapshots that are not deleted.

 

Every VC looks like :

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotContent
metadata:
  annotations:
    snapshot.storage.kubernetes.io/volumesnapshot-being-deleted: "yes"
  creationTimestamp: "2024-03-09T00:03:40Z"
  deletionGracePeriodSeconds: 0
  deletionTimestamp: "2024-03-09T00:04:25Z"
  finalizers:
  - snapshot.storage.kubernetes.io/volumesnapshotcontent-bound-protection
  generation: 3
  name: snapshot-copy-2b4pthg4-content-f019e565-697e-4649-b797-d461f4c18544
  resourceVersion: "175949429"
  uid: c0234817-0c1f-4e7d-8969-44fb7a83d5d5
spec:
  deletionPolicy: Delete
  driver: rook-ceph.rbd.csi.ceph.com
  source:
    snapshotHandle: 0001-0009-rook-ceph-0000000000000002-8eda267e-e1a2-4242-8836-92df0dbeb325
  volumeSnapshotClassName: k10-clone-csi-rbdplugin-snapclass
  volumeSnapshotRef:
    kind: VolumeSnapshot
    name: snapshot-copy-2b4pthg4
    namespace: kasten-io
    uid: ecfa59d1-7353-4a39-af09-ca304808e9a4
status:
  creationTime: 1709942621836570796
  readyToUse: true
  restoreSize: 0
  snapshotHandle: 0001-0009-rook-ceph-0000000000000002-8eda267e-e1a2-4242-8836-92df0dbeb325

 

Regarding Ceph, the only relevant logs are of course the csi-snapshotter container, which tells :

...
E0419 11:21:58.273585       1 snapshot_controller_base.go:359] could not sync content "snapshot-copy-9jhzn7fz-content-f7e30f08-5d76-4643-9d3f-4604441fb283": failed to delete snapshot "snapshot-copy-9jhzn7fz-content-f7e30f08-5d76-4643-9d3f-4604441fb283", err: failed to delete snapshot content snapshot-copy-9jhzn7fz-content-f7e30f08-5d76-4643-9d3f-4604441fb283: "rpc error: code = InvalidArgument desc = provided secret is empty"
I0419 11:21:58.273685       1 event.go:364] Event(v1.ObjectReference{Kind:"VolumeSnapshotContent", Namespace:"", Name:"snapshot-copy-9jhzn7fz-content-f7e30f08-5d76-4643-9d3f-4604441fb283", UID:"a66cd49d-68ed-4bb3-aa26-efc56e6668ea", APIVersion:"snapshot.storage.k8s.io/v1", ResourceVersion:"204560594", FieldPath:""}): type: 'Warning' reason: 'SnapshotDeleteError' Failed to delete snapshot
...

 

StorageClass and VolumeSnapshotClass associated with rook-ceph both have secret location annotations. 

 

I do have backups both for RBD and CephFS, and both plugin pods have the same kind of logs

I can provide logs from k10_debug.sh tool if needed.

 

Any help appreciated, thanks a lot !

 

 

Best answer by lgromb

Hi @Hagag

 

You’re absolutely right. I did managed to delete “1” VSC by adding those annotations to it :

 

    snapshot.storage.kubernetes.io/deletion-secret-name: rook-csi-rbd-provisioner
    snapshot.storage.kubernetes.io/deletion-secret-namespace: rook-ceph

 

Now I need to find a quick and dirty way to apply those annotations to all snapshots…

 

Thanks for your help !

View original
Did this topic help you find an answer to your question?

Hagag
Forum|alt.badge.img+2
  • Experienced User
  • April 22, 2024

Hi @lgromb 
 

It seems there may be an issue originating from the Ceph side, as the responsibility for managing snapshots lies with the Ceph csi-snapshotter. K10 merely requests deletion from the CSI snapshotter. I recommend reaching out to the Ceph or storage team to investigate why errors occur during snapshot deletion.

you can also try to manually create and delete the snapshot without involving K10 and check if you have the same issue.
 

Additionally, I came across a link to a bug report detailing a similar issue.
https://bugzilla.redhat.com/show_bug.cgi?id=1951399


BR,
Ahmed Hagag


  • Not a newbie anymore
  • April 22, 2024

Hi @Hagag

 

You’re absolutely right. I did managed to delete “1” VSC by adding those annotations to it :

 

    snapshot.storage.kubernetes.io/deletion-secret-name: rook-csi-rbd-provisioner
    snapshot.storage.kubernetes.io/deletion-secret-namespace: rook-ceph

 

Now I need to find a quick and dirty way to apply those annotations to all snapshots…

 

Thanks for your help !


Comment