Hello,
I’m trying to setup backup policies, but almost all policy runs fail at the action “Exporting RestorePoint” with the following error:
cause:
cause:
cause:
message: 's"{"message":"Failed to export snapshot
data","function":"kasten.io/k10/kio/exec/phases/phase.(*artifactCopier).convertSnapshots.func1","linenumber":408,"file":"kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:408","fields":i{"name":"type","value":"CSI"},{"name":"id","value":"k10-csi-snap-rttwnskswscjph98"}],"cause":{"message":"Error
creating portable
snapshot","function":"kasten.io/k10/kio/exec/phases/phase.(*gvcConverter).Convert","linenumber":1178,"file":"kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:1178","cause":{"message":"ActionSet
Failed","function":"kasten.io/k10/kio/kanister.(*Operation).Execute","linenumber":114,"file":"kasten.io/k10/kio/kanister/operation.go:114","fields":i{"name":"message","value":"{\"message\":\"Failed
while waiting for Pod to be
ready\",\"function\":\"kasten.io/k10/kio/kanister/function.copyVolumeDataPodFunc.func1\",\"linenumber\":153,\"file\":\"kasten.io/k10/kio/kanister/function/copy_volume_data.go:153\",\"fields\":e{\"name\":\"pod\",\"value\":\"copy-vol-data-hmrkf\"}],\"cause\":{\"message\":\"Pod
did not transition into running state.
Timeout:15m0s Namespace:kasten-io, Name:copy-vol-data-hmrkf: Context
done while polling: context deadline
exceeded\"}}"},{"name":"actionSet","value":{"metadata":{"name":"k10-copy-k10-persistentvolumeclaim-generic-volume-2.0.20-ksls25","generateName":"k10-copy-k10-persistentvolumeclaim-generic-volume-2.0.20-kanister-pvc-2vd75-kasten-io-pvc-","namespace":"kasten-io","uid":"39748042-0502-43ff-90c5-10878d20a150","resourceVersion":"399477","generation":4,"creationTimestamp":"2022-06-02T10:02:25Z","labels":{"kanister.io/JobID":"ec558fa4-e258-11ec-abba-76c5584997a1"},"managedFields":i{"manager":"Go-http-client","operation":"Update","apiVersion":"cr.kanister.io/v1alpha1","time":"2022-06-02T10:02:25Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:kanister.io/JobID":{}}},"f:spec":{".":{},"f:actions":{}},"f:status":{".":{},"f:actions":{},"f:error":{".":{},"f:message":{}},"f:state":{}}}}]},"spec":{"actions":t{"name":"copy","object":{"apiVersion":"","group":"","resource":"","kind":"pvc","name":"kanister-pvc-2vd75","namespace":"kasten-io"},"blueprint":"k10-persistentvolumeclaim-generic-volume-2.0.20","secrets":{"artifactKey":{"apiVersion":"","group":"","resource":"","kind":"secret","name":"k10-content-store-passphrase-m7w2w","namespace":"kasten-io"}},"profile":{"apiVersion":"v1alpha1","group":"","resource":"","kind":"profile","name":"kanister-portable-copy-f2x6p","namespace":"kasten-io"},"podOverride":{"securityContext":{"runAsNonRoot":false,"runAsUser":0},"tolerations":t{"effect":"NoExecute","key":"node.kubernetes.io/not-ready","operator":"Exists","tolerationSeconds":300},{"effect":"NoExecute","key":"node.kubernetes.io/unreachable","operator":"Exists","tolerationSeconds":300}]},"options":{"hostName":"6cd5aec8-5bdb-4d73-a5a3-04f01e6427b0.acid-postgres-cluster.pgdata-acid-postgres-cluster-1","objectStorePath":"repo/6cd5aec8-5bdb-4d73-a5a3-04f01e6427b0/","pvcRepository":"repo/6cd5aec8-5bdb-4d73-a5a3-04f01e6427b0/","userName":"k10-admin"},"preferredVersion":"v1.0.0-alpha"}]},"status":{"state":"failed","actions":t{"name":"copy","object":{"apiVersion":"","group":"","resource":"","kind":"pvc","name":"kanister-pvc-2vd75","namespace":"kasten-io"},"blueprint":"k10-persistentvolumeclaim-generic-volume-2.0.20","phases":h{"name":"copyToObjectStore","state":"failed"}],"artifacts":{"snapshot":{"keyValue":{"backupIdentifier":"{{
.Phases.copyToObjectStore.Output.backupID }}","backupPath":"{{
.Phases.copyToObjectStore.Output.backupRoot }}","funcVersion":"{{
.Phases.copyToObjectStore.Output.version }}","objectStorePath":"{{
.Options.pvcRepository }}","phySize":"{{
.Phases.copyToObjectStore.Output.phySize }}","size":"{{
.Phases.copyToObjectStore.Output.size
}}"}}},"deferPhase":{"name":"","state":""}}],"error":{"message":"{\"message\":\"Failed
while waiting for Pod to be
ready\",\"function\":\"kasten.io/k10/kio/kanister/function.copyVolumeDataPodFunc.func1\",\"linenumber\":153,\"file\":\"kasten.io/k10/kio/kanister/function/copy_volume_data.go:153\",\"fields\":e{\"name\":\"pod\",\"value\":\"copy-vol-data-hmrkf\"}],\"cause\":{\"message\":\"Pod
did not transition into running state.
Timeout:15m0s Namespace:kasten-io, Name:copy-vol-data-hmrkf: Context
done while polling: context deadline
exceeded\"}}"}}}}]}}}","{"message":"Failed to export snapshot
data","function":"kasten.io/k10/kio/exec/phases/phase.(*artifactCopier).convertSnapshots.func1","linenumber":408,"file":"kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:408","fields":i{"name":"type","value":"CSI"},{"name":"id","value":"k10-csi-snap-6plgr82jtrwhzwwf"}],"cause":{"message":"Error
creating portable
snapshot","function":"kasten.io/k10/kio/exec/phases/phase.(*gvcConverter).Convert","linenumber":1178,"file":"kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:1178","cause":{"message":"ActionSet
Failed","function":"kasten.io/k10/kio/kanister.(*Operation).Execute","linenumber":114,"file":"kasten.io/k10/kio/kanister/operation.go:114","fields":i{"name":"message","value":"{\"message\":\"Failed
while waiting for Pod to be
ready\",\"function\":\"kasten.io/k10/kio/kanister/function.copyVolumeDataPodFunc.func1\",\"linenumber\":153,\"file\":\"kasten.io/k10/kio/kanister/function/copy_volume_data.go:153\",\"fields\":e{\"name\":\"pod\",\"value\":\"copy-vol-data-xbzgg\"}],\"cause\":{\"message\":\"Pod
did not transition into running state.
Timeout:15m0s Namespace:kasten-io, Name:copy-vol-data-xbzgg: context
deadline
exceeded\"}}"},{"name":"actionSet","value":{"metadata":{"name":"k10-copy-k10-persistentvolumeclaim-generic-volume-2.0.20-kmp9lp","generateName":"k10-copy-k10-persistentvolumeclaim-generic-volume-2.0.20-kanister-pvc-qfxrw-kasten-io-pvc-","namespace":"kasten-io","uid":"224b6da6-5ba5-4b6d-9562-d5151e5ae335","resourceVersion":"399489","generation":4,"creationTimestamp":"2022-06-02T10:02:25Z","labels":{"kanister.io/JobID":"ec558fa4-e258-11ec-abba-76c5584997a1"},"managedFields":i{"manager":"Go-http-client","operation":"Update","apiVersion":"cr.kanister.io/v1alpha1","time":"2022-06-02T10:02:25Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:kanister.io/JobID":{}}},"f:spec":{".":{},"f:actions":{}},"f:status":{".":{},"f:actions":{},"f:error":{".":{},"f:message":{}},"f:state":{}}}}]},"spec":{"actions":t{"name":"copy","object":{"apiVersion":"","group":"","resource":"","kind":"pvc","name":"kanister-pvc-qfxrw","namespace":"kasten-io"},"blueprint":"k10-persistentvolumeclaim-generic-volume-2.0.20","secrets":{"artifactKey":{"apiVersion":"","group":"","resource":"","kind":"secret","name":"k10-content-store-passphrase-4zckf","namespace":"kasten-io"}},"profile":{"apiVersion":"v1alpha1","group":"","resource":"","kind":"profile","name":"kanister-portable-copy-f2x6p","namespace":"kasten-io"},"podOverride":{"securityContext":{"runAsNonRoot":false,"runAsUser":0},"tolerations":t{"effect":"NoExecute","key":"node.kubernetes.io/not-ready","operator":"Exists","tolerationSeconds":300},{"effect":"NoExecute","key":"node.kubernetes.io/unreachable","operator":"Exists","tolerationSeconds":300}]},"options":{"hostName":"6cd5aec8-5bdb-4d73-a5a3-04f01e6427b0.acid-postgres-cluster.pgdata-acid-postgres-cluster-2","objectStorePath":"repo/6cd5aec8-5bdb-4d73-a5a3-04f01e6427b0/","pvcRepository":"repo/6cd5aec8-5bdb-4d73-a5a3-04f01e6427b0/","userName":"k10-admin"},"preferredVersion":"v1.0.0-alpha"}]},"status":{"state":"failed","actions":t{"name":"copy","object":{"apiVersion":"","group":"","resource":"","kind":"pvc","name":"kanister-pvc-qfxrw","namespace":"kasten-io"},"blueprint":"k10-persistentvolumeclaim-generic-volume-2.0.20","phases":h{"name":"copyToObjectStore","state":"failed"}],"artifacts":{"snapshot":{"keyValue":{"backupIdentifier":"{{
.Phases.copyToObjectStore.Output.backupID }}","backupPath":"{{
.Phases.copyToObjectStore.Output.backupRoot }}","funcVersion":"{{
.Phases.copyToObjectStore.Output.version }}","objectStorePath":"{{
.Options.pvcRepository }}","phySize":"{{
.Phases.copyToObjectStore.Output.phySize }}","size":"{{
.Phases.copyToObjectStore.Output.size
}}"}}},"deferPhase":{"name":"","state":""}}],"error":{"message":"{\"message\":\"Failed
while waiting for Pod to be
ready\",\"function\":\"kasten.io/k10/kio/kanister/function.copyVolumeDataPodFunc.func1\",\"linenumber\":153,\"file\":\"kasten.io/k10/kio/kanister/function/copy_volume_data.go:153\",\"fields\":e{\"name\":\"pod\",\"value\":\"copy-vol-data-xbzgg\"}],\"cause\":{\"message\":\"Pod
did not transition into running state.
Timeout:15m0s Namespace:kasten-io, Name:copy-vol-data-xbzgg: context
deadline exceeded\"}}"}}}}]}}}","{"message":"Failed to export snapshot
data","function":"kasten.io/k10/kio/exec/phases/phase.(*artifactCopier).convertSnapshots.func1","linenumber":408,"file":"kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:408","fields":i{"name":"type","value":"CSI"},{"name":"id","value":"k10-csi-snap-d8ngsfjr4l9hkk5t"}],"cause":{"message":"Error
creating portable
snapshot","function":"kasten.io/k10/kio/exec/phases/phase.(*gvcConverter).Convert","linenumber":1178,"file":"kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:1178","cause":{"message":"ActionSet
Failed","function":"kasten.io/k10/kio/kanister.(*Operation).Execute","linenumber":114,"file":"kasten.io/k10/kio/kanister/operation.go:114","fields":i{"name":"message","value":"{\"message\":\"Failed
while waiting for Pod to be
ready\",\"function\":\"kasten.io/k10/kio/kanister/function.copyVolumeDataPodFunc.func1\",\"linenumber\":153,\"file\":\"kasten.io/k10/kio/kanister/function/copy_volume_data.go:153\",\"fields\":e{\"name\":\"pod\",\"value\":\"copy-vol-data-2fsh7\"}],\"cause\":{\"message\":\"Pod
did not transition into running state.
Timeout:15m0s Namespace:kasten-io, Name:copy-vol-data-2fsh7: context
deadline
exceeded\"}}"},{"name":"actionSet","value":{"metadata":{"name":"k10-copy-k10-persistentvolumeclaim-generic-volume-2.0.20-k8r9rv","generateName":"k10-copy-k10-persistentvolumeclaim-generic-volume-2.0.20-kanister-pvc-qpzlt-kasten-io-pvc-","namespace":"kasten-io","uid":"3ada599a-6c69-4e25-9dae-cfdb4dda1879","resourceVersion":"399506","generation":4,"creationTimestamp":"2022-06-02T10:02:26Z","labels":{"kanister.io/JobID":"ec558fa4-e258-11ec-abba-76c5584997a1"},"managedFields":i{"manager":"Go-http-client","operation":"Update","apiVersion":"cr.kanister.io/v1alpha1","time":"2022-06-02T10:02:26Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:kanister.io/JobID":{}}},"f:spec":{".":{},"f:actions":{}},"f:status":{".":{},"f:actions":{},"f:error":{".":{},"f:message":{}},"f:state":{}}}}]},"spec":{"actions":t{"name":"copy","object":{"apiVersion":"","group":"","resource":"","kind":"pvc","name":"kanister-pvc-qpzlt","namespace":"kasten-io"},"blueprint":"k10-persistentvolumeclaim-generic-volume-2.0.20","secrets":{"artifactKey":{"apiVersion":"","group":"","resource":"","kind":"secret","name":"k10-content-store-passphrase-r9xjg","namespace":"kasten-io"}},"profile":{"apiVersion":"v1alpha1","group":"","resource":"","kind":"profile","name":"kanister-portable-copy-f2x6p","namespace":"kasten-io"},"podOverride":{"securityContext":{"runAsNonRoot":false,"runAsUser":0},"tolerations":t{"effect":"NoExecute","key":"node.kubernetes.io/not-ready","operator":"Exists","tolerationSeconds":300},{"effect":"NoExecute","key":"node.kubernetes.io/unreachable","operator":"Exists","tolerationSeconds":300}]},"options":{"hostName":"6cd5aec8-5bdb-4d73-a5a3-04f01e6427b0.acid-postgres-cluster.pgdata-acid-postgres-cluster-0","objectStorePath":"repo/6cd5aec8-5bdb-4d73-a5a3-04f01e6427b0/","pvcRepository":"repo/6cd5aec8-5bdb-4d73-a5a3-04f01e6427b0/","userName":"k10-admin"},"preferredVersion":"v1.0.0-alpha"}]},"status":{"state":"failed","actions":t{"name":"copy","object":{"apiVersion":"","group":"","resource":"","kind":"pvc","name":"kanister-pvc-qpzlt","namespace":"kasten-io"},"blueprint":"k10-persistentvolumeclaim-generic-volume-2.0.20","phases":h{"name":"copyToObjectStore","state":"failed"}],"artifacts":{"snapshot":{"keyValue":{"backupIdentifier":"{{
.Phases.copyToObjectStore.Output.backupID }}","backupPath":"{{
.Phases.copyToObjectStore.Output.backupRoot }}","funcVersion":"{{
.Phases.copyToObjectStore.Output.version }}","objectStorePath":"{{
.Options.pvcRepository }}","phySize":"{{
.Phases.copyToObjectStore.Output.phySize }}","size":"{{
.Phases.copyToObjectStore.Output.size
}}"}}},"deferPhase":{"name":"","state":""}}],"error":{"message":"{\"message\":\"Failed
while waiting for Pod to be
ready\",\"function\":\"kasten.io/k10/kio/kanister/function.copyVolumeDataPodFunc.func1\",\"linenumber\":153,\"file\":\"kasten.io/k10/kio/kanister/function/copy_volume_data.go:153\",\"fields\":e{\"name\":\"pod\",\"value\":\"copy-vol-data-2fsh7\"}],\"cause\":{\"message\":\"Pod
did not transition into running state.
Timeout:15m0s Namespace:kasten-io, Name:copy-vol-data-2fsh7: context
deadline exceeded\"}}"}}}}]}}}"]'
file: kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:146
function: kasten.io/k10/kio/exec/phases/phase.(*artifactCopier).Copy
linenumber: 146
message: Error converting snapshots
file: kasten.io/k10/kio/exec/phases/phase/export.go:138
function: kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run
linenumber: 138
message: Failed to copy artifacts
message: Job failed to be executed
fields: i]
If I check the PVCs in the kasten-io namespace, I see that there are some PVCs stuck in Pending with the following message:
Name: kanister-pvc-8trnj
Namespace: kasten-io
StorageClass: rook-ceph-block
Status: Pending
Volume:
Labels: <none>
Annotations: volume.beta.kubernetes.io/storage-provisioner: rook-ceph.rbd.csi.ceph.com
Finalizers: rkubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
DataSource:
APIGroup: snapshot.storage.k8s.io
Kind: VolumeSnapshot
Name: snapshot-copy-9qqdbn56
Used By: copy-vol-data-mmbhd
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ExternalProvisioning 2m48s (x26 over 8m45s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "rook-ceph.rbd.csi.ceph.com" or manually created by system administrator
Normal Provisioning 15s (x12 over 8m45s) rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-d8bcc5fc4-bc7kd_44ea7e16-68f8-4506-bc45-928559eaf606 External provisioner is provisioning volume for claim "kasten-io/kanister-pvc-8trnj"
Warning ProvisioningFailed 15s (x12 over 8m45s) rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-d8bcc5fc4-bc7kd_44ea7e16-68f8-4506-bc45-928559eaf606 failed to provision volume with StorageClass "rook-ceph-block": error getting handle for DataSource Type VolumeSnapshot by Name snapshot-copy-9qqdbn56: error getting snapshot snapshot-copy-9qqdbn56 from api server: the server could not find the requested resource (get volumesnapshots.snapshot.storage.k8s.io snapshot-copy-9qqdbn56)
However the snapshot exists:
Name: snapshot-copy-9qqdbn56
Namespace: kasten-io
Labels: <none>
Annotations: <none>
API Version: snapshot.storage.k8s.io/v1
Kind: VolumeSnapshot
Metadata:
Creation Timestamp: 2022-06-02T10:17:57Z
Finalizers:
snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
Generation: 1
Managed Fields:
API Version: snapshot.storage.k8s.io/v1
Fields Type: FieldsV1
fieldsV1:
f:spec:
.:
f:source:
.:
f:volumeSnapshotContentName:
f:volumeSnapshotClassName:
Manager: executor-server
Operation: Update
Time: 2022-06-02T10:17:57Z
API Version: snapshot.storage.k8s.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
f:status:
.:
f:boundVolumeSnapshotContentName:
f:creationTime:
f:readyToUse:
f:restoreSize:
Manager: snapshot-controller
Operation: Update
Time: 2022-06-02T10:17:57Z
Resource Version: 399711
UID: e7b7e672-8569-481c-9bd0-7f8c6d6fc60c
Spec:
Source:
Volume Snapshot Content Name: snapshot-copy-9qqdbn56-content-a42450dc-b3d6-4ee2-99c3-3a0d3ea0cb5a
Volume Snapshot Class Name: k10-clone-csi-rbdplugin-snapclass
Status:
Bound Volume Snapshot Content Name: snapshot-copy-9qqdbn56-content-a42450dc-b3d6-4ee2-99c3-3a0d3ea0cb5a
Creation Time: 2022-06-02T10:17:56Z
Ready To Use: true
Restore Size: 0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SnapshotCreated 9m48s snapshot-controller Snapshot kasten-io/snapshot-copy-9qqdbn56 was successfully created by the CSI dr
Normal SnapshotReady 9m48s snapshot-controller Snapshot kasten-io/snapshot-copy-9qqdbn56 is ready to use.
There are also a few jobs, where the export is working:
The preflight-check was successful as well:
Kubernetes Version Check:
Valid kubernetes version (v1.20.15) - OK
RBAC Check:
Kubernetes RBAC is enabled - OK
Aggregated Layer Check:
The Kubernetes Aggregated Layer is enabled - OK
W0602 10:29:06.839593 7 warnings.go:70] storage.k8s.io/v1beta1 CSIDriver is deprecated in v1.19+, unavailable in v1.22+; use stor age.k8s.io/v1 CSIDriver
CSI Capabilities Check:
Using CSI GroupVersion snapshot.storage.k8s.io/v1 - OK
W0602 10:29:07.943947 7 warnings.go:70] storage.k8s.io/v1beta1 CSIDriver is deprecated in v1.19+, unavailable in v1.22+; use stor age.k8s.io/v1 CSIDriver
W0602 10:29:07.945922 7 warnings.go:70] storage.k8s.io/v1beta1 CSIDriver is deprecated in v1.19+, unavailable in v1.22+; use stor age.k8s.io/v1 CSIDriver
W0602 10:29:07.947792 7 warnings.go:70] storage.k8s.io/v1beta1 CSIDriver is deprecated in v1.19+, unavailable in v1.22+; use stor age.k8s.io/v1 CSIDriver
W0602 10:29:09.143837 7 warnings.go:70] storage.k8s.io/v1beta1 CSIDriver is deprecated in v1.19+, unavailable in v1.22+; use stor age.k8s.io/v1 CSIDriver
W0602 10:29:09.193755 7 warnings.go:70] storage.k8s.io/v1beta1 CSIDriver is deprecated in v1.19+, unavailable in v1.22+; use stor age.k8s.io/v1 CSIDriver
W0602 10:29:09.243422 7 warnings.go:70] storage.k8s.io/v1beta1 CSIDriver is deprecated in v1.19+, unavailable in v1.22+; use stor age.k8s.io/v1 CSIDriver
Validating Provisioners:
rook-ceph.rbd.csi.ceph.com:
Is a CSI Provisioner - OK
Storage Classes:
rook-ceph-block
Valid Storage Class - OK
Volume Snapshot Classes:
csi-rbdplugin-snapclass
Has k10.kasten.io/is-snapshot-class annotation set to true - OK
Has deletionPolicy 'Delete' - OK
k10-clone-csi-rbdplugin-snapclass
rook-ceph.cephfs.csi.ceph.com:
Is a CSI Provisioner - OK
Storage Classes:
rook-ceph-fs
Valid Storage Class - OK
Volume Snapshot Classes:
csi-cephfsplugin-snapclass
Has k10.kasten.io/is-snapshot-class annotation set to true - OK
Has deletionPolicy 'Delete' - OK
Validate Generic Volume Snapshot:
Pod Created successfully - OK
GVS Backup command executed successfully - OK
Pod deleted successfully - OK
Thanks for your help in advance