Question

EFS Storage Class Backup

  • 21 December 2023
  • 9 comments
  • 149 views

Userlevel 2

Hello

I have an app and that app is using EFS Storage Class

I’ve installed the external snapshotter and I have created the volume snapshot class as the docs said

Here is the VolumeSnapshotsclass

k get volumesnapshotclasses.snapshot.storage.k8s.io csi-efs-snapclass -o yaml

apiVersion: snapshot.storage.k8s.io/v1

deletionPolicy: Delete

driver: efs.csi.aws.com

kind: VolumeSnapshotClass

metadata:

  annotations:

    k10.kasten.io/is-snapshot-class: "true"

    kubectl.kubernetes.io/last-applied-configuration: |

      {"apiVersion":"snapshot.storage.k8s.io/v1","deletionPolicy":"Delete","driver":"efs.csi.aws.com","kind":"VolumeSnapshotClass","metadata":{"annotations":{"k10.kasten.io/is-snapshot-class":"true"},"name":"csi-efs-snapclass"}}

  creationTimestamp: "2023-12-21T12:27:13Z"

  generation: 1

  name: csi-efs-snapclass

  resourceVersion: "79748590"

  uid: 3a44a554-653e-4aa6-a6aa-a899903a88c2

And also added the “k10.kasten.io/volume-snapshot-class: csi-efs-snapclass” annotations to the SC

The backup failing…

Here is the full detail:

k -n jenkins describe backupactions scheduled-jhscs

Name:         scheduled-jhscs

Namespace:    jenkins

Labels:       k10.kasten.io/appName=jenkins

              k10.kasten.io/appNamespace=jenkins

              k10.kasten.io/isRunNow=true

              k10.kasten.io/policyName=jenkins-backup

              k10.kasten.io/policyNamespace=kasten-io

              k10.kasten.io/runActionName=policy-run-7pvwc

Annotations:  <none>

API Version:  actions.kio.kasten.io/v1alpha1

Kind:         BackupAction

Metadata:

  Creation Timestamp:  2023-12-21T13:00:34Z

  Resource Version:    624

  UID:                 edf5935b-a000-11ee-8d69-e682395b78bc

Spec:

  Expires At:  2023-12-21T13:08:19Z

  Filters:

  Profile:

    Name:          aws

    Namespace:     kasten-io

  Scheduled Time:  2023-12-21T13:00:28Z

  Subject:

    Name:       jenkins

    Namespace:  jenkins

Status:

  End Time:  2023-12-21T13:03:20Z

  Error:

    Cause:    {"cause":{"cause":{"cause":{"cause":{"message":"{\"message\":\"Failed to backup data\",\"function\":\"kasten.io/k10/kio/kanister/function.(*backupDataFunc).Exec\",\"linenumber\":131,\"file\":\"kasten.io/k10/kio/kanister/function/backup_data.go:131\",\"cause\":{\"message\":\"2 errors have occurred\",\"errors\":[{\"message\":\"Unable to get command executor\",\"function\":\"kasten.io/k10/kio/kopia.CreateKopiaRepository\",\"linenumber\":487,\"file\":\"kasten.io/k10/kio/kopia/repository.go:487\",\"cause\":{\"message\":\"pod is not yet ready\"}},{\"message\":\"Failed to get command executor\",\"function\":\"kasten.io/k10/kio/kopia.ConnectToKopiaRepository\",\"linenumber\":646,\"file\":\"kasten.io/k10/kio/kopia/repository.go:646\",\"cause\":{\"message\":\"pod is not yet ready\"}}]}}"},"fields":[{"name":"actionSet","value":{"metadata":{"creationTimestamp":"2023-12-21T13:03:19Z","generateName":"k10-backup-k10-namespace-generic-volume-2.0.39-jenkins-jenkins-namespace-","generation":5,"labels":{"kanister.io/JobID":"ee0d2d50-a000-11ee-aa6a-0645134124ff"},"managedFields":[{"apiVersion":"cr.kanister.io/v1alpha1","fieldsType":"FieldsV1","fieldsV1":{"f:status":{".":{},"f:actions":{},"f:error":{".":{},"f:message":{}},"f:progress":{},"f:state":{}}},"manager":"controller","operation":"Update","time":"2023-12-21T13:03:19Z"},{"apiVersion":"cr.kanister.io/v1alpha1","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:kanister.io/JobID":{}}},"f:spec":{".":{},"f:actions":{}}},"manager":"executor-server","operation":"Update","time":"2023-12-21T13:03:19Z"}],"name":"k10-backup-k10-namespace-generic-volume-2.0.39-jenkins-jen9qdj9","namespace":"kasten-io","resourceVersion":"79760960","uid":"d298cfe9-3fe1-4402-9cc3-c2d396bbd074"},"spec":{"actions":[{"blueprint":"k10-namespace-generic-volume-2.0.39","name":"backup","object":{"apiVersion":"","group":"","kind":"namespace","name":"jenkins","namespace":"jenkins","resource":""},"options":{"hostName":"9c4e1b26-6ad6-4ece-a694-30e13a5e8711..cloned-jenkins-6bjcf5r","mountPath":"k10/cloned-jenkins-6bjcf5r","objectStorePath":"repo/9c4e1b26-6ad6-4ece-a694-30e13a5e8711/","pod":"backup-pvc-data-mr2dv","snapshotTags":"JobId:edf5935b-a000-11ee-8d69-e682395b78bc","userName":"k10-admin"},"podOverride":{"securityContext":{"runAsNonRoot":false,"runAsUser":0},"tolerations":[{"effect":"NoExecute","key":"node.kubernetes.io/not-ready","operator":"Exists","tolerationSeconds":300},{"effect":"NoExecute","key":"node.kubernetes.io/unreachable","operator":"Exists","tolerationSeconds":300}]},"preferredVersion":"v1.0.0-alpha","profile":{"apiVersion":"v1alpha1","group":"","kind":"profile","name":"k10-location-npbm9","namespace":"kasten-io","resource":""},"secrets":{"artifactKey":{"apiVersion":"","group":"","kind":"secret","name":"k10-content-store-passphrase-mr62w","namespace":"kasten-io","resource":""}}}]},"status":{"actions":[{"artifacts":{"snapshot":{"keyValue":{"backupIdentifier":"{{ .Phases.backupToObjectStore.Output.backupID }}","backupPath":"{{ .Options.mountPath }}","funcVersion":"{{ .Phases.backupToObjectStore.Output.version }}","objectStorePath":"{{ .Options.objectStorePath }}","phySize":"{{ .Phases.backupToObjectStore.Output.phySize }}","size":"{{ .Phases.backupToObjectStore.Output.size }}"}}},"blueprint":"k10-namespace-generic-volume-2.0.39","deferPhase":{"name":"","progress":{},"state":""},"name":"backup","object":{"apiVersion":"","group":"","kind":"namespace","name":"jenkins","namespace":"jenkins","resource":""},"phases":[{"name":"backupToObjectStore","progress":{},"state":"failed"}]}],"error":{"message":"{\"message\":\"Failed to backup data\",\"function\":\"kasten.io/k10/kio/kanister/function.(*backupDataFunc).Exec\",\"linenumber\":131,\"file\":\"kasten.io/k10/kio/kanister/function/backup_data.go:131\",\"cause\":{\"message\":\"2 errors have occurred\",\"errors\":[{\"message\":\"Unable to get command executor\",\"function\":\"kasten.io/k10/kio/kopia.CreateKopiaRepository\",\"linenumber\":487,\"file\":\"kasten.io/k10/kio/kopia/repository.go:487\",\"cause\":{\"message\":\"pod is not yet ready\"}},{\"message\":\"Failed to get command executor\",\"function\":\"kasten.io/k10/kio/kopia.ConnectToKopiaRepository\",\"linenumber\":646,\"file\":\"kasten.io/k10/kio/kopia/repository.go:646\",\"cause\":{\"message\":\"pod is not yet ready\"}}]}}"},"progress":{},"state":"failed"}}}],"file":"kasten.io/k10/kio/kanister/operation.go:153","function":"kasten.io/k10/kio/kanister.(*Operation).WaitForActionSet","linenumber":153,"message":"ActionSet Failed"},"fields":[{"name":"pvcName","value":"jenkins"},{"name":"namespace","value":"jenkins"}],"file":"kasten.io/k10/kio/exec/phases/backup/snapshot_data_phase.go:862","function":"kasten.io/k10/kio/exec/phases/backup.basicVolumeSnapshot.basicVolumeSnapshot.func1.func2","linenumber":862,"message":"Error snapshotting volume"},"fields":[{"name":"appName","value":"jenkins"},{"name":"appType","value":"statefulset"},{"name":"namespace","value":"jenkins"}],"file":"kasten.io/k10/kio/exec/phases/backup/snapshot_data_phase.go:873","function":"kasten.io/k10/kio/exec/phases/backup.basicVolumeSnapshot","linenumber":873,"message":"Failed to snapshot volumes"},"file":"kasten.io/k10/kio/exec/phases/backup/snapshot_data_phase.go:388","function":"kasten.io/k10/kio/exec/phases/backup.processVolumeArtifacts","linenumber":388,"message":"Failed snapshots for workload"}

    Message:  Job failed to be executed

  Exceptions:

    Cause:    {"cause":{"cause":{"cause":{"cause":{"cause":{"message":"resource name may not be empty"},"fields":[{"name":"scName","value":""}],"file":"kasten.io/k10/kio/exec/phases/phase/snapshot.go:853","function":"kasten.io/k10/kio/exec/phases/phase.ForceGVSOnStorageClass","linenumber":853,"message":"Could not get storageclass"},"fields":[{"name":"pvcName","value":"cloned-jenkins-6bjcf5r"},{"name":"namespace","value":"jenkins"}],"file":"kasten.io/k10/kio/snapshotinspector/cleanerops/cleaner_ops.go:134","function":"kasten.io/k10/kio/snapshotinspector/cleanerops.(*ProviderFetch).FetchVolumeInfos","linenumber":134,"message":"Could not check if pvc is gvs"},"file":"kasten.io/k10/kio/snapshotinspector/snapshotcleaner.go:135","function":"kasten.io/k10/kio/snapshotinspector.(*CleanSnapshotOps).FetchProviders","linenumber":135,"message":"Failed to fetch namespace PVCs"},"file":"kasten.io/k10/kio/exec/phases/backup/snapshot_cleanup_phase.go:102","function":"kasten.io/k10/kio/exec/phases/backup.(*SnapshotCleanupPhase).RunHelper","linenumber":102,"message":"Failed to fetch providers"},"fields":[],"message":"Failed to cleanup snapshots"}

    Message:  Failure in snapshot cleanup phase. Snapshots may be orphaned.

  Progress:   100

  Restore Point:

    Name:

  Result:

    Name:      

  Start Time:  2023-12-21T13:00:34Z

  State:       Failed

Events:        <none>

 

Appreciate any help


9 comments

Badge

Hello,

Maybe you can try to run kubestr https://kubestr.io/. It’s a diagnosis tool which can give more information to address the root cause.

 

 

Sincerly,

Haythem

Userlevel 2

Hello @Haythem Elkhouly 

Thank you for your reply

Here is the result:

./kubestr                                        

 

**************************************

  _  ___   _ ___ ___ ___ _____ ___

  | |/ / | | | _ ) __/ __|_   _| _ \

  | ' <| |_| | _ \ _|\__ \ | | |   /

  |_|\_\\___/|___/___|___/ |_| |_|_\

 

Explore your Kubernetes storage options

**************************************

Kubernetes Version Check:

  Valid kubernetes version (v1.28.4-eks-8cb36c9)  -  OK

 

RBAC Check:

  Kubernetes RBAC is enabled  -  OK

 

Aggregated Layer Check:

  The Kubernetes Aggregated Layer is enabled  -  OK

 

Available Storage Provisioners:

 

  efs.csi.aws.com:

    Missing CSIDriver Object. Required by some provisioners.

    This is a CSI driver!

    (The following info may not be up to date. Please check with the provider for more information.)

    Provider:            AWS Elastic File System

    Website:             https://github.com/aws/aws-efs-csi-driver

    Description:         A Container Storage Interface (CSI) Driver for AWS Elastic File System (EFS)

    Additional Features:

 

    Storage Classes:

      * efs-sc

    Volume Snapshot Classes:

      * csi-efs-snapclass

 

    To perform a FIO test, run-

      ./kubestr fio -s <storage class>

 

    To perform a check for block device support, run-

      ./kubestr blockmount -s <storage class>

 

    To test CSI snapshot/restore functionality, run-

      ./kubestr csicheck -s <storage class> -v <volume snapshot class>

 

  kubernetes.io/aws-ebs:

    This is an in tree provisioner.

 

    Storage Classes:

      * gp2

 

    To perform a FIO test, run-

      ./kubestr fio -s <storage class>

 

    To perform a check for block device support, run-

      ./kubestr blockmount -s <storage class>

Userlevel 7
Badge +22

Try doing the test:

 

 ./kubestr csicheck -s <storage class> -v <volume snapshot class>

Userlevel 7
Badge +22

Take a look here. I am not familiar with EFC-CSI but it says in some cases you need to create an infrastructure profile

While the EFS CSI driver has begun supporting dynamic provisioning, it does not create new EFS volumes. Instead, it creates and uses access points within existing EFS volumes. The current AWS APIs do not support backups of individual access points.

However, K10 can take backups of these dynamically provisioned EFS volumes using the Shareable Volume Backup and Restore mechanism.

For all other operations, EFS requires an Infrastructure Profile. Please refer to AWS Infrastructure Profile on how to create one.”

https://docs.kasten.io/latest/install/storage.html#efs-csi

Userlevel 2

Hello @Geoff Burke 

Thanks for your reply

Here is the result

./kubestr csicheck -s efs-sc -v csi-efs-snapclass

Creating application

  -> Created pod (kubestr-csi-original-podkj47d) and pvc (kubestr-csi-original-pvccv8xr)

Taking a snapshot

Cleaning up resources

CSI checker test:

  Failed to create Snapshot: CSI Driver failed to create snapshot for PVC (kubestr-csi-original-pvccv8xr) in Namespace (default): client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline  -  Error

Error: Failed to create Snapshot: CSI Driver failed to create snapshot for PVC (kubestr-csi-original-pvccv8xr) in Namespace (default): client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline

 


 

./k10tools primer

Kubernetes Version Check:

  Valid kubernetes version (v1.28.4-eks-8cb36c9)  -  OK

 

RBAC Check:

  Kubernetes RBAC is enabled  -  OK

 

Aggregated Layer Check:

  The Kubernetes Aggregated Layer is enabled  -  OK

 

CSI Capabilities Check:

  Using CSI GroupVersion snapshot.storage.k8s.io/v1  -  OK

 

Validating Provisioners:

efs.csi.aws.com:

  Storage Classes:

    efs-sc

      Dynamic EFS CSI driver is supported via K10 Sharable Volume Backup. See https://docs.kasten.io/latest/install/shareable-volume.html.

      Valid Storage Class  -  OK

 

kubernetes.io/aws-ebs:

  Storage Classes:

    gp2

      Valid Storage Class  -  OK

 

 

Validate Generic Volume Snapshot:

  Pod created successfully  -  OK

  GVS Backup command executed successfully  -  OK

  Pod deleted successfully  -  OK

Userlevel 2

Also, as you mentioned, the K10 supports the EFS-CSI

But I don’t know why it fails…

I also enabled the Kanister sidecar support as the docs said, but still nothing…

Badge

Hi,

Do you have infrastructure profile for Efs? I know when you install kasten, you don't need to pass the aws key & secret but you have to have an infrastructure profile.

 

Sincerly,

Haythem

Userlevel 2

Hello @Haythem Elkhouly 

Thanks for your reply

I installed the K10 with IAM Roles for Service Accounts option.

After installation, in the dashboard, I added a Profile for K10 with AWS S3 bucket.

Is that what you mean?

Userlevel 6
Badge +2

@alifiroozi80 From the logs, It seems that k10 is using shareable volume backup and I assume you are using dynamically provisioned EFS PVCs because of that.

 

We found a problem with this workflow with some versions of k10 starting from 6.0.7.

We have already fixed this in latest version of k10 and upgrading to the latest version will resolve this issue for you.

Comment