Skip to main content
Solved

export jobs failing


  • Comes here often
  • 8 comments

Hello 

I setup the policies for backup job and it created snapshots successfully but failed at export operation with errors 

 

Looking at the volume pod error which is trying to create, it shows 
 

0/5 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling.
AttachVolume.Attach failed for volume "kio-57c7a59970eb11ef8059e2b67c8c8805-0" : rpc error: code = Internal desc = failed to attach disk: "bbdf69b2-e841-493c-b015-b6d86a530292" with node: "423e15a9-8fjd-45d9-df64-64208ca72806" err failed to attach cns volume: "bbdf69b2-e841-493c-b015-b6d86a530292" to node vm: "VirtualMachine:vm-93261 [VirtualCenterHost: iu-op-vcsa01.abc.com, UUID: 423e14a9-8fed-45d9-df64-34208ca02806, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-7, VirtualCenterHost: iu-op-vcsa01.abc.com]]". fault: "(*types.LocalizedMethodFault)(0xc001211f20)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n BaseMethodFault: (types.BaseMethodFault) <nil>,\n Reason: (string) (len=16) \"VSLM task failed\"\n },\n LocalizedMessage: (string) (len=32) \"CnsFault error: VSLM task failed\"\n})\n". opId: "89eb24e6"

but we have enough resources available to create more pods/pvcs. 
 

errors in the export job:

- cause:
    cause:
      cause:
        errors:
          - cause:
              cause:
                cause:
                  cause:
                    cause:
                      cause:
                        cause:
                          cause:
                            message: "client rate limiter Wait returned an error: context deadline exceeded"
                          file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod.go
                          function: github.com/kanisterio/kanister/pkg/kube.WaitForPodReady
                          linenumber: 412
                          message: Pod did not transition into running state.
                            Timeout:15m0s  Namespace:kasten-io,
                            Name:copy-vol-data-qpj9w
                        file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod_controller.go
                        function: github.com/kanisterio/kanister/pkg/kube.(*podController).WaitForPodReady
                        linenumber: 174
                        message: Pod failed to become ready in time
                      fields:
                        - name: pod
                          value: copy-vol-data-qpj9w
                        - name: namespace
                          value: kasten-io
                      file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:304
                      function: kasten.io/k10/kio/kanister/function.CopyVolumeData.copyVolumeDataPodExecFunc.func2
                      linenumber: 304
                      message: failed while waiting for Pod to be ready
                    file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:161
                    function: kasten.io/k10/kio/kanister/function.CopyVolumeData
                    linenumber: 161
                    message: Failed to execute copy volume data pod function
                  file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:249
                  function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).genericVolumeCopy
                  linenumber: 249
                  message: failed running copyVolumeData
                file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:170
                function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).CopySnapshotRestoredInPVC
                linenumber: 170
                message: failed running genericVolumeCopy
              file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:77
              function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverter).Convert
              linenumber: 77
              message: Error creating portable snapshot
            fields:
              - name: type
                value: FCD
              - name: id
                value: 07ef12f1-6a7c-4804-a54a-bcccb76863a4:72d852ef-f8c6-49e9-8d8a-30fc2360903e_NTIgYTIgZTggMWYgNDggM2UgNDMgMTgtNjMgZmYgZjkgYjUgMDMgOTUgYjMgOTQvMTE=
            file: kasten.io/k10/kio/exec/phases/phase/artifactcopier.go:544
            function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).convertSnapshots.func1
            linenumber: 544
            message: Failed to export snapshot data
          - cause:
              cause:
                cause:
                  cause:
                    cause:
                      cause:
                        cause:
                          cause:
                            message: "client rate limiter Wait returned an error: context deadline exceeded"
                          file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod.go
                          function: github.com/kanisterio/kanister/pkg/kube.WaitForPodReady
                          linenumber: 412
                          message: Pod did not transition into running state.
                            Timeout:15m0s  Namespace:kasten-io,
                            Name:copy-vol-data-dxsxw
                        file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod_controller.go
                        function: github.com/kanisterio/kanister/pkg/kube.(*podController).WaitForPodReady
                        linenumber: 174
                        message: Pod failed to become ready in time
                      fields:
                        - name: pod
                          value: copy-vol-data-dxsxw
                        - name: namespace
                          value: kasten-io
                      file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:304
                      function: kasten.io/k10/kio/kanister/function.CopyVolumeData.copyVolumeDataPodExecFunc.func2
                      linenumber: 304
                      message: failed while waiting for Pod to be ready
                    file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:161
                    function: kasten.io/k10/kio/kanister/function.CopyVolumeData
                    linenumber: 161
                    message: Failed to execute copy volume data pod function
                  file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:249
                  function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).genericVolumeCopy
                  linenumber: 249
                  message: failed running copyVolumeData
                file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:170
                function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).CopySnapshotRestoredInPVC
                linenumber: 170
                message: failed running genericVolumeCopy
              file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:77
              function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverter).Convert
              linenumber: 77
              message: Error creating portable snapshot
            fields:
              - name: type
                value: FCD
              - name: id
                value: 1aa4f67d-b3b0-41f2-bdbf-03e9a4bf1232:869f3053-575b-4059-b53f-d46b02c9a74a_NTIgNTYgZTAgODIgZGIgMDYgMzggMWQtN2YgOWIgZGUgNzUgZTkgMDggMzMgZjkvMTE=
            file: kasten.io/k10/kio/exec/phases/phase/artifactcopier.go:544
            function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).convertSnapshots.func1
            linenumber: 544
            message: Failed to export snapshot data
        message: 2 errors have occurred
      file: kasten.io/k10/kio/exec/phases/phase/artifactcopier.go:274
      function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).Copy
      linenumber: 274
      message: Error converting snapshots
    file: kasten.io/k10/kio/exec/phases/phase/export.go:172
    function: kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run
    linenumber: 172
    message: Failed to copy artifacts
  message: Job failed to be executed
- cause:
    cause:
      cause:
        errors:
          - cause:
              cause:
                cause:
                  cause:
                    cause:
                      cause:
                        cause:
                          cause:
                            message: "client rate limiter Wait returned an error: context deadline exceeded"
                          file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod.go
                          function: github.com/kanisterio/kanister/pkg/kube.WaitForPodReady
                          linenumber: 412
                          message: Pod did not transition into running state.
                            Timeout:15m0s  Namespace:kasten-io,
                            Name:copy-vol-data-scldg
                        file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod_controller.go
                        function: github.com/kanisterio/kanister/pkg/kube.(*podController).WaitForPodReady
                        linenumber: 174
                        message: Pod failed to become ready in time
                      fields:
                        - name: pod
                          value: copy-vol-data-scldg
                        - name: namespace
                          value: kasten-io
                      file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:304
                      function: kasten.io/k10/kio/kanister/function.CopyVolumeData.copyVolumeDataPodExecFunc.func2
                      linenumber: 304
                      message: failed while waiting for Pod to be ready
                    file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:161
                    function: kasten.io/k10/kio/kanister/function.CopyVolumeData
                    linenumber: 161
                    message: Failed to execute copy volume data pod function
                  file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:249
                  function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).genericVolumeCopy
                  linenumber: 249
                  message: failed running copyVolumeData
                file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:170
                function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).CopySnapshotRestoredInPVC
                linenumber: 170
                message: failed running genericVolumeCopy
              file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:77
              function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverter).Convert
              linenumber: 77
              message: Error creating portable snapshot
            fields:
              - name: type
                value: FCD
              - name: id
                value: 07ef12f1-6a7c-4804-a54a-bcccb76863a4:72d852ef-f8c6-49e9-8d8a-30fc2360903e_NTIgYTIgZTggMWYgNDggM2UgNDMgMTgtNjMgZmYgZjkgYjUgMDMgOTUgYjMgOTQvMTE=
            file: kasten.io/k10/kio/exec/phases/phase/artifactcopier.go:544
            function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).convertSnapshots.func1
            linenumber: 544
            message: Failed to export snapshot data
          - cause:
              cause:
                cause:
                  cause:
                    cause:
                      cause:
                        cause:
                          cause:
                            message: "client rate limiter Wait returned an error: context deadline exceeded"
                          file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod.go
                          function: github.com/kanisterio/kanister/pkg/kube.WaitForPodReady
                          linenumber: 412
                          message: Pod did not transition into running state.
                            Timeout:15m0s  Namespace:kasten-io,
                            Name:copy-vol-data-njnpt
                        file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod_controller.go
                        function: github.com/kanisterio/kanister/pkg/kube.(*podController).WaitForPodReady
                        linenumber: 174
                        message: Pod failed to become ready in time
                      fields:
                        - name: pod
                          value: copy-vol-data-njnpt
                        - name: namespace
                          value: kasten-io
                      file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:304
                      function: kasten.io/k10/kio/kanister/function.CopyVolumeData.copyVolumeDataPodExecFunc.func2
                      linenumber: 304
                      message: failed while waiting for Pod to be ready
                    file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:161
                    function: kasten.io/k10/kio/kanister/function.CopyVolumeData
                    linenumber: 161
                    message: Failed to execute copy volume data pod function
                  file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:249
                  function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).genericVolumeCopy
                  linenumber: 249
                  message: failed running copyVolumeData
                file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:170
                function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).CopySnapshotRestoredInPVC
                linenumber: 170
                message: failed running genericVolumeCopy
              file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:77
              function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverter).Convert
              linenumber: 77
              message: Error creating portable snapshot
            fields:
              - name: type
                value: FCD
              - name: id
                value: 1aa4f67d-b3b0-41f2-bdbf-03e9a4bf1232:869f3053-575b-4059-b53f-d46b02c9a74a_NTIgNTYgZTAgODIgZGIgMDYgMzggMWQtN2YgOWIgZGUgNzUgZTkgMDggMzMgZjkvMTE=
            file: kasten.io/k10/kio/exec/phases/phase/artifactcopier.go:544
            function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).convertSnapshots.func1
            linenumber: 544
            message: Failed to export snapshot data
        message: 2 errors have occurred
      file: kasten.io/k10/kio/exec/phases/phase/artifactcopier.go:274
      function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).Copy
      linenumber: 274
      message: Error converting snapshots
    file: kasten.io/k10/kio/exec/phases/phase/export.go:172
    function: kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run
    linenumber: 172
    message: Failed to copy artifacts
  message: Job failed to be executed
- cause:
    cause:
      cause:
        errors:
          - cause:
              cause:
                cause:
                  cause:
                    cause:
                      cause:
                        cause:
                          cause:
                            message: "client rate limiter Wait returned an error: context deadline exceeded"
                          file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod.go
                          function: github.com/kanisterio/kanister/pkg/kube.WaitForPodReady
                          linenumber: 412
                          message: Pod did not transition into running state.
                            Timeout:15m0s  Namespace:kasten-io,
                            Name:copy-vol-data-btqgt
                        file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod_controller.go
                        function: github.com/kanisterio/kanister/pkg/kube.(*podController).WaitForPodReady
                        linenumber: 174
                        message: Pod failed to become ready in time
                      fields:
                        - name: pod
                          value: copy-vol-data-btqgt
                        - name: namespace
                          value: kasten-io
                      file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:304
                      function: kasten.io/k10/kio/kanister/function.CopyVolumeData.copyVolumeDataPodExecFunc.func2
                      linenumber: 304
                      message: failed while waiting for Pod to be ready
                    file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:161
                    function: kasten.io/k10/kio/kanister/function.CopyVolumeData
                    linenumber: 161
                    message: Failed to execute copy volume data pod function
                  file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:249
                  function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).genericVolumeCopy
                  linenumber: 249
                  message: failed running copyVolumeData
                file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:170
                function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).CopySnapshotRestoredInPVC
                linenumber: 170
                message: failed running genericVolumeCopy
              file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:77
              function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverter).Convert
              linenumber: 77
              message: Error creating portable snapshot
            fields:
              - name: type
                value: FCD
              - name: id
                value: 1aa4f67d-b3b0-41f2-bdbf-03e9a4bf1232:869f3053-575b-4059-b53f-d46b02c9a74a_NTIgNTYgZTAgODIgZGIgMDYgMzggMWQtN2YgOWIgZGUgNzUgZTkgMDggMzMgZjkvMTE=
            file: kasten.io/k10/kio/exec/phases/phase/artifactcopier.go:544
            function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).convertSnapshots.func1
            linenumber: 544
            message: Failed to export snapshot data
          - cause:
              cause:
                cause:
                  cause:
                    cause:
                      cause:
                        cause:
                          cause:
                            message: 'container "container" in pod "copy-vol-data-fx9gg" is waiting to
                              start: ContainerCreating'
                          file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod.go
                          function: github.com/kanisterio/kanister/pkg/kube.getErrorFromLogs
                          linenumber: 353
                          message: Failed to fetch logs from the pod
                        file: github.com/kanisterio/kanister@v0.0.0-20240828182737-b6d930f12c93/pkg/kube/pod_controller.go
                        function: github.com/kanisterio/kanister/pkg/kube.(*podController).WaitForPodReady
                        linenumber: 174
                        message: Pod failed to become ready in time
                      fields:
                        - name: pod
                          value: copy-vol-data-fx9gg
                        - name: namespace
                          value: kasten-io
                      file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:304
                      function: kasten.io/k10/kio/kanister/function.CopyVolumeData.copyVolumeDataPodExecFunc.func2
                      linenumber: 304
                      message: failed while waiting for Pod to be ready
                    file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:161
                    function: kasten.io/k10/kio/kanister/function.CopyVolumeData
                    linenumber: 161
                    message: Failed to execute copy volume data pod function
                  file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:249
                  function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).genericVolumeCopy
                  linenumber: 249
                  message: failed running copyVolumeData
                file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:170
                function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverterInternalAPIImpl).CopySnapshotRestoredInPVC
                linenumber: 170
                message: failed running genericVolumeCopy
              file: kasten.io/k10/kio/exec/internal/snapshotconverters/ac_gvc_converter.go:77
              function: kasten.io/k10/kio/exec/internal/snapshotconverters.(*GVCConverter).Convert
              linenumber: 77
              message: Error creating portable snapshot
            fields:
              - name: type
                value: FCD
              - name: id
                value: 07ef12f1-6a7c-4804-a54a-bcccb76863a4:72d852ef-f8c6-49e9-8d8a-30fc2360903e_NTIgYTIgZTggMWYgNDggM2UgNDMgMTgtNjMgZmYgZjkgYjUgMDMgOTUgYjMgOTQvMTE=
            file: kasten.io/k10/kio/exec/phases/phase/artifactcopier.go:544
            function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).convertSnapshots.func1
            linenumber: 544
            message: Failed to export snapshot data
        message: 2 errors have occurred
      file: kasten.io/k10/kio/exec/phases/phase/artifactcopier.go:274
      function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).Copy
      linenumber: 274
      message: Error converting snapshots
    file: kasten.io/k10/kio/exec/phases/phase/export.go:172
    function: kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run
    linenumber: 172
    message: Failed to copy artifacts
  message: Job failed to be executed



Doing the pre-flight check

k10tools primer storage check vsphere -f ./vsphere_check.yaml with either 

I can see the error whilst creating pod 
 

0/5 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling.
MountVolume.MountDevice failed for volume "primer-test-vsphere-pv-dk87k" : rpc error: code = Internal desc = error in formating and mounting volume. Parameters: {c76493ab-e218-4a3b-a98c-8a8c14eea1c1 ext4 /var/lib/kubelet/plugins/kubernetes.io/csi/csi.vsphere.vmware.com/070d225ca99c107b528f142fa6117f857de5ae79bee817092cef0296be4e13d8/globalmount [] false} err: failed to mount volume as "ext4"; already contains xfs: error: mount failed: exit status 32 mounting arguments: -t ext4 -o defaults /dev/disk/by-id/wwn-0x6000c2960c97253497ea9cc3d3a9b4fd /var/lib/kubelet/plugins/kubernetes.io/csi/csi.vsphere.vmware.com/070d225ca99c107b528f142fa6117f857de5ae79bee817092cef0296be4e13d8/globalmount output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/csi.vsphere.vmware.com/070d225ca99c107b528f142fa6117f857de5ae79bee817092cef0296be4e13d8/globalmount: wrong fs type, bad option, bad superblock on /dev/sdf, missing codepage or helper program, or other error.

output of vsphere check 

k10tools primer storage check vsphere -f ./vsphere_check.yaml
Using "./vsphere_check.yaml" file content as config source
-> Setup Provider
-> Create Namespace
-> Create Volume
-> Create Test Pod
-> Write Data
-> Create Snapshot
-> Delete Test Pod
-> Delete Volume
   - Delete PVC 'primer-test-vsphere-pvc-4rpkv'
-> Restore Volume
   - Restore vSphere FCD
   - Restore PV
   - Restore PVC
-> Restore Test Pod
-> Delete Test Pod
-> Delete Snapshot
-> Delete Volume
   - Delete PVC 'primer-test-vsphere-pvc-qbwq4'
-> Delete Namespace
VSphere backup/restore checker:
  Created FCD provider  -  OK
  Created namespace 'primer-test-ns-4jbfn'  -  OK
  Created PVC 'primer-test-vsphere-pvc-4rpkv'  -  OK
  Created test pod 'primer-test-pod-62rbn'  -  OK
  Wrote data '2024-09-12 13:39:21.525605548 +0100 BST m=+5.059465608' to pod 'primer-test-pod-62rbn'  -  OK
  Created snapshot 'a9bdf30a-bc7c-4b16-9895-49f1fb01f165:ce5bfa54-bc8a-4752-91de-bc3c363c63fa' for FCD 'a9bdf30a-bc7c-4b16-9895-49f1fb01f165' (PV 'pvc-c640def9-efa2-4dd2-935e-6096327fd95a')  -  OK
  Deleted test pod 'primer-test-pod-62rbn'  -  OK
  Deleted PVC 'primer-test-vsphere-pvc-4rpkv', PV 'pvc-c640def9-efa2-4dd2-935e-6096327fd95a' and FCD 'a9bdf30a-bc7c-4b16-9895-49f1fb01f165'  -  OK
  Restored FCD 'd01f4b12-44db-4933-b214-6bde89bffecd', PV 'primer-test-vsphere-pv-mdd6l' and PVC 'primer-test-vsphere-pvc-qbwq4' from snapshot 'a9bdf30a-bc7c-4b16-9895-49f1fb01f165:ce5bfa54-bc8a-4752-91de-bc3c363c63fa'  -  OK
  Failed to create test pod: Pod did not transition into running state. Timeout:15m0s  Namespace:primer-test-ns-4jbfn, Name:primer-test-pod-6n76c: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline  -  Error
  Failed to delete test pod 'primer-test-pod-6n76c': client rate limiter Wait returned an error: context deadline exceeded  -  Error
  Failed to delete FCD snapshot 'a9bdf30a-bc7c-4b16-9895-49f1fb01f165:ce5bfa54-bc8a-4752-91de-bc3c363c63fa': {"message":"Failed to create a task for the DeleteSnapshot invocation on an IVD Protected Entity","function":"kasten.io/k10/kio/storage/blockstorage/vmware.(*FcdProvider).SnapshotDelete.func1","linenumber":342,"file":"kasten.io/k10/kio/storage/blockstorage/vmware/vmware.go:342","cause":{"message":"Post \"https://bs-fp-vcsa01.blacksun.com/vslm/sdk\": context deadline exceeded"}}  -  Error
  Failed to delete PVC 'primer-test-vsphere-pvc-qbwq4': client rate limiter Wait returned an error: context deadline exceeded  -  Error
  Failed to delete namespace 'primer-test-ns-4jbfn': client rate limiter Wait returned an error: context deadline exceeded  -  Error
Error: ["{\"message\":\"Failed to create test pod: Pod did not transition into running state. Timeout:15m0s  Namespace:primer-test-ns-4jbfn, Name:primer-test-pod-6n76c: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline\",\"function\":\"kasten.io/k10/kio/tools/k10primer.(*TestRetVal).Errors\",\"linenumber\":180,\"file\":\"kasten.io/k10/kio/tools/k10primer/k10primer.go:180\"}","{\"message\":\"Failed to delete test pod 'primer-test-pod-6n76c': client rate limiter Wait returned an error: context deadline exceeded\",\"function\":\"kasten.io/k10/kio/tools/k10primer.(*TestRetVal).Errors\",\"linenumber\":180,\"file\":\"kasten.io/k10/kio/tools/k10primer/k10primer.go:180\"}","{\"message\":\"Failed to delete FCD snapshot 'a9bdf30a-bc7c-4b16-9895-49f1fb01f165:ce5bfa54-bc8a-4752-91de-bc3c363c63fa': {\\\"message\\\":\\\"Failed to create a task for the DeleteSnapshot invocation on an IVD Protected Entity\\\",\\\"function\\\":\\\"kasten.io/k10/kio/storage/blockstorage/vmware.(*FcdProvider).SnapshotDelete.func1\\\",\\\"linenumber\\\":342,\\\"file\\\":\\\"kasten.io/k10/kio/storage/blockstorage/vmware/vmware.go:342\\\",\\\"cause\\\":{\\\"message\\\":\\\"Post \\\\\\\"https://bs-fp-vcsa01.blacksun.com/vslm/sdk\\\\\\\": context deadline exceeded\\\"}}\",\"function\":\"kasten.io/k10/kio/tools/k10primer.(*TestRetVal).Errors\",\"linenumber\":180,\"file\":\"kasten.io/k10/kio/tools/k10primer/k10primer.go:180\"}","{\"message\":\"Failed to delete PVC 'primer-test-vsphere-pvc-qbwq4': client rate limiter Wait returned an error: context deadline exceeded\",\"function\":\"kasten.io/k10/kio/tools/k10primer.(*TestRetVal).Errors\",\"linenumber\":180,\"file\":\"kasten.io/k10/kio/tools/k10primer/k10primer.go:180\"}","{\"message\":\"Failed to delete namespace 'primer-test-ns-4jbfn': client rate limiter Wait returned an error: context deadline exceeded\",\"function\":\"kasten.io/k10/kio/tools/k10primer.(*TestRetVal).Errors\",\"linenumber\":180,\"file\":\"kasten.io/k10/kio/tools/k10primer/k10primer.go:180\"}"]
[root@BS-S-K3S-N1 kasten]# 




k10tools primer storage check csi -f ./csi_check.yaml

Using "./csi_check.yaml" file content as config source
CSI Snapshot Walkthrough:
  Not a supported CSI driver (csi.vsphere.vmware.com)  -  Error
Error: {"message":"Not a supported CSI driver (csi.vsphere.vmware.com)","function":"kasten.io/k10/kio/tools/k10primer.(*TestRetVal).Errors","linenumber":180,"file":"kasten.io/k10/kio/tools/k10primer/k10primer.go:180"}

 

 

k10tools provider-snapshots list -t FCD
It lists all snapshots created successfully


our environment 

vSphere Client version 7.0.3.01400
k3s version v1.29.6+k3s2 (b4b156d9)

vSphere CSI: csi.vsphere.vmware.com

 

Images:

vsphere-csi-controller
image: gcr.io/cloud-provider-vsphere/csi/release/driver:v3.0.0

csi-provisioner

image: k8s.gcr.io/sig-storage/csi-provisioner:v3.4.0

csi-snapshotter:

image: k8s.gcr.io/sig-storage/csi-snapshotter:v6.2.1

Best answer by Pavithra

Hi @msaeed 
   Thanks for the clarification that you are using different storageclass now which is not xfs.
   
    So, Now this issue is related to vsphere csi and please find the below github issue for the same. 
    https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/1416
   
   The solution for this would be to enable Changed Block Tracking (CBT) on existing worker nodes. 
   Please refer the vmware KB https://kb.vmware.com/s/article/88193
        
Thanks,
Pavithra

View original
Did this topic help you find an answer to your question?

4 comments

Forum|alt.badge.img
  • Comes here often
  • 11 comments
  • September 12, 2024

Hi @msaeed 
  The error "failed to mount volume as 'ext4'; already contains xfs" generally occurs when the underlying virtual disk (VMDK) in vSphere has been formatted with a different file system than what the Kubernetes node expects.
  If you're using VMware's CSI driver, this mismatch might happen when: The VM’s disk is already formatted with a different file system, such as xfs, and Kubernetes is trying to mount it as ext4 by default. 
  So if you want to mount xfs volume to kubernetes pod, you might need to change the fstype parameter in storageclass. 
   Below, is the sample example storageclass with fstype parameter
   

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: vsphere-csi-sc
provisioner: csi.vsphere.vmware.com
parameters:
  fstype: xfs   # Make sure this matches the desired file system type
reclaimPolicy: Delete
volumeBindingMode: Immediate


Thanks,
Pavithra


  • Author
  • Comes here often
  • 8 comments
  • September 12, 2024

Hi Pavithra 

Thanks for responding. 

I did the pre-flight vsphere_check with different storage class which is not xfs and tests were passed. 

But when I ran the job again for the same storage class in kasten, still the export jobs failed with same error as before
 

0/5 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling.
AttachVolume.Attach failed for volume "kio-f946c101710d11ef8059e2b67c8c8805-1" : rpc error: code = Internal desc = failed to attach disk: "b265cef3-fa49-4d58-afc1-703d27f25e70" with node: "423e15a9-8fed-45d9-df64-64208ca72806" err failed to attach cns volume: "b265cef3-fa49-4d58-afc1-703d27f25e70" to node vm: "VirtualMachine:vm-93261 [VirtualCenterHost: ui-ea-vcsa01.abc.com, UUID: 423e15a9-8fed-45d9-df64-64208ca72806, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-7, VirtualCenterHost: ui-ea-vcsa01.abc.com]]". fault: "(*types.LocalizedMethodFault)(0xc000c53980)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n BaseMethodFault: (types.BaseMethodFault) <nil>,\n Reason: (string) (len=16) \"VSLM task failed\"\n },\n LocalizedMessage: (string) (len=32) \"CnsFault error: VSLM task failed\"\n})\n". opId: "89eb3af2"


Also doing the csi_check returns
 

Using "./csi_check.yaml" file content as config source
CSI Snapshot Walkthrough:
  Not a supported CSI driver (csi.vsphere.vmware.com)  -  Error
Error: {"message":"Not a supported CSI driver (csi.vsphere.vmware.com)","function":"kasten.io/k10/kio/tools/k10primer.(*TestRetVal).Errors","linenumber":180,"file":"kasten.io/k10/kio/tools/k10primer/k10primer.go:180"}


​​​​​​


Forum|alt.badge.img
  • Comes here often
  • 11 comments
  • Answer
  • September 12, 2024

Hi @msaeed 
   Thanks for the clarification that you are using different storageclass now which is not xfs.
   
    So, Now this issue is related to vsphere csi and please find the below github issue for the same. 
    https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/1416
   
   The solution for this would be to enable Changed Block Tracking (CBT) on existing worker nodes. 
   Please refer the vmware KB https://kb.vmware.com/s/article/88193
        
Thanks,
Pavithra


  • Author
  • Comes here often
  • 8 comments
  • September 13, 2024

Hi Pavithra 

Thanks for you help and providing useful material. 
It has been resolved after enabling CBT

 

Thanks 


Comment