Skip to main content
Question

Unable to export snapshot when pvc are in ReadWriteMany accessmode


Hi,

i’m running a RKE2 cluster with portworx storage and 37 deployment

i have a policy to backup deployment everyday and export backups to a S3 generic storage.

Prior september all was runninig perfectly.

In september some exports was failing (2/37)

Last week i have decided to migrate my pvc who was using native provider to CSI.

Now all my deployment with csi storage class have export failed

Old deployments not migrated are running as well

i have try to rollback pvc migrated to native and … same problem: export failed.

So, i have made some other tests :

  • deploy an application with a readwriteonce volume (native) > export OK
  • deploy an application with a readwriteonce volume (csi) > export OK
  • deploy an application with a readwritemany volume (native) > export Failed
  • deploy an application with a readwritemany volume (csi) > export Failed

 

 

- cause:
    cause:
      cause:
        cause:
          cause:
            cause:
              file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:319
              function: kasten.io/k10/kio/kanister/function.CopyVolumeData.copyVolumeDataPodExecFunc.func1
              linenumber: 319
              message: Failed to get snapshot ID from create snapshot output
            file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:129
            function: kasten.io/k10/kio/kanister/function.CopyVolumeData
            linenumber: 129
            message: Failed to execute copy volume data pod function
          file: kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:1635
          function: kasten.io/k10/kio/exec/phases/phase.(*gvcConverter).Convert
          linenumber: 1635
          message: Error creating portable snapshot
        fields:
          - name: type
            value: CSI
          - name: id
            value: k10-csi-snap-5lkfmkllzwhkxvrb
        file: kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:442
        function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).convertSnapshots.func1
        linenumber: 442
        message: Failed to export snapshot data
      file: kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:210
      function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).copy
      linenumber: 210
      message: Error converting snapshots
    file: kasten.io/k10/kio/exec/phases/phase/export.go:168
    function: kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run
    linenumber: 168
    message: Failed to copy artifacts
  message: Job failed to be executed
- cause:
    cause:
      cause:
        cause:
          cause:
            cause:
              file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:319
              function: kasten.io/k10/kio/kanister/function.CopyVolumeData.copyVolumeDataPodExecFunc.func1
              linenumber: 319
              message: Failed to get snapshot ID from create snapshot output
            file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:129
            function: kasten.io/k10/kio/kanister/function.CopyVolumeData
            linenumber: 129
            message: Failed to execute copy volume data pod function
          file: kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:1635
          function: kasten.io/k10/kio/exec/phases/phase.(*gvcConverter).Convert
          linenumber: 1635
          message: Error creating portable snapshot
        fields:
          - name: type
            value: CSI
          - name: id
            value: k10-csi-snap-5lkfmkllzwhkxvrb
        file: kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:442
        function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).convertSnapshots.func1
        linenumber: 442
        message: Failed to export snapshot data
      file: kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:210
      function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).copy
      linenumber: 210
      message: Error converting snapshots
    file: kasten.io/k10/kio/exec/phases/phase/export.go:168
    function: kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run
    linenumber: 168
    message: Failed to copy artifacts
  message: Job failed to be executed
- cause:
    cause:
      cause:
        cause:
          cause:
            cause:
              file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:319
              function: kasten.io/k10/kio/kanister/function.CopyVolumeData.copyVolumeDataPodExecFunc.func1
              linenumber: 319
              message: Failed to get snapshot ID from create snapshot output
            file: kasten.io/k10/kio/kanister/function/kio_copy_volume_data.go:129
            function: kasten.io/k10/kio/kanister/function.CopyVolumeData
            linenumber: 129
            message: Failed to execute copy volume data pod function
          file: kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:1635
          function: kasten.io/k10/kio/exec/phases/phase.(*gvcConverter).Convert
          linenumber: 1635
          message: Error creating portable snapshot
        fields:
          - name: type
            value: CSI
          - name: id
            value: k10-csi-snap-5lkfmkllzwhkxvrb
        file: kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:442
        function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).convertSnapshots.func1
        linenumber: 442
        message: Failed to export snapshot data
      file: kasten.io/k10/kio/exec/phases/phase/copy_snapshots.go:210
      function: kasten.io/k10/kio/exec/phases/phase.(*ArtifactCopier).copy
      linenumber: 210
      message: Error converting snapshots
    file: kasten.io/k10/kio/exec/phases/phase/export.go:168
    function: kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run
    linenumber: 168
    message: Failed to copy artifacts
  message: Job failed to be executed

 

11 comments

Chris.Childerhose
Forum|alt.badge.img+21
  • Veeam Legend, Veeam Vanguard
  • 8401 comments
  • September 23, 2023

@safiya @Madi.Cristil - please move to Kasten K10 discussion board for better help.


Madi.Cristil
Forum|alt.badge.img+8
  • Community Manager
  • 616 comments
  • September 23, 2023

jaiganeshjk
Forum|alt.badge.img+2
  • Experienced User
  • 274 comments
  • September 25, 2023

@Vecteur IT Thanks for posting your question here.

Unfortunately, The log messages doesn’t show much on what is going on in the backend.

Would you mind opening a case with us through `my.veeam.com` and use `Kasten by veeam K10 Trial` in products while opening a case. 

Please collect the debug logs (https://docs.kasten.io/latest/operating/support.html#gathering-debugging-information) and upload the same to the case. We will get in touch and take a deep look at what’s going on. 


  • Author
  • Comes here often
  • 14 comments
  • September 25, 2023
jaiganeshjk wrote:

@Vecteur IT Thanks for posting your question here.

Unfortunately, The log messages doesn’t show much on what is going on in the backend.

Would you mind opening a case with us through `my.veeam.com` and use `Kasten by veeam K10 Trial` in products while opening a case. 

Please collect the debug logs (https://docs.kasten.io/latest/operating/support.html#gathering-debugging-information) and upload the same to the case. We will get in touch and take a deep look at what’s going on. 

 

Hi jaiganeshjk

Thank you for your help.

I have create Case #06321679 and join debug log.

 


  • Author
  • Comes here often
  • 14 comments
  • September 28, 2023

Hi,

In kubernetes event i have thi error after copy-vol-data-8nnj5 is created :

12m         Warning   VolumeFailedDelete   persistentvolume/vol-e0db55a6-5dd0-11ee-bc68-e20363dc92a1   rpc error: code = Internal desc = Failed to delete volume 555257514601042816: rpc error: code = Internal desc = Failed to detach volume 555257514601042816: Volume 555257514601042816 is mounted at 1 location(s): /var/lib/osd/pxns/555257514601042816

 


jaiganeshjk
Forum|alt.badge.img+2
  • Experienced User
  • 274 comments
  • September 28, 2023

@satish.kumar FYI ^^

 


  • Author
  • Comes here often
  • 14 comments
  • September 28, 2023

Hi, i have now a script to capture copy-vol-data.

the TTL cor copy-col is about 5 second and for the pvc its the same

join describe for copy-vol-data-9rnkg and kanister-pvc-qs9nk:

Name:             copy-vol-data-9rnkg
Namespace:        kasten-io
Priority:         0
Service Account:  k10-k10
Node:             <none>
Labels:           createdBy=kanister
Annotations:      <none>
Status:           Pending
IP:               
IPs:              <none>
Containers:
  container:
    Image:      ghcr.io/kanisterio/kanister-tools:0.96.0
    Port:       <none>
    Host Port:  <none>
    Command:
      bash
      -c
      tail -f /dev/null
    Environment:  <none>
    Mounts:
      /mnt/vol_data/kanister-pvc from vol-4954d9c6-5e09-11ee-bc68-e20363dc92a1 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tljw8 (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  vol-4954d9c6-5e09-11ee-bc68-e20363dc92a1:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  kanister-pvc-qs9nk
    ReadOnly:   false
  kube-api-access-tljw8:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age   From   Message
  ----     ------            ----  ----   -------
  Warning  FailedScheduling  0s    stork  0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/6 nodes are available: 6 No preemption victims found for incoming pod..

 

The pvc:

Name:          kanister-pvc-qs9nk
Namespace:     kasten-io
StorageClass:  px-csi-cms
Status:        Bound
Volume:        pvc-c1365038-fafe-4d58-977d-ce58b4aa7f22
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: pxd.portworx.com
               volume.kubernetes.io/storage-provisioner: pxd.portworx.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      1Gi
Access Modes:  RWO
VolumeMode:    Filesystem
DataSource:
  APIGroup:  snapshot.storage.k8s.io
  Kind:      VolumeSnapshot
  Name:      snapshot-copy-zzknt9qw
Used By:     <none>
Events:
  Type    Reason                 Age              From                                                                              Message
  ----    ------                 ----             ----                                                                              -------
  Normal  ExternalProvisioning   5s (x2 over 5s)  persistentvolume-controller                                                       waiting for a volume to be created, either by external provisioner "pxd.portworx.com" or manually created by system administrator
  Normal  Provisioning           5s               pxd.portworx.com_px-csi-ext-fc94fdf48-sc2xm_55665d21-835b-45a1-9031-f104ffdf27d6  External provisioner is provisioning volume for claim "kasten-io/kanister-pvc-qs9nk"
  Normal  ProvisioningSucceeded  4s               pxd.portworx.com_px-csi-ext-fc94fdf48-sc2xm_55665d21-835b-45a1-9031-f104ffdf27d6  Successfully provisioned volume pvc-c1365038-fafe-4d58-977d-ce58b4aa7f22

Is it normal that DataSource is empty ??


I have the same issue after Uprade from 5.5.11 to 6.0.9.

We habe RKE1 and Storage is ceph-csi-rbd


  • Author
  • Comes here often
  • 14 comments
  • October 24, 2023

I have a response from support:

Apparentl it seems redwritemany volumes in Portworx means the underlying filesystems is sharedv4 (NFS )

Regrettably here is an issue ( we are working on the fix very soon) reading from NFS with rootless.

 

The solution will be available in the near future, so please continue to monitor the K10 release notes and upgrade your K10 once the fix becomes available.


Thx for your replay

Sorry I see yet you have the issue only with RWX. We have Ceph and the volume is RWO and have the same issue


  • Author
  • Comes here often
  • 14 comments
  • October 24, 2023

Maybe an other rootless problem...


Comment