Solved

Restore from "locked" bucket not possible


Hi,

I’m currently evaluating K10 (V5.0) inside a RKE2 cluster with S3-compatible storage as backup target.

Since Longhorn probably has some problems with CSI snapshots I created the sidecar configuration as described in https://docs.kasten.io/latest/install/generic.html#end-to-end-example

I created a “Immutable Backup” Location profile, S3 is Minio.

Backup to that location works fine, but when I try to restore I get the following error message:

cause:

          file: kasten.io/k10/kio/exec/phases/phase/restore_app.go:1610

          function: kasten.io/k10/kio/exec/phases/phase.genericVolumeSnapshotRestore

          linenumber: 1610

          message: No end time recorded for export operation to locked bucket

Backup + restore to/from a location without object lock runs without any problems. 

Any ideas? Perhaps some kind of incompatibility with Minio S3?

Many thanks in advance

Regards

Joern

icon

Best answer by JWester 24 June 2022, 16:00

View original

13 comments

Userlevel 2

Hi Joern,

Thanks for posting your question in the forums. Kasten K10 shouldn’t have any issue with Longhorn with CSI based snapshots and backup. There is some additional configuration required however. At a high level it’s:

  • Prepare the nodes for ISCSI
  • Install the snapshot CRDs
  • Create the Longhorn Volume Snapshot Class
  • Set up a S3 Backup target in the Longhorn UI (click Settings. In the Backup section, set Backup Target to S3 target location)

Test with the Kasten primer script or Kubestr.

 

As far as the Minio S3 Object lock issue goes: we are definitely qualified with object lock with Minio. Are you following the bucket setup from https://docs.kasten.io/latest/usage/configuration.html?highlight=minio#immutable-backups ?

 

-Adam

 

Hi Adam,

Longhorn + CSI snapshots:

I found the documentation in the Longhorn manual about installing the CRDs for the external snapshotter (https://longhorn.io/docs/1.2.4/snapshots-and-backups/csi-snapshot-support/enable-csi-snapshot-support/). But I’m still not able to create a CSI snapshot.

And the k10primer reports “Cluster isn't CSI capable”. My knowledge about K8s is not (yet) good enough for further debugging so I skipped to generic backup.

Restore from S3 Object Lock:

Yes, I think I set it up correctly. “Validate bucket” showed no errors. The error message “No end time recorded for export operation to locked bucket” looks like K10 is expecting some values from S3/??? and stops.

I’ll try some other S3-compatible targets (Ceph Rados and NetApp Storagegrid) during the next days. 

Regards

Joern

Userlevel 2

Can you tell me what K8s version you are  running? There is a bug with the primer script with a version of RKE2 and K3s for the “cluster isn't CSI capable” that we are tracking. You might receive this error but K10 works properly. Download kubestr as an alternative test.

 

-Adam

Userlevel 2
Badge +1

@JWester 

 

Immutable backups with GVS is currently not supported (irrespective of object storage minio or S3 etc)  . We have requested an internal Feature Request for this.

 

Regards
Satish

Hi Adam,

v1.23.6+rke2r2, deployed through Rancher.

But I don’t think the primer script has a bug.

I just tried kubestr without options:

Available Storage Provisioners:

  driver.longhorn.io:

    Missing CSIDriver Object. Required by some provisioners.

    This is a CSI driver!

And trying a CSI snapshot:

jw@ubuntu1:~$ ./kubestr csicheck -s longhorn -v longhorn

Creating application

  -> Created pod (kubestr-csi-original-podzpszl) and pvc (kubestr-csi-original-pvcrsx5f)

Taking a snapshot

Cleaning up resources

Error deleteing PVC (kubestr-csi-original-pvcrsx5f) - (context deadline exceeded)

Error deleteing Pod (kubestr-csi-original-podzpszl) - (context deadline exceeded)

CSI checker test:

  Failed to create Snapshot: CSI Driver failed to create snapshot for PVC (kubestr-csi-original-pvcrsx5f) in Namespace (default): context deadline exceeded  -  Error

Error: Failed to create Snapshot: CSI Driver failed to create snapshot for PVC (kubestr-csi-original-pvcrsx5f) in Namespace (default): context deadline exceeded

 

I’ll continue debugging with a colleague on friday.

Joern

Userlevel 2

1.23 unfortunately is not yet supported. Is it possible to re-deploy on 1.22? I’ve personally tested RKE2 with Longhorn on 1.22 as working.

 

-Adam

Ok, I’ll deploy a new cluster tomorrow with 1.22 and let you know. Many thanks!

I just installed a new cluster with 

  • v1.22.10+rke2r2
  • deployed Longhorn (open-iscsi is running, I can create PVs)
  • created external-snapshotter CRDs
  • created VolumeSnapshotClass longhorn

but still the same error:

./kubestr csicheck -s longhorn -v longhorn

Creating application

  -> Created pod (kubestr-csi-original-pod77m8n) and pvc (kubestr-csi-original-pvc6kxnp)

Taking a snapshot

Cleaning up resources

Error deleteing PVC (kubestr-csi-original-pvc6kxnp) - (context deadline exceeded)

Error deleteing Pod (kubestr-csi-original-pod77m8n) - (context deadline exceeded)

CSI checker test:

  Failed to create Snapshot: CSI Driver failed to create snapshot for PVC (kubestr-csi-original-pvc6kxnp) in Namespace (default): context deadline exceeded  -  Error

Error: Failed to create Snapshot: CSI Driver failed to create snapshot for PVC (kubestr-csi-original-pvc6kxnp) in Namespace (default): context deadline exceeded

 

k10primer also still reports “Cluster isn't CSI capable”. Do you know how this is checked?

Userlevel 2

Below are my setup notes from my testing on RKE2 with Longhorn. It’s possible your snapshot CRDs need to be installed as I did below. The primer script “Cluster isn’t CSI capable” is a bug that only exists in 1.22+ of RKE2/K3s. Kubestr should work properly however.

 

Install snaphot CDRs
------------------------------------
git clone https://github.com/kubernetes-csi/external-snapshotter.git
kubectl kustomize external-snapshotter/client/config/crd | kubectl create -f -
kubectl -n kube-system kustomize external-snapshotter/deploy/kubernetes/snapshot-controller | kubectl create -f -

Enable Longhorn Snapshot Class
--------------------------------------

cat <<EOF>> longhorn-csi-snapshotclass.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: longhorn
annotations:
k10.kasten.io/is-snapshot-class: "true"
driver: driver.longhorn.io
deletionPolicy: Delete
EOF
kubectl apply -f longhorn-csi-snapshotclass.yaml


Install local minio for longhorn
--------------------------------------
kubectl create -f https://raw.githubusercontent.com/longhorn/longhorn/master/deploy/backupstores/minio-backupstore.yaml

Go to the Longhorn UI. In the top navigation bar, click Settings. In the Backup section, set Backup Target to

s3://backupbucket@us-east-1/

And set Backup Target Credential Secret to:

minio-secret


Testing CSI
---------------------------------------
kubestr csicheck -s longhorn -v longhorn

 

Hi Adam,

many thanks for your setup notes.

Why did you install minio? 

I think CRDs and snapshot controller are installed correctly. 

CSI snapshots just don’t work, but I can see them. They are somehow incomplete.

kc get volumesnapshots

NAME                                         READYTOUSE   SOURCEPVC                       SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS           SNAPSHOTCONTENT   CREATIONTIME   AGE

demo-pvc-volume-snapshot-longhorn-snapshot                demo-pvc                                                              longhorn-snapshot-vsc                                    39s

kubestr-snapshot-20220616150335                           kubestr-csi-original-pvcnfd9m                                         longhorn-snapshot-vsc
       

 

I even tried Longhorn 1.3.0 released yesterday, no success. Ok, enough debugging for today.

Joern

Userlevel 2

Reading the release notes on Longhorn 1.3 - it seems like a local backup target may no longer be a requirement when calling CSI based snapshots. This was a requirement previously, so using a local target was a solution with K10, as ultimately K10 would move the backup to a public cloud/external S3 target.

See https://github.com/longhorn/longhorn/releases and the issue described here https://github.com/longhorn/longhorn/issues/2534

Hi Adam,

I finally got it working: I forgot to replace all “default” namespace settings in rbac-snapshot-controller.yaml so the ServiceAccount had some permission problem.

Now K10 with the “new” snapshot method in Longhorn 1.3.0 works fine.

Many thanks for your help!

Joern

About my problem with restore from immutable backup: I found this in the 5.0.1 release notes:

Known Issues

  • The generic storage and shareable volume backup and restore workflows are not compatible with immutable backups location profiles; their use together is not supported.

Comment