Hi Joern,
Thanks for posting your question in the forums. Kasten K10 shouldn’t have any issue with Longhorn with CSI based snapshots and backup. There is some additional configuration required however. At a high level it’s:
- Prepare the nodes for ISCSI
- Install the snapshot CRDs
- Create the Longhorn Volume Snapshot Class
- Set up a S3 Backup target in the Longhorn UI (click Settings. In the Backup section, set Backup Target to S3 target location)
Test with the Kasten primer script or Kubestr.
As far as the Minio S3 Object lock issue goes: we are definitely qualified with object lock with Minio. Are you following the bucket setup from https://docs.kasten.io/latest/usage/configuration.html?highlight=minio#immutable-backups ?
-Adam
Hi Adam,
Longhorn + CSI snapshots:
I found the documentation in the Longhorn manual about installing the CRDs for the external snapshotter (https://longhorn.io/docs/1.2.4/snapshots-and-backups/csi-snapshot-support/enable-csi-snapshot-support/). But I’m still not able to create a CSI snapshot.
And the k10primer reports “Cluster isn't CSI capable”. My knowledge about K8s is not (yet) good enough for further debugging so I skipped to generic backup.
Restore from S3 Object Lock:
Yes, I think I set it up correctly. “Validate bucket” showed no errors. The error message “No end time recorded for export operation to locked bucket” looks like K10 is expecting some values from S3/??? and stops.
I’ll try some other S3-compatible targets (Ceph Rados and NetApp Storagegrid) during the next days.
Regards
Joern
Can you tell me what K8s version you are running? There is a bug with the primer script with a version of RKE2 and K3s for the “cluster isn't CSI capable” that we are tracking. You might receive this error but K10 works properly. Download kubestr as an alternative test.
-Adam
@JWester
Immutable backups with GVS is currently not supported (irrespective of object storage minio or S3 etc) . We have requested an internal Feature Request for this.
Regards
Satish
Hi Adam,
v1.23.6+rke2r2, deployed through Rancher.
But I don’t think the primer script has a bug.
I just tried kubestr without options:
Available Storage Provisioners:
driver.longhorn.io:
Missing CSIDriver Object. Required by some provisioners.
This is a CSI driver!
And trying a CSI snapshot:
jw@ubuntu1:~$ ./kubestr csicheck -s longhorn -v longhorn
Creating application
-> Created pod (kubestr-csi-original-podzpszl) and pvc (kubestr-csi-original-pvcrsx5f)
Taking a snapshot
Cleaning up resources
Error deleteing PVC (kubestr-csi-original-pvcrsx5f) - (context deadline exceeded)
Error deleteing Pod (kubestr-csi-original-podzpszl) - (context deadline exceeded)
CSI checker test:
Failed to create Snapshot: CSI Driver failed to create snapshot for PVC (kubestr-csi-original-pvcrsx5f) in Namespace (default): context deadline exceeded - Error
Error: Failed to create Snapshot: CSI Driver failed to create snapshot for PVC (kubestr-csi-original-pvcrsx5f) in Namespace (default): context deadline exceeded
I’ll continue debugging with a colleague on friday.
Joern
1.23 unfortunately is not yet supported. Is it possible to re-deploy on 1.22? I’ve personally tested RKE2 with Longhorn on 1.22 as working.
-Adam
Ok, I’ll deploy a new cluster tomorrow with 1.22 and let you know. Many thanks!
I just installed a new cluster with
- v1.22.10+rke2r2
- deployed Longhorn (open-iscsi is running, I can create PVs)
- created external-snapshotter CRDs
- created VolumeSnapshotClass longhorn
but still the same error:
./kubestr csicheck -s longhorn -v longhorn
Creating application
-> Created pod (kubestr-csi-original-pod77m8n) and pvc (kubestr-csi-original-pvc6kxnp)
Taking a snapshot
Cleaning up resources
Error deleteing PVC (kubestr-csi-original-pvc6kxnp) - (context deadline exceeded)
Error deleteing Pod (kubestr-csi-original-pod77m8n) - (context deadline exceeded)
CSI checker test:
Failed to create Snapshot: CSI Driver failed to create snapshot for PVC (kubestr-csi-original-pvc6kxnp) in Namespace (default): context deadline exceeded - Error
Error: Failed to create Snapshot: CSI Driver failed to create snapshot for PVC (kubestr-csi-original-pvc6kxnp) in Namespace (default): context deadline exceeded
k10primer also still reports “Cluster isn't CSI capable”. Do you know how this is checked?
Below are my setup notes from my testing on RKE2 with Longhorn. It’s possible your snapshot CRDs need to be installed as I did below. The primer script “Cluster isn’t CSI capable” is a bug that only exists in 1.22+ of RKE2/K3s. Kubestr should work properly however.
Install snaphot CDRs
------------------------------------
git clone https://github.com/kubernetes-csi/external-snapshotter.git
kubectl kustomize external-snapshotter/client/config/crd | kubectl create -f -
kubectl -n kube-system kustomize external-snapshotter/deploy/kubernetes/snapshot-controller | kubectl create -f -
Enable Longhorn Snapshot Class
--------------------------------------
cat <<EOF>> longhorn-csi-snapshotclass.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: longhorn
annotations:
k10.kasten.io/is-snapshot-class: "true"
driver: driver.longhorn.io
deletionPolicy: Delete
EOF
kubectl apply -f longhorn-csi-snapshotclass.yaml
Install local minio for longhorn
--------------------------------------
kubectl create -f https://raw.githubusercontent.com/longhorn/longhorn/master/deploy/backupstores/minio-backupstore.yaml
Go to the Longhorn UI. In the top navigation bar, click Settings. In the Backup section, set Backup Target to
s3://backupbucket@us-east-1/
And set Backup Target Credential Secret to:
minio-secret
Testing CSI
---------------------------------------
kubestr csicheck -s longhorn -v longhorn
Hi Adam,
many thanks for your setup notes.
Why did you install minio?
I think CRDs and snapshot controller are installed correctly.
CSI snapshots just don’t work, but I can see them. They are somehow incomplete.
kc get volumesnapshots
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
demo-pvc-volume-snapshot-longhorn-snapshot demo-pvc longhorn-snapshot-vsc 39s
kubestr-snapshot-20220616150335 kubestr-csi-original-pvcnfd9m longhorn-snapshot-vsc
I even tried Longhorn 1.3.0 released yesterday, no success. Ok, enough debugging for today.
Joern
Reading the release notes on Longhorn 1.3 - it seems like a local backup target may no longer be a requirement when calling CSI based snapshots. This was a requirement previously, so using a local target was a solution with K10, as ultimately K10 would move the backup to a public cloud/external S3 target.
See https://github.com/longhorn/longhorn/releases and the issue described here https://github.com/longhorn/longhorn/issues/2534
Hi Adam,
I finally got it working: I forgot to replace all “default” namespace settings in rbac-snapshot-controller.yaml so the ServiceAccount had some permission problem.
Now K10 with the “new” snapshot method in Longhorn 1.3.0 works fine.
Many thanks for your help!
Joern
About my problem with restore from immutable backup: I found this in the 5.0.1 release notes:
Known Issues