Hello!
We’re facing an issue with using the Kasten DR functionality following the official guide here: https://docs.kasten.io/latest/operating/dr.html#recovering-k10-from-a-disaster
The issue is logged in the DR restore job pod stating the following:
Error: {"message":"Failed to scale down Catalog","function":"kasten.io/k10/kio/tools/restorectl.restoreK10","linenumber":138,"file":"kasten.io/k10/kio/tools/restorectl/restore.go:138","cause":{"message":"Failed waiting for deployment replicas","function":"kasten.io/k10/kio/tools/restorectl/servicescaler.(*deploymentScaler).ScaleAndVerifyWithTimeout","linenumber":80,"file":"kasten.io/k10/kio/tools/restorectl/servicescaler/deployment_scaler.go:80","fields":[{"name":"deployment","value":"catalog-svc"},{"name":"replicas","value":0}],"cause":{"message":"context cancelled","function":"kasten.io/k10/kio/tools/restorectl/servicescaler.waitForDeploymentReplicas","linenumber":61,"file":"kasten.io/k10/kio/tools/restorectl/servicescaler/utils.go:61"}}}
Our setup:
- EKS cluster version 1.25.6
- K10 Helm version 5.5.6 using IRSA (IAM role to Service Account)
- DR activated on an AWS S3 location and a passphrase
In the k10-restore helm chart the following is given:
-
sourceClusterID
-
profile.name points to K10 AWS S3 location name
-
secrets.awsIamRole points to IRSA arn
- Passphrase is given via provisioned K8s secret
Looking forward to any input on this, as this is crucial for us to work to cover the DR scenario.
All the best,
Widura