Skip to main content

Hi, I am new to Kasten.

 

I am not sure, what I am doing wrong. Any help will be great.

 

I Was testing backup and restore using Kasten on the same OCP cluster.

Kasten installation details:

  • Red Hat Openshift 4.12
  • 3 master nodes
  • 3 worker nodes
  • VMWare vSphere 7.0.3
  • with k10 6.5. (free licensed version)
  • 3 nodes in the cluster, 5 nodes licensed
  • having these Kasten licenses - starter license and trialstarter license
  • Kasten installed using helm based installation
  • The Infrastructure location has been defined correctly for the vCenter.
  • VMWare thin-csi driver based storageclass
  • VMWare thin-csi based volumesnapshotclass having the annotation 
    k10.kasten.io/is-snapshot-class: "true"

     

Here are the other details if relevant.

 

K10 Version 6.5.0
K10 Namespace kasten-io
Kubernetes Version v1.25.14+bcb9a60
Kubernetes Release Type OpenShift
Cluster ID 99160696-5a63-4afc-8e29-90d52f6c99ee

 

  • snapshot is fine
  • snapshot based backups are fine
  • snapshots based restores are also fine
  • I can delete a namespace after backup and am able to restore the namespace with all the objects as expected

Then to enable Kasten DR following had been done

  • Linux based NFS server was configured with 777 on the NFS export path, with export options as rw,async,no_root_squash
  • NFS PV was created - ReadWriteMany
  • NFS PVC defined in the kasten-io namespace - ReadWriteMany
  • Define the location profile in Kasten
  • K10 DR passphrase defined
  • K10 Cluster ID was generated
  • Location Profile validated in Kasten

With these configured tried performing the export of a previously taken snapshot and that failed with the messages like below

 

  actionDetails:
phases:
- attempt: 3
endTime: 2023-12-09T16:32:33Z
errors:
- cause: '{"cause":{"cause":{"message":"Failure in exporting
restorepoint"},"fields":f{"name":"FailedSubPhases","value":"{"Err":{"cause":{"cause":{"cause":{"cause":{"message":"Failed
to exec command in pod: command terminated with exit code
1"},"file":"kasten.io/k10/kio/kopia/repository.go:500","function":"kasten.io/k10/kio/kopia.CreateKopiaRepository","linenumber":500,"message":"Failed
to create the backup
repository"},"fields":f{"name":"appNamespace","value":"awx"}],"file":"kasten.io/k10/kio/exec/phases/phase/export.go:266","function":"kasten.io/k10/kio/exec/phases/phase.prepareKopiaRepoIfExportingData","linenumber":266,"message":"Failed
to create Kopia repository for data
export"},"file":"kasten.io/k10/kio/exec/phases/phase/export.go:168","function":"kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run","linenumber":168,"message":"Failed
to copy artifacts"},"fields":f],"message":"Job failed to be
executed"},"ID":"39647c42-96b0-11ee-ab01-0a580a80045d","Phase":"Exporting
RestorePoint"}]}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:195","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).processGroup","linenumber":195,"message":"Failure
in exporting
restorepoint"},"fields":f{"name":"manifestID","value":"390d730c-96b0-11ee-ab01-0a580a80045d"},{"name":"jobID","value":"390df1a6-96b0-11ee-962c-0a580a80045e"},{"name":"groupIndex","value":0}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:96","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).Run","linenumber":96,"message":"Failed
checking jobs in group"}'

 

Tried the same with an S3 based location profile

  • s3 based on opensource Minio installation
  • S3 URL with HTTP and not HTTPS
  • With the needed Access and Secret keys and bucket name (a newly created empty bucket)
  • Addition of the location profile validated

Tried export to the S3 location profile. This also fails with the following errors.

 

phases:
- attempt: 3
endTime: 2023-12-09T17:39:52Z
errors:
- cause: '{"cause":{"cause":{"message":"Failure in exporting
restorepoint"},"fields":,{"name":"FailedSubPhases","value":"{"Err":{"cause":{"cause":{"cause":{"cause":{"message":"Failed
to exec command in pod: command terminated with exit code
1"},"file":"kasten.io/k10/kio/kopia/repository.go:500","function":"kasten.io/k10/kio/kopia.CreateKopiaRepository","linenumber":500,"message":"Failed
to create the backup
repository"},"fields":,{"name":"appNamespace","value":"awx"}],"file":"kasten.io/k10/kio/exec/phases/phase/export.go:266","function":"kasten.io/k10/kio/exec/phases/phase.prepareKopiaRepoIfExportingData","linenumber":266,"message":"Failed
to create Kopia repository for data
export"},"file":"kasten.io/k10/kio/exec/phases/phase/export.go:168","function":"kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run","linenumber":168,"message":"Failed
to copy artifacts"},"fields":,],"message":"Job failed to be
executed"},"ID":"a1bf4da0-96b9-11ee-ab01-0a580a80045d","Phase":"Exporting
RestorePoint"}]}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:195","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).processGroup","linenumber":195,"message":"Failure
in exporting
restorepoint"},"fields":,{"name":"manifestID","value":"9ef0c7ee-96b9-11ee-ab01-0a580a80045d"},{"name":"jobID","value":"9ef188a6-96b9-11ee-962c-0a580a80045e"},{"name":"groupIndex","value":0}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:96","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).Run","linenumber":96,"message":"Failed
checking jobs in group"}'
message: Job failed to be executed
- cause: '{"cause":{"cause":{"message":"Failure in exporting
restorepoint"},"fields":,{"name":"FailedSubPhases","value":"{"Err":{"cause":{"cause":{"cause":{"cause":{"message":"Failed
to exec command in pod: command terminated with exit code
1"},"file":"kasten.io/k10/kio/kopia/repository.go:500","function":"kasten.io/k10/kio/kopia.CreateKopiaRepository","linenumber":500,"message":"Failed
to create the backup
repository"},"fields":,{"name":"appNamespace","value":"awx"}],"file":"kasten.io/k10/kio/exec/phases/phase/export.go:266","function":"kasten.io/k10/kio/exec/phases/phase.prepareKopiaRepoIfExportingData","linenumber":266,"message":"Failed
to create Kopia repository for data
export"},"file":"kasten.io/k10/kio/exec/phases/phase/export.go:168","function":"kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run","linenumber":168,"message":"Failed
to copy artifacts"},"fields":,],"message":"Job failed to be
executed"},"ID":"a1bf4da0-96b9-11ee-ab01-0a580a80045d","Phase":"Exporting
RestorePoint"}]}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:195","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).processGroup","linenumber":195,"message":"Failure
in exporting
restorepoint"},"fields":,{"name":"manifestID","value":"9ef0c7ee-96b9-11ee-ab01-0a580a80045d"},{"name":"jobID","value":"9ef188a6-96b9-11ee-962c-0a580a80045e"},{"name":"groupIndex","value":0}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:96","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).Run","linenumber":96,"message":"Failed
checking jobs in group"}'
message: Job failed to be executed
- cause: '{"cause":{"cause":{"message":"Failure in exporting
restorepoint"},"fields":,{"name":"FailedSubPhases","value":"{"Err":{"cause":{"cause":{"cause":{"cause":{"message":"Failed
to exec command in pod: command terminated with exit code
1"},"file":"kasten.io/k10/kio/kopia/repository.go:500","function":"kasten.io/k10/kio/kopia.CreateKopiaRepository","linenumber":500,"message":"Failed
to create the backup
repository"},"fields":,{"name":"appNamespace","value":"awx"}],"file":"kasten.io/k10/kio/exec/phases/phase/export.go:266","function":"kasten.io/k10/kio/exec/phases/phase.prepareKopiaRepoIfExportingData","linenumber":266,"message":"Failed
to create Kopia repository for data
export"},"file":"kasten.io/k10/kio/exec/phases/phase/export.go:168","function":"kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run","linenumber":168,"message":"Failed
to copy artifacts"},"fields":,],"message":"Job failed to be
executed"},"ID":"a1bf4da0-96b9-11ee-ab01-0a580a80045d","Phase":"Exporting
RestorePoint"}]}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:195","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).processGroup","linenumber":195,"message":"Failure
in exporting
restorepoint"},"fields":,{"name":"manifestID","value":"9ef0c7ee-96b9-11ee-ab01-0a580a80045d"},{"name":"jobID","value":"9ef188a6-96b9-11ee-962c-0a580a80045e"},{"name":"groupIndex","value":0}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:96","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).Run","linenumber":96,"message":"Failed
checking jobs in group"}'
message: Job failed to be executed

 

What am I doing wrong here?

No luck so far,

 

Logs show config file cannot be created which is needed by Kopia to connect to the repository.

 

 

n71] tcp.0: 01702150009.155343360, {"Container"=>"container", "File"=>"pkg/format/format.go", "Function"=>"github.com/kanisterio/kanister/pkg/format.LogWithCtx", "Level"=>"info", "Line"=>90, "LogKind"=>"datapath", "Message"=>"Pod Update", "Out"=>"t31mERRORR0m unable to connect to repository: error connecting to repository: unable to write config file: error writing file: cannot create temp file: open /tmp/kopia-repository.config875834848: read-only file system", "Pod"=>"create-repo-nnqvw", "Time"=>"2023-12-09T19:26:49.147221615Z", "cluster_name"=>"99160696-5a63-4afc-8e29-90d52f6c99ee", "hostname"=>"executor-svc-666d8f99fc-vslbl", "version"=>"6.5.0"}]

Any pointers will be great


@Sujit Kumar Singh 

Thanks for posting . 

Based on the error I see it's unable to write to the external storage (nfs) due to read only mount . 

I would check NFS permissions is applied properly for directories and sub directories.

Regards
Satish


Thank you for your responses. 

Kasten set up is working now.

I cleaned up the installation of Kasten and re-installed the same following the procedure put here 

https://www.kasten.io/kubernetes/resources/blog/learn-the-best-way-to-install-kasten-k10-on-openshift

Then an S3 (from Acromove FreeNAS) was used to create the needed S3 buckets with the permissions to be used as a location profile.

I had also cleaned up some old SCCs that were needed some applications that were previously running on the OCP cluster - not sure if that had helped too.

 


Comment