Skip to main content
Solved

Red Hat OpenShift 4.12 on VMWare vSphere 7.0.3 k10 6.5 Snapshot export fails


Hi, I am new to Kasten.

 

I am not sure, what I am doing wrong. Any help will be great.

 

I Was testing backup and restore using Kasten on the same OCP cluster.

Kasten installation details:

  • Red Hat Openshift 4.12
  • 3 master nodes
  • 3 worker nodes
  • VMWare vSphere 7.0.3
  • with k10 6.5. (free licensed version)
  • 3 nodes in the cluster, 5 nodes licensed
  • having these Kasten licenses - starter license and trialstarter license
  • Kasten installed using helm based installation
  • The Infrastructure location has been defined correctly for the vCenter.
  • VMWare thin-csi driver based storageclass
  • VMWare thin-csi based volumesnapshotclass having the annotation 
    k10.kasten.io/is-snapshot-class: "true"

     

Here are the other details if relevant.

 

K10 Version 6.5.0
K10 Namespace kasten-io
Kubernetes Version v1.25.14+bcb9a60
Kubernetes Release Type OpenShift
Cluster ID 99160696-5a63-4afc-8e29-90d52f6c99ee

 

  • snapshot is fine
  • snapshot based backups are fine
  • snapshots based restores are also fine
  • I can delete a namespace after backup and am able to restore the namespace with all the objects as expected

Then to enable Kasten DR following had been done

  • Linux based NFS server was configured with 777 on the NFS export path, with export options as rw,async,no_root_squash
  • NFS PV was created - ReadWriteMany
  • NFS PVC defined in the kasten-io namespace - ReadWriteMany
  • Define the location profile in Kasten
  • K10 DR passphrase defined
  • K10 Cluster ID was generated
  • Location Profile validated in Kasten

With these configured tried performing the export of a previously taken snapshot and that failed with the messages like below

 

  actionDetails:
    phases:
      - attempt: 3
        endTime: 2023-12-09T16:32:33Z
        errors:
          - cause: '{"cause":{"cause":{"message":"Failure in exporting
              restorepoint"},"fields":[{"name":"FailedSubPhases","value":[{"Err":{"cause":{"cause":{"cause":{"cause":{"message":"Failed
              to exec command in pod: command terminated with exit code
              1"},"file":"kasten.io/k10/kio/kopia/repository.go:500","function":"kasten.io/k10/kio/kopia.CreateKopiaRepository","linenumber":500,"message":"Failed
              to create the backup
              repository"},"fields":[{"name":"appNamespace","value":"awx"}],"file":"kasten.io/k10/kio/exec/phases/phase/export.go:266","function":"kasten.io/k10/kio/exec/phases/phase.prepareKopiaRepoIfExportingData","linenumber":266,"message":"Failed
              to create Kopia repository for data
              export"},"file":"kasten.io/k10/kio/exec/phases/phase/export.go:168","function":"kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run","linenumber":168,"message":"Failed
              to copy artifacts"},"fields":[],"message":"Job failed to be
              executed"},"ID":"39647c42-96b0-11ee-ab01-0a580a80045d","Phase":"Exporting
              RestorePoint"}]}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:195","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).processGroup","linenumber":195,"message":"Failure
              in exporting
              restorepoint"},"fields":[{"name":"manifestID","value":"390d730c-96b0-11ee-ab01-0a580a80045d"},{"name":"jobID","value":"390df1a6-96b0-11ee-962c-0a580a80045e"},{"name":"groupIndex","value":0}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:96","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).Run","linenumber":96,"message":"Failed
              checking jobs in group"}'

 

Tried the same with an S3 based location profile

  • s3 based on opensource Minio installation
  • S3 URL with HTTP and not HTTPS
  • With the needed Access and Secret keys and bucket name (a newly created empty bucket)
  • Addition of the location profile validated

Tried export to the S3 location profile. This also fails with the following errors.

 

phases:
  - attempt: 3
    endTime: 2023-12-09T17:39:52Z
    errors:
      - cause: '{"cause":{"cause":{"message":"Failure in exporting
          restorepoint"},"fields":[{"name":"FailedSubPhases","value":[{"Err":{"cause":{"cause":{"cause":{"cause":{"message":"Failed
          to exec command in pod: command terminated with exit code
          1"},"file":"kasten.io/k10/kio/kopia/repository.go:500","function":"kasten.io/k10/kio/kopia.CreateKopiaRepository","linenumber":500,"message":"Failed
          to create the backup
          repository"},"fields":[{"name":"appNamespace","value":"awx"}],"file":"kasten.io/k10/kio/exec/phases/phase/export.go:266","function":"kasten.io/k10/kio/exec/phases/phase.prepareKopiaRepoIfExportingData","linenumber":266,"message":"Failed
          to create Kopia repository for data
          export"},"file":"kasten.io/k10/kio/exec/phases/phase/export.go:168","function":"kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run","linenumber":168,"message":"Failed
          to copy artifacts"},"fields":[],"message":"Job failed to be
          executed"},"ID":"a1bf4da0-96b9-11ee-ab01-0a580a80045d","Phase":"Exporting
          RestorePoint"}]}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:195","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).processGroup","linenumber":195,"message":"Failure
          in exporting
          restorepoint"},"fields":[{"name":"manifestID","value":"9ef0c7ee-96b9-11ee-ab01-0a580a80045d"},{"name":"jobID","value":"9ef188a6-96b9-11ee-962c-0a580a80045e"},{"name":"groupIndex","value":0}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:96","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).Run","linenumber":96,"message":"Failed
          checking jobs in group"}'
        message: Job failed to be executed
      - cause: '{"cause":{"cause":{"message":"Failure in exporting
          restorepoint"},"fields":[{"name":"FailedSubPhases","value":[{"Err":{"cause":{"cause":{"cause":{"cause":{"message":"Failed
          to exec command in pod: command terminated with exit code
          1"},"file":"kasten.io/k10/kio/kopia/repository.go:500","function":"kasten.io/k10/kio/kopia.CreateKopiaRepository","linenumber":500,"message":"Failed
          to create the backup
          repository"},"fields":[{"name":"appNamespace","value":"awx"}],"file":"kasten.io/k10/kio/exec/phases/phase/export.go:266","function":"kasten.io/k10/kio/exec/phases/phase.prepareKopiaRepoIfExportingData","linenumber":266,"message":"Failed
          to create Kopia repository for data
          export"},"file":"kasten.io/k10/kio/exec/phases/phase/export.go:168","function":"kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run","linenumber":168,"message":"Failed
          to copy artifacts"},"fields":[],"message":"Job failed to be
          executed"},"ID":"a1bf4da0-96b9-11ee-ab01-0a580a80045d","Phase":"Exporting
          RestorePoint"}]}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:195","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).processGroup","linenumber":195,"message":"Failure
          in exporting
          restorepoint"},"fields":[{"name":"manifestID","value":"9ef0c7ee-96b9-11ee-ab01-0a580a80045d"},{"name":"jobID","value":"9ef188a6-96b9-11ee-962c-0a580a80045e"},{"name":"groupIndex","value":0}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:96","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).Run","linenumber":96,"message":"Failed
          checking jobs in group"}'
        message: Job failed to be executed
      - cause: '{"cause":{"cause":{"message":"Failure in exporting
          restorepoint"},"fields":[{"name":"FailedSubPhases","value":[{"Err":{"cause":{"cause":{"cause":{"cause":{"message":"Failed
          to exec command in pod: command terminated with exit code
          1"},"file":"kasten.io/k10/kio/kopia/repository.go:500","function":"kasten.io/k10/kio/kopia.CreateKopiaRepository","linenumber":500,"message":"Failed
          to create the backup
          repository"},"fields":[{"name":"appNamespace","value":"awx"}],"file":"kasten.io/k10/kio/exec/phases/phase/export.go:266","function":"kasten.io/k10/kio/exec/phases/phase.prepareKopiaRepoIfExportingData","linenumber":266,"message":"Failed
          to create Kopia repository for data
          export"},"file":"kasten.io/k10/kio/exec/phases/phase/export.go:168","function":"kasten.io/k10/kio/exec/phases/phase.(*exportRestorePointPhase).Run","linenumber":168,"message":"Failed
          to copy artifacts"},"fields":[],"message":"Job failed to be
          executed"},"ID":"a1bf4da0-96b9-11ee-ab01-0a580a80045d","Phase":"Exporting
          RestorePoint"}]}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:195","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).processGroup","linenumber":195,"message":"Failure
          in exporting
          restorepoint"},"fields":[{"name":"manifestID","value":"9ef0c7ee-96b9-11ee-ab01-0a580a80045d"},{"name":"jobID","value":"9ef188a6-96b9-11ee-962c-0a580a80045e"},{"name":"groupIndex","value":0}],"file":"kasten.io/k10/kio/exec/phases/phase/queue_and_wait_children.go:96","function":"kasten.io/k10/kio/exec/phases/phase.(*queueAndWaitChildrenPhase).Run","linenumber":96,"message":"Failed
          checking jobs in group"}'
        message: Job failed to be executed

 

What am I doing wrong here?

Best answer by Sujit Kumar Singh

Thank you for your responses. 

Kasten set up is working now.

I cleaned up the installation of Kasten and re-installed the same following the procedure put here 

https://www.kasten.io/kubernetes/resources/blog/learn-the-best-way-to-install-kasten-k10-on-openshift

Then an S3 (from Acromove FreeNAS) was used to create the needed S3 buckets with the permissions to be used as a location profile.

I had also cleaned up some old SCCs that were needed some applications that were previously running on the OCP cluster - not sure if that had helped too.

 

View original
Did this topic help you find an answer to your question?

3 comments

No luck so far,

 

Logs show config file cannot be created which is needed by Kopia to connect to the repository.

 

 

[71] tcp.0: [1702150009.155343360, {"Container"=>"container", "File"=>"pkg/format/format.go", "Function"=>"github.com/kanisterio/kanister/pkg/format.LogWithCtx", "Level"=>"info", "Line"=>90, "LogKind"=>"datapath", "Message"=>"Pod Update", "Out"=>"[31mERROR[0m unable to connect to repository: error connecting to repository: unable to write config file: error writing file: cannot create temp file: open /tmp/kopia-repository.config875834848: read-only file system", "Pod"=>"create-repo-nnqvw", "Time"=>"2023-12-09T19:26:49.147221615Z", "cluster_name"=>"99160696-5a63-4afc-8e29-90d52f6c99ee", "hostname"=>"executor-svc-666d8f99fc-vslbl", "version"=>"6.5.0"}]

Any pointers will be great


Forum|alt.badge.img+1
  • Experienced User
  • 49 comments
  • December 11, 2023

@Sujit Kumar Singh 

Thanks for posting . 

Based on the error I see it's unable to write to the external storage (nfs) due to read only mount . 

I would check NFS permissions is applied properly for directories and sub directories.

Regards
Satish


  • Author
  • Not a newbie anymore
  • 2 comments
  • Answer
  • December 20, 2023

Thank you for your responses. 

Kasten set up is working now.

I cleaned up the installation of Kasten and re-installed the same following the procedure put here 

https://www.kasten.io/kubernetes/resources/blog/learn-the-best-way-to-install-kasten-k10-on-openshift

Then an S3 (from Acromove FreeNAS) was used to create the needed S3 buckets with the permissions to be used as a location profile.

I had also cleaned up some old SCCs that were needed some applications that were previously running on the OCP cluster - not sure if that had helped too.