Skip to main content

Hello,

I’m trying to backup a large Postgres Pod (>150 GB) but it fails after 45 minutes with the following error message:

- cause:
cause:
cause:
cause:
cause:
cause:
message: context deadline exceeded
file: kasten.io/k10/kio/poll/poll.go:96
function: kasten.io/k10/kio/poll.waitWithBackoffWithRetriesHelper
linenumber: 96
message: Context done while polling
fields:
- name: duration
value: 44m59.992268957s
file: kasten.io/k10/kio/poll/poll.go:66
function: kasten.io/k10/kio/poll.waitWithBackoffWithRetries
linenumber: 66
message: Timeout while polling
fields:
- name: actionSet
value: k10-backup-k10-pgo-bp-0.0.3-cinepg-cine--295f2
file: kasten.io/k10/kio/kanister/operation.go:381
function: kasten.io/k10/kio/kanister.(*Operation).waitForActionSetCompletion
linenumber: 381
message: Error waiting for ActionSet
file: kasten.io/k10/kio/exec/phases/backup/snapshot_data_phase.go:581
function: kasten.io/k10/kio/exec/phases/backup.snapshotNamespace
linenumber: 581
message: Error performing operator snapshot
file: kasten.io/k10/kio/exec/phases/backup/snapshot_data_phase.go:407
function: kasten.io/k10/kio/exec/phases/backup.processNonWorkloadArtifact
linenumber: 407
message: Failed snapshot for namespace
message: Job failed to be executed

 

Are there any parameters I could set in the backup policy to extend this timeout, or what is the actual reason for this issue?

 

BR,

Daniel

Hello @Daniel Moes,

Thank you for using K10 community!

There is a parameter that can be used to increase the kanister backup timeout, that would be for your case since you probably is using a blueprint for backup Postgres.

I would recommend first try to identify if there are no issues during the backup phase in the kanister-svc logs, if you see that the backup was still running and did not complete in time, you can try increasing the value.

Would be good to try the same blueprint command in the Postgres pod and see how much time it takes so you can have an idea.

If you find out it is just taking more time, the helm value below can be used to upgrade kanister backup timeout:

--set kanister.backupTimeout=<value-minutes>

The default value is 45 minutes, it can be increased according to your needs, below as an example, increasing to 120 minutes:

helm get values k10 -n kasten-io > k10_val.yaml
helm upgrade k10 kasten/k10 --namespace=kasten-io -f k10_val.yaml \
--set kanister.backupTimeout=120 --version=YOUR_K10_VERSION

More information about this and other K10 helm values can be found at our docs:

https://docs.kasten.io/latest/install/advanced.html#complete-list-of-k10-helm-options

Hope it helps.

FRubens


Comment