Solved

Kanister Backup not working after upgrade from 4.5.10 to 4.5.13/4.5.14


Userlevel 2

Hi,

since the upgrade from Kasten10 to the latest release (4.5.13 and also 4.5.14) my snapshot backups are not working anymore.

In the Kanister log I can find the following problem:

kopia: error: unable to list sources: 401 Unauthorized, try --help

 

What can cause that problem?

 

2022-04-29T13: 30: 03.953394730+02: 00 {

    "ActionSet": "k10-backuptoserver-k10-deployment-generic-volume-2.0.18-pitgs6g",

    "File": "pkg/controller/controller.go",

    "Function": "github.com/kanisterio/kanister/pkg/controller.(*Controller).runAction.func1",

    "Line": 445,

    "Phase": "backupToServer",

    "cluster_name": "d65edb7c-54cd-4606-840a-4465efe18b9f",

    "hostname": "kanister-svc-68d47d45d8-2kg9q",

    "kanister.io/JobID": "91e4950c-c7af-11ec-9d5b-7a258d16cf8b",

    "level": "info",

    "msg": "Executing phase backupToServer",

    "time": "2022-04-29T11:30:03.945715584Z",

    "version": "4.5.14"

}

2022-04-29T13: 30: 03.953402976+02: 00 {

    "Command": "kopia --log-level=error --config-file=/tmp/kopia-repository.config --log-dir=/tmp/kopia-log server status --address=https://10.152.183.200:51515 --server-cert-fingerprint=\u003c****\u003e --server-username=k10-admin@data-mover-server-pod --server-password=\u003c****\u003e",

    "File": "kasten.io/k10/kio/kopia/kopia.go",

    "Function": "kasten.io/k10/kio/kopia.stringSliceCommand",

    "Line": 140,

    "cluster_name": "d65edb7c-54cd-4606-840a-4465efe18b9f",

    "hostname": "kanister-svc-68d47d45d8-2kg9q",

    "level": "info",

    "msg": "kopia command",

    "time": "20220429-11:30:03.947Z",

    "version": "4.5.14"

}

2022-04-29T13: 30: 03.953409168+02: 00 {

    "ActionSet": "k10-backuptoserver-k10-deployment-generic-volume-2.0.18-pitgs6g",

    "File": "pkg/controller/controller.go",

    "Function": "github.com/kanisterio/kanister/pkg/controller.(*Controller).onUpdate",

    "Line": 163,

    "Status": "pending",

    "cluster_name": "d65edb7c-54cd-4606-840a-4465efe18b9f",

    "hostname": "kanister-svc-68d47d45d8-2kg9q",

    "kanister.io/JobID": "91e4950c-c7af-11ec-9d5b-7a258d16cf8b",

    "level": "info",

    "msg": "Updated ActionSet",

    "time": "2022-04-29T11:30:03.948158004Z",

    "version": "4.5.14"

}

2022-04-29T13: 30: 03.953415643+02: 00 {

    "ActionSet": "k10-backuptoserver-k10-deployment-generic-volume-2.0.18-pitgs6g",

    "File": "pkg/controller/controller.go",

    "Function": "github.com/kanisterio/kanister/pkg/controller.(*Controller).onUpdate",

    "Line": 163,

    "Phase": "backupToServer-\u003epending",

    "Status": "running",

    "cluster_name": "d65edb7c-54cd-4606-840a-4465efe18b9f",

    "hostname": "kanister-svc-68d47d45d8-2kg9q",

    "kanister.io/JobID": "91e4950c-c7af-11ec-9d5b-7a258d16cf8b",

    "level": "info",

    "msg": "Updated ActionSet",

    "time": "2022-04-29T11:30:03.94820996Z",

    "version": "4.5.14"

}

2022-04-29T13: 30: 05.011886067+02: 00 {

    "Container": "kanister-sidecar",

    "File": "pkg/format/format.go",

    "Function": "github.com/kanisterio/kanister/pkg/format.LogWithCtx",

    "Line": 61,

    "Out": "kopia: error: unable to list sources: 401 Unauthorized, try --help",

    "Pod": "pihole-6b577bbbdb-p22xt",

    "cluster_name": "d65edb7c-54cd-4606-840a-4465efe18b9f",

    "hostname": "kanister-svc-68d47d45d8-2kg9q",

    "level": "info",

    "msg": "Pod Update",

    "time": "2022-04-29T11:30:05.011457172Z",

    "version": "4.5.14"

}

icon

Best answer by lemassacre 30 April 2022, 21:26

View original

5 comments

Userlevel 3
Badge +1

Hi @lemassacre , Thanks for reaching out to us . 

 

kopia: error: unable to list sources: 401 Unauthorized, try --help” is an info message and is expected till Kopia API server is ready . Possibly the actual error could be right after that. Can you provide more info about it

  • Error yaml of snapshot job
  • process followed to upgrade ?( if helm please provide the command used to upgrade)

debug logs can help us looking into this issue . if possible provide us them with support ticket. 

Regards
Satish Valasa

 

Userlevel 2

Hi @Satish

after some testing on my test environment I can now tell, that the problem is with version 4.5.13 (it worked until v4.5.12).

 

. Possibly the actual error could be right after that. Can you provide more info about it

This is the last log entry, that occurs multiple times in the log

 

  • process followed to upgrade ?( if helm please provide the command used to upgrade)
helm repo update

helm upgrade k10 kasten/k10 --namespace=kasten-io --version 4.5.13 -f helm-kasten.yaml

I also tried it before with the command from the documentation:

helm repo update && \
helm get values k10 --output yaml --namespace=kasten-io > k10_val.yaml && \
helm upgrade k10 kasten/k10 --namespace=kasten-io -f k10_val.yaml

Error yaml of snapshot job:

cause:
cause:
cause:
cause:
cause:
message: context deadline exceeded
file: kasten.io/k10/kio/poll/poll.go:95
function: kasten.io/k10/kio/poll.waitWithBackoffWithRetriesHelper
linenumber: 95
message: Context done while polling
fields:
- name: duration
value: 44m59.963318481s
file: kasten.io/k10/kio/poll/poll.go:65
function: kasten.io/k10/kio/poll.waitWithBackoffWithRetries
linenumber: 65
message: Timeout while polling
fields:
- name: actionSet
value: k10-backuptoserver-k10-deployment-generic-volume-2.0.17-piznnbl
file: kasten.io/k10/kio/kanister/operation.go:284
function: kasten.io/k10/kio/kanister.(*Operation).waitForActionSetCompletion
linenumber: 284
message: Error waiting for ActionSet
fields:
- name: appName
value: app
- name: appType
value: deployment
- name: namespace
value: namespace
file: kasten.io/k10/kio/exec/phases/backup/snapshot_data_phase.go:612
function: kasten.io/k10/kio/exec/phases/backup.basicVolumeSnapshot
linenumber: 612
message: Failed to snapshot volumes
message: Job failed to be executed
fields: []

 

Userlevel 2

I could finally solve the problem. Don’t know what’s the reason for it, but i had to completely delete and then deploy my workloads again.

After that the kanister backups now also work with the current version 4.5.14 agian.

Userlevel 3
Badge +1

Hello, Lemassacre,

 

That's great to hear. With the context deadline error, this would have been an issue within events, possibly an issue with PVC or pods. It's great to hear you have resolved your problem.

 

Thanks

Emmanuel

Userlevel 6
Badge +2

@lemassacre 

The below error means that the kanister action timed out (default is 45 mins)

This can be adjusted if your data is more and if you anticipate it to take more time.

The helm value kanister.backupTimeout can be used to upgrade K10 to increase the timeout.This document will provide you all the available helm values that you can use to configure K10.

        function: kasten.io/k10/kio/poll.waitWithBackoffWithRetriesHelper
linenumber: 95
message: Context done while polling
fields:
- name: duration
value: 44m59.963318481s

 

Comment