Skip to main content

Need help in troubleshooting this KOPIA error

Used Kanister blueprint MySQL

Storage Class : Longhorn

Hi,

Can you give us a screenshot or logs (show details) as there is not enough info there. Did exporting a snapshot work without the blueprint?

 


These are the logs for kanister pod

Exporting without Kanister did not work

@Geoff Burke 


I have not done a lot of kanister troubleshooting but looking at the api section of the docs it seems you can manually check backup actions which might help with commands like this:

kubectl get backupactions.actions.kio.kasten.io ….

check them out here https://docs.kasten.io/latest/api/actions.html

Otherwise we might have to wait for one of the Kasten folks here to help out.

 

 


@rbhowmik-sds I would suggest during the backup/export to check logs of the following pods executor-svc and kanister-svc and focus on kopia errors, it would be nice if you shared the log snippets with us

kubectl logs -l component=executor -n kasten-io -f

kubectl logs -l component=kanister -n kasten-io -f

Ahmed


hi @Hagag these are logs for both the above commands


Hi @rbhowmik-sds  The executor pods seem unable to retrieve the job from the queue for execution, which differs from the initial error you provided.

it is telling that you have read only FS and could not fetch queued jobs

did you get the same kopia error in the dashbard?

Ahmed


@Hagag while running the Job, getting Kopia error in dashboard else everything is working fine in dashboard. I have already shared the screenshot for the issue which is showing in dashboard

Pls suggest some troubleshooting steps. Need to fix this error as soon as possible


@rbhowmik-sds i need to check the debug logs, Would you mind opening a case with us through `my.veeam.com` and use `Kasten by veeam K10 Trial` in products while opening a case.

Please collect the debug logs (https://docs.kasten.io/latest/operating/support.html#gathering-debugging-information) and upload the same to the case. We will get in touch and take a deep look at what’s going on.


hi @Hagag I created a case in Veeam as per your suggestion.


 @rbhowmik-sds I see the case is taken by my colleague, he should check it and get back to you but from the logs I see the jobs-svc pod is not started which explains why there is an issue with executor pods unable to retrieve the job from the queue for execution.


jobs-svc-6b999bbdc9-c82bg                0/1     Init:0/1           0                  18h

I see also many K10 pods restarting and in CrashLoopBackOff.

 

services/executor-svc-8d6ff74c4-h9cc8.txt:{"File":"kasten.io/k10/kio/exec/internal/queuepoll/queuepoll.go","Function":"kasten.io/k10/kio/exec/internal/queuepoll.(*QueuePoll).getQueuedJobWithBackoff.func1","Line":58,"cluster_name":"a852c906-4194-4486-b00a-8f0be53f817d","error":{"message":"Error fetching queued job","function":"kasten.io/k10/kio/exec/internal/jobs.(*JobManager).NextQueuedJob","linenumber":145,"file":"kasten.io/k10/kio/exec/internal/jobs/jobmanager.go:145","cause":{"message":"Get \"http://10.111.62.71:8000/v0/queuedjobs\": dial tcp 10.111.62.71:8000: connect: connection refused"}},"hostname":"executor-svc-8d6ff74c4-h9cc8","level":"error","msg":"Error while polling for QueuedJob","time":"2023-12-31T08:22:48.600Z","version":"6.5.0"}




BR,
Ahmed Hagag


@rbhowmik-sds 

my colleague has ownership of the case and will contact you soon.

I also had a quick view of the logs, and I found the jobs-svc pod is not started, so the executor pods won't be able to pull the job from the queue.

jobs-svc-6b999bbdc9-c82bg    This pod is not started 
also, you have many pods restarting and some of them are in crashloopbackupoff status.

CC @Haythem Elkhouly 


Comment