We have hundreds of kanister-job-* pods in our kasten namespace that are in state phase “Succeeded” but are never deleted.
$ kubectl get pod --namespace kasten-io --field-selector status.phase=Succeeded|wc -l
355
Here is the status of one of the jobs:
status:
phase: Succeeded
conditions:
- type: Initialized
status: 'True'
lastProbeTime: null
lastTransitionTime: '2023-06-04T15:11:22Z'
reason: PodCompleted
- type: Ready
status: 'False'
lastProbeTime: null
lastTransitionTime: '2023-06-04T15:11:22Z'
reason: PodCompleted
- type: ContainersReady
status: 'False'
lastProbeTime: null
lastTransitionTime: '2023-06-04T15:11:22Z'
reason: PodCompleted
- type: PodScheduled
status: 'True'
lastProbeTime: null
lastTransitionTime: '2023-06-04T15:11:22Z'
The Log of the Kanister-SVC-Pod shows the following messages for this pod:
Failed to delete pod - an empty namespace may not be set when a resource name is provided
{
"ActionSet": "k10-backup-etcd-blueprint-etcd-details-etcd-backup--nkzh5",
"Container": "container",
"File": "pkg/output/stream.go",
"Function": "github.com/kanisterio/kanister/pkg/output.LogAndParse.func1",
"Line": 56,
"LogKind": "datapath",
"Phase": "removeSnapshot",
"Pod": "kanister-job-glxkr",
"Pod_Out": "Unable to use a TTY - input is not a terminal or the right kind of file",
"cluster_name": "2fc2b23f-9017-4015-9c0a-25e1048d2dfc",
"hostname": "kanister-svc-96f46bf89-fk6dn",
"kanister.io/JobID": "e0398f6f-02e9-11ee-ab43-0a580a83002b",
"level": "info",
"msg": "",
"time": "2023-06-04T15:11:29.279526021Z",
"version": "5.5.7"
}
{
"File": "pkg/kube/pod_controller.go",
"Function": "github.com/kanisterio/kanister/pkg/kube.(*podController).StopPod",
"Line": 160,
"Namespace": "",
"PodName": "kanister-job-glxkr",
"cluster_name": "2fc2b23f-9017-4015-9c0a-25e1048d2dfc",
"error": "an empty namespace may not be set when a resource name is provided",
"hostname": "kanister-svc-96f46bf89-fk6dn",
"level": "info",
"msg": "Failed to delete pod",
"time": "2023-06-04T15:11:29.307724147Z",
"version": "5.5.7"
}
{
"File": "pkg/kube/pod_runner.go",
"Function": "github.com/kanisterio/kanister/pkg/kube.(*podRunner).Run.func1",
"Line": 64,
"PodName": "kanister-job-glxkr",
"cluster_name": "2fc2b23f-9017-4015-9c0a-25e1048d2dfc",
"error": "an empty namespace may not be set when a resource name is provided",
"hostname": "kanister-svc-96f46bf89-fk6dn",
"level": "info",
"msg": "Failed to delete pod",
"time": "2023-06-04T15:11:29.307765475Z",
"version": "5.5.7"
}
We are using Kasten 5.5.7 on OKD 4.12.0-0.okd-2023-04-16-041331. Any Idea about the cause of this?
Currently we have to work around this problem by regularly deleting succeeded jobs:
kubectl delete pod --namespace kasten-io --field-selector status.phase=Succeeded