k10-grafana Init:CrashLoopBackOff "reason":"BadRequest","code":400

I am trying to deploy K10s but without success, I only need the Grafana Pod to start but it hangs

NAME READY STATUS RESTARTS AGE
aggregatedapis-svc-64c55c6979-5gswn 1/1 Running 0 81m
auth-svc-79c7944dcb-mlfsh 1/1 Running 0 81m
catalog-svc-6548c9f45d-sczm8 2/2 Running 0 81m
controllermanager-svc-cd748cdff-dsrnn 1/1 Running 0 81m
crypto-svc-5846c467dc-ddqtn 4/4 Running 0 81m
dashboardbff-svc-5d4857cdf7-npldc 2/2 Running 0 81m
executor-svc-6fddc58f98-l6v9f 2/2 Running 0 81m
executor-svc-6fddc58f98-smfrk 2/2 Running 0 81m
executor-svc-6fddc58f98-zt8lv 2/2 Running 0 81m
frontend-svc-5566c779c9-tzslx 1/1 Running 0 81m
gateway-8567b6f75b-glbw2 1/1 Running 0 81m
jobs-svc-86c9c4889f-hnrjk 1/1 Running 0 81m
k10-grafana-78dc667986-4sfgz 0/1 Init:CrashLoopBackOff 10 (90s ago) 27m
kanister-svc-96f46bf89-fm9s2 1/1 Running 0 81m
logging-svc-76c6d79878-9s49n 1/1 Running 0 81m
metering-svc-79686c7689-qnrlw 1/1 Running 0 81m
prometheus-server-7787b6d6dc-l8m7m 2/2 Running 0 81m
state-svc-6c98bb5657-rvcvp 2/2 Running 0 81m

I share the pod log, has this situation happened to someone

oc logs --v=8 k10-grafana-78dc667986-4sfgz
I0405 16:32:21.126157 446944 loader.go:374] Config loaded from file: /root/.kube/config
I0405 16:32:21.133823 446944 round_trippers.go:463] GET https://api.lab.nanobunkers.io:6443/api/v1/namespaces/kasten-io/pods/k10-grafana-78dc667986-4sfgz
I0405 16:32:21.133855 446944 round_trippers.go:469] Request Headers:
I0405 16:32:21.133866 446944 round_trippers.go:473] Accept: application/json, */*
I0405 16:32:21.133881 446944 round_trippers.go:473] User-Agent: oc/4.12.0 (linux/amd64) kubernetes/b05f7d4
I0405 16:32:21.146039 446944 round_trippers.go:574] Response Status: 200 OK in 12 milliseconds
I0405 16:32:21.146077 446944 round_trippers.go:577] Response Headers:
I0405 16:32:21.146085 446944 round_trippers.go:580] Audit-Id: c0522355-682d-4a25-a6e3-243b9dbd4038
I0405 16:32:21.146093 446944 round_trippers.go:580] Cache-Control: no-cache, private
I0405 16:32:21.146099 446944 round_trippers.go:580] Content-Type: application/json
I0405 16:32:21.146111 446944 round_trippers.go:580] Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
I0405 16:32:21.146120 446944 round_trippers.go:580] X-Kubernetes-Pf-Flowschema-Uid: 19dd2e7f-a040-4c79-833a-8f06c46f2527
I0405 16:32:21.146129 446944 round_trippers.go:580] X-Kubernetes-Pf-Prioritylevel-Uid: 02385814-62fe-40ac-bc33-6377b3ac8bed
I0405 16:32:21.146140 446944 round_trippers.go:580] Date: Wed, 05 Apr 2023 22:32:21 GMT
I0405 16:32:21.146272 446944 request.go:1154] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"k10-grafana-78dc667986-4sfgz","generateName":"k10-grafana-78dc667986-","namespace":"kasten-io","uid":"c2688ece-718d-4382-a045-b57828c5c6ef","resourceVersion":"17399859","creationTimestamp":"2023-04-05T22:03:13Z","labels":{"app":"grafana","component":"grafana","pod-template-hash":"78dc667986","release":"k10"},"annotations":{"checksum/config":"0d520b80404b43fe4bd21c306c6272741b865eb1fa10c69d2b78197dec7ffa59","checksum/dashboards-json-config":"eae4f30f696bce8d7ea91891b68e4ab7657565d17ebd27c591066c09aca44e7f","checksum/sc-dashboard-provider-config":"01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b","checksum/secret":"31e08bbb0d21be7f8a9c1b778847e8d15f8d727a8a0c51946be1e3434580b4f1","k8s.v1.cni.cncf.io/network-status":"6{\n \"name\": \"openshift-sdn\",\n \"interface\": \"eth0\",\n \"ips\": 0\n \"10.131.0.136\"\n ],\n \"default\": true,\n \"dns\": {}\n}]","k8s.v1.cni.cncf.io/networks-status":"b{\n \"name\": \"openshift-s mtruncated 11853 chars]
Defaulted container "grafana" out of: grafana, init-chown-data (init), download-dashboards (init)
I0405 16:32:21.148709 446944 round_trippers.go:463] GET https://api.lab.nanobunkers.io:6443/api/v1/namespaces/kasten-io/pods/k10-grafana-78dc667986-4sfgz/log?container=grafana
I0405 16:32:21.148723 446944 round_trippers.go:469] Request Headers:
I0405 16:32:21.148736 446944 round_trippers.go:473] Accept: application/json, */*
I0405 16:32:21.148746 446944 round_trippers.go:473] User-Agent: oc/4.12.0 (linux/amd64) kubernetes/b05f7d4
I0405 16:32:21.159851 446944 round_trippers.go:574] Response Status: 400 Bad Request in 11 milliseconds
I0405 16:32:21.159879 446944 round_trippers.go:577] Response Headers:
I0405 16:32:21.159889 446944 round_trippers.go:580] Audit-Id: 9be9af6b-213f-4284-b64b-a40cb0d76611
I0405 16:32:21.159898 446944 round_trippers.go:580] Cache-Control: no-cache, private
I0405 16:32:21.159905 446944 round_trippers.go:580] Content-Type: application/json
I0405 16:32:21.159916 446944 round_trippers.go:580] Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
I0405 16:32:21.159926 446944 round_trippers.go:580] Content-Length: 213
I0405 16:32:21.159933 446944 round_trippers.go:580] Date: Wed, 05 Apr 2023 22:32:21 GMT
I0405 16:32:21.159964 446944 request.go:1154] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"container \"grafana\" in pod \"k10-grafana-78dc667986-4sfgz\" is waiting to start: PodInitializing","reason":"BadRequest","code":400}
I0405 16:32:21.160298 446944 helpers.go:246] server response object: _{
"metadata": {},
"status": "Failure",
"message": "container \"grafana\" in pod \"k10-grafana-78dc667986-4sfgz\" is waiting to start: PodInitializing",
"reason": "BadRequest",
"code": 400
}]
Error from server (BadRequest): container "grafana" in pod "k10-grafana-78dc667986-4sfgz" is waiting to start: PodInitializing

Page 1 / 1

@jaiganeshjk

can you try

kubectl describe pod k10-grafana-78dc667986-4sfgz -n kasten-io

just to see at a higher level what it is complaining about.

cheers

@Geoff Burke

Thanks for answering this is the output of the instruction

kubectl describe pod k10-grafana-78dc667986-4sfgz -n kasten-io

Name: k10-grafana-78dc667986-4sfgz
Namespace: kasten-io
Priority: 0
Service Account: k10-grafana
Node: ocp-w-1.lab.dominio.io/192.168.22.211
Start Time: Wed, 05 Apr 2023 16:03:13 -0600
Labels: app=grafana
component=grafana
pod-template-hash=78dc667986
release=k10
Annotations: checksum/config: 0d520b80404b43fe4bd21c306c6272741b865eb1fa10c69d2b78197dec7ffa59
checksum/dashboards-json-config: eae4f30f696bce8d7ea91891b68e4ab7657565d17ebd27c591066c09aca44e7f
checksum/sc-dashboard-provider-config: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
checksum/secret: 31e08bbb0d21be7f8a9c1b778847e8d15f8d727a8a0c51946be1e3434580b4f1
k8s.v1.cni.cncf.io/network-status:
3{
"name": "openshift-sdn",
"interface": "eth0",
"ips":
"10.131.0.136"
],
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status:
�{
"name": "openshift-sdn",
"interface": "eth0",
"ips":
"10.131.0.136"
],
"default": true,
"dns": {}
}]
openshift.io/scc: k10-grafana
Status: Pending
IP: 10.131.0.136
IPs:
IP: 10.131.0.136
Controlled By: ReplicaSet/k10-grafana-78dc667986
Init Containers:
init-chown-data:
Container ID: cri-o://1edf02413f8821c2832e542fec8380c1fd85525d66a1061f86c229d2e35451c9
Image: gcr.io/kasten-images/init:5.5.7
Image ID: gcr.io/kasten-images/init@sha256:bd844aa544aad9e8447f7f58553b0ef2ecee1f2e3e5e73b0b479f6dd096b02dc
Port: <none>
Host Port: <none>
Command:
chown
-R
472:472
/var/lib/grafana
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 05 Apr 2023 18:02:06 -0600
Finished: Wed, 05 Apr 2023 18:02:06 -0600
Ready: False
Restart Count: 28
Environment: <none>
Mounts:
/var/lib/grafana from storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-d78cd (ro)
download-dashboards:
Container ID:
Image: gcr.io/kasten-images/init:5.5.7
Image ID:
Port: <none>
Host Port: <none>
Command:
/bin/sh
Args:
-c
mkdir -p /var/lib/grafana/dashboards/default && /bin/sh -x /etc/grafana/download_dashboards.sh
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/etc/grafana/download_dashboards.sh from config (rw,path="download_dashboards.sh")
/var/lib/grafana from storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-d78cd (ro)
Containers:
grafana:
Container ID:
Image: gcr.io/kasten-images/grafana:5.5.7
Image ID:
Ports: 80/TCP, 3000/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Liveness: http-get http://:3000/api/health delay=60s timeout=30s period=10s #success=1 #failure=10
Readiness: http-get http://:3000/api/health delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
GF_SECURITY_ADMIN_USER: <set to the key 'admin-user' in secret 'k10-grafana'> Optional: false
GF_SECURITY_ADMIN_PASSWORD: <set to the key 'admin-password' in secret 'k10-grafana'> Optional: false
GF_PATHS_DATA: /var/lib/grafana/
GF_PATHS_LOGS: /var/log/grafana
GF_PATHS_PLUGINS: /var/lib/grafana/plugins
GF_PATHS_PROVISIONING: /etc/grafana/provisioning
Mounts:
/etc/grafana/grafana.ini from config (rw,path="grafana.ini")
/etc/grafana/provisioning/dashboards/dashboardproviders.yaml from config (rw,path="dashboardproviders.yaml")
/etc/grafana/provisioning/datasources/datasources.yaml from config (rw,path="datasources.yaml")
/var/lib/grafana from storage (rw)
/var/lib/grafana/dashboards/default/default.json from dashboards-default (rw,path="default.json")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-d78cd (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: k10-grafana
Optional: false
dashboards-default:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: k10-grafana-dashboards-default
Optional: false
storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: k10-grafana
ReadOnly: false
kube-api-access-d78cd:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional: <nil>
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 119m default-scheduler Successfully assigned kasten-io/k10-grafana-78dc667986-4sfgz to ocp-w-1.lab.dominio.io
Normal AddedInterface 119m multus Add eth0 i10.131.0.136/23] from openshift-sdn
Normal Pulled 117m (x5 over 119m) kubelet Container image "gcr.io/kasten-images/init:5.5.7" already present on machine
Normal Created 117m (x5 over 119m) kubelet Created container init-chown-data
Normal Started 117m (x5 over 119m) kubelet Started container init-chown-data
Warning BackOff 4m15s (x532 over 119m) kubelet Back-off restarting failed container

This is openshift so I am not really in my neck of the woods, but I believe the grafana pod is the last to deploy in the K10 deployment just as an out of the box quick check, see if you are running out of resources on the nodes maybe?

oc adm top node

and I think

oc describe node nodename

My amateur sees I think that the issues start at the init-chown-data (so it is changing owner?) so check to see that the pvc is ok as well?

I know the Kasten support folks here know Openshift really well so I am sure they will chime in and help out as well.

cheers

@Geoff Burke Thank you for your comments This is the data that the command returns. I have an NFS storage, it could be the problem. the other pods do take the PV

Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 776m (5%) 0 (0%)
memory 4412Mi (34%) 1700Mi (13%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>

Are you able to deploy other workloads to the NFS without issues?

Might seem like a silly question but I actually had a problem like this since I checked everything else first.

@Geoff Burke

Good morning, they already raised the PODs, they gave me an idia and I investigated, it turns out that the nfs server side was not configured correctly, specifically in the export file I was missing parameters, I am left as follows/volk10-1 *(rw,sync,no_root_squash)Thank you very much for the comments, they are very helpful.

NAME READY STATUS RESTARTS AGE
aggregatedapis-svc-64c55c6979-6br5z 1/1 Running 0 16m
auth-svc-79c7944dcb-nr5w7 1/1 Running 0 16m
catalog-svc-6548c9f45d-hl69t 2/2 Running 0 16m
controllermanager-svc-cd748cdff-465js 1/1 Running 0 16m
crypto-svc-5846c467dc-48nbd 4/4 Running 0 16m
dashboardbff-svc-5d4857cdf7-hv5vz 2/2 Running 0 16m
executor-svc-6fddc58f98-jndjk 2/2 Running 0 16m
executor-svc-6fddc58f98-lvh9m 2/2 Running 0 16m
executor-svc-6fddc58f98-nztx5 2/2 Running 0 16m
frontend-svc-5566c779c9-kz4q5 1/1 Running 0 16m
gateway-8567b6f75b-q6m58 1/1 Running 0 16m
jobs-svc-86c9c4889f-7xrw7 1/1 Running 0 16m
k10-grafana-678bbc5cbc-5s2k9 1/1 Running 0 16m
kanister-svc-96f46bf89-mkww8 1/1 Running 0 16m
logging-svc-76c6d79878-5tmq4 1/1 Running 0 16m
metering-svc-79686c7689-7lss5 1/1 Running 0 16m
prometheus-server-7787b6d6dc-f5sw5 2/2 Running 0 16m
state-svc-6c98bb5657-8c5qf 2/2 Running 0 16m
-root@ocp-svc k10]#

@Geoff Burke

they already raised the PODs, they gave me an idia and I investigated, it turns out that the nfs server side was not configured correctly, specifically in the export file I was missing parameters, I am left as follows

/volk10-1 *(rw,sync,no_root_squash)

Thank you very much for the comments, they are very helpful.

ah yes.. the fact that some of those other pods have persistent storage I should have thought of that oh well glad that helped. Was it because of permissions on the storage?

cheers

It was the configuration file /etc/exports I missed adding the flag  "no_root_squash" restart the NFS service, clean up the environment and get everything up from startup and the startup pod

drwxrwxrwx 5 root root 90 Apr 6 18:20 volk10-1
drwxrwxrwx 9 472 472 159 Apr 6 18:20 volk10-2
drwxrwxrwx 8 root root 176 Apr 6 18:11 volk10-3

volk10-2 the pod changed the owner, it is the one that identifies that the grafana is there, it did that in the deployment.

Comment

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded