Skip to main content
I am trying to deploy K10s but without success, I only need the Grafana Pod to start but it hangs


NAME                                    READY   STATUS                  RESTARTS       AGE
aggregatedapis-svc-64c55c6979-5gswn     1/1     Running                 0              81m
auth-svc-79c7944dcb-mlfsh               1/1     Running                 0              81m
catalog-svc-6548c9f45d-sczm8            2/2     Running                 0              81m
controllermanager-svc-cd748cdff-dsrnn   1/1     Running                 0              81m
crypto-svc-5846c467dc-ddqtn             4/4     Running                 0              81m
dashboardbff-svc-5d4857cdf7-npldc       2/2     Running                 0              81m
executor-svc-6fddc58f98-l6v9f           2/2     Running                 0              81m
executor-svc-6fddc58f98-smfrk           2/2     Running                 0              81m
executor-svc-6fddc58f98-zt8lv           2/2     Running                 0              81m
frontend-svc-5566c779c9-tzslx           1/1     Running                 0              81m
gateway-8567b6f75b-glbw2                1/1     Running                 0              81m
jobs-svc-86c9c4889f-hnrjk               1/1     Running                 0              81m
k10-grafana-78dc667986-4sfgz            0/1     Init:CrashLoopBackOff   10 (90s ago)   27m
kanister-svc-96f46bf89-fm9s2            1/1     Running                 0              81m
logging-svc-76c6d79878-9s49n            1/1     Running                 0              81m
metering-svc-79686c7689-qnrlw           1/1     Running                 0              81m
prometheus-server-7787b6d6dc-l8m7m      2/2     Running                 0              81m
state-svc-6c98bb5657-rvcvp              2/2     Running                 0              81m
 

I share the pod log, has this situation happened to someone

 

 oc logs --v=8 k10-grafana-78dc667986-4sfgz
I0405 16:32:21.126157  446944 loader.go:374] Config loaded from file:  /root/.kube/config
I0405 16:32:21.133823  446944 round_trippers.go:463] GET https://api.lab.nanobunkers.io:6443/api/v1/namespaces/kasten-io/pods/k10-grafana-78dc667986-4sfgz
I0405 16:32:21.133855  446944 round_trippers.go:469] Request Headers:
I0405 16:32:21.133866  446944 round_trippers.go:473]     Accept: application/json, */*
I0405 16:32:21.133881  446944 round_trippers.go:473]     User-Agent: oc/4.12.0 (linux/amd64) kubernetes/b05f7d4
I0405 16:32:21.146039  446944 round_trippers.go:574] Response Status: 200 OK in 12 milliseconds
I0405 16:32:21.146077  446944 round_trippers.go:577] Response Headers:
I0405 16:32:21.146085  446944 round_trippers.go:580]     Audit-Id: c0522355-682d-4a25-a6e3-243b9dbd4038
I0405 16:32:21.146093  446944 round_trippers.go:580]     Cache-Control: no-cache, private
I0405 16:32:21.146099  446944 round_trippers.go:580]     Content-Type: application/json
I0405 16:32:21.146111  446944 round_trippers.go:580]     Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
I0405 16:32:21.146120  446944 round_trippers.go:580]     X-Kubernetes-Pf-Flowschema-Uid: 19dd2e7f-a040-4c79-833a-8f06c46f2527
I0405 16:32:21.146129  446944 round_trippers.go:580]     X-Kubernetes-Pf-Prioritylevel-Uid: 02385814-62fe-40ac-bc33-6377b3ac8bed
I0405 16:32:21.146140  446944 round_trippers.go:580]     Date: Wed, 05 Apr 2023 22:32:21 GMT
I0405 16:32:21.146272  446944 request.go:1154] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"k10-grafana-78dc667986-4sfgz","generateName":"k10-grafana-78dc667986-","namespace":"kasten-io","uid":"c2688ece-718d-4382-a045-b57828c5c6ef","resourceVersion":"17399859","creationTimestamp":"2023-04-05T22:03:13Z","labels":{"app":"grafana","component":"grafana","pod-template-hash":"78dc667986","release":"k10"},"annotations":{"checksum/config":"0d520b80404b43fe4bd21c306c6272741b865eb1fa10c69d2b78197dec7ffa59","checksum/dashboards-json-config":"eae4f30f696bce8d7ea91891b68e4ab7657565d17ebd27c591066c09aca44e7f","checksum/sc-dashboard-provider-config":"01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b","checksum/secret":"31e08bbb0d21be7f8a9c1b778847e8d15f8d727a8a0c51946be1e3434580b4f1","k8s.v1.cni.cncf.io/network-status":"6{\n    \"name\": \"openshift-sdn\",\n    \"interface\": \"eth0\",\n    \"ips\": 0\n        \"10.131.0.136\"\n    ],\n    \"default\": true,\n    \"dns\": {}\n}]","k8s.v1.cni.cncf.io/networks-status":"b{\n    \"name\": \"openshift-s mtruncated 11853 chars]
Defaulted container "grafana" out of: grafana, init-chown-data (init), download-dashboards (init)
I0405 16:32:21.148709  446944 round_trippers.go:463] GET https://api.lab.nanobunkers.io:6443/api/v1/namespaces/kasten-io/pods/k10-grafana-78dc667986-4sfgz/log?container=grafana
I0405 16:32:21.148723  446944 round_trippers.go:469] Request Headers:
I0405 16:32:21.148736  446944 round_trippers.go:473]     Accept: application/json, */*
I0405 16:32:21.148746  446944 round_trippers.go:473]     User-Agent: oc/4.12.0 (linux/amd64) kubernetes/b05f7d4
I0405 16:32:21.159851  446944 round_trippers.go:574] Response Status: 400 Bad Request in 11 milliseconds
I0405 16:32:21.159879  446944 round_trippers.go:577] Response Headers:
I0405 16:32:21.159889  446944 round_trippers.go:580]     Audit-Id: 9be9af6b-213f-4284-b64b-a40cb0d76611
I0405 16:32:21.159898  446944 round_trippers.go:580]     Cache-Control: no-cache, private
I0405 16:32:21.159905  446944 round_trippers.go:580]     Content-Type: application/json
I0405 16:32:21.159916  446944 round_trippers.go:580]     Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
I0405 16:32:21.159926  446944 round_trippers.go:580]     Content-Length: 213
I0405 16:32:21.159933  446944 round_trippers.go:580]     Date: Wed, 05 Apr 2023 22:32:21 GMT
I0405 16:32:21.159964  446944 request.go:1154] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"container \"grafana\" in pod \"k10-grafana-78dc667986-4sfgz\" is waiting to start: PodInitializing","reason":"BadRequest","code":400}
I0405 16:32:21.160298  446944 helpers.go:246] server response object: _{
  "metadata": {},
  "status": "Failure",
  "message": "container \"grafana\" in pod \"k10-grafana-78dc667986-4sfgz\" is waiting to start: PodInitializing",
  "reason": "BadRequest",
  "code": 400
}]
Error from server (BadRequest): container "grafana" in pod "k10-grafana-78dc667986-4sfgz" is waiting to start: PodInitializing
 

 

@jaiganeshjk 


can you try 

kubectl describe pod k10-grafana-78dc667986-4sfgz -n kasten-io

just to see at a higher level what it is complaining about.

 

cheers


@Geoff Burke 

 

Thanks  for answering this is the output of the instruction

 kubectl describe pod k10-grafana-78dc667986-4sfgz -n kasten-io
 

 

 

Name:             k10-grafana-78dc667986-4sfgz
Namespace:        kasten-io
Priority:         0
Service Account:  k10-grafana
Node:             ocp-w-1.lab.dominio.io/192.168.22.211
Start Time:       Wed, 05 Apr 2023 16:03:13 -0600
Labels:           app=grafana
                  component=grafana
                  pod-template-hash=78dc667986
                  release=k10
Annotations:      checksum/config: 0d520b80404b43fe4bd21c306c6272741b865eb1fa10c69d2b78197dec7ffa59
                  checksum/dashboards-json-config: eae4f30f696bce8d7ea91891b68e4ab7657565d17ebd27c591066c09aca44e7f
                  checksum/sc-dashboard-provider-config: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
                  checksum/secret: 31e08bbb0d21be7f8a9c1b778847e8d15f8d727a8a0c51946be1e3434580b4f1
                  k8s.v1.cni.cncf.io/network-status:
                    3{
                        "name": "openshift-sdn",
                        "interface": "eth0",
                        "ips":
                            "10.131.0.136"
                        ],
                        "default": true,
                        "dns": {}
                    }]
                  k8s.v1.cni.cncf.io/networks-status:
                    Â{
                        "name": "openshift-sdn",
                        "interface": "eth0",
                        "ips":
                            "10.131.0.136"
                        ],
                        "default": true,
                        "dns": {}
                    }]
                  openshift.io/scc: k10-grafana
Status:           Pending
IP:               10.131.0.136
IPs:
  IP:           10.131.0.136
Controlled By:  ReplicaSet/k10-grafana-78dc667986
Init Containers:
  init-chown-data:
    Container ID:  cri-o://1edf02413f8821c2832e542fec8380c1fd85525d66a1061f86c229d2e35451c9
    Image:         gcr.io/kasten-images/init:5.5.7
    Image ID:      gcr.io/kasten-images/init@sha256:bd844aa544aad9e8447f7f58553b0ef2ecee1f2e3e5e73b0b479f6dd096b02dc
    Port:          <none>
    Host Port:     <none>
    Command:
      chown
      -R
      472:472
      /var/lib/grafana
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 05 Apr 2023 18:02:06 -0600
      Finished:     Wed, 05 Apr 2023 18:02:06 -0600
    Ready:          False
    Restart Count:  28
    Environment:    <none>
    Mounts:
      /var/lib/grafana from storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-d78cd (ro)
  download-dashboards:
    Container ID:
    Image:         gcr.io/kasten-images/init:5.5.7
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
    Args:
      -c
      mkdir -p /var/lib/grafana/dashboards/default && /bin/sh -x /etc/grafana/download_dashboards.sh
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /etc/grafana/download_dashboards.sh from config (rw,path="download_dashboards.sh")
      /var/lib/grafana from storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-d78cd (ro)
Containers:
  grafana:
    Container ID:
    Image:          gcr.io/kasten-images/grafana:5.5.7
    Image ID:
    Ports:          80/TCP, 3000/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:3000/api/health delay=60s timeout=30s period=10s #success=1 #failure=10
    Readiness:      http-get http://:3000/api/health delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      GF_SECURITY_ADMIN_USER:      <set to the key 'admin-user' in secret 'k10-grafana'>      Optional: false
      GF_SECURITY_ADMIN_PASSWORD:  <set to the key 'admin-password' in secret 'k10-grafana'>  Optional: false
      GF_PATHS_DATA:               /var/lib/grafana/
      GF_PATHS_LOGS:               /var/log/grafana
      GF_PATHS_PLUGINS:            /var/lib/grafana/plugins
      GF_PATHS_PROVISIONING:       /etc/grafana/provisioning
    Mounts:
      /etc/grafana/grafana.ini from config (rw,path="grafana.ini")
      /etc/grafana/provisioning/dashboards/dashboardproviders.yaml from config (rw,path="dashboardproviders.yaml")
      /etc/grafana/provisioning/datasources/datasources.yaml from config (rw,path="datasources.yaml")
      /var/lib/grafana from storage (rw)
      /var/lib/grafana/dashboards/default/default.json from dashboards-default (rw,path="default.json")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-d78cd (ro)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      k10-grafana
    Optional:  false
  dashboards-default:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      k10-grafana-dashboards-default
    Optional:  false
  storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  k10-grafana
    ReadOnly:   false
  kube-api-access-d78cd:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason          Age                     From               Message
  ----     ------          ----                    ----               -------
  Normal   Scheduled       119m                    default-scheduler  Successfully assigned kasten-io/k10-grafana-78dc667986-4sfgz to ocp-w-1.lab.dominio.io
  Normal   AddedInterface  119m                    multus             Add eth0 i10.131.0.136/23] from openshift-sdn
  Normal   Pulled          117m (x5 over 119m)     kubelet            Container image "gcr.io/kasten-images/init:5.5.7" already present on machine
  Normal   Created         117m (x5 over 119m)     kubelet            Created container init-chown-data
  Normal   Started         117m (x5 over 119m)     kubelet            Started container init-chown-data
  Warning  BackOff         4m15s (x532 over 119m)  kubelet            Back-off restarting failed container
 


This is openshift so I am not really in my neck of the woods, but I believe the grafana pod is the last to deploy in the K10 deployment just as an out of the box quick check, see if you are running out of resources on the nodes maybe?

oc adm top node

and I think

oc describe node nodename

My amateur sees I think that the issues start at the init-chown-data (so it is changing owner?) so check to see that the pvc is ok as well? 

I know the Kasten support folks here know Openshift really well so I am sure they will chime in and help out as well.

 

cheers


 

@Geoff Burke  Thank you for your comments This is the data that the command returns. I have an NFS storage, it could be the problem. the other pods do take the PV


Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests      Limits
  --------           --------      ------
  cpu                776m (5%)     0 (0%)
  memory             4412Mi (34%)  1700Mi (13%)
  ephemeral-storage  0 (0%)        0 (0%)
  hugepages-1Gi      0 (0%)        0 (0%)
  hugepages-2Mi      0 (0%)        0 (0%)
Events:              <none>
 


Are you able to deploy other workloads to the NFS without issues?

 

Might seem like a silly question but I actually had a problem like this since I checked everything else first.

 

 


@Geoff Burke 

Good morning, they already raised the PODs, they gave me an idia and I investigated, it turns out that the nfs server side was not configured correctly, specifically in the export file I was missing parameters, I am left as follows/volk10-1 *(rw,sync,no_root_squash)Thank you very much for the comments, they are very helpful.

 

NAME                                    READY   STATUS    RESTARTS   AGE
aggregatedapis-svc-64c55c6979-6br5z     1/1     Running   0          16m
auth-svc-79c7944dcb-nr5w7               1/1     Running   0          16m
catalog-svc-6548c9f45d-hl69t            2/2     Running   0          16m
controllermanager-svc-cd748cdff-465js   1/1     Running   0          16m
crypto-svc-5846c467dc-48nbd             4/4     Running   0          16m
dashboardbff-svc-5d4857cdf7-hv5vz       2/2     Running   0          16m
executor-svc-6fddc58f98-jndjk           2/2     Running   0          16m
executor-svc-6fddc58f98-lvh9m           2/2     Running   0          16m
executor-svc-6fddc58f98-nztx5           2/2     Running   0          16m
frontend-svc-5566c779c9-kz4q5           1/1     Running   0          16m
gateway-8567b6f75b-q6m58                1/1     Running   0          16m
jobs-svc-86c9c4889f-7xrw7               1/1     Running   0          16m
k10-grafana-678bbc5cbc-5s2k9            1/1     Running   0          16m
kanister-svc-96f46bf89-mkww8            1/1     Running   0          16m
logging-svc-76c6d79878-5tmq4            1/1     Running   0          16m
metering-svc-79686c7689-7lss5           1/1     Running   0          16m
prometheus-server-7787b6d6dc-f5sw5      2/2     Running   0          16m
state-svc-6c98bb5657-8c5qf              2/2     Running   0          16m
-root@ocp-svc k10]#
 


@Geoff Burke 

 they already raised the PODs, they gave me an idia and I investigated, it turns out that the nfs server side was not configured correctly, specifically in the export file I was missing parameters, I am left as follows

/volk10-1 *(rw,sync,no_root_squash)

Thank you very much for the comments, they are very helpful.

 

NAME                                    READY   STATUS    RESTARTS   AGE
aggregatedapis-svc-64c55c6979-6br5z     1/1     Running   0          16m
auth-svc-79c7944dcb-nr5w7               1/1     Running   0          16m
catalog-svc-6548c9f45d-hl69t            2/2     Running   0          16m
controllermanager-svc-cd748cdff-465js   1/1     Running   0          16m
crypto-svc-5846c467dc-48nbd             4/4     Running   0          16m
dashboardbff-svc-5d4857cdf7-hv5vz       2/2     Running   0          16m
executor-svc-6fddc58f98-jndjk           2/2     Running   0          16m
executor-svc-6fddc58f98-lvh9m           2/2     Running   0          16m
executor-svc-6fddc58f98-nztx5           2/2     Running   0          16m
frontend-svc-5566c779c9-kz4q5           1/1     Running   0          16m
gateway-8567b6f75b-q6m58                1/1     Running   0          16m
jobs-svc-86c9c4889f-7xrw7               1/1     Running   0          16m
k10-grafana-678bbc5cbc-5s2k9            1/1     Running   0          16m
kanister-svc-96f46bf89-mkww8            1/1     Running   0          16m
logging-svc-76c6d79878-5tmq4            1/1     Running   0          16m
metering-svc-79686c7689-7lss5           1/1     Running   0          16m
prometheus-server-7787b6d6dc-f5sw5      2/2     Running   0          16m
state-svc-6c98bb5657-8c5qf              2/2     Running   0          16m
croot@ocp-svc k10]#
 


ah yes.. the fact that some of those other pods have persistent storage 🙂 I should have thought of that 🤣 oh well glad that helped. Was it because of permissions on the storage?

 

cheers


 

It was the configuration file /etc/exports I missed adding the flag  "no_root_squash" restart the NFS service, clean up the environment and get everything up from startup and the startup pod

 

drwxrwxrwx   5 root             root    90 Apr  6 18:20 volk10-1
drwxrwxrwx   9              472    472 159 Apr  6 18:20 volk10-2
drwxrwxrwx   8 root             root   176 Apr  6 18:11 volk10-3
 

volk10-2 the pod changed the owner, it is the one that identifies that the grafana is there, it did that in the deployment.


Comment