Skip to main content

Hi,

I want to test Kasten to see if it will fit in my future k8s cluster but I can’t access the dashboard and I see the pod gateway is always crash and restart

gateway-5765cd558f-b5zbs                 0/1     CrashLoopBackOff   7 (4m32s ago)   11m

when i do:

kubectl describe pod gateway-5765cd558f-b5zbs - n kasten-io

in the events I see this:

Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Created    17m (x2 over 17m)     kubelet            Created container ambassador
  Normal   Started    17m (x2 over 17m)     kubelet            Started container ambassador
  Normal   Killing    16m (x2 over 17m)     kubelet            Container ambassador failed liveness probe, will be restarted
  Normal   Pulled     16m (x3 over 17m)     kubelet            Container image "gcr.io/kasten-images/emissary:6.5.5" already present on machine
  Normal   Scheduled  15m                   default-scheduler  Successfully assigned kasten-io/gateway-5765cd558f-b5zbs to k8s-worker02
  Warning  Unhealthy  12m (x19 over 17m)    kubelet            Liveness probe failed: HTTP probe failed with statuscode: 503
  Warning  BackOff    8m5s (x24 over 14m)   kubelet            Back-off restarting failed container ambassador in pod gateway-5765cd558f-b5zbs_kasten-io(c19265b0-cf49-4779-8597-554dbb06b655)
  Warning  Unhealthy  2m32s (x45 over 17m)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 503

 

and when i do :

kubectl -n kasten-io logs gateway-5765cd558f-b5zbs

I see a lot of message

https://pastebin.com/ETJXrirt
 

I understand that he search for an AMBASSADOR container but where is this container?

and all the other pods are always running and never crash or reboot

 

thanks for your help

From our experience, this is caused by the Ambassador container exhausting the available file descriptors on the scheduled node, see Too many open files · Issue #1317 · emissary-ingress/emissary · GitHub for details

 

Make sure to set a higher file descriptor limit on your nodes, e.g. by editing /etc/security/limits.conf by appending the following lines:

 

*    soft    nofile    1048576
*    hard    nofile    1048576

 

After that, kill the pod for it to be re-created and the new file descriptor limit to take effect. If the issue persists then try scheduling the pod on other nodes with sufficient file descriptors available.


@jaiganeshjk 


Comment