Question

Kasten pods not start

  • 23 November 2022
  • 6 comments
  • 559 views

Userlevel 2
Badge

hi! just deploy kasten and some pods never start. run preflights and everything is fine. Only have a “warning” because use generic storage, but just run the command “--set injectKanisterSidecar.enabled=true”

NAME                                     READY   STATUS    RESTARTS   AGE
aggregatedapis-svc-8557786dbd-npr5j      1/1     Running   0          31m
auth-svc-6c4fcdc8c9-mg4fn                1/1     Running   0          31m
catalog-svc-574b4fd998-6pwnf             0/2     Pending   0          31m
controllermanager-svc-797dbd58bc-b78nj   1/1     Running   0          31m
crypto-svc-84b454686c-957jb              4/4     Running   0          31m
dashboardbff-svc-5bfc458dcc-xk29l        1/1     Running   0          31m
executor-svc-6c59c87c77-4xmj2            2/2     Running   0          31m
executor-svc-6c59c87c77-rq5mc            2/2     Running   0          31m
executor-svc-6c59c87c77-s72x4            2/2     Running   0          31m
frontend-svc-68dcf99f46-5mwdt            1/1     Running   0          31m
gateway-67c7ccdf5c-wttvg                 1/1     Running   0          31m
jobs-svc-54f597c676-9qtpv                0/1     Pending   0          31m
k10-grafana-7c57cf7464-bb585             0/1     Pending   0          31m
kanister-svc-5d798c6cd8-rg7gh            1/1     Running   0          31m
logging-svc-c47544bf-2lbmk               0/1     Pending   0          31m
metering-svc-85846559c4-hdbht            0/1     Pending   0          31m
prometheus-server-849b9ddbb9-25smp       0/2     Pending   0          31m
state-svc-756b856969-k7hwz               2/2     Running   0          31m
 

Any ideas?


6 comments

Userlevel 5
Badge +2

Hello @gerardotapianqn Can you describe any of these pending pods and share the output ?

 

Thanks

Ahmed Hagag

Userlevel 2
Badge

Helo Hagag! send you the ouput:

 

Name:             jobs-svc-54f597c676-9qtpv
Namespace:        kasten-io
Priority:         0
Service Account:  k10-k10
Node:             <none>
Labels:           app=k10
                  app.kubernetes.io/instance=k10
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=k10
                  component=jobs
                  helm.sh/chart=k10-5.5.1
                  heritage=Helm
                  pod-template-hash=54f597c676
                  release=k10
                  run=jobs-svc
Annotations:      checksum/config: 0b5a4973e7cf2294eb3fa4922bad8db43b0b5729ca490b2e441be3a973ef5067
                  checksum/frontend-nginx-config: 4ef0c228905a86dc1f5b29d324e7e41b980254f990587ddc32d6a069e0ca2915
                  checksum/secret: 545c38b0922de19734fbffde62792c37c2aef6a3216cfa472449173165220f7d
Status:           Pending
IP:
IPs:              <none>
Controlled By:    ReplicaSet/jobs-svc-54f597c676
Init Containers:
  upgrade-init:
    Image:      gcr.io/kasten-images/upgrade:5.5.1
    Port:       <none>
    Host Port:  <none>
    Environment:
      MODEL_STORE_DIR:  <set to the key 'modelstoredirname' of config map 'k10-config'>  Optional: false
    Mounts:
      /mnt/k10state from jobs-persistent-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kzl8f (ro)
Containers:
  jobs-svc:
    Image:      gcr.io/kasten-images/jobs:5.5.1
    Port:       8000/TCP
    Host Port:  0/TCP
    Requests:
      cpu:      30m
      memory:   380Mi
    Liveness:   http-get http://:8000/v0/healthz delay=300s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:8000/v0/healthz delay=3s timeout=1s period=10s #success=1 #failure=3
    Environment:
      VERSION:                                        <set to the key 'version' of config map 'k10-config'>            Optional: false
      MODEL_STORE_DIR:                                <set to the key 'modelstoredirname' of config map 'k10-config'>  Optional: false
      LOG_LEVEL:                                      <set to the key 'loglevel' of config map 'k10-config'>           Optional: false
      POD_NAMESPACE:                                  kasten-io (v1:metadata.namespace)
      CONCURRENT_SNAP_CONVERSIONS:                    <set to the key 'concurrentSnapConversions' of config map 'k10-config'>               Optional: false
      CONCURRENT_WORKLOAD_SNAPSHOTS:                  <set to the key 'concurrentWorkloadSnapshots' of config map 'k10-config'>             Optional: false
      K10_DATA_STORE_PARALLEL_UPLOAD:                 <set to the key 'k10DataStoreParallelUpload' of config map 'k10-config'>              Optional: false
      K10_DATA_STORE_GENERAL_CONTENT_CACHE_SIZE_MB:   <set to the key 'k10DataStoreGeneralContentCacheSizeMB' of config map 'k10-config'>   Optional: false
      K10_DATA_STORE_GENERAL_METADATA_CACHE_SIZE_MB:  <set to the key 'k10DataStoreGeneralMetadataCacheSizeMB' of config map 'k10-config'>  Optional: false
      K10_DATA_STORE_RESTORE_CONTENT_CACHE_SIZE_MB:   <set to the key 'k10DataStoreRestoreContentCacheSizeMB' of config map 'k10-config'>   Optional: false
      K10_DATA_STORE_RESTORE_METADATA_CACHE_SIZE_MB:  <set to the key 'k10DataStoreRestoreMetadataCacheSizeMB' of config map 'k10-config'>  Optional: false
      K10_LIMITER_GENERIC_VOLUME_SNAPSHOTS:           <set to the key 'K10LimiterGenericVolumeSnapshots' of config map 'k10-config'>        Optional: false
      K10_LIMITER_GENERIC_VOLUME_COPIES:              <set to the key 'K10LimiterGenericVolumeCopies' of config map 'k10-config'>           Optional: false
      K10_LIMITER_GENERIC_VOLUME_RESTORES:            <set to the key 'K10LimiterGenericVolumeRestores' of config map 'k10-config'>         Optional: false
      K10_LIMITER_CSI_SNAPSHOTS:                      <set to the key 'K10LimiterCsiSnapshots' of config map 'k10-config'>                  Optional: false
      K10_LIMITER_PROVIDER_SNAPSHOTS:                 <set to the key 'K10LimiterProviderSnapshots' of config map 'k10-config'>             Optional: false
      AWS_ASSUME_ROLE_DURATION:                       <set to the key 'AWSAssumeRoleDuration' of config map 'k10-config'>                   Optional: false
      K10_RELEASE_NAME:                               k10
      KANISTER_FUNCTION_VERSION:                      <set to the key 'kanisterFunctionVersion' of config map 'k10-config'>  Optional: false
    Mounts:
      /mnt/k10state from jobs-persistent-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kzl8f (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  jobs-persistent-storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  jobs-pv-claim
    ReadOnly:   false
  kube-api-access-kzl8f:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  38s (x152 over 12h)  default-scheduler  0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.


Name:             k10-grafana-7c57cf7464-bb585
Namespace:        kasten-io
Priority:         0
Service Account:  k10-grafana
Node:             <none>
Labels:           app=grafana
                  component=grafana
                  pod-template-hash=7c57cf7464
                  release=k10
Annotations:      checksum/config: 0d520b80404b43fe4bd21c306c6272741b865eb1fa10c69d2b78197dec7ffa59
                  checksum/dashboards-json-config: b1d3a8c25f5fc2d516af2914ecc796b6e451627e352234c80fa81cc422f2c372
                  checksum/sc-dashboard-provider-config: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
                  checksum/secret: 842b974ff80da56e4188d2e2cc946517195291b069ac275da114083799fadc26
Status:           Pending
IP:
IPs:              <none>
Controlled By:    ReplicaSet/k10-grafana-7c57cf7464
Init Containers:
  init-chown-data:
    Image:      registry.access.redhat.com/ubi8/ubi-minimal:8.7-923
    Port:       <none>
    Host Port:  <none>
    Command:
      chown
      -R
      472:472
      /var/lib/grafana
    Environment:  <none>
    Mounts:
      /var/lib/grafana from storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-klmtp (ro)
  download-dashboards:
    Image:      registry.access.redhat.com/ubi8/ubi-minimal:8.7-923
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/sh
    Args:
      -c
      mkdir -p /var/lib/grafana/dashboards/default && /bin/sh -x /etc/grafana/download_dashboards.sh
    Environment:  <none>
    Mounts:
      /etc/grafana/download_dashboards.sh from config (rw,path="download_dashboards.sh")
      /var/lib/grafana from storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-klmtp (ro)
Containers:
  grafana:
    Image:       grafana/grafana:9.1.5
    Ports:       80/TCP, 3000/TCP
    Host Ports:  0/TCP, 0/TCP
    Liveness:    http-get http://:3000/api/health delay=60s timeout=30s period=10s #success=1 #failure=10
    Readiness:   http-get http://:3000/api/health delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      GF_SECURITY_ADMIN_USER:      <set to the key 'admin-user' in secret 'k10-grafana'>      Optional: false
      GF_SECURITY_ADMIN_PASSWORD:  <set to the key 'admin-password' in secret 'k10-grafana'>  Optional: false
      GF_PATHS_DATA:               /var/lib/grafana/
      GF_PATHS_LOGS:               /var/log/grafana
      GF_PATHS_PLUGINS:            /var/lib/grafana/plugins
      GF_PATHS_PROVISIONING:       /etc/grafana/provisioning
    Mounts:
      /etc/grafana/grafana.ini from config (rw,path="grafana.ini")
      /etc/grafana/provisioning/dashboards/dashboardproviders.yaml from config (rw,path="dashboardproviders.yaml")
      /etc/grafana/provisioning/datasources/datasources.yaml from config (rw,path="datasources.yaml")
      /var/lib/grafana from storage (rw)
      /var/lib/grafana/dashboards/default/default.json from dashboards-default (rw,path="default.json")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-klmtp (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      k10-grafana
    Optional:  false
  dashboards-default:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      k10-grafana-dashboards-default
    Optional:  false
  storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  k10-grafana
    ReadOnly:   false
  kube-api-access-klmtp:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  68s (x152 over 12h)  default-scheduler  0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.


Name:             logging-svc-c47544bf-2lbmk
Namespace:        kasten-io
Priority:         0
Service Account:  k10-k10
Node:             <none>
Labels:           app=k10
                  app.kubernetes.io/instance=k10
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=k10
                  component=logging
                  helm.sh/chart=k10-5.5.1
                  heritage=Helm
                  pod-template-hash=c47544bf
                  release=k10
                  run=logging-svc
Annotations:      checksum/config: 0b5a4973e7cf2294eb3fa4922bad8db43b0b5729ca490b2e441be3a973ef5067
                  checksum/frontend-nginx-config: 4ef0c228905a86dc1f5b29d324e7e41b980254f990587ddc32d6a069e0ca2915
                  checksum/secret: 545c38b0922de19734fbffde62792c37c2aef6a3216cfa472449173165220f7d
Status:           Pending
IP:
IPs:              <none>
Controlled By:    ReplicaSet/logging-svc-c47544bf
Init Containers:
  upgrade-init:
    Image:      gcr.io/kasten-images/upgrade:5.5.1
    Port:       <none>
    Host Port:  <none>
    Environment:
      MODEL_STORE_DIR:  <set to the key 'modelstoredirname' of config map 'k10-config'>  Optional: false
    Mounts:
      /mnt/k10state from logging-persistent-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kj7b9 (ro)
Containers:
  logging-svc:
    Image:       gcr.io/kasten-images/logging:5.5.1
    Ports:       8000/TCP, 24224/TCP, 24225/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP
    Requests:
      cpu:      2m
      memory:   40Mi
    Liveness:   http-get http://:8000/v0/healthz delay=300s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:8000/v0/healthz delay=3s timeout=1s period=10s #success=1 #failure=3
    Environment:
      VERSION:                                        <set to the key 'version' of config map 'k10-config'>            Optional: false
      MODEL_STORE_DIR:                                <set to the key 'modelstoredirname' of config map 'k10-config'>  Optional: false
      LOG_LEVEL:                                      <set to the key 'loglevel' of config map 'k10-config'>           Optional: false
      POD_NAMESPACE:                                  kasten-io (v1:metadata.namespace)
      CONCURRENT_SNAP_CONVERSIONS:                    <set to the key 'concurrentSnapConversions' of config map 'k10-config'>               Optional: false
      CONCURRENT_WORKLOAD_SNAPSHOTS:                  <set to the key 'concurrentWorkloadSnapshots' of config map 'k10-config'>             Optional: false
      K10_DATA_STORE_PARALLEL_UPLOAD:                 <set to the key 'k10DataStoreParallelUpload' of config map 'k10-config'>              Optional: false
      K10_DATA_STORE_GENERAL_CONTENT_CACHE_SIZE_MB:   <set to the key 'k10DataStoreGeneralContentCacheSizeMB' of config map 'k10-config'>   Optional: false
      K10_DATA_STORE_GENERAL_METADATA_CACHE_SIZE_MB:  <set to the key 'k10DataStoreGeneralMetadataCacheSizeMB' of config map 'k10-config'>  Optional: false
      K10_DATA_STORE_RESTORE_CONTENT_CACHE_SIZE_MB:   <set to the key 'k10DataStoreRestoreContentCacheSizeMB' of config map 'k10-config'>   Optional: false
      K10_DATA_STORE_RESTORE_METADATA_CACHE_SIZE_MB:  <set to the key 'k10DataStoreRestoreMetadataCacheSizeMB' of config map 'k10-config'>  Optional: false
      K10_LIMITER_GENERIC_VOLUME_SNAPSHOTS:           <set to the key 'K10LimiterGenericVolumeSnapshots' of config map 'k10-config'>        Optional: false
      K10_LIMITER_GENERIC_VOLUME_COPIES:              <set to the key 'K10LimiterGenericVolumeCopies' of config map 'k10-config'>           Optional: false
      K10_LIMITER_GENERIC_VOLUME_RESTORES:            <set to the key 'K10LimiterGenericVolumeRestores' of config map 'k10-config'>         Optional: false
      K10_LIMITER_CSI_SNAPSHOTS:                      <set to the key 'K10LimiterCsiSnapshots' of config map 'k10-config'>                  Optional: false
      K10_LIMITER_PROVIDER_SNAPSHOTS:                 <set to the key 'K10LimiterProviderSnapshots' of config map 'k10-config'>             Optional: false
      AWS_ASSUME_ROLE_DURATION:                       <set to the key 'AWSAssumeRoleDuration' of config map 'k10-config'>                   Optional: false
      K10_RELEASE_NAME:                               k10
      KANISTER_FUNCTION_VERSION:                      <set to the key 'kanisterFunctionVersion' of config map 'k10-config'>  Optional: false
    Mounts:
      /mnt/conf from logging-configmap-storage (rw)
      /mnt/k10state from logging-persistent-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kj7b9 (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  logging-persistent-storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  logging-pv-claim
    ReadOnly:   false
  logging-configmap-storage:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      fluentbit-configmap
    Optional:  false
  kube-api-access-kj7b9:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  38s (x152 over 12h)  default-scheduler  0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.


Name:             metering-svc-85846559c4-hdbht
Namespace:        kasten-io
Priority:         0
Service Account:  k10-k10
Node:             <none>
Labels:           app=k10
                  app.kubernetes.io/instance=k10
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=k10
                  component=metering
                  helm.sh/chart=k10-5.5.1
                  heritage=Helm
                  pod-template-hash=85846559c4
                  release=k10
                  run=metering-svc
Annotations:      checksum/config: 0b5a4973e7cf2294eb3fa4922bad8db43b0b5729ca490b2e441be3a973ef5067
                  checksum/secret: 545c38b0922de19734fbffde62792c37c2aef6a3216cfa472449173165220f7d
Status:           Pending
IP:
IPs:              <none>
Controlled By:    ReplicaSet/metering-svc-85846559c4
Init Containers:
  upgrade-init:
    Image:      gcr.io/kasten-images/upgrade:5.5.1
    Port:       <none>
    Host Port:  <none>
    Environment:
      MODEL_STORE_DIR:  /var/reports/
    Mounts:
      /var/reports/ from metering-persistent-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mxz66 (ro)
Containers:
  metering-svc:
    Image:      gcr.io/kasten-images/metering:5.5.1
    Port:       8000/TCP
    Host Port:  0/TCP
    Liveness:   http-get http://:8000/v0/healthz delay=90s timeout=1s period=10s #success=1 #failure=3
    Environment:
      VERSION:                       <set to the key 'version' of config map 'k10-config'>   Optional: false
      LOG_LEVEL:                     <set to the key 'loglevel' of config map 'k10-config'>  Optional: false
      POD_NAMESPACE:                 kasten-io (v1:metadata.namespace)
      AGENT_CONFIG_FILE:             /var/ubbagent/config.yaml
      AGENT_STATE_DIR:               /var/reports/ubbagent
      K10_REPORT_COLLECTION_PERIOD:  1800
      K10_REPORT_PUSH_PERIOD:        3600
    Mounts:
      /var/reports/ from metering-persistent-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mxz66 (ro)
      /var/ubbagent from meter-config (rw)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  meter-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      k10-k10-metering-config
    Optional:  false
  metering-persistent-storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  metering-pv-claim
    ReadOnly:   false
  kube-api-access-mxz66:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  38s (x152 over 12h)  default-scheduler  0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.


Name:             prometheus-server-849b9ddbb9-25smp
Namespace:        kasten-io
Priority:         0
Service Account:  prometheus-server
Node:             <none>
Labels:           app=prometheus
                  chart=prometheus-15.8.5
                  component=server
                  heritage=Helm
                  pod-template-hash=849b9ddbb9
                  release=k10
Annotations:      <none>
Status:           Pending
IP:
IPs:              <none>
Controlled By:    ReplicaSet/prometheus-server-849b9ddbb9
Containers:
  prometheus-server-configmap-reload:
    Image:      jimmidyson/configmap-reload:v0.5.0
    Port:       <none>
    Host Port:  <none>
    Args:
      --volume-dir=/etc/config
      --webhook-url=http://127.0.0.1:9090/k10/prometheus/-/reload
    Environment:  <none>
    Mounts:
      /etc/config from config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hsdlh (ro)
  prometheus-server:
    Image:      quay.io/prometheus/prometheus:v2.34.0
    Port:       9090/TCP
    Host Port:  0/TCP
    Args:
      --storage.tsdb.retention.time=30d
      --config.file=/etc/config/prometheus.yml
      --storage.tsdb.path=/data
      --web.console.libraries=/etc/prometheus/console_libraries
      --web.console.templates=/etc/prometheus/consoles
      --web.enable-lifecycle
      --web.route-prefix=/k10/prometheus
      --web.external-url=/k10/prometheus/
    Liveness:     http-get http://:9090/k10/prometheus/-/healthy delay=30s timeout=10s period=15s #success=1 #failure=3
    Readiness:    http-get http://:9090/k10/prometheus/-/ready delay=30s timeout=4s period=5s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /data from storage-volume (rw)
      /etc/config from config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hsdlh (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      k10-k10-prometheus-config
    Optional:  false
  storage-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  prometheus-server
    ReadOnly:   false
  kube-api-access-hsdlh:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  38s (x152 over 12h)  default-scheduler  0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.

Userlevel 5
Badge +2

@gerardotapianqn do you have Preemption enabled in your cluster, I see the below error:

default-scheduler  0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.

 

It seems the pending pods are in the queue and waiting to be scheduled. pods are waiting to be scheduled and will stay in the scheduling queue, until sufficient resources are free, and they can be scheduled.

The scheduler picks a Pod from the queue and tries to schedule it on a Node. If no Node is found that satisfies all the specified requirements of the Pod, preemption logic is triggered for the pending Pod.

can you share the output of the below command:


kubectl get priorityclass

Userlevel 2
Badge

yes i send you:

 

 

Userlevel 2
Badge

@Hagag  any ideas about this?

Userlevel 5
Badge +2

@gerardotapianqn I see you have set a PriorityClass with preemption enabled, The priority level of a PriorityClass is used by the Kubernetes scheduler to determine the order in which pods should be scheduled. If not enough resources are available to schedule all pods, the scheduler will prioritize higher-priority pods over lower-priority pods.

That is why you have some K10 pods in a pending state, could you make sure that you have enough cluster resources for the Kubernetes scheduler to be able to schedule the rest of the pods?

or you can try to disable the preemption & priority class if not needed.

 

Comment