Solved

Problem installing k10 - helm

2 years ago
January 19, 2023
3 comments
317 views

stefanodemartini
Not a newbie anymore
1 comment

Hello, i’m trying to install k10 on a "on-premise” k8s cluster.

Kubernetes v1.25.1 with 4 worker nodes and 3 control-planes.
StorageClass is provided Rook-ceph filesystem ad the the rook-ceph environment is at latest release.
To be short, all seems to be working fine including the pre-flight checks but with

helm install my-k10 kasten/k10 --version 5.5.3

kubectl get pods

aggregatedapis-svc-9b5775d64-db4bj       1/1     Running                 0               23m
auth-svc-cf4646c89-vwjm5                 1/1     Running                 0               23m
catalog-svc-5bddbd67b5-dm7rs             0/2     Init:CrashLoopBackOff   9 (2m29s ago)   23m
controllermanager-svc-78b8fb4bf7-td5z9   1/1     Running                 0               23m
crypto-svc-7f4ff8b479-wzvck              4/4     Running                 0               23m
dashboardbff-svc-d87fc5bc-fmr7h          2/2     Running                 0               23m
executor-svc-7f5dc6c874-2858w            2/2     Running                 0               23m
executor-svc-7f5dc6c874-5cqjg            2/2     Running                 0               23m
executor-svc-7f5dc6c874-kvsz8            2/2     Running                 0               23m
frontend-svc-c69bf6fb6-vnznl             1/1     Running                 0               23m
gateway-9dd654864-s7mq6                  1/1     Running                 0               23m
jobs-svc-c89f77974-wgntb                 0/1     Init:CrashLoopBackOff   9 (2m2s ago)    23m
k10-grafana-7fc8b45cd-6wtvz              1/1     Running                 0               23m
kanister-svc-6656bc89d5-fp5wl            1/1     Running                 0               23m
logging-svc-6d95d9dd85-98mt2             0/1     Init:CrashLoopBackOff   9 (112s ago)    23m
metering-svc-5b855f46c5-g27wq            1/1     Running                 0               23m
prometheus-server-9f7769bbb-24w2l        1/2     CrashLoopBackOff        9 (2m17s ago)   23m
state-svc-7765f59cc-h4kvh                2/2     Running                 0               23m

looking inside those failing pods for example:
kubectl logs catalog-svc-5bddbd67b5-dm7rs -c upgrade-init

2023/01/19 13:43:54 Fresh install detected
panic: {"message":"Failed to change state owner","function":"main.main","linenumber":55,"file":"kasten.io/k10/cmd/upgrade/upgrade.go:55","cause":{"message":"Failed to create store directory","function":"main.changeStoreOwner","linenumber":30,"file":"kasten.io/k10/cmd/upgrade/upgrade.go:30","fields":[{"name":"model_store_dir","value":"//mnt/k10state/kasten-io/"}],"cause":{"message":"mkdir //mnt/k10state/kasten-io/: permission denied"}}}

goroutine 1 [running]:
main.main()
        /codefresh/volume/k10/go/src/kasten.io/k10/cmd/upgrade/upgrade.go:55 +0x54

It seems an authorization problem also on the remaining pods in Init:CrashLoopBackOff and
CrashLoopBackOff
Is there someone who can address this issue?

Best answer by stefanodemartini

Thank you for your kind answer.

k get pvc -n kasten-io
NAME                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
catalog-pv-claim    Bound    pvc-b77f1112-53bf-4318-bd8c-a2eb1e311e10   20Gi       RWO            rook-cephfs    18h
jobs-pv-claim       Bound    pvc-30ca4284-2231-4d86-bd02-b3a963343e45   20Gi       RWO            rook-cephfs    18h
k10-grafana         Bound    pvc-3a19897f-6f9d-4b64-8080-c846d399e39c   5Gi        RWO            rook-cephfs    18h
logging-pv-claim    Bound    pvc-8d5891ec-76a1-4ac4-8fcb-5d06113c1bbd   20Gi       RWO            rook-cephfs    18h
metering-pv-claim   Bound    pvc-a94319a8-35f4-464b-8395-c82d45b6a7d3   2Gi        RWO            rook-cephfs    18h
prometheus-server   Bound    pvc-80eb78fa-6d15-4fce-b64e-34fbd17321da   8Gi        RWO            rook-cephfs    18h

That's what i guessed but nothing strange with these pvc's as thery were created directly by the helm chart. I don't think it's a problem of the underlaying storageclass since other charts don’t have problems…

View original

Did this topic help you find an answer to your question?

+8

Madi.Cristil
Community Manager
617 comments
2 years ago
January 19, 2023

@jaiganeshjk @Yongkang

Madi Cristil

+2

jaiganeshjk
Experienced User
274 comments
2 years ago
January 20, 2023

@stefanodemartini Thanks for posting the question here.

From the above error messages, I understand that there is a problem with the permissions of the PVCs.

All the pods which use PVCs are failing to start/initialise and the error message that I see in the logs are permission denied(except Grafana, which uses initContainer to change permissions of the directory).

There is something wrong with the directory permissions and ownership for those PVCs.

Let me take a look at the CephFS if it has any limitations wrt fsGroup settings in the workload.

Jaiganesh

S

stefanodemartini
Author
Not a newbie anymore
1 comment
Answer
2 years ago
January 20, 2023

Thank you for your kind answer.

k get pvc -n kasten-io
NAME                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
catalog-pv-claim    Bound    pvc-b77f1112-53bf-4318-bd8c-a2eb1e311e10   20Gi       RWO            rook-cephfs    18h
jobs-pv-claim       Bound    pvc-30ca4284-2231-4d86-bd02-b3a963343e45   20Gi       RWO            rook-cephfs    18h
k10-grafana         Bound    pvc-3a19897f-6f9d-4b64-8080-c846d399e39c   5Gi        RWO            rook-cephfs    18h
logging-pv-claim    Bound    pvc-8d5891ec-76a1-4ac4-8fcb-5d06113c1bbd   20Gi       RWO            rook-cephfs    18h
metering-pv-claim   Bound    pvc-a94319a8-35f4-464b-8395-c82d45b6a7d3   2Gi        RWO            rook-cephfs    18h
prometheus-server   Bound    pvc-80eb78fa-6d15-4fce-b64e-34fbd17321da   8Gi        RWO            rook-cephfs    18h

That's what i guessed but nothing strange with these pvc's as thery were created directly by the helm chart. I don't think it's a problem of the underlaying storageclass since other charts don’t have problems…

Comment

Related topics

Mobile Pass Activation for Senior Passicon

Apps questions: journey entries, pass activation & activate a travel day (connect journeys to pass et al)icon

Senior Global Pass - validityicon

Global Pass - Start day within 11 months ? + stop inbetween?icon

Travel as a group (18 adults) - Can one person plan the trip for everyone?

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded