@dk-do , As you mentioned, you are hitting the timeout while executing the kanister action.
There is a helm value that can be used to configure this kanister timeout kanister.backupTimeout
.
By default this is set to 45mins.
You can upgrade K10 by setting this helm value as below.
helm get values k10 --output yaml --namespace=kasten-io > k10_val.yaml && \
helm upgrade k10 kasten/k10 --namespace=kasten-io -f k10_val.yaml --set kanister.backupTimeout=120
Thanks for the fast response. I will try it with 120mins.
I just mentioned 120 mins as an example. It might take more than that depending on the network bandwidth or your environment.
Yes - I understood it like that. I will check how long it will take and adjust the timeout.
Another question: on which volume (/backup) is the backup file stored?
elasticdump --bulk=true --input=http://${host_name}:9200 --output=/backup
gzip /backup
Is it the storage class defined in the helm values (persistence.storageClass)?
@jaiganeshjk: Can you please answer my last question? :)
Another question: on which volume (/backup) is the backup file stored?
elasticdump --bulk=true --input=http://${host_name}:9200 --output=/backup
gzip /backup
Is it the storage class defined in the helm values (persistence.storageClass)?
@dk-do I just looked at the Elastisearch logical blueprint. We use kubeTask to run the phases in the blueprint.
This kubeTask creates a temporary pod and runs the commands against your ES instance.
We save the dump to a temporary directory and push the dump to your location profile. The local backup file gets deleted along with the temporary pod.
We don’t mount any PVCs in the temporary pod that we create using kubeTask.
Hope this answers your question.
@dk-do
This kubeTask creates a temporary pod and runs the commands against your ES instance.
We save the dump to a temporary directory and push the dump to your location profile. The local backup file gets deleted along with the temporary pod.
So it is stored in the same PV where Elastic stores its data?
So it is stored in the same PV where Elastic stores its data?
No it is not stored locally anywhere. It is pushed to the S3 objectstore/NFS filestore that you have configured as a location profile
Sorry for asking again:
elasticdump --bulk=true --input=http://${host_name}:9200 --output=/backup
==> this command creates a dump on /backup
==> or do I understand it wrong?
==> where is /backup? I think it is some local storageClass?
kando location push --profile '{{ toJson .Profile }}' --path "${backup_file_path}" --output-name "kopiaOutput" /backup.gz
==> this command pushes the created backup to (in our case) S3
elasticdump --bulk=true --input=http://${host_name}:9200 --output=/backup
==> this command creates a dump on /backup
Yea right. We dump it in /backup
which is just a temporary directory in the container. It is not persistent.
It gets deleted as soon as the pod gets deleted.
You can get the yaml of the pod that is created for this operation. The name of the pod would be kanister-job-*
. It wouldn’t have any volumes attached to it.
kando location push --profile '{{ toJson .Profile }}' --path "${backup_file_path}" --output-name "kopiaOutput" /backup.gz
==> this command pushes the created backup to (in our case) S3
Absolutely. This pushes the dump file from local directory into S3
elasticdump --bulk=true --input=http://${host_name}:9200 --output=/backup
==> this command creates a dump on /backup
Yea right. We dump it in /backup
which is just a temporary directory in the container. It is not persistent.
It gets deleted as soon as the pod gets deleted.
@jaiganeshjk
Okay, thanks for the info! In this case it is stored in the docker overlay directory which is by default the worker node’s root mount (/) on which the container is running:
bash-5.0# ls /backup -lah
-rw-r--r-- 1 root root 1.2G Feb 16 15:59 /backup
root@ops2-w6:/var/lib/docker/overlay2/5b120335c0139fdc1a32a3da964ff532cccfecf36d1e9100bea3fd2a8dc5ee40/merged# ls -lah
total 1.5G
drwxr-xr-x 1 root root 4.0K Feb 16 15:09 .
drwx--x--- 5 root root 4.0K Feb 16 15:09 ..
-rw-r--r-- 1 root root 1.5G Feb 16 16:12 backup <=== Elastic Backup File
drwxr-xr-x 1 root root 4.0K Feb 10 04:51 bin
drwxr-xr-x 1 root root 4.0K Feb 16 15:09 dev
-rwxr-xr-x 1 root root 0 Feb 16 15:09 .dockerenv
-rwxr-xr-x 1 root root 253 Feb 10 04:51 esdump-setup.sh
drwxr-xr-x 1 root root 4.0K Feb 16 15:09 etc
drwxr-xr-x 2 root root 4.0K Jan 16 2020 home
drwxr-xr-x 1 root root 4.0K Jan 16 2020 lib
drwxr-xr-x 5 root root 4.0K Jan 16 2020 media
drwxr-xr-x 2 root root 4.0K Jan 16 2020 mnt
drwxr-xr-x 2 root root 4.0K Jan 16 2020 opt
dr-xr-xr-x 2 root root 4.0K Jan 16 2020 proc
drwx------ 1 root root 4.0K Feb 16 16:00 root
drwxr-xr-x 1 root root 4.0K Feb 16 15:09 run
drwxr-xr-x 2 root root 4.0K Jan 16 2020 sbin
drwxr-xr-x 2 root root 4.0K Jan 16 2020 srv
drwxr-xr-x 2 root root 4.0K Jan 16 2020 sys
drwxrwxrwt 1 root root 4.0K Feb 10 04:51 tmp
drwxr-xr-x 1 root root 4.0K Jan 16 2020 usr
drwxr-xr-x 1 root root 4.0K Jan 16 2020 var
So we have to make sure that / on the corresponding node has enough space - is it possible or “recommended” to store those files on a nfs file share?
Hi @jaiganeshjk
I have another question: I changed the timeout to 16hrs.
Unfortunately 16hrs are not enough for backing up 1200GB in ElasticSearch. Is there an option to speed it up?
Everything is in local network (we use an own Minio installation for tests), servers are equipped with SSDs and connected with 10GBIT LAN connections.
fields:
- name: duration
value: 15h59m59.986648512s
@dk-do You are right. Currently we just have it done in the temporary volumes as you mentioned in the above comment.
It is a good feature request to have such volumes in a PVC which will remove dependency of having capacity in the node.
For the time taken for the copy, I am not sure if there is a way to speed it up.
Hi @jaiganeshjk
I have another question: I changed the timeout to 16hrs.
Unfortunately 16hrs are not enough for backing up 1200GB in ElasticSearch. Is there an option to speed it up?
Everything is in local network (we use an own Minio installation for tests), servers are equipped with SSDs and connected with 10GBIT LAN connections.
fields:
- name: duration
value: 15h59m59.986648512s