Solved

About VADP API in VMware Tanzu


Userlevel 1
hi My client is a member of tkgm's guest cluster.want to do backup and recovery.   I want to migrate from Cluster A to Cluster B.   When performing a backupThe VMware CSI driver does not support Volumesnapshot because the current host version is 7.0.3.       Therefore, the VMware infra profile is registered, and snapshots and exports are performed through the VADP api.       When recovering exported data to another cluster

   

like the picture belowWhen the message occurs, the pvc recovery fails.

 

I understand that the generic volume shown in the capture means the general volume. Why does this message occur?         Is it because it is restored like a sidecar method rather than using VADP.api when performing restoration?

      

I would like to know the exact backup recovery logic for VMware tanzu.

 

icon

Best answer by Hagag 14 June 2023, 15:41

View original

4 comments

Userlevel 5
Badge +2


Hello @tamama3 


Typically, when encountering the error message "Failed to exec command in pod command terminated with exit code 137," it indicates that the process was unable to finish due to insufficient resources allocated to the pod.

As an initial troubleshooting step to address this issue, attempt to modify the resource limits for the restore pod in your recovery cluster by utilizing the following parameter in the helm command.

for example:
 

--set genericVolumeSnapshot.resources.requests.cpu=100m
--set genericVolumeSnapshot.resources.requests.memory=800Mi
--set genericVolumeSnapshot.resources.limits.cpu=1200m
--set genericVolumeSnapshot.resources.limits.memory=4000Mi


Here is a full example, you need to replace the K10 version

 

helm upgrade k10 kasten/k10 --namespace=kasten-io --reuse-values --set genericVolumeSnapshot.resources.requests.cpu=100m \
--set genericVolumeSnapshot.resources.requests.memory=800Mi \
--set genericVolumeSnapshot.resources.limits.cpu=1200m \
--set genericVolumeSnapshot.resources.limits.memory=4000Mi \
--version=<CURRENT_VERSION>

 

Userlevel 1

hi hagag

 

Are you setting it up because the kasten pod created during recovery lacks the resources required for recovery?Can the same message occur if the resource required for recovery exceeds the limit?
Userlevel 5
Badge +2

@tamama3 As previously stated, this error indicates that the process could not complete because the allocated resources for the pod were insufficient. However, it is possible that the error is caused by another factor, such as an I/O issue. Therefore, we began by increasing the limits. If you encounter another failure, you can attempt to restore after adjusting the limits, while simultaneously monitoring the resources of the pod and your worker node. This will provide a clearer understanding of why the error occurs.

Userlevel 1

@Hagag Thank you for your answer.

 

All of Kasten`s educational materials and releted blog materials are only test contents in a small environment.

 

Customer environments are diverse and even larger

 

it is a pity that there is no such thing as kasten`s guide document to prepare for it

 

The option you answered is also an advanced option, but I`m struggling every time because there aren`t enough examples or guides for this

 

Do you have any references or links on this subject?

 

Thank you

Comment