An X8M-2 Exadata system hosting an Oracle database environment has an issue with some of its database backups.
During the backup of very large databases and after many hours of it running, the RMAN plugin detects a socket timeout on the connection between the client and Veeam's Backup and Replication server (VBR), on its management port 10006, and throws the following exception
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ch1 channel at 09/06/2024 14:25:53
ORA-19506: failed to create sequential file, name="28045ae9-146a-42fd-837a-d32b1ac4a4bc/RMAN_2044054831_TPIIP_20240906_pv34bjb7_1_1.vab", parms=""
ORA-27028: skgfqcre: sbtbackup returned error
ORA-19511: Error received from media manager layer, error text:
Failed to connect to the endpoint t172.26.173.14:10006]. Connection timed out
--tr:Failed to connect to target endpoint.
--tr:Failed to establish reconnectable connection to VBR
When this happens, the Veeam plugin program continues to remain active and running backups, however all concurrent and impending backups fail as they’re all using the same active process and its threads, all of which are now in this irrevocably broken state. The only solution is to forcefully kill all active Veeam and its related processes on the client server.
Our analysis has revealed the following things.
Firstly, only very large terabyte size database backups are affected, smaller databases are fine. This probably means the issue is a function of execution time rather than database size. After a long period of time, for some reason the connection fails which breaks the entire plugin network-related stack right to the parent process level and does not recover.
Secondly, we pinpointed the moment this error started happening, which was our upgrade from Veeam version 11 to 12.1 last year. All version 11 backups prior to the upgrade were successful.