Skip to main content

I am soliciting feedback from the community and Vanguards in Slack about an issue we are seeing so wanted to get feedback here also.

Scenario - we have a customer using v12 with latest patches sending data to one of our DCs with Hitachi HCP Cloud Scale cluster which is very large.  They are sending over 20PB of data to us.

Object Lock is turned on for some of the buckets and we are seeing billions of gets/puts/list commands which is causing havoc on the cluster worker nodes for HCPCS.

The client has configured their jobs with the default block size of 1MB and we were thinking to change that to 4MB, but it would not make sense for 106-byte files.  That is the sizing of the files we are seeing the issue on for three buckets of which 2 of them are Windows servers and one is Linux.

Any thoughts or has anyone run in to this before?  I am working with Veeam and Hitachi together as well but if this rings a bell for anyone your input would be great.

Thanks.

Ugh. still an issue I take it. I was hoping in Veeam 12 this would have improved.  I remember the reduce concurrent button helped at bit but that was Veeam11. Someone told me that one of the issues is it has to constantly poll for instant restore to be available. I don’t know if that is accurate. Either way I would not want to have my S3 with a cloud provider that charges for api calls.


Nothing familiar for me, but I don’t have anything running at that scale.


Ugh. still an issue I take it. I was hoping in Veeam 12 this would have improved.  I remember the reduce concurrent button helped at bit but that was Veeam11. Someone told me that one of the issues is it has to constantly poll for instant restore to be available. I don’t know if that is accurate. Either way I would not want to have my S3 with a cloud provider that charges for api calls.

Yeah, there was a post on the forums about this when v12 came out just my Google is not working today.  😋😂


Have you checked how many tasks is allowed on the object storage repository? The default is unlimited, but in my experience even 8 or less is enough to fully saturate a 10 GbE WAN link.


Yes we have!
To be honest, the small files are causing a lot of problems for us right now.
We only use Netapp Storage GRID as object storage - yes I know that's not the loved baby 😉 ;)
The system has a database design that can't handle many small files on a very performant way. Veeam creates an incredible amount of them and this causes a lot of error messages, overloads etc. for our customers.

I am already talking to Veeam & Netapp - but so far there is no quick solution.

 

PS: Are we talking exclusively about VM backups or also NAS data? The NAS data should have a different layout and also create larger BLOB files? Our usecase is currently almost always VM data.


How are performance metrics (CPU/RAM) when the havoc appears on the HCPS? The Cloud Scale cluster is on dedicated hardware or VMs? I hope it’s on dedicated hardware for this kind of configuration!

Every data is on S11? Or S10?


Have you checked how many tasks is allowed on the object storage repository? The default is unlimited, but in my experience even 8 or less is enough to fully saturate a 10 GbE WAN link.

Thanks @haslund for that and that is something we will look in to.  I did just find out that one of the sites is still on v11a so that I am thinking is also part of the problem.  But I will keep this in mind once we figure things out as right now they are just doing offloading from SOBR which is using a Data Domain for the performance tier.  😖


How are performance metrics (CPU/RAM) when the havoc appears on the HCPS? The Cloud Scale cluster is on dedicated hardware or VMs? I hope it’s on dedicated hardware for this kind of configuration!

Every data is on S11? Or S10?

It is a dedicated cluster just for this client with hardware front end nodes and S11 backend nodes.  The front end nodes go crazy when this happens due to the billions of API calls.


Hmmm good to know, seems a software bugs from my POV. Keep us in touch :D


Hmmm good to know, seems a software bugs from my POV. Keep us in touch :D

I think part of it is that the buckets with issues the client is not v12 but rather v11a still.  So we are investigating things with them and trying to get them to upgrade. 👍🏼


We have worked with Veeam and determined that 4MB blocks are what is needed for Hitachi Cloud Scale (HCPCS) but we are finding also some configuration issues on the client side so will be addressing those as well.  It is the misconfigured jobs that are causing all the smaller packet objects of 109 bytes but only some of them after reviewing the graphs from Hitachi.  If I do find the final answer will post back here.

Thanks everyone for chiming in with suggestions and tips it is appreciated and what this community is all about.  😁


Comment