Skip to main content

We're installing a storeonce 3660 FC and noticed quite a poor performance during network-free backup.

The setup is as following:

Primay storage alletra 5000 16Gb FC

Backup Gateway server 8Gb FC

Storeonce 3660 16Gb FC

Veeam 12.2.

The storeonce has been configured as repository with FC activated and the gateway server can reach it correctly, SAN has been zoned.

The JOB is configured with storage integration and it creates the Alletra snapshot mounting it to the gateway server.

Given such configuration the backup job is running at about 200MB/s where the same job using a diskbased target via FC as well is running at 1GB/s, which make sens of the 8Gb gateway server's HBA

The SAN Fabric is made of 2 switches, one is an 8Gb Blade switch and the second is a SN3000 16Gb switch.

Th Gateway server is connected to the Blade Switch while the Alletra and the Storeonce are connected to the external 16Gb switch

While the job is running, what shows as the bottleneck in the backup statistics? Source, Proxy, Network or Target?


It says the target


To me this is not a Veeam problem but rather a configuration/setup issue.  You have two different speed switches which will factor in to the problem for sure as it needs to downgrade the speed to the 8GB FC switch.

Also version 12.2 is not the newest release so you may want to try upgrading to 12.3 to see if any of the HPE plugins were updated as well which may help to address the issue.

Should this continue would suggest contacting support for further assistance.

The Veeam environment - we also need more details here - 

  1. Are you running an all-in-one VBR server?  What specs for RAM, CPU, etc.
  2. How many proxies deployed and specs?
  3. What virtual layer - VMware, etc.?

The details here are not much to go on.

 
 
 

To me this is not a Veeam problem but rather a configuration/setup issue.  You have two different speed switches which will factor in to the problem for sure as it needs to downgrade the speed to the 8GB FC switch.

Also version 12.2 is not the newest release so you may want to try upgrading to 12.3 to see if any of the HPE plugins were updated as well which may help to address the issue.

Should this continue would suggest contacting support for further assistance.

The Veeam environment - we also need more details here - 

  1. Are you running an all-in-one VBR server?  What specs for RAM, CPU, etc.
  2. How many proxies deployed and specs?
  3. What virtual layer - VMware, etc.?

The details here are not much to go on.

 
 
 

The Fabric infrastructure is the same as used so far and using a Disk Based target mounted to the proxy/gateway server via FC results in 1GB/s throughput.

The virtualization layes is vmware

The Gateway to Storeonce is also the proxy and is separate from the VBR server 


To me this is not a Veeam problem but rather a configuration/setup issue.  You have two different speed switches which will factor in to the problem for sure as it needs to downgrade the speed to the 8GB FC switch.

Also version 12.2 is not the newest release so you may want to try upgrading to 12.3 to see if any of the HPE plugins were updated as well which may help to address the issue.

Should this continue would suggest contacting support for further assistance.

The Veeam environment - we also need more details here - 

  1. Are you running an all-in-one VBR server?  What specs for RAM, CPU, etc.
  2. How many proxies deployed and specs?
  3. What virtual layer - VMware, etc.?

The details here are not much to go on.

 
 
 

The Fabric infrastructure is the same as used so far and using a Disk Based target mounted to the proxy/gateway server via FC results in 1GB/s throughput.

The virtualization layes is vmware

The Gateway to Storeonce is also the proxy and is separate from the VBR server 

If that is the case then there is something with the storage snapshots causing the slowdown.  That is where I would start investigating.


I have a dedicated 40Gb storage network for my iSCSI setup using Nimble as my target, but my HBAs are 10Gb. I also use BfSS and during my job runs, I almost always run around the 200MB range; sometimes I may hit 1Gb. I actually think it’s rare to hit 1Gb+ speeds for most Incremental job runs. I think initial and Active Full runs tend to hit those higher numbers.

That said, having the “target” marked in the Job as your performance issue is a bit concerning. I am leaning with Chris in that something here just may not be configured properly on your target side.

Do you have Veeam Proxies setup? The Gateway Server can also be a Proxy. Reference the Guide on path/setup using Gateways:
https://helpcenter.veeam.com/docs/backup/vsphere/gateway_server.html?ver=120 


To me this is not a Veeam problem but rather a configuration/setup issue.  You have two different speed switches which will factor in to the problem for sure as it needs to downgrade the speed to the 8GB FC switch.

Also version 12.2 is not the newest release so you may want to try upgrading to 12.3 to see if any of the HPE plugins were updated as well which may help to address the issue.

Should this continue would suggest contacting support for further assistance.

The Veeam environment - we also need more details here - 

  1. Are you running an all-in-one VBR server?  What specs for RAM, CPU, etc.
  2. How many proxies deployed and specs?
  3. What virtual layer - VMware, etc.?

The details here are not much to go on.

 
 
 

The Fabric infrastructure is the same as used so far and using a Disk Based target mounted to the proxy/gateway server via FC results in 1GB/s throughput.

The virtualization layes is vmware

The Gateway to Storeonce is also the proxy and is separate from the VBR server 

If that is the case then there is something with the storage snapshots causing the slowdown.  That is where I would start investigating.

VBR is telling the target being the bottleneck.

I cannot understand how to investigate the source snapshot being the problem since the alletra just create the snapshot and export it to the gateway/proxy server.
The same job with a different target still on FC performs much better.

 


https://forums.veeam.com/post252291.html#p252291

@Sc2111 Can you confirm quick, with your test job how many VMs are in there?

The post I linked is from a technology partner at HPE, and while it’s about restores the same concepts Federico describes apply.

Deduplication Appliances (not just Storeonce, DataDomain, Quantum, etc. as well) handle incoming data in streams, and individual streams have a very finite cap depending on your model. The main benefit of these dedup boxes isn’t on individual stream speed, it’s about total throughput.

Deduplication appliances really shine when there are tons of streams going in/out: Stream here == Concurrent Task (machine disk) in Veeam terms, so if your test job was just a few VMs with a single disk each, probably the throughput won’t be so great.

I would not compare a deduplication appliance (any) to a normal block-based repository -- for DataDomain and Storeonce, you must go through their respective libraries to utilize the advanced storage features, and these libraries and the boxes themselves will have some overhead and restrictions in place to prevent any one client from overloading the boxes.

In short, don’t focus on individual streams (single VM) backup speeds with Storeonce (or any deduplication appliance), single stream performance will always be lesser than to a “dumb” block storage device, and this is by design for the dedup appliances.

Instead, focus on the total throughput per hour when you’re dumping all the backups in and you should see the processing rate match your expectations and the advertised ingest rate per hour for the Storeonce (assuming you are pushing enough streams to it)

Similarly, I would ask how the Gateway is sized for the Storeonce -- Gateway servers must be sized accordingly and very often I see this is missed -- so it would be good to know:

Gateway server CPU/Memory available

Any other software or Veeam roles (proxy, repo, etc) installed on the gateway?

How many concurrent tasks configured for the gateway currently?

 

I’m doubtful that much can be done here, but gateway and stream count are probably your best bet. Similarly, I must note that the FC configuration for Deduplication Appliances is no longer a strongly recommended practice -- nothing wrong with it, but Veeam’s experience is that it’s more trouble than it’s worth usually and network connections are easier to manage and offer comparable performance. I am not saying go out and change it immediately, but you may consider testing with network and see if  you can eke out any extra performance.


https://forums.veeam.com/post252291.html#p252291

@Sc2111 Can you confirm quick, with your test job how many VMs are in there?

The post I linked is from a technology partner at HPE, and while it’s about restores the same concepts Federico describes apply.

Deduplication Appliances (not just Storeonce, DataDomain, Quantum, etc. as well) handle incoming data in streams, and individual streams have a very finite cap depending on your model. The main benefit of these dedup boxes isn’t on individual stream speed, it’s about total throughput.

Deduplication appliances really shine when there are tons of streams going in/out: Stream here == Concurrent Task (machine disk) in Veeam terms, so if your test job was just a few VMs with a single disk each, probably the throughput won’t be so great.

I would not compare a deduplication appliance (any) to a normal block-based repository -- for DataDomain and Storeonce, you must go through their respective libraries to utilize the advanced storage features, and these libraries and the boxes themselves will have some overhead and restrictions in place to prevent any one client from overloading the boxes.

In short, don’t focus on individual streams (single VM) backup speeds with Storeonce (or any deduplication appliance), single stream performance will always be lesser than to a “dumb” block storage device, and this is by design for the dedup appliances.

Instead, focus on the total throughput per hour when you’re dumping all the backups in and you should see the processing rate match your expectations and the advertised ingest rate per hour for the Storeonce (assuming you are pushing enough streams to it)

Similarly, I would ask how the Gateway is sized for the Storeonce -- Gateway servers must be sized accordingly and very often I see this is missed -- so it would be good to know:

Gateway server CPU/Memory available

Any other software or Veeam roles (proxy, repo, etc) installed on the gateway?

How many concurrent tasks configured for the gateway currently?

 

I’m doubtful that much can be done here, but gateway and stream count are probably your best bet. Similarly, I must note that the FC configuration for Deduplication Appliances is no longer a strongly recommended practice -- nothing wrong with it, but Veeam’s experience is that it’s more trouble than it’s worth usually and network connections are easier to manage and offer comparable performance. I am not saying go out and change it immediately, but you may consider testing with network and see if  you can eke out any extra performance.

 

Thanks for the useful link and info

the gateway server has the following configuration

128GB RAM

2CPU Intel XEON E5-2667 2.90GHz

Configured with 12 Max task concurrently

 

 


Very happy to help.

Gateaway looks alright, 2CPU Intel XEON E5-2667 2.90GHz looks to have 12 cores total? I’m willing to bet you can push that up, officially Veeam supports up to 2:1 tasks per core, so you can try throwing a few more tasks on there and adding more VMs to the job, I think gateway + storeonce should be able to handle it, and will increase your throughput rate.


Very happy to help.

Gateaway looks alright, 2CPU Intel XEON E5-2667 2.90GHz looks to have 12 cores total? I’m willing to bet you can push that up, officially Veeam supports up to 2:1 tasks per core, so you can try throwing a few more tasks on there and adding more VMs to the job, I think gateway + storeonce should be able to handle it, and will increase your throughput rate.

So looks like that:

  • if the job has a single vm, even with multiple disks, only one session to the storeonce is created
  • each session cannot achieve more than, roughly, 300-400MB/s ?

is that correct in your experience ?

 

EDIT

We moved the storeonce to the same SAN switch as the gateway server, so bypassing the ISL between the two SAN switches, and we instantly got Transfer rate of about 1GB/which does make sense of the 8Gb HBA on the server.

Now the Bottleneck is mentioned as the Source, which by the way is across the ISL as well.

We’re planning to installa a new Gateweay server with 16Gb HBA connected on the 16Gb SAN switch

thanks all for the useful discussion

 


Very happy to help.

Gateaway looks alright, 2CPU Intel XEON E5-2667 2.90GHz looks to have 12 cores total? I’m willing to bet you can push that up, officially Veeam supports up to 2:1 tasks per core, so you can try throwing a few more tasks on there and adding more VMs to the job, I think gateway + storeonce should be able to handle it, and will increase your throughput rate.

So looks like that:

  • if the job has a single vm, even with multiple disks, only one session to the storeonce is created
  • each session cannot achieve more than, roughly, 300-400MB/s ?

is that correct in your experience ?

 

Kind of, give our User Guide on Concurrent Tasks a read quick, basically each task is usually equivalent to 1 VM disk. Let’s say you have a proxy with 4 tasks configured, and 2 VMs, 2 disks each == 4 concurrent tasks, so no wasted processing time.

Let’s say you had 2 proxies with 2 tasks configured for each, and 4 vms with 2 disks each. That’d be 8 disks so ideally it’d queue the first two VMs, process all four disks, then queue the next.

As for session max rate, that will heavily depend on the model, I don’t think I ever saw even suggested figures from HPE on single stream performance, which is understandable given their focus on total throughput of all streams. I know DataDomains have a limit around 200-400 MB/s depending on a few factors, and Storeonce usually falls within that.

EDIT

We moved the storeonce to the same SAN switch as the gateway server, so bypassing the ISL between the two SAN switches, and we instantly got Transfer rate of about 1GB/which does make sense of the 8Gb HBA on the server.

Now the Bottleneck is mentioned as the Source, which by the way is across the ISL as well.

We’re planning to installa a new Gateweay server with 16Gb HBA connected on the 16Gb SAN switch

thanks all for the useful discussion

 

Wow, great to hear it. Can I ask though, that’s on a single VM/disk?


Very happy to help.

Gateaway looks alright, 2CPU Intel XEON E5-2667 2.90GHz looks to have 12 cores total? I’m willing to bet you can push that up, officially Veeam supports up to 2:1 tasks per core, so you can try throwing a few more tasks on there and adding more VMs to the job, I think gateway + storeonce should be able to handle it, and will increase your throughput rate.

So looks like that:

  • if the job has a single vm, even with multiple disks, only one session to the storeonce is created
  • each session cannot achieve more than, roughly, 300-400MB/s ?

is that correct in your experience ?

 

Kind of, give our User Guide on Concurrent Tasks a read quick, basically each task is usually equivalent to 1 VM disk. Let’s say you have a proxy with 4 tasks configured, and 2 VMs, 2 disks each == 4 concurrent tasks, so no wasted processing time.

Let’s say you had 2 proxies with 2 tasks configured for each, and 4 vms with 2 disks each. That’d be 8 disks so ideally it’d queue the first two VMs, process all four disks, then queue the next.

As for session max rate, that will heavily depend on the model, I don’t think I ever saw even suggested figures from HPE on single stream performance, which is understandable given their focus on total throughput of all streams. I know DataDomains have a limit around 200-400 MB/s depending on a few factors, and Storeonce usually falls within that.

EDIT

We moved the storeonce to the same SAN switch as the gateway server, so bypassing the ISL between the two SAN switches, and we instantly got Transfer rate of about 1GB/which does make sense of the 8Gb HBA on the server.

Now the Bottleneck is mentioned as the Source, which by the way is across the ISL as well.

We’re planning to installa a new Gateweay server with 16Gb HBA connected on the 16Gb SAN switch

thanks all for the useful discussion

 

Wow, great to hear it. Can I ask though, that’s on a single VM/disk?

It’s a single VM with 3 disks

thanks


Comment