Experience with HPE Apollo or other High Density server as backup repository?
Hi,
I’d be interested in getting feedback from people that are already using HPE Apollo’s or other high density servers with local storage as repository server. It seems that a lot of people are currently planning to use them.
what is you setup?
Number of servers? Number of disks?
Which type and size of RAID?
Number of VM’s / source data
how is your performance?
do you use them only as backup target or as copy source as well?
which filesystem ReFS/XFS with blockcloning/reflink
Page 1 / 3
We just received our 2 demo Apollo 4510 that hopefully become our new copy target. 130Kg! Let’s see if I can test them with the use cases we have for our primary backup storage too. As we are in a hurry to use them as new copy targets (broken copy chains, we have to completely start over) I’m not so sure about it. Anyhow, I’ll be able to get some first impressions soon.
If you don’t know these links, I would recommend you to read is:
The Apollos are not deployed, we are currently installing them with RHEL 8.2. It will take a couple of days until I’ll be able to test.
We use a lot of HPE StoreEasy devices for smaller environments including Windows Storage Server : perfect cost-performance balance. For larger environments is a HPE Apollo perfect : more storage available, more performant CPUs availabe, more customizable, more flexible (choose your preferred OS)
If you don’t know these links, I would recommend you to read is:
It’s about a Veeam configuration, Veeam and HPE made missive performance tests with v11. There are a lot of tips and reasons for these.
I know both threads, I’ve already spammed the forum. Point is, I don’t see much real world user experience. HPE claims that those devices are heavily used by Veeam customers. But our Veeam contacts could not provide any (larger) company that uses Apollos.
As I’ll not have much time to test our new setup, the decision SAN + Server vs High Density Server has to be 95% the final setup. If I had more time, I’d just test this in our environment for a couple of weeks.
The benchmarks look very solid, but for us it’s not real world as data will not only be written to the Apollo, its the source for copy and offload jobs. And probably surebackup at some point.
Run status group 0 (all jobs): READ: bw=1562MiB/s (1638MB/s), 77.0MiB/s-352MiB/s (81.8MB/s-369MB/s), io=1400GiB (1503GB), run=203566-917518msec WRITE: bw=670MiB/s (703MB/s), 33.6MiB/s-151MiB/s (35.2MB/s-158MB/s), io=600GiB (645GB), run=203566-917518msec
Disk stats (read/write): sdb: ios=2866217/1229705, merge=159/59, ticks=14433112/258465, in_queue=13229596, util=100.00%
Reading @vNote42 and @ralf exchanges
Thank you guys for all your answers
I’ve been watching this silently for a while but just want to say a huge thank you to both @Ralf and @vNote42 for an awesome technical interaction here.
It’s rare that we as a collective can all see real world numbers outside of specific lab simulations and free from marketing so this has been truly awesome to see. I’m still looking at potentially a Dell PowerEdge XE7100 based system so if I can organise a test I’d love to re-run these benchmarks and get a true Dell vs HPE battle going here!
[this is frustrating, I just added a post with some benchmark results and ended up with nothing but a warning that it exceeds the max number of characters….]
Here are some initial numbers for one 28 x 16TB RAID 60 xfs volume.
I added the options su and sw to mkfs, I think they should be right, don’t know why mkfs.xfs is complaining.
eroot@sdeu2000 ~]# mkfs.xfs -b size=4096 -m reflink=1,crc=1 -d su=256k,sw=24 /dev/sdb -f mkfs.xfs: Specified data stripe width 12288 is not the same as the volume stripe width 6144 meta-data=/dev/sdf isize=512 agcount=350, agsize=268435392 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 data = bsize=4096 blocks=93755080704, imaxpct=1 = sunit=64 swidth=1536 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=521728, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0
…. Run status group 0 (all jobs): READ: bw=1527MiB/s (1602MB/s), 76.4MiB/s-348MiB/s (80.2MB/s-365MB/s), io=1000GiB (1074GB), run=146941-670315msec WRITE: bw=1528MiB/s (1602MB/s), 76.3MiB/s-349MiB/s (80.0MB/s-366MB/s), io=1000GiB (1074GB), run=146941-670315msec
Disk stats (read/write): sdb: ios=2047186/2048082, merge=119/120, ticks=7863530/1349428, in_queue=7637829, util=100.00%
Hi,
I’d be interested in getting feedback from people that are already using HPE Apollo’s or other high density servers with local storage as repository server. It seems that a lot of people are currently planning to use them.
what is you setup?
Number of servers? Number of disks?
Which type and size of RAID?
Number of VM’s / source data
how is your performance?
do you use them only as backup target or as copy source as well?
which filesystem ReFS/XFS with blockcloning/reflink
i use Dell PowerEdge R740, R730. The performance is Good.
The Apollos are not deployed, we are currently installing them with RHEL 8.2. It will take a couple of days until I’ll be able to test.
just curious, why in your case did you choose RHEL 8.2 , instead of (e.g.) Ubuntu?
RHEL is the default for all Linux servers here.
For smaller environments (less than 500 VMs) we use sometimes a DELL PowerEdge R740 XD or XD2.
It is able to handle up to 24 disks with a maximum of - I think - 18TB / disk in the moment.
Up to now we used a ReFS repository with these servers. In the future we will give it a try with a linux installation and an XFS immutable repo…
The server work fine in these environments. No problems up to now.
i use Dell PowerEdge R740, R730. The performance is Good.
For how many VM’ s and backup data?
Just noticed that my last post was in the wrong thread. Here are the two options I’m thinking about. We’ll probably try out a backup and copy extent on each server.
I wanted to post some Veeam benchmarks too, but this doesn’t make much sense as I can see that jobs are limited by network. We have a 40/50 GbE 547FLR network adapter with QSFP → SFP adapter, so this is currently limited by 2x10 GbE bonded linux interfaces. I see ~1,3GB/s and Veeam is showing network or source as bottleneck. Our network equipment will be replaced in the next months, then I will switch to 40GbE. As we are testing with v10, I also can not use the Apollos as proxy to backup directly from storage snapshot. Our existing proxies are sending the data to the Apollo via network.
One advice: make a large /tmp fs is you use a dedicated partition or lvm volume. Veeam is writing a lot to /tmp and my first jobs failed because it was only 4GB.
I’ve been watching this silently for a while but just want to say a huge thank you to both @Ralf and @vNote42 for an awesome technical interaction here.
It’s rare that we as a collective can all see real world numbers outside of specific lab simulations and free from marketing so this has been truly awesome to see. I’m still looking at potentially a Dell PowerEdge XE7100 based system so if I can organise a test I’d love to re-run these benchmarks and get a true Dell vs HPE battle going here!
@MicoolPaul , let us know when you get your Dell server! We will test it too
Thanks are due to @Ralf ! He did all the testing! Thanks also from me!
Sounds good! 50ms on SATA disks I would call “acceptable”. And this is near saturation; as you already said, you will probably will not see such counters in real life.
Well, depending on the test parameter I can reach 2,5GB/s. I mean… we probably will not be able to saturate this anyway in the near future. I just want to be sure that my RAID and fs setup correct as this is hard to modify once data is on the disk.
Run status group 0 (all jobs): WRITE: bw=2463MiB/s (2582MB/s), 2463MiB/s-2463MiB/s (2582MB/s-2582MB/s), io=1000GiB (1074GB), run=415836-415836msec
Disk stats (read/write): sdb: ios=0/2047239, merge=0/119, ticks=0/165897764, in_queue=164873511, util=100.00%
I think, you can sleep without worries
Do you see average latency during tests?
For the last test: lat (msec): min=4, max=434, avg=79.64, stdev=37.26
Oh, average is quite high.
You may can reduce it by slightly decreasing queue depth. Probably performance will not change that much if reduction is small. I would try --iodepth=16
Similar throughput but avg latency below 50ms then.
Well, depending on the test parameter I can reach 2,5GB/s. I mean… we probably will not be able to saturate this anyway in the near future. I just want to be sure that my RAID and fs setup correct as this is hard to modify once data is on the disk.
Run status group 0 (all jobs): WRITE: bw=2463MiB/s (2582MB/s), 2463MiB/s-2463MiB/s (2582MB/s-2582MB/s), io=1000GiB (1074GB), run=415836-415836msec
Disk stats (read/write): sdb: ios=0/2047239, merge=0/119, ticks=0/165897764, in_queue=164873511, util=100.00%
I think, you can sleep without worries
Do you see average latency during tests?
For the last test: lat (msec): min=4, max=434, avg=79.64, stdev=37.26
Oh, average is quite high.
You may can reduce it by slightly decreasing queue depth. Probably performance will not change that much if reduction is small. I would try --iodepth=16
The Apollos are not deployed, we are currently installing them with RHEL 8.2. It will take a couple of days until I’ll be able to test.
just curious, why in your case did you choose RHEL 8.2 , instead of (e.g.) Ubuntu?
Well, depending on the test parameter I can reach 2,5GB/s. I mean… we probably will not be able to saturate this anyway in the near future. I just want to be sure that my RAID and fs setup correct as this is hard to modify once data is on the disk.