Experience with HPE Apollo or other High Density server as backup repository?

Forum|Forum|4 years ago
April 26, 2021
61 comments
2261 views

Ralf
Comes here often

Hi,

I’d be interested in getting feedback from people that are already using HPE Apollo’s or other high density servers with local storage as repository server. It seems that a lot of people are currently planning to use them.

what is you setup?
Number of servers? Number of disks?
Which type and size of RAID?
Number of VM’s / source data
how is your performance?
do you use them only as backup target or as copy source as well?
which filesystem ReFS/XFS with blockcloning/reflink

Show first post

Forum|pagination.label 3 / 3

Ralf
Author
Comes here often
Forum|Forum|4 years ago
May 17, 2021

The Apollos are not deployed, we are currently installing them with RHEL 8.2. It will take a couple of days until I’ll be able to test.

just curious, why in your case did you choose RHEL 8.2 , instead of (e.g.) Ubuntu?

RHEL is the default for all Linux servers here.

Ralf
Author
Comes here often
Forum|Forum|4 years ago
May 19, 2021

Just noticed that my last post was in the wrong thread. Here are the two options I’m thinking about. We’ll probably try out a backup and copy extent on each server.

BertrandFR
Influencer
Forum|Forum|4 years ago
May 19, 2021

For my curiosity what kind of object storage are you using? Example of performance for copy job?

i will deploy on rhel 8 too, we’r using kickstart or satellite provisionning. Do you deploy hardening on you repo? rhel csi?

Ralf
Author
Comes here often
Forum|Forum|4 years ago
May 19, 2021

For capacity tier offloading we use AWS S3 buckets, we want to look into Wasabi in the next couple of weeks now that they have object lock too.

The servers were deployed from Linux team, they use satellite. No special hardening yet.

BertrandFR
Influencer
Forum|Forum|4 years ago
June 2, 2021

Hey Ralf, any news about your test?

Ralf
Author
Comes here often
Forum|Forum|4 years ago
June 2, 2021

No more tests already migrated most of the jobs. I had some problems with sudo/ssh, some jobs failed a lot of data was left in /opt/veeam (configured in registry) and in the Veeam user home, /var is growing rapidly. At one point Veeam could not perform any task on one server as the Veeam user was not able to connect to his nfs home anymore (>100 defunct processes). After disabling sudo and using the root user this did not happen again, but it’s not so nice needing root for this to work properly.

As we are still on v10 this might be much better in v11 with persistent data mover. All of those are no Apollo issues, it’s Veeam + Linux. Performance in real world is still pretty good, I’m just struggling with the available repository task slots as 52 cores per server are be a bit too little with backup + copy + offload tasks. Those 2 Apollos were initially bought only as copy target, but - as usual - mission has changed and they are now also used for backup. We’ll have 2-4 additional Apollos shortly, then the task slot issue should also be solved.

BertrandFR
Influencer
Forum|Forum|4 years ago
June 17, 2021

Hey @Ralf , when you said var is growing rapidly ? What is about? Logs? Cache? How many mb/gb per days/jobs?

I’m wonderiing, how did you mount the xfs partition? Are you using lvm?

+12

vNote42
On the path to Greatness
Forum|Forum|4 years ago
June 17, 2021

Hey @Ralf , when you said var is growing rapidly ? What is about? Logs? Cache? How many mb/gb per days/jobs?

I’m wonderiing, how did you mount the xfs partition? Are you using lvm?

I would also be interested if you are using LVM. IMHO it is not necessary for repositories.

Wolfgang | vnote42.net | @vNote42

Ralf
Author
Comes here often
Forum|Forum|4 years ago
June 17, 2021

Yes, we use LVM by default. But not for the Veeam extents. /var has a size of 25 GB now, 17 GB are used, 12 GB Veeam logs.

xfs mount options (not much tuning):

… type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,sunit=512,swidth=6144,noquota)

Ralf
Author
Comes here often
Forum|Forum|4 years ago
July 14, 2021

Just a note: we configured the 4510 servers with the 547FLR-QSFP 40G network adapter. This was not the best idea as the max supported DAC cable lenght is 5m and there is only a MPO transceiver or AOC as alternative. BiDi transceiver like “HPE X140 40G QSFP+ LC BiDi MM” are not supported for this adapter. There is an HPE advisory that certain HPE 40/100 network adapters does not work with BiDi due to a power problem…. As we do not use MPO or AOC in our datacenter, we probably have to replace the adapters now.

BertrandFR
Influencer
Forum|Forum|4 years ago
February 18, 2022

after all i’m now member of apollo gang :). Don’t forget like me to configure cache of the controller, by default it was 100% to read. The difference was HUGE. Interesting fact my processing rate quadruplet compared to my old dedup appliance on same backup scenario. I will continue my test to a specific restoration scenario to compare read performances on backup repo.

fio --rw=write --name=test --size=50G --direct=1 --bs=512k --numjobs=20 --ioengine=libaio --iodepth=16 --refill_buffers --group_reporting
test: (g=0): rw=write, bs=(R) 512KiB-512KiB, (W) 512KiB-512KiB, (T) 512KiB-512KiB, ioengine=libaio, iodepth=16
...
fio-3.19
Starting 20 processes
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
test: Laying out IO file (1 file / 51200MiB)
Jobs: 11 (f=10): [f(1),W(1),_(1),W(1),_(1),W(1),_(1),W(2),_(4),W(1),_(2),W(4)][99.5%][w=2398MiB/s][w=4795 IOPS][eta 00m:02s]
test: (groupid=0, jobs=20): err= 0: pid=276626: Fri Feb 18 12:28:06 2022
write: IOPS=5030, BW=2515MiB/s (2637MB/s)(1000GiB/407115msec); 0 zone resets
    slat (usec): min=5, max=204138, avg=2829.68, stdev=8357.51
    clat (usec): min=1612, max=905853, avg=60461.38, stdev=32858.38
     lat (usec): min=1621, max=914383, avg=63291.66, stdev=33526.12
    clat percentiles (msec):
     | 1.00th=[   35], 5.00th=[   39], 10.00th=[   41], 20.00th=[   43],
     | 30.00th=[   45], 40.00th=[   47], 50.00th=[   50], 60.00th=[   53],
     | 70.00th=[   57], 80.00th=[   65], 90.00th=[   90], 95.00th=[ 155],
     | 99.00th=[ 186], 99.50th=[ 197], 99.90th=[ 215], 99.95th=[ 224],
     | 99.99th=[ 241]
   bw ( MiB/s): min= 1528, max= 4374, per=100.00%, avg=2521.54, stdev=17.08, samples=16126
   iops        : min= 3055, max= 8738, avg=5032.83, stdev=34.26, samples=16126
lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.09%, 50=51.84%
lat (msec)   : 100=39.01%, 250=9.04%, 500=0.01%, 750=0.01%, 1000=0.01%
cpu          : usr=5.01%, sys=1.05%, ctx=2038785, majf=0, minf=77794
IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2048000,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
WRITE: bw=2515MiB/s (2637MB/s), 2515MiB/s-2515MiB/s (2637MB/s-2637MB/s), io=1000GiB (1074GB), run=407115-407115msec

Disk stats (read/write):
sda: ios=0/2046821, merge=0/125, ticks=0/103413090, in_queue=103413090, util=100.00%

Forum|pagination.label 3 / 3

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded