-
- Veeam ProPartner
- Posts: 59
- Liked: 40 times
- Joined: Jan 08, 2013 4:26 pm
- Full Name: Falk
- Location: Germany
- Contact:
Re: ReFS vs. XFS - a small real world synthetic performance comparison
My standard setup for similar sized customers is a server with a Megaraid 9560 and 16x16TB in a Raid6 (+Hotspare).
In the setup, the 10 GBit network card is generally the limit and we can write 1.1GiB/s permanently.
In the setup, the 10 GBit network card is generally the limit and we can write 1.1GiB/s permanently.
-
- Novice
- Posts: 4
- Liked: never
- Joined: Feb 07, 2022 10:14 pm
- Contact:
Re: ReFS vs. XFS - a small real world synthetic performance comparison
Sorry, the parentheses probably hurt clarity in my statement. Microsoft definitely recommends ReFS for all kinds of workloads, but they recommend 3x mirror (or mirror-accelerated parity with 2019+) for performance. Microsoft pretty much recommends parity only be used for archives due to the significant (~order of magnitude) write performance hit.
-
- Veteran
- Posts: 1248
- Liked: 443 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: ReFS vs. XFS - a small real world synthetic performance comparison
Yes, but with parity they mean Storage Spaces parity if i am not incorrect, not RAID 6(0) per se.
-
- Novice
- Posts: 4
- Liked: never
- Joined: Feb 07, 2022 10:14 pm
- Contact:
Re: ReFS vs. XFS - a small real world synthetic performance comparison
@mkretzer, not sure if that's in reply to my comment, but if so, yes, I'm referring only to Microsoft's own storage stack (particularly S2D). I'm not sure I've heard a good explanation why their parity erasure codings are so slow in comparison to other implementations, but it seems just accepted that it is. Hardware parity presumably should perform much better. That's why I'm curious about the details behind these backup stats--if it's S2D, is it parity? Then the poor perf is expected. If it's hardware parity RAID or S2D and mirrored, it's surprising.
-
- Veteran
- Posts: 1248
- Liked: 443 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: ReFS vs. XFS - a small real world synthetic performance comparison
Both systems (XFS and ReFS) are the same - both Hardware RAID in external FC Storage systems (Hitachi G350).
-
- Novice
- Posts: 3
- Liked: 1 time
- Joined: Jan 04, 2023 8:11 pm
- Contact:
Re: ReFS vs. XFS - a small real world synthetic performance comparison
Greetings,
Here is some interesting and related info but focused on RAID settings and ReFS.
TLDR: RAID type and strip/stripe settings has a bit impact in performance. Jump down to Example 1 & 2 and conclusions.
I hope to run the same benchmarks for XFS on the same hardware to get apples to apples comparison between ReFS and XFS but am having some issues installing the a Ubuntu Hardened XFS Repo using Veeam's latest instructions.
HP G9 2U server
Hardware raid with 4GB cache
12x3.5" SAS 7.2K 4TB drives for data
2xSATA SSDs for OS
Sorry, I can't figure out how to get tables to display so I didn't post the detailed results ... here is a summary. Tests are from Veeam's suggested synthetic benchmark tests. Note the test file size must be greater than the cache (>4GB in my case) and run long enough to diminish the effects of filling up the cache at the start. When I ran tests of 1GB files sizes it was basically just executing inside the cache and I was getting ridiculous numbers.
1. diskspd.exe -Sh -d600 #X
2. diskspd.exe -c25G -b512K -w100 -Sh -d600 d:\testfile.dat
3. diskspd.exe -c100G -b512K -w50 -r4K -Sh -d600 d:\testfile.dat
4. diskspd.exe -c100G -b512K -w67 -r4K -Sh -d600 d:\testfile.dat
5. diskspd.exe -c100G -b512K -r4K -Sh -d600 d:\testfile.dat
6. diskspd.exe -c100G -b512K -Sh -d600 d:\testfile.dat
Individual Drive (typical results) Total Read Write
MiB/s IOPS | MiB/s IOPS | MiB/s IOPS
1. Direct Disk Max Read 206.11 3297.8 | 206.35 3301.64 | 0 0
2. Active Full / Forward Incremental 202.05 404.11 | 0 0 | 202.05 404.11
3. Synthetic Full / Merge operations 52.57 105.14 | 26.35 52.7 | 26.22 52.44
4. Reverse Incremental 50.02 100.04 | 16.54 33.09 | 33.48 66.96
5. Restore / Health Check / SureBackup (worst case) 55.71 111.41 | 55.71 111.41 | 0 0
6. Restore / Health Check / SureBackup (best case) 204.88 409.76 | 204.88 409.76 | 0 0
Theoretical Maximums Observed Maximums
Read Write Read Write
MiB/s IOPS MiB/s IOPS MiB/s IOPS MiB/s IOPS
8xHDDs R10 1600 26400 | 800 1600 | 1036.6* 13129.5 | 810.61 1621.22
12xHDDs R10 2400 39600 |1200 2400 | 1211.24* 19100.75 | 1200.68 2401.36
12xHDDs R60 1600 26400 |1600 3200 | 1540.33 23014.66 | 1603.42 3206.83
12xHDDs R6 2000 33000 |2000 4000 | 1908.74 26968.79 | 2003.91 4007.82
* I think a small Q setting is hurting these results, Q size 8 with Crystal tests were much better.
Very interesting is the effect of strip/stripe size. Veeam documents example suggests 128k strip sizes so that 4x strip = 512K stripe. However I found that the actual "best" throughput was achieved with much larger sizes with various trade offs. Unfortunately I don't have anywhere I can post the spreadsheets with all the details.
Example 1
8xHDDs R10 512K/2M ReFS 64K vs 128k/512K ReFS 64K
Test 1 & 2 were basically the same
Test 3 & 4 were faster the larger strip/stripe you had
Test 5 improved by about 4% to 5% with each increase in strip/stripe size
Test 6 was a whopping 29% faster but then only marginally faster for the next strip/stripe size increase
Example 2
12xHDDs R6 512K/5M ReFS 64K vs 128K/1.25M ReFS 64K
Test 1 was 100+% faster than the example 1 at 1685 MB/s
Test 2 was 148% faster than the example 1 at 2000 MB/s
Test 3 & 4 were significantly slower
Test 5 was marginally slower
Test 6 in this case was a giant 87% faster on top of the huge 29% increase in the above config and hit 1908.74 MB/s
I also tested but didn't post results for 12xHDDs R10 & 12xHDDs R60.
Conclusions:
Raid 6 provides peak performance for full backup #2 / restore #6 (best case) if you have a good hardware RAID card with a lot of cache.
Worst case restore #5 takes a very small hit in performance.
I think it's worth the performance hit to synthetic full / merge operations #3 since the XFS block clone would make up for that.
Reverse incremental #4 would also suffer but can't be done with immutable XFS anyways so who cares.
I also attempted to measured Raid 10 and 6 rebuild times (11 hrs optimal tested with no load, 53 hrs estimated worst case @ 20%). Considering my Veeam backup server is under load mostly overnight and on weekend there are a lot of hours available to run a rebuild at 100% speed.
R6 parity array initialization times is < 6 hrs with no load.
Questions:
My Ubuntu v20 Server install was crashing during install setup when prompted for user and machine name. I have multiple NICs and using only port got me past that crash point. Now it's crashing later on. I'm going on vacation for a few days so will try again next week.
Does anyone have suggestions/tips/tricks for installing Ubuntu 20 server on HP G9 hardware? I am eager to run the same tests with XFS.
Any suggestions as to where I could publish my detail testing results?
Thanks!
Here is some interesting and related info but focused on RAID settings and ReFS.
TLDR: RAID type and strip/stripe settings has a bit impact in performance. Jump down to Example 1 & 2 and conclusions.
I hope to run the same benchmarks for XFS on the same hardware to get apples to apples comparison between ReFS and XFS but am having some issues installing the a Ubuntu Hardened XFS Repo using Veeam's latest instructions.
HP G9 2U server
Hardware raid with 4GB cache
12x3.5" SAS 7.2K 4TB drives for data
2xSATA SSDs for OS
Sorry, I can't figure out how to get tables to display so I didn't post the detailed results ... here is a summary. Tests are from Veeam's suggested synthetic benchmark tests. Note the test file size must be greater than the cache (>4GB in my case) and run long enough to diminish the effects of filling up the cache at the start. When I ran tests of 1GB files sizes it was basically just executing inside the cache and I was getting ridiculous numbers.
1. diskspd.exe -Sh -d600 #X
2. diskspd.exe -c25G -b512K -w100 -Sh -d600 d:\testfile.dat
3. diskspd.exe -c100G -b512K -w50 -r4K -Sh -d600 d:\testfile.dat
4. diskspd.exe -c100G -b512K -w67 -r4K -Sh -d600 d:\testfile.dat
5. diskspd.exe -c100G -b512K -r4K -Sh -d600 d:\testfile.dat
6. diskspd.exe -c100G -b512K -Sh -d600 d:\testfile.dat
Individual Drive (typical results) Total Read Write
MiB/s IOPS | MiB/s IOPS | MiB/s IOPS
1. Direct Disk Max Read 206.11 3297.8 | 206.35 3301.64 | 0 0
2. Active Full / Forward Incremental 202.05 404.11 | 0 0 | 202.05 404.11
3. Synthetic Full / Merge operations 52.57 105.14 | 26.35 52.7 | 26.22 52.44
4. Reverse Incremental 50.02 100.04 | 16.54 33.09 | 33.48 66.96
5. Restore / Health Check / SureBackup (worst case) 55.71 111.41 | 55.71 111.41 | 0 0
6. Restore / Health Check / SureBackup (best case) 204.88 409.76 | 204.88 409.76 | 0 0
Theoretical Maximums Observed Maximums
Read Write Read Write
MiB/s IOPS MiB/s IOPS MiB/s IOPS MiB/s IOPS
8xHDDs R10 1600 26400 | 800 1600 | 1036.6* 13129.5 | 810.61 1621.22
12xHDDs R10 2400 39600 |1200 2400 | 1211.24* 19100.75 | 1200.68 2401.36
12xHDDs R60 1600 26400 |1600 3200 | 1540.33 23014.66 | 1603.42 3206.83
12xHDDs R6 2000 33000 |2000 4000 | 1908.74 26968.79 | 2003.91 4007.82
* I think a small Q setting is hurting these results, Q size 8 with Crystal tests were much better.
Very interesting is the effect of strip/stripe size. Veeam documents example suggests 128k strip sizes so that 4x strip = 512K stripe. However I found that the actual "best" throughput was achieved with much larger sizes with various trade offs. Unfortunately I don't have anywhere I can post the spreadsheets with all the details.
Example 1
8xHDDs R10 512K/2M ReFS 64K vs 128k/512K ReFS 64K
Test 1 & 2 were basically the same
Test 3 & 4 were faster the larger strip/stripe you had
Test 5 improved by about 4% to 5% with each increase in strip/stripe size
Test 6 was a whopping 29% faster but then only marginally faster for the next strip/stripe size increase
Example 2
12xHDDs R6 512K/5M ReFS 64K vs 128K/1.25M ReFS 64K
Test 1 was 100+% faster than the example 1 at 1685 MB/s
Test 2 was 148% faster than the example 1 at 2000 MB/s
Test 3 & 4 were significantly slower
Test 5 was marginally slower
Test 6 in this case was a giant 87% faster on top of the huge 29% increase in the above config and hit 1908.74 MB/s
I also tested but didn't post results for 12xHDDs R10 & 12xHDDs R60.
Conclusions:
Raid 6 provides peak performance for full backup #2 / restore #6 (best case) if you have a good hardware RAID card with a lot of cache.
Worst case restore #5 takes a very small hit in performance.
I think it's worth the performance hit to synthetic full / merge operations #3 since the XFS block clone would make up for that.
Reverse incremental #4 would also suffer but can't be done with immutable XFS anyways so who cares.
I also attempted to measured Raid 10 and 6 rebuild times (11 hrs optimal tested with no load, 53 hrs estimated worst case @ 20%). Considering my Veeam backup server is under load mostly overnight and on weekend there are a lot of hours available to run a rebuild at 100% speed.
R6 parity array initialization times is < 6 hrs with no load.
Questions:
My Ubuntu v20 Server install was crashing during install setup when prompted for user and machine name. I have multiple NICs and using only port got me past that crash point. Now it's crashing later on. I'm going on vacation for a few days so will try again next week.
Does anyone have suggestions/tips/tricks for installing Ubuntu 20 server on HP G9 hardware? I am eager to run the same tests with XFS.
Any suggestions as to where I could publish my detail testing results?
Thanks!
Who is online
Users browsing this forum: Amazon [Bot], Semrush [Bot] and 40 guests