Long retention

mporliod · Post by **mporliod** » Mar 04, 2024 8:48 am this post

Good morning, I would need some advice. I want to know what is the most correct way to perform a daily backup with a 6-month retention. The environment is VMware with about 35 VMs. In your opinion, what could be the most suitable repository? I was thinking of a server with capacitive disks and an REFS file system. I also thought about installing Veeam on that server and not keeping it virtual.

Post by **Mildur** » Mar 04, 2024 9:23 am this post

Hello Massimiliano

reFS for 6 months is fine. Use weekly synthetic full backups for Fast Cloning.
For the hardware we recommend to use a enterprise grade raid controller with battery-backed write cache.

May I ask, have you also considered a second repository with immutable or air-gapped backups? Preferably in an off side location? Just having an reFS repository doesn't follow our 3-2-1 recommendation (3 copy of your data, 2 different media, 1 copy of your backup offsite)

Another option would be to virtualize your backup server and then use the physical server as a hardened repository with XFS. This would allow you to use immutability for your backup files. Immutability protects your backups against malicious attacks.

Best,
Fabian

mporliod · Post by **mporliod** » Mar 04, 2024 9:28 am this post

if I go to use a data domain, is the same job policy ok?

Post by **Mildur** » Mar 04, 2024 9:59 am this post

Please note, we recommended to use deduplication appliances mainly as a target for backup copy jobs. The expected restore performance (throughput) is generally lower than a reFS or XFS repository.

For your design with daily backups, you can use the same job policy. Write a backup every day with weekly synthetic full backups (if integrated with DDBoost).
Regular full backups are a strong requirement, because Data Domain has a limit of how long a backup chain can be (120 restore points). Also starting with V12.1, you could leverage immutable backups on Data Domain.

You can find best practices on Dells website as well on ours:
- Dell: https://www.dell.com/support/kbdoc/en-u ... mendations
- Veeam: https://helpcenter.veeam.com/docs/backu ... ml?ver=120

Best,
Fabian

RGijsen · Post by **RGijsen** » Mar 04, 2024 6:56 pm this post

I've stated this before and I now state it again: I would stronly vote against ReFS when using spinning disks. As with all dedupe solutions (which fast-clone is too in some form) your files WILL get scattered all over the volume. Unless you have a huge amount of disks (50+?) I'd absolutely not go that route. Hardware dedupe boxes like StoreOnce have optimizations of course, but 'simple' software stuff like Windows Dedupe or ReFS don't. In addition ReFS isn't really meant for standalone volumes. It has, or at least can have metadata which allows for error-detection and the ability to repair the file from a mirror or parity. Great stuff, but obviously you'd need to have a mirror / parity setup. If you don't have that, or even when you DO have that but a file still can't be repaired, ReFS will remove the file from the visible namespace, but the actual file remains on disk. No way to free up the actual space. In the past we had ReFS with a corrupted file, taking away several TB's of space. The origin of the corruption turned out to be a defective memory module, so that wasn't ReFS to blame. But even MS support couldn't get the space back to the volume without just copying all files over to a new volume, which in turn 'dehydrates' all files so all your block-cloning advantages are gone.
We did that anyhow, but about a year later especially restores became so dreadfully slow, in the megabytes per second range, because of all the random reads those poor disks had to do, it was just unbreabale.
We then cut the cord, bought a box of 10TB (or 12TB, I don't remember) reasonably cheap enterprise level SATA disks, put it all on a big HPe SmartArray controller, made a big NTFS volume of it, and we never ever looked back for that repository. It performs WAY better than ReFS ever did after the first month and it keeps performing, and it has predictable performance too.

In our book, NTFS is WAY more mature than ReFS. NTFS has it's flaws too, but at least there is a lot of knowledge on the file system, and also a lot of tooling available to fix things when it went berzerk, which hasn't happened ever since in our environment. The only real tool there is to manage or troubleshoot ReFS volumes is ReFSUTIL.exe, which has very, very limited functionality. Unless you have ReFS with pure SSD's, and have mirror in place (costing additional disks, but such a setup works actually very well) I'd definitely go for NTFS. I'd never trust my data to a single non-mirrored ReFS volume again until it matures.

Now for the record, we have one off-site repository with SSD storage, which does in fact run ReFS with block cloning. While the performance definetely dropped over the years, it's still fast enough to meet our recovery times. But also SSD's are much slower with random IO than with sequential.

Mar 04, 2024 8:11 pm

I'm curious if @RGijsen has tried XFS. I have had many bad experiences with ReFS (if you choose to go ReFS, make sure the hardware raid controller is on Microsoft's HCL for ReFS) and I very much shy away from it these days unless I have no other choice.

Performance with XFS has been great and you don't have as many concerns as ReFS. I would avoid NTFS simply because it lacks block cloning.

robnicholsonmalt · Mar 04, 2024 11:14 pm

Quick ironic that ReFS stands for Resilient File System and by the accounts here, it's anything but resilient!

RGijsen · Post by **RGijsen** » Mar 05, 2024 9:17 pm this post

I've not tried XFS as our repositories are on Windows for now. And with spindles, over time not having block cloning is a blessing. Obviously when your backup storage has enough random IO performance (ie. full flash array) then yes, block cloning and / or dedupe can save you quite a bit on drives. However, despite all error checks and stuff on either your raid controller(s) or FS or whatever, if that one important block which all your backup files contain, but is deduped on one block on your disk, and it gets corrupted some way... gone is all.
For my backups I decided I don't want to have any of those risks, buy a sufficient number of relatively cheap spindle disks (which are still quite fast sequentially) and in the end it was cheaper than go with enterprise level SSD with ReFS. The repositories I'm looking at now has 12x 10TB 7200 nearline sata disks in good ol' raid 6, and I write backups to that with about 1.4 to 1.5GB a second. As that is compressed by Veeam, so that is about 1.8GB/sec of production data. That's just over the network, on-host backup. When I use storage integration and do off-host backups, it's even faster as I rule out the network (we have fiber channel storage).
That's quick enough for us and way more than ReFS did after a few months.

mporliod · Post by **mporliod** » Mar 08, 2024 2:20 pm this post

Mildur wrote: ↑Mar 04, 2024 9:23 am Hello Massimiliano

reFS for 6 months is fine. Use weekly synthetic full backups for Fast Cloning.
For the hardware we recommend to use a enterprise grade raid controller with battery-backed write cache.

May I ask, have you also considered a second repository with immutable or air-gapped backups? Preferably in an off side location? Just having an reFS repository doesn't follow our 3-2-1 recommendation (3 copy of your data, 2 different media, 1 copy of your backup offsite)

Another option would be to virtualize your backup server and then use the physical server as a hardened repository with XFS. This would allow you to use immutability for your backup files. Immutability protects your backups against malicious attacks.

Best,
Fabian

I want to know what is the most correct way to perform a daily backup with a 6-month retention. The environment is VMware with about 35 VMs.
If the repository was a swarm object storage what would be the most correct way to do this kind of backup ?

thank

mporliod · Post by **mporliod** » Mar 08, 2024 2:39 pm this post

Mildur wrote: ↑Mar 04, 2024 9:59 am Please note, we recommended to use deduplication appliances mainly as a target for backup copy jobs. The expected restore performance (throughput) is generally lower than a reFS or XFS repository.

For your design with daily backups, you can use the same job policy. Write a backup every day with weekly synthetic full backups (if integrated with DDBoost).
Regular full backups are a strong requirement, because Data Domain has a limit of how long a backup chain can be (120 restore points). Also starting with V12.1, you could leverage immutable backups on Data Domain.

You can find best practices on Dells website as well on ours:
- Dell: https://www.dell.com/support/kbdoc/en-u ... mendations
- Veeam: https://helpcenter.veeam.com/docs/backu ... ml?ver=120

Best,
Fabian

so if I understand correctly I should introduce full backups like one every month, together with the weekly synthetic full backups?

R&D Forums

Long retention

Re: Long retention

Re: Long retention

Re: Long retention

Re: Long retention

Re: Long retention

Re: Long retention

Re: Long retention

Re: Long retention

Re: Long retention

Who is online