Comprehensive data protection for all workloads
Post Reply
christiankelly
Service Provider
Posts: 128
Liked: 11 times
Joined: May 06, 2012 6:22 pm
Full Name: Christian Kelly
Contact:

Thoughts on per-VM backup files: per-disk split

Post by christiankelly »

Was there any thinking about splitting out the VMs per disk when choosing "per-VM backups files". I have some VMs that are 2+TB but they have multiple sub 1TB disk drives. For Windows Server 2016 dedupe recommendations of files not being over 1TB it would be nice to split the files down to the desk level rather than having them all in one VM file. Is that something Veeam is considering or are there reasons this wouldn't be a good idea?

Thanks,
Gostev
Chief Product Officer
Posts: 31456
Liked: 6647 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Thoughts on per-VM backup files

Post by Gostev »

Christian - we have no plans for disk-level split, mostly because too much re-design is required on our end for a corner case (monster VMs and specific to Windows dedupe, as other dedupe engines we intergrate with do not have this limit). That said, we've been stress testing Windows Server 2016 dedupe a lot lately and the real file size limit seems to be between around 4 TB, so this should further narrow down the scope of VMs not suitable for Windows Server 2016 dedupe. Thanks!
christiankelly
Service Provider
Posts: 128
Liked: 11 times
Joined: May 06, 2012 6:22 pm
Full Name: Christian Kelly
Contact:

Re: Thoughts on per-VM backup files

Post by christiankelly »

Good to hear that 2016 handles files lager then 1TB. I agree it's a corner case and I'm sure dedupe will continue to get better. Thanks
bg.ranken
Expert
Posts: 121
Liked: 21 times
Joined: Feb 18, 2015 8:13 pm
Full Name: Randall Kender
Contact:

[MERGED] Split Backup Files in Repository Settings

Post by bg.ranken »

Hello,

Not really sure where to post feature requests so if this is the wrong location then I apologize.

I really like the per-VM feature in v9 that you can set on a repository, but is there any way to also add a feature to split files larger than a certain size? I have been evaluating Windows 2016 with their technical preview 4 and would like to use the data deduplication feature, however one limit on that is the 1TB file limit (same as it was on 2012 R2). While it does work for files larger than 1TB in my testing I'd still like to stay under support and not have to deal with files larger. The per-VM feature definitely helps, but it doesn't help for my VMs that by themselves are over 1TB. I know Gostev mentioned a while ago that he had been talking directly with the dedup team at Microsoft about the 1TB limit but so far I don't think he's posted anything further regarding his findings.

It would be nice if there was a way to set a repository (say used for a Veeam backup copy job with lots of GFF retention settings) to split any files over 1TB into multiple files. I've seen the software already do this on backup copy jobs themselves when it gets interrupted or fails; why not add an option that would do this automatically? I assume this feature already exists for tape as well, though I'm not positive since I haven't used tape with Veeam.
bg.ranken
Expert
Posts: 121
Liked: 21 times
Joined: Feb 18, 2015 8:13 pm
Full Name: Randall Kender
Contact:

Re: Thoughts on per-VM backup files: per-disk split

Post by bg.ranken »

So I literally got this reply from a Microsoft rep right now:
You are correct on the performance impact from the (larger) file size. The bigger issue however is with NTFS’ limitation on the number of file extents a file can have. When that limit is reached NTFS will start failing writes to the file and there you have a reliability issue.

That being said, for scenarios like backup/archive where data is written once and only once then gets deduplicated and becomes essentially read-only afterwards the 1TB limitation actually doesn’t apply. It isn’t very clear to me whether your case falls into this camp or not from your email but if yes you are not subject to this limitation.

The reason we put out a warning like this is because all our official scenarios: general purpose file server, VDI over CSV and virtualized backup via DPM, anticipate frequent in-place writes after the data set is deduplicated and hence are all affected by this limitation.

We will definitely pass your feedback to our core file system and work with them to evaluate various options to help improve customer experience next.
While not really official word from Microsoft (I really wish they would just post something like this somewhere) it does seem that for our case regarding backup files we should be in the clear. We just need to make sure any active backup files aren't being deduplicated, which shouldn't be much of an issue since I had planned on doing that from the start.

For instance if you set a daily backup copy job to keep 4 weekly backup files, 12 monthly backup files, and 30 restore points, setting the Microsoft dedup to only deduplicate files older than 35-40 days should be perfectly fine.
Gostev
Chief Product Officer
Posts: 31456
Liked: 6647 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Thoughts on per-VM backup files: per-disk split

Post by Gostev »

That is correct. I discussed this limitation directly with the development team behind Windows Server deduplication a few months ago, so I know what exactly causes this 1TB limitation from technical perspective. To put a long story short, this limitation definitely ONLY applies to files which need to be modified after they have already been deduped, which is obviously not the case with archived backup files, which never change once created.

But, it's very good to see Microsoft pushing this information down to their sales reps even! Because they made it very clear to me they are not intended to test their dedupe with files larger than 1TB regardless. Perhaps together, we can make them change their mind ;) I did stress with them multiple times that this is extremely important for wider Windows Server 2016 dedupe adoption by image-level backup applications.
bg.ranken
Expert
Posts: 121
Liked: 21 times
Joined: Feb 18, 2015 8:13 pm
Full Name: Randall Kender
Contact:

Re: Thoughts on per-VM backup files: per-disk split

Post by bg.ranken » 1 person likes this post

Thanks Gostev, I'm glad to hear you're pushing this. While I like the idea of dedup appliances, if I can get a cheaper device that has more performance and fits closer to Veeam's reference architecture while giving me similar dedup ratios then I'm all for it. In terms of 2016 adoption we hadn't planned on rolling any servers out till 2016 R2, but with the changes to dedup we'd probably do our backup server within a month of release to get the new dedup engine. And that would be the trojan horse I need to push 2016 for the rest of our environment, so I definitely would agree with you on that.

BTW, a little more info from the Microsoft rep, and he also confirmed they would be updating their online documentation to cover these specific use cases:
There is one more thing. Even if your use case is read-only-after-dedup, we don’t support optimizing files of arbitrary sizes. What will happen here is that we will only optimize the first 4TBs of the file. So if you have a 16TB file, the remaining 12 TB data on it won’t be processed by Dedup. Additionally, the large file size will put more burden on the resources (mainly memory) hence you might run into issues on not-properly-configured hardware you wouldn’t hit otherwise. In short, this isn’t as cut-n-dry as it might’ve sounded to you in my previous emails. There will be some fine-prints when we get the message out more broadly.
Gostev
Chief Product Officer
Posts: 31456
Liked: 6647 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Thoughts on per-VM backup files: per-disk split

Post by Gostev »

That's awesome to hear as I guess this means they've actually listened to me :D

And it is useful to know their plans regarding deduplication of large files in the release... because with the current build (TP4), we found that files larger than 4TB are simply skipped by the dedupe engine entirely.
bg.ranken
Expert
Posts: 121
Liked: 21 times
Joined: Feb 18, 2015 8:13 pm
Full Name: Randall Kender
Contact:

Re: Thoughts on per-VM backup files: per-disk split

Post by bg.ranken »

Yeah, well it's just one tech that I've spoken to so far, so that doesn't mean that they will necessarily implement it. I'll have to do some testing on TP4, I didn't realize it skips 4GB files, from what the Microsoft rep said it shouldn't act like that so perhaps it's something they've implemented beyond TP4 or he might be mistaken.
mkaec
Veteran
Posts: 462
Liked: 133 times
Joined: Jul 16, 2015 1:31 pm
Full Name: Marc K
Contact:

Re: Thoughts on per-VM backup files: per-disk split

Post by mkaec » 1 person likes this post

Thanks for getting to the bottom of this. I think Microsoft is missing out on a good revenue opportunity here. They are so close to being a solid alternative to dedicated dedup appliances.
Post Reply

Who is online

Users browsing this forum: Semrush [Bot] and 175 guests