Comprehensive data protection for all workloads
Post Reply
TechIsCool
Novice
Posts: 6
Liked: 1 time
Joined: Jan 31, 2013 6:17 pm
Contact:

Backup Copy Job != ZFS Deduplication

Post by TechIsCool »

Support Case: 00460180

Hey everyone have been working on this issue for about a week thought I would open this up to the community since it seem that Veeam does not think it's a software problem.

Let me give a little background information.

Specs:
OS: OpenIndiana 151a7 with Nappit front end
Case: Supermicro SC846 TQ-R900B
Motherboard: SUPERMICRO X9DR3-LN4F+ -
CPU: 2x Intel Xeon E5-2620 / 2 GHz
HBA Cards: 3x (IBM Serveraid M1015 SAS/SATA)
Memory: 64GB (4x Samsung memory - 16 GB M393B2G70BH0-CH9)

Both servers right now have a single hard drive outlined for this test so we don't get conflicting compression or dedupe ratios.

Backup Job layout
Dell PS4100X -> Veeam Backup -> SMB Share server 1(Backup Jobs)

Backup Copy Job
SMB Share TestDedupe -> Veeam Backup -> SMB Share server 2(Backup Copy Job)

These are great servers and I have had great results with deduplication on Backup Jobs. This has not been the case for Backup Copy Jobs. Lets say I take 3vm's and run two active full backups from a Backup Job a week apart the change data is about 10GB out of a 100GB file so we should have about 220GB worth of files without deduplication. But with deduplication we should have 110GB without compression. I should get somewhere around a 2.0+- .10 for the deduplication this works without any issue. Now If I take that same job and create a Backup Copy Job that does synthetic full for retention and we have two copies I would have expected to get a 2.0-+ .10 on the deduplication this is not the case and I normally get a 1.05.

Does anyone else have these issues and if so have they solved them.
Gostev
Chief Product Officer
Posts: 31457
Liked: 6647 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Backup Copy Job != ZFS Deduplication

Post by Gostev »

Very strange. I mean, ZFS deduplication is very strange, if it does not dedupe between bit-identical data blocks. Because Backup Copy job is all about taking data blocks from one file, and putting them into another file unmodified.

Do you possibly have compression enabled in one job, but not enabled in another one?
tsightler
VP, Product Management
Posts: 6009
Liked: 2842 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Backup Copy Job != ZFS Deduplication

Post by tsightler »

Unless I'm misreading your statement, I believe your observations are actually showing correct deduplication.

For the first example you state that you rant two active full backups so you would end up with something like this:

1st Active Full -- 100GB VBK
2nd Active Full -- 100GB VBK

So that's 200GB of data, 190GB will be duplicate (based on your 10GB of change) which would give you a savings of something around 1.95x, although it will probably be slightly less than that based on ZFS block size and how distributed the changed blocks are.

However, when you use a backup copy you will instead get something like this:

1st Full Copy -- 100GB VBK
2nd Incremental Copy -- 10GB VIB

Obviously, because Veeam only copied the 10GB of changed data there will be much less to dedupe in the second scenario. It will likely find some savings, but not a lot.

Perhaps you are saying that you are using GFS for the backup copies and having the synthetic process build a new full so you do have two VBKs on the target? Do you have GFS enabled on the job and did the GFS process finish and you now have 2 full VBK files on the target and you're still seeing this?
TechIsCool
Novice
Posts: 6
Liked: 1 time
Joined: Jan 31, 2013 6:17 pm
Contact:

Re: Backup Copy Job != ZFS Deduplication

Post by TechIsCool »

Gostev wrote:Very strange. I mean, ZFS deduplication is very strange, if it does not dedupe between bit-identical data blocks. Because Backup Copy job is all about taking data blocks from one file, and putting them into another file unmodified.

Do you possibly have compression enabled in one job, but not enabled in another one?
Compression is disabled on the Veeam Backup and the Backup Copy. and the connection settings can be on LAN or WAN it seems to be about the same.

tsightler wrote:Unless I'm misreading your statement, I believe your observations are actually showing correct deduplication.

For the first example you state that you rant two active full backups so you would end up with something like this:

1st Active Full -- 100GB VBK
2nd Active Full -- 100GB VBK

So that's 200GB of data, 190GB will be duplicate (based on your 10GB of change) which would give you a savings of something around 1.95x, although it will probably be slightly less than that based on ZFS block size and how distributed the changed blocks are.

However, when you use a backup copy you will instead get something like this:

1st Full Copy -- 100GB VBK
2nd Incremental Copy -- 10GB VIB

Obviously, because Veeam only copied the 10GB of changed data there will be much less to dedupe in the second scenario. It will likely find some savings, but not a lot.

Perhaps you are saying that you are using GFS for the backup copies and having the synthetic process build a new full so you do have two VBKs on the target? Do you have GFS enabled on the job and did the GFS process finish and you now have 2 full VBK files on the target and you're still seeing this?
Sorry should have been more specific yes I have been trying it with both GFS where it created a synthetic fulls. The method I described above was to just creating a clone of the Backup Copy. If the files where bit-identical it should not matter which job moved the data if there was no changed data in the source location.

When checking dedupe ratio's there has always been two vbk files from the same host and same backup time or same host different backup time to see if that made a difference.
tsightler
VP, Product Management
Posts: 6009
Liked: 2842 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Backup Copy Job != ZFS Deduplication

Post by tsightler »

What's your fixed block size on ZFS (i.e. the dedupe block size)? Do you definitely have compression disabled on the copy job repository? Alignment?
TechIsCool
Novice
Posts: 6
Liked: 1 time
Joined: Jan 31, 2013 6:17 pm
Contact:

Re: Backup Copy Job != ZFS Deduplication

Post by TechIsCool »

tsightler wrote:What's your fixed block size on ZFS (i.e. the dedupe block size)? Do you definitely have compression disabled on the copy job repository? Alignment?
Record size is default right now with 128k.

Just checked the job yes compression is disabled on backup copy and also the backup job.
TechIsCool
Novice
Posts: 6
Liked: 1 time
Joined: Jan 31, 2013 6:17 pm
Contact:

Re: Backup Copy Job != ZFS Deduplication

Post by TechIsCool »

So some more infomation I have created myself a 1GB test vmdk that is made up of 8 files ranging between small block size and large. I have taken this file and backed it up using a Backup Job twice (Active Full) and get a dedupe ratio of 1.99. Now I created two new Backup Copy Jobs and let them both move the two sets of data one after another. Somehow with this fileset it works correctly. dedupe ratio of 2.00 on the files. Still testing would like to get to the bottom of this problem.
tsightler
VP, Product Management
Posts: 6009
Liked: 2842 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Backup Copy Job != ZFS Deduplication

Post by tsightler »

To get decent deduplication from ZFS with Veeam backup files you'll likely need to use a much smaller record size. At 128K you will require fixed 128K blocks that are completely identical. Even a small difference from one file to another will keep this from being the case. You'll also want to check the align blocks option on the repository. I've seen quite good success from 16K record size on ZFS, but you'll need significantly more memory.
Post Reply

Who is online

Users browsing this forum: abdul_bari, Semrush [Bot] and 192 guests