-
- Novice
- Posts: 6
- Liked: 1 time
- Joined: Jan 31, 2013 6:17 pm
- Contact:
Backup Copy Job != ZFS Deduplication
Support Case: 00460180
Hey everyone have been working on this issue for about a week thought I would open this up to the community since it seem that Veeam does not think it's a software problem.
Let me give a little background information.
Specs:
OS: OpenIndiana 151a7 with Nappit front end
Case: Supermicro SC846 TQ-R900B
Motherboard: SUPERMICRO X9DR3-LN4F+ -
CPU: 2x Intel Xeon E5-2620 / 2 GHz
HBA Cards: 3x (IBM Serveraid M1015 SAS/SATA)
Memory: 64GB (4x Samsung memory - 16 GB M393B2G70BH0-CH9)
Both servers right now have a single hard drive outlined for this test so we don't get conflicting compression or dedupe ratios.
Backup Job layout
Dell PS4100X -> Veeam Backup -> SMB Share server 1(Backup Jobs)
Backup Copy Job
SMB Share TestDedupe -> Veeam Backup -> SMB Share server 2(Backup Copy Job)
These are great servers and I have had great results with deduplication on Backup Jobs. This has not been the case for Backup Copy Jobs. Lets say I take 3vm's and run two active full backups from a Backup Job a week apart the change data is about 10GB out of a 100GB file so we should have about 220GB worth of files without deduplication. But with deduplication we should have 110GB without compression. I should get somewhere around a 2.0+- .10 for the deduplication this works without any issue. Now If I take that same job and create a Backup Copy Job that does synthetic full for retention and we have two copies I would have expected to get a 2.0-+ .10 on the deduplication this is not the case and I normally get a 1.05.
Does anyone else have these issues and if so have they solved them.
Hey everyone have been working on this issue for about a week thought I would open this up to the community since it seem that Veeam does not think it's a software problem.
Let me give a little background information.
Specs:
OS: OpenIndiana 151a7 with Nappit front end
Case: Supermicro SC846 TQ-R900B
Motherboard: SUPERMICRO X9DR3-LN4F+ -
CPU: 2x Intel Xeon E5-2620 / 2 GHz
HBA Cards: 3x (IBM Serveraid M1015 SAS/SATA)
Memory: 64GB (4x Samsung memory - 16 GB M393B2G70BH0-CH9)
Both servers right now have a single hard drive outlined for this test so we don't get conflicting compression or dedupe ratios.
Backup Job layout
Dell PS4100X -> Veeam Backup -> SMB Share server 1(Backup Jobs)
Backup Copy Job
SMB Share TestDedupe -> Veeam Backup -> SMB Share server 2(Backup Copy Job)
These are great servers and I have had great results with deduplication on Backup Jobs. This has not been the case for Backup Copy Jobs. Lets say I take 3vm's and run two active full backups from a Backup Job a week apart the change data is about 10GB out of a 100GB file so we should have about 220GB worth of files without deduplication. But with deduplication we should have 110GB without compression. I should get somewhere around a 2.0+- .10 for the deduplication this works without any issue. Now If I take that same job and create a Backup Copy Job that does synthetic full for retention and we have two copies I would have expected to get a 2.0-+ .10 on the deduplication this is not the case and I normally get a 1.05.
Does anyone else have these issues and if so have they solved them.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7299 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Backup Copy Job != ZFS Deduplication
Very strange. I mean, ZFS deduplication is very strange, if it does not dedupe between bit-identical data blocks. Because Backup Copy job is all about taking data blocks from one file, and putting them into another file unmodified.
Do you possibly have compression enabled in one job, but not enabled in another one?
Do you possibly have compression enabled in one job, but not enabled in another one?
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Backup Copy Job != ZFS Deduplication
Unless I'm misreading your statement, I believe your observations are actually showing correct deduplication.
For the first example you state that you rant two active full backups so you would end up with something like this:
1st Active Full -- 100GB VBK
2nd Active Full -- 100GB VBK
So that's 200GB of data, 190GB will be duplicate (based on your 10GB of change) which would give you a savings of something around 1.95x, although it will probably be slightly less than that based on ZFS block size and how distributed the changed blocks are.
However, when you use a backup copy you will instead get something like this:
1st Full Copy -- 100GB VBK
2nd Incremental Copy -- 10GB VIB
Obviously, because Veeam only copied the 10GB of changed data there will be much less to dedupe in the second scenario. It will likely find some savings, but not a lot.
Perhaps you are saying that you are using GFS for the backup copies and having the synthetic process build a new full so you do have two VBKs on the target? Do you have GFS enabled on the job and did the GFS process finish and you now have 2 full VBK files on the target and you're still seeing this?
For the first example you state that you rant two active full backups so you would end up with something like this:
1st Active Full -- 100GB VBK
2nd Active Full -- 100GB VBK
So that's 200GB of data, 190GB will be duplicate (based on your 10GB of change) which would give you a savings of something around 1.95x, although it will probably be slightly less than that based on ZFS block size and how distributed the changed blocks are.
However, when you use a backup copy you will instead get something like this:
1st Full Copy -- 100GB VBK
2nd Incremental Copy -- 10GB VIB
Obviously, because Veeam only copied the 10GB of changed data there will be much less to dedupe in the second scenario. It will likely find some savings, but not a lot.
Perhaps you are saying that you are using GFS for the backup copies and having the synthetic process build a new full so you do have two VBKs on the target? Do you have GFS enabled on the job and did the GFS process finish and you now have 2 full VBK files on the target and you're still seeing this?
-
- Novice
- Posts: 6
- Liked: 1 time
- Joined: Jan 31, 2013 6:17 pm
- Contact:
Re: Backup Copy Job != ZFS Deduplication
Compression is disabled on the Veeam Backup and the Backup Copy. and the connection settings can be on LAN or WAN it seems to be about the same.Gostev wrote:Very strange. I mean, ZFS deduplication is very strange, if it does not dedupe between bit-identical data blocks. Because Backup Copy job is all about taking data blocks from one file, and putting them into another file unmodified.
Do you possibly have compression enabled in one job, but not enabled in another one?
Sorry should have been more specific yes I have been trying it with both GFS where it created a synthetic fulls. The method I described above was to just creating a clone of the Backup Copy. If the files where bit-identical it should not matter which job moved the data if there was no changed data in the source location.tsightler wrote:Unless I'm misreading your statement, I believe your observations are actually showing correct deduplication.
For the first example you state that you rant two active full backups so you would end up with something like this:
1st Active Full -- 100GB VBK
2nd Active Full -- 100GB VBK
So that's 200GB of data, 190GB will be duplicate (based on your 10GB of change) which would give you a savings of something around 1.95x, although it will probably be slightly less than that based on ZFS block size and how distributed the changed blocks are.
However, when you use a backup copy you will instead get something like this:
1st Full Copy -- 100GB VBK
2nd Incremental Copy -- 10GB VIB
Obviously, because Veeam only copied the 10GB of changed data there will be much less to dedupe in the second scenario. It will likely find some savings, but not a lot.
Perhaps you are saying that you are using GFS for the backup copies and having the synthetic process build a new full so you do have two VBKs on the target? Do you have GFS enabled on the job and did the GFS process finish and you now have 2 full VBK files on the target and you're still seeing this?
When checking dedupe ratio's there has always been two vbk files from the same host and same backup time or same host different backup time to see if that made a difference.
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Backup Copy Job != ZFS Deduplication
What's your fixed block size on ZFS (i.e. the dedupe block size)? Do you definitely have compression disabled on the copy job repository? Alignment?
-
- Novice
- Posts: 6
- Liked: 1 time
- Joined: Jan 31, 2013 6:17 pm
- Contact:
Re: Backup Copy Job != ZFS Deduplication
Record size is default right now with 128k.tsightler wrote:What's your fixed block size on ZFS (i.e. the dedupe block size)? Do you definitely have compression disabled on the copy job repository? Alignment?
Just checked the job yes compression is disabled on backup copy and also the backup job.
-
- Novice
- Posts: 6
- Liked: 1 time
- Joined: Jan 31, 2013 6:17 pm
- Contact:
Re: Backup Copy Job != ZFS Deduplication
So some more infomation I have created myself a 1GB test vmdk that is made up of 8 files ranging between small block size and large. I have taken this file and backed it up using a Backup Job twice (Active Full) and get a dedupe ratio of 1.99. Now I created two new Backup Copy Jobs and let them both move the two sets of data one after another. Somehow with this fileset it works correctly. dedupe ratio of 2.00 on the files. Still testing would like to get to the bottom of this problem.
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Backup Copy Job != ZFS Deduplication
To get decent deduplication from ZFS with Veeam backup files you'll likely need to use a much smaller record size. At 128K you will require fixed 128K blocks that are completely identical. Even a small difference from one file to another will keep this from being the case. You'll also want to check the align blocks option on the repository. I've seen quite good success from 16K record size on ZFS, but you'll need significantly more memory.
Who is online
Users browsing this forum: andreilight1, Bing [Bot], Semrush [Bot] and 178 guests