Comprehensive data protection for all workloads
Post Reply
aaron@ARB
Expert
Posts: 138
Liked: 14 times
Joined: Feb 21, 2014 3:12 am
Full Name: ARBCorporationPtyLtd
Contact:

Low Deduplication Ratio, what am i doing wrong?

Post by aaron@ARB »

Hi all,

I have 2 VMWare file servers that I am backing up (in a single job) and I am not seeing the same sort of deduplication that I have seen with Backup Exec 2014 that I am coming from.


I have been running it for a week now and the figures are below

Full Fri: Data Size: 7.4tb, Backup Size, 5.4tb, Dedupe ratio: 1.1x, Compression Ratio: 1.3x
Inc Mon: Data Size: 39.5gb, Backup Size, 20.6gb, Dedupe ratio: 1.0x, Compression Ratio: 1.9x
Inc Tue: Data Size: 57.5gb, Backup Size, 39.8gb, Dedupe ratio: 1.0x, Compression Ratio: 1.4x
Inc Wed: Data Size: 42.1gb, Backup Size, 25.4gb, Dedupe ratio: 1.0x, Compression Ratio: 1.7x
Inc Thu: Data Size: 30.3gb, Backup Size, 17.7gb, Dedupe ratio: 1.0x, Compression Ratio: 1.7x

So what this tells me is that there was no deduping done at all when it came to the incremental backups and the only thing that was happening was some compression?

Obviously there is no simple answer, but does that look acceptable? I have the compression set to optimal, incline dedupe set to on and everything is defaulted to whatever would have been set on install.

Is there anything that I am missing?

I know it will take time to improve but I will check what it does tonight when I do my next 'active full' which ultimately should store VERY little due to the fact that my incrementals are not that large which would tell me that the change rate for these 2 servers is quite low?

I have a high speed disk array which is directly attached to the server (my backup proxy is also the repository, just a physical Dell R720XD with a ton of 10k disks in Raid 10 configuration), would setting compression to off and setting the repository to uncompress for writing to the repository assist with raising the deduplication ratio?

TIA
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Low Deduplication Ratio, what am i doing wrong?

Post by Vitaliy S. »

Hi Aaron,
Aaron wrote:So what this tells me is that there was no deduping done at all when it came to the incremental backups and the only thing that was happening was some compression?
Deduplication is done within the restore point. If you're transferring changed blocks during incremental run, which is by definition should be unique blocks, then deduplication might be lower compared to the full job pass.
Aaron wrote:Is there anything that I am missing?
What are you storage optimization settings configured in the backup job?
Aaron wrote:I know it will take time to improve but I will check what it does tonight when I do my next 'active full' which ultimately should store VERY little due to the fact that my incrementals are not that large which would tell me that the change rate for these 2 servers is quite low?
Active full should not be little, it will grab the entire VM image and apply settings you've configured in the backup job (deduplication and compression levels).
Aaron wrote:I have a high speed disk array which is directly attached to the server (my backup proxy is also the repository, just a physical Dell R720XD with a ton of 10k disks in Raid 10 configuration), would setting compression to off and setting the repository to uncompress for writing to the repository assist with raising the deduplication ratio?
Unless you're using deduplication volume as a target for your backup jobs, this will not decrease the size of the backup files.

Thanks!
aaron@ARB
Expert
Posts: 138
Liked: 14 times
Joined: Feb 21, 2014 3:12 am
Full Name: ARBCorporationPtyLtd
Contact:

Re: Low Deduplication Ratio, what am i doing wrong?

Post by aaron@ARB »

Hi Vitaliy,
Vitaliy S. wrote:Deduplication is done within the restore point. If you're transferring changed blocks during incremental run, which is by definition should be unique blocks, then deduplication might be lower compared to the full job pass.
Yes I agree, I do not expect to see a great deduplication ratio when it comes to Incremental backups.
Vitaliy S. wrote:What are you storage optimization settings configured in the backup job?
From the Storage tab in the 'Advanced Settings' menu the following settings are actve

Image
Vitaliy S. wrote:Active full should not be little, it will grab the entire VM image and apply settings you've configured in the backup job (deduplication and compression levels).
Yes but given that the incremental jobs are small (as you have seen) and the amount of static data on the backup is so high, I would expect quite a small increase in the 'size on disk' of the 2nd full backup as most of it would benefit from a deduplication system. I had previously run the same backup through BE2014 and it works exactly as i would expect it to. The first full backup has a low deduplication ratio but the 2nd full backup only slightly increases the size on disk as most of the data is static.

Below is what is being reported by Veeam
Image

And here is a deduplication storage volume that I have been using with Backup Exec.
It is worth mentioning here however that BE2014 is calculating statistics based off the whole volume (which is now shared with veeam) so the used capacity should read 9.19tb which affects the ratio but you get the idea, I am physically storing 9.19tb of data with actual backup data totalling 172tb with a deduplication ratio of approximately 19:1 once you remove the veeam data.

Image
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Low Deduplication Ratio, what am i doing wrong?

Post by Vitaliy S. »

aaron@ARB wrote:From the Storage tab in the 'Advanced Settings' menu the following settings are actve
If you select different storage optimization option (smaller block to use), then the backups will be smaller and the dedupe ratio will be higher.
aaron@ARB wrote:Yes but given that the incremental jobs are small (as you have seen) and the amount of static data on the backup is so high, I would expect quite a small increase in the 'size on disk' of the 2nd full backup as most of it would benefit from a deduplication system. I had previously run the same backup through BE2014 and it works exactly as i would expect it to. The first full backup has a low deduplication ratio but the 2nd full backup only slightly increases the size on disk as most of the data is static.
Deduplication is done within the restore point, not across the entire backup chain, so your full backup will take as much space as it consumed during the initial run.

Here is a bit more details about how Veeam deduplication works > Deduplication. Hope this helps!
aaron@ARB
Expert
Posts: 138
Liked: 14 times
Joined: Feb 21, 2014 3:12 am
Full Name: ARBCorporationPtyLtd
Contact:

Re: Low Deduplication Ratio, what am i doing wrong?

Post by aaron@ARB »

Taking a simplistic view, If I have a 1mb block which I stored last week for example in my full backup, why would I want to store that same 1mb block again this week even if it has not changed? why not just use a pointer to the first copy of it? That's what BE appears to be doing which given my backups would appear to have a lot of static data in them, would make it more space efficient? As at the moment in my 20tb volume I could only store approximately 4 full backups where as with BE I have them going back to September last year.

I have read the FAQ before it but It still does not make enough sense to me, I might have to ring the local veeam people to explain it to me, as if I look at what veeam is doing with its 'deduplication' and I look at what backup exec does, it seems that BE is infinitely more efficent when I look at its storage footprint for the backup data?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Low Deduplication Ratio, what am i doing wrong?

Post by foggy »

Cannot comment on BE, however, if you do want to keep only changed blocks, then just use incremental backup, without active fulls. Each full backup in Veeam B&R is a completely independent restore point and its blocks can be deduplicated only in case of deduplicating target volume/storage.
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Low Deduplication Ratio, what am i doing wrong?

Post by Vitaliy S. »

I see your point about active full backup, but active fulls are done for a reason, for example, to avoid possible dependencies on the existing backup chain and to be protected from possible block corruption in the original backup file.
aaron@ARB
Expert
Posts: 138
Liked: 14 times
Joined: Feb 21, 2014 3:12 am
Full Name: ARBCorporationPtyLtd
Contact:

Re: Low Deduplication Ratio, what am i doing wrong?

Post by aaron@ARB » 3 people like this post

Coming from a BE world, I do full backups every week as I dont trust BE to be able to keep the backup chain intact as you really do not know what you are going to get when you come in of a morning when it comes to their backups, its just horribly unreliable. Veeam on the other hand, I can say that I have had NOT ONE problem with it so far in so far as its operation. Sure there are queries on functionality like I have been posting, but I separate these from the actual functionality of the product which has not put a foot wrong which is just such a relief.

On from what you are saying, I think its just a case of learning what the product is trying to do. When I think of the term 'deduplication' i think of storage deduplication so far as the way in which BE works in that if you are doing (in veeam speak) 'active fulls' each week, this is of little consequence to your backup size as realistically its not making much difference as you're only storing changed blocks so ostensibly its just an incremental backup anyway, just called a full backup, so i take your point about incrementals.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Low Deduplication Ratio, what am i doing wrong?

Post by foggy »

Correct, in our terms a full backup - is actually a full independent portable backup that can be taken anywhere without any dependency on the previous chain and restored, should the need be.
Post Reply

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 141 guests