Comprehensive data protection for all workloads
Post Reply
jgezels
Lurker
Posts: 1
Liked: never
Joined: Apr 19, 2011 5:53 am
Contact:

Dedup between jobs

Post by jgezels »

Hi

We're testing Veaam Backup & replication and we are wondering what best practises are concerning backup jobs and Dedup.
We tried with one "big" job (10vm's in one job) which works but offcourse is more prone to errors and failure because it runs longer.
We also tried to create a job per VM but i'm wondering if there is deduping going on between the different jobs

Can somebody clarify this?

Thank You

Jan
Gostev
Chief Product Officer
Posts: 31544
Liked: 6715 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Dedup between jobs

Post by Gostev »

Hi Jan,

There is no dedupe between jobs. Virtually all customers run multiple VMs per job, this is how Veeam is designed to be used. Putting multiple VMs into job does not make your backup more or less reliable for specific VM. The fact that jobs runs longer does not really affect anything, plus we do implement granular job retries for failed VMs within the job. Although errors and failures are not common at all (at least with our product) - unless you have actual issues with your infrastructure, there is little that could go wrong/break.

Thanks.
jpaul
Enthusiast
Posts: 88
Liked: 2 times
Joined: Apr 19, 2010 1:14 am
Full Name: Justin Paul
Contact:

Re: Dedup between jobs

Post by jpaul »

The only way you will get dedupe between jobs is to lay down a bunch of cash on something like an exagrid.
Bunce
Veteran
Posts: 259
Liked: 8 times
Joined: Sep 18, 2009 9:56 am
Full Name: Andrew
Location: Adelaide, Australia
Contact:

Re: Dedup between jobs

Post by Bunce »

jpaul wrote:The only way you will get dedupe between jobs is to lay down a bunch of cash on something like an exagrid.
Actually, a number of competing products use a dedup appliance as a target where the number and type of jobs is irrelevant. (It also negates a large number of vbk/vib/retention/removal/restore point problems posted on this forum, but thats another kettle of fish).

However as Gostev points out - there's negligible additional risk in one VM per job as opposed to numerous VM's. If a particular VM doesn't back up for whatever reason, its likely to do so irrespective of whether its by itself or not.
Gostev
Chief Product Officer
Posts: 31544
Liked: 6715 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Dedup between jobs

Post by Gostev »

Bunce wrote:Actually, a number of competing products use a dedup appliance as a target where the number and type of jobs is irrelevant.
At cost of putting all eggs in one basket, and making tape (or other removable media) backups impossible or limited to re-hydrated full backup exports requiring 10x storage space than Veeam backups... but that's another kettle of fish :wink:

Question is always about using the right tool for the job... if you have 10-20 VMs to backup, either approach is good (single Veeam job, or global dedupe will both produce the same results). If you have 100s of VMs to backup, global dedupe issues mentioned above will quickly start to matter.
Bunce
Veteran
Posts: 259
Liked: 8 times
Joined: Sep 18, 2009 9:56 am
Full Name: Andrew
Location: Adelaide, Australia
Contact:

Re: Dedup between jobs

Post by Bunce »

Gostev wrote: At cost of putting all eggs in one basket, and making tape (or other removable media) backups impossible or limited to re-hydrated full backup exports requiring 10x storage space than Veeam backups... but that's another kettle of fish :wink:
Never experienced any of these issues and we've used both. I always raise an eyebrow at the eggs in one basket argument - Veeam's 'basket' isn't a single file - its a chain of files - all which must remain in tact - and valid - and available - to support its RPO's.

Appliances self repair just as Veeam states its files do. I don't think you can't advocate placing all VM's in one job to achieve maximum deduplication and then infer that doing so is risky by 'placing all your eggs in one basket' in another product.. Either a company believes its product's backup, validation and recovery processes are sufficiently robust or it doesn't. Surebackup is an interesting pro here.. :wink:

Risk is only alleviated once a backup of the target at a point in time is made - which applies to both architectures. Appliances can be backed up or replicated in their entirety or via exports (both which worked OK for us, albeit slowly - but so are the tapes themselves, which is why we're moving away from them), and have the advantage of using far less storage space than Veem due to not having to continually perform full backups and from proper tiered retention policies that allow un-needed backups to be removed periodically, thereby allowing the total appliance space to be almost constant.

"How can I remove this backup", "why are these rollbacks still here" are continual questions - often by users who haven't fully read the doco mind you - but avoidable issues none-the-less in other architectures.

There's a reason VMWare chose to go this path after observing a number of Virtualisation backup products - unfortunately other aspects of their product stink and are about 4 versions away from being useful. Therein lies the problem - no-ones yet come up with a product which ticks all the boxes.

Right tool for the job - agree 100%, but lets not dismiss the pros and cons of each.
Gostev
Chief Product Officer
Posts: 31544
Liked: 6715 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Dedup between jobs

Post by Gostev »

Bunce wrote:Never experienced any of these issues and we've used both. I always raise an eyebrow at the eggs in one basket argument - Veeam's 'basket' isn't a single file - its a chain of files - all which must remain in tact - and valid - and available - to support its RPO's.
You are exactly right! And our approach makes it easy to make sure those file are available. Just make a copy of each small incremental backup file every day - to tape, to external hard drive, to remote location, and so on. Now, please tell me what will you copy every day in case of global dedupe?

I am sure we do not need to argue here that protecting your backup storage itself is a must? And how can you do this with global dedupe? Especially if your media is tape? Copy terabytes of data to tape every day? Not gonna work... there is simply not enough time in a day.
Bunce wrote:Appliances self repair just as Veeam states its files do.
They do not, this is all marketing. Unfortunately, you cannot magically get missing/corrupt information from nowhere. What happens is they simply "mark" bad blocks as bad, so that they are fetched from source again during the next incremental run. And users are left to pray they do not need to restore before next run actually happens... and environment to read missing data from is still available :wink:

BTW, Veeam cannot repair bad block either. Our "repair" of backup file is simply about reverting to last-know good state (the benefit here is that there is no need for successful incremental run for repair to happen).
Bunce wrote:I don't think you can't advocate placing all VM's in one job to achieve maximum deduplication and then infer that doing so is risky by 'placing all your eggs in one basket' in another product.. Either a company believes its product's backup, validation and recovery processes are sufficiently robust or it doesn't. Surebackup is an interesting pro here.. :wink:
Personally, I do not have this one-basket fear. To me, it is a matter of having a copy of your backups, and a quality of code testing. Incidentally (or may be not?) those backup companies featuring global dedupe cannot seem to provide reliable implementation, and always have these issues with corruption in dedupe stores reported on their user forums (and not like exception, but all the time). So, actually, even having a copy of backup file does not help their users...
Bunce wrote:Risk is only alleviated once a backup of the target at a point in time is made - which applies to both architectures. Appliances can be backed up or replicated in their entirety or via exports (both which worked OK for us, albeit slowly - but so are the tapes themselves, which is why we're moving away from them), and have the advantage of using far less storage space than Veem due to not having to continually perform full backups
Wow, this is not true. With Veeam, you do not have to continually perform full backups if you do not want to. Heck, our product did not even have ability to create periodic full backups until most recent versions :D we have been forever-incremental & single-full since version 1.0 !

In all cases, I am not buying it that re-hydrated exports take far less storage space than deduped and compressed Veeam backup. This is just against any logic.
Bunce wrote:and from proper tiered retention policies that allow un-needed backups to be removed periodically, thereby allowing the total appliance space to be almost constant
Just like with reversed incremental backup with Veeam (recommended way to backup to disk). Total space almost constant.
Bunce wrote:"How can I remove this backup", "why are these rollbacks still here" are continual questions - often by users who haven't fully read the doco mind you - but avoidable issues none-the-less in other architectures.
Never had a single question like this on these forums (I am not kidding, just searh) until v5 with new backup modes was released. Before that, when we only had reversed-incremental backup mode, there were zero question asked. Because retention worked exactly as people expect it to. Apparently, many people simply do not understand regular incremental backup and how retention works with it. Probably, because of being new to backup in a whole.
Bunce wrote:There's a reason VMWare chose to go this path after observing a number of Virtualisation backup products
You are saying this like VMware is some ultimate backup expert with great experience and expertise in building backup solutions. Come on... they are by far NOT the one. And the fact that VMware had chosen this road does not yet mean that the road is right. Remember, they are addressing very specific need with their product - free, "good enough", built-in backup for VERY small customers with minimal investments from VMware. Even their own positioning paper for VDR explains that larger customers should use "proper" backup solutions...
Bunce wrote:unfortunately other aspects of their product stink and are about 4 versions away from being useful. Therein lies the problem - no-ones yet come up with a product which ticks all the boxes.
No one ever will. Just like bike, car, bus and truck will never get merge into some ideal implementation. Right tool for the job! And I am not convincing you here that our approach is better for YOU specifically. I realize that other approaches might very well be better for your specific needs. I am just explaining why our approach is better for majority of customers, today.

In all cases, no product will never be good for 100% of customers... this is just something we have to live with. Some may get close, but there will always be popular alternatives. Different needs and requirements, different way of doing things, different management styles, different environment sizes. Even size alone - the product that works well for 100 VMs shop will not be good for 100000 VMs shop (and what's interesting, the other way around fully applies too).
Bunce wrote:Right tool for the job - agree 100%, but lets not dismiss the pros and cons of each.
Absolutely agree. One pro I fully agree with, is that global dedupe will always provide better space savings with raw storage, than in case of per-job dedupe. Although this becomes less and less relevant nowadays with every storage getting its own dedupe, which allows it to dedupe between multiple backup files.
Post Reply

Who is online

Users browsing this forum: billy.tsang, Bing [Bot], cme_b2b and 91 guests