Comprehensive data protection for all workloads
luckyinfil
Enthusiast
Posts: 91
Liked: 10 times
Joined: Aug 30, 2013 8:25 pm
Contact:

Re: Version 7 and Deduplication devices.

Post by luckyinfil »

dellock6 wrote:GFS retention makes dedup appliances really helpful. There are high chances many blocks in different VBK files are the same, so a dedup appliance can achieve in this situation a high deduplication ratio. With this kind of appliances, to have several full VIB files is not a huge problem.

Luca.

GFS works fine, but only for weekly, monthly, quarterly and yearly. Based on the information provided, it will have tons of issues with long term daily backup archival. When you create a backup copy job, it natively creates an incremental chain (VBK + VIBs) for the daily backup copy restore points. For those that need long term retention of daily backups, this could result in a very long chain which means extremely slow restore times the longer the chain gets (not to mention that the chain is deduped and needs to be rehydrated which further impacts restore times). The ideal way to archive daily backups would be VBK files for each day. Right now, it looks like the only way to do that right now is to have a separate backup copy job and then copy it over to a deduped repository using postjob scripts, which is not ideal.
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Version 7 and Deduplication devices.

Post by dellock6 »

All the idea behind GFS and long retention is about the low likelihood you would have to restore data from an old restore point. That's why the new V7 works this way: you keep on your primary backup target the number of restore points you think you would need to restore from, and move to the secondary backups the remaining restore points, mainly for archiving purposes.
Restoring from a long backucopy chain will for sure have lower RTO than from primary backup (especially if you are using reversed incremental here), but again what are the odds you will have to restore from there? And you will have to satisfy the same RTO values as from a 2 days old backup?

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Version 7 and Deduplication devices.

Post by tsightler »

luckyinfil wrote:The ideal way to archive daily backups would be VBK files for each day.
I don't know that I agree with this part. I believe it's better to balance the requirements of copying the full VBK vs the number of VIB files. Perhaps a VBK once a week or month.

What I've done for several customers that have this requirement is to create a small Powershell script that creates a new Backup Copy job automatically each month, with names like "<Job Name> - Oct 2013", "<Job Name> - Nov 2013", etc. The new job creates a new full VBK on the first day of the month, and then VIB files for the rest of the month, when the script detects that it's the last day of the month, it creates a new job for the next month, then disables and deletes the old job, leaving the backups intact and in the GUI as "Imported". It's not a perfect solution, but it works pretty well so far. This also keeps the chain from being super long over time (i.e. I think a 3 year long chain with 1000+ restore points would be a mess to use in the GUI), and allows the older backups to be easily moved elsewhere in the future since each month is contained within it's own folder.

No doubt I would love to see some more flexibility with this in the GUI in future versions, perhaps a "Backup Archive Copy", with a slightly different approach for very long term retention.
luckyinfil
Enthusiast
Posts: 91
Liked: 10 times
Joined: Aug 30, 2013 8:25 pm
Contact:

Re: Version 7 and Deduplication devices.

Post by luckyinfil »

daily VBKs are the ideal solution for those who are using dedupe appliances, which would be everybody (who archives on a non deduped storage?). The longer the archive chain, the longer it will take for a restore. Not only does a chain of backups slow down the restore times, it's magnified by another factor due to every incremental being in a deduped state, and having to rehydrate that data will make the restores even longer. If the amount of archive data is below the maximum ingress rate of your dedupe storage, why not use VBKs everyday?
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Version 7 and Deduplication devices.

Post by tsightler »

luckyinfil wrote:daily VBKs are the ideal solution for those who are using dedupe appliances, which would be everybody (who archives on a non deduped storage?). The longer the archive chain, the longer it will take for a restore. Not only does a chain of backups slow down the restore times, it's magnified by another factor due to every incremental being in a deduped state, and having to rehydrate that data will make the restores even longer.
The overall increase in restore time is not that significant. Veeam doesn't rehydrate every incremental file to perform a restore, we do have to read metadata from each VBK/VIB in the chain, which does increase the "startup time" for a restore somewhat, but it's not typically a huge increase in time for chains of reasonable length (7 or 30 days), typically only a few minutes at the start of the restore job.
luckyinfil wrote:If the amount of archive data is below the maximum ingress rate of your dedupe storage, why not use VBKs everyday?
I suppose I can accept that premise if the backups are small and take a fairly low amount of time to transfer a full to the dedupe storage, I know of a few customers that use the same concept for tape archives, they archive their few TBs of full backups to tape every day because they're full backups fit on a single tape or two anyway. But this isn't at all common with the customers I work for. Most are sending 10's or 100's of TBs of backup files to dedupe storage so they're not interested in having to transfer data for hours every day when transferring 10-20x less data is far faster and the only cost is the potential of a few minutes extra spent on a restore from an archive which isn't even a common occurrence.
veremin
Product Manager
Posts: 20413
Liked: 2301 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Version 7 and Deduplication devices.

Post by veremin »

The ideal way to archive daily backups would be VBK files for each day. Right now, it looks like the only way to do that right now is to have a separate backup copy job and then copy it over to a deduped repository using postjob scripts, which is not ideal.
As mentioned above, if you want to create full .vbk file on dedupe appliance on daily basis, then, the best option will be either to create additional backup job with active full selected for every day of week or use reversed incremental mode for primary backup job and run post-job script that will take resulting daily .vbk file to dedupe device.

Thanks.
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Version 7 and Deduplication devices.

Post by tsightler »

There can be a significant disadvantage to this approach over using a Backup Copy. If you're just copying the VBK files, and the VBK files themselves are already compressed with Veeam, then you'll get very poor dedupe from the appliance. If you use a Backup Copy you can choose to have the block uncompressed when writing to the dedupe appliance. It's pretty trivial to create a simple script that causes the Backup Copy to make a new VBK file on each run, it's basically the same as the "Rotated Media" script that I already use, so this is a simple option if you want this behavior and your active backups are compressed.
soylent
Enthusiast
Posts: 61
Liked: 7 times
Joined: Aug 01, 2012 8:33 pm
Full Name: Max
Location: Fort Lauderdale, Florida
Contact:

Re: Version 7 and Deduplication devices.

Post by soylent »

There are numerous mentions of "rehydration" in this post, and how certain storage devices can have very long restoration processes (specifically the DD2500).

Can someone please explain to me what kind of real world performance I would experience running an InstantRecovery from a DD2500 vs an Exagrid (or any other non-inline dedupe storage)? Are we talking 2x, 5x or 10x+ difference? Or in terms of actual time, is it a few minutes or a few hours?
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Version 7 and Deduplication devices.

Post by tsightler » 1 person likes this post

I can really say regarding the Exagrid, but in testing general devices in customer environments and labs I can give you the following generaltization:

Small Web Server
Normal Boot (to usable state): 2 min
Instant Restore from backup on regular disk: 3-4 mintues
Instant Restore from backup on dedupe appliance : 20-25 minutes

Domain Controller
Normal Boot (to usable state): ~5 min
InstantRestore from backup on regular disk: ~8-10 mintues
Instant Restore from backup on dedupe appliance : 60+ minutes

Small Exchange Server
Normal Boot (to usable state): ~10 min
InstantRestore from backup on regular disk: ~15-17 mintues
Instant Restore from backup on dedupe appliance : N/A -- services failed to start

These numbers were recorded during testing with a client system. The disk based backup was to an old Netapp filer so nothing really special. The dedupe device was not a DD2500, but rather a higher end midrange DD unit (I believe a 4500) but I've also observed similar numbers in the lab and field with other dedupe devices and don't believe there is a huge difference between any of the inline dedupe devices in this aspect. Exagrid may have some advantage because of their landing zone, assuming the backups are configured correctly.

It should also be noted that storing backups can impact operations other than Instant Restore, for example one of my clients has to wait about 10 minutes for the GUI to open a large directory tree when the backup is on a DD, and I've seen several other cases where Explorer for Exchange with large directories was quite slow. In general, I'd say the performance is at least 10x slower than just backing up to normal disk, perhaps more. This does not necessarily make the solution unacceptable, but it's just something of which to be aware. Some of these issues can be mitigated, for example, by using file indexing the FLR issue is less of a problem.

That being said, I know plenty of clients that use Veeam with dedupe appliances and are happy with the results even with these limitations.
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 113 guests