Comprehensive data protection for all workloads
jadams159
Enthusiast
Posts: 80
Liked: 4 times
Joined: Apr 16, 2012 11:44 am
Full Name: Justin Adams
Location: United States
Contact:

Version 7 and Deduplication devices.

Post by jadams159 »

I know there has been some discussion about hardware dedupe devices and Veeam job settings, but I want to turn the focus to the new version of Veeam. I'm specifically using Data Domain 2500's, which it may be worthwhile to note that these latest Data Domains are approved to be used as network shares, so they can be read from at reasonable rates.

What are the best practices when it comes to the following:

Storage Advanced Options-
1) Inline Deduplication
2) Compression
3) Storage Optimizations

Job Type-
I'm assuming I want to use forward incrementals, but should I perform periodic fulls, or synthetic? (Note my comment above that DD2500's don't mind being read from)

Thanks to everyone.
tsightler
VP, Product Management
Posts: 6024
Liked: 2853 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Version 7 and Deduplication devices.

Post by tsightler »

Not minding being read from for file shares, and performing a synthetic full are two drastically different workloads. Opening a closing files are simple reads which are not high I/O, synthetic full is pounding random I/O. Until proven otherwise, active fulls will continue to be the recommendation for all dedupe appliances.
foggy
Veeam Software
Posts: 21124
Liked: 2137 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Version 7 and Deduplication devices.

Post by foggy »

jadams159 wrote:What are the best practices when it comes to the following:

Storage Advanced Options-
1) Inline Deduplication
2) Compression
3) Storage Optimizations
Actually no major changes regarding these (aside from the new compression level), our general recommendation are still to use Veeam dedupe as well as the dedupe-friendly compression mode while backing up to the dedupe devices.
jadams159
Enthusiast
Posts: 80
Liked: 4 times
Joined: Apr 16, 2012 11:44 am
Full Name: Justin Adams
Location: United States
Contact:

Re: Version 7 and Deduplication devices.

Post by jadams159 »

Found a doc from Data Domain, but it is dated prior to their latest release of devices and focuses on Veeam v.6. Can you guys tell me what your thoughts are on the quotes taken from the document?

For backup jobs regarding Synthetic vs. Full:
"As a general recommendation, synthetic fulls are recommended and do not impact
restore performance. This process can be made more efficient if the option to select an
Active Full backup occasionally is selected, but this choice requires more read duress and
longer backup windows on the primary storage resources."

For backup jobs regarding Dedupe and Compression:
"For the absolute best storage performance for restores, disable Veeam compression and
deduplication. To do this, deselect the Enable inline data deduplication box and set the
compression level to None,"
"Optionally, to reduce the backup transfer burden, the default configuration can be
selected. This leverages Veeam’s deduplication at the local target option (1 MB block
source deduplication). The repository options explained earlier will ensure that the data
lands on the Data Domain uncompressed, making further reduction perform well"

The repository options mentioned above refer to this:
"Align backup file data blocks and Decompress backup data blocks before
storing options should be selected"

Thanks again.
jadams159
Enthusiast
Posts: 80
Liked: 4 times
Joined: Apr 16, 2012 11:44 am
Full Name: Justin Adams
Location: United States
Contact:

Re: Version 7 and Deduplication devices.

Post by jadams159 »

No thoughts on the above recommendations?
Vitaliy S.
VP, Product Management
Posts: 27297
Liked: 2773 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Version 7 and Deduplication devices.

Post by Vitaliy S. »

Hi Justin,

As Tom has correctly said above Active Full is still a preferable option for the dedupe appliance, however you can check it by yourself by running both options and comparing the results, but I'm pretty sure Active Full will be quicker. As to the compression and deduplication configuration, then it depends on what you want to achieve, our recommendations are still the same > Veeam, DataDomain and Linux NFS share

Thank you!
NightBird
Expert
Posts: 244
Liked: 57 times
Joined: Apr 28, 2009 8:33 am
Location: Strasbourg, FRANCE
Contact:

Re: Version 7 and Deduplication devices.

Post by NightBird »

Snif... You are not authorised to read this forum.
Vitaliy S.
VP, Product Management
Posts: 27297
Liked: 2773 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Version 7 and Deduplication devices.

Post by Vitaliy S. »

Fixed the link in my previous post, now it should work.
jadams159
Enthusiast
Posts: 80
Liked: 4 times
Joined: Apr 16, 2012 11:44 am
Full Name: Justin Adams
Location: United States
Contact:

Re: Version 7 and Deduplication devices.

Post by jadams159 »

So I decided to do some testing. The first week of backups are done. I configured my backup job per the integration guide from Data Domain (noted above).

The weird thing is that my veeam job statistics report 2.3x deduplication on the first full despite the fact that inline deduplication is NOT checked.

What's up with that?
Gostev
Chief Product Officer
Posts: 31606
Liked: 7095 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Version 7 and Deduplication devices.

Post by Gostev »

Zeroed virtual disk data block are deduped regardless of this setting. There is absolutely no good reason to inflate a backup file with a bunch of zeroed blocks :D
jadams159
Enthusiast
Posts: 80
Liked: 4 times
Joined: Apr 16, 2012 11:44 am
Full Name: Justin Adams
Location: United States
Contact:

Re: Version 7 and Deduplication devices.

Post by jadams159 »

AH HA! Thanks Gostev. Nice meeting you at VMworld.

I'm halfway through my testing, I'll post a summary of my results towards the end of the week.
luckyinfil
Enthusiast
Posts: 91
Liked: 10 times
Joined: Aug 30, 2013 8:25 pm
Contact:

Re: Version 7 and Deduplication devices.

Post by luckyinfil »

Just wanted to add to this discussion, but how do dedupe appliances (such as data domains and even Windows Server 2012) affect Veeam restore times/VM performance using instant VM recovery? Since each block needs to be rehydrated before the restore, will it be significantly slower? If so, this leads to my next question of whether one should have a undeduped repository as well as the deduped repository.

ie: Veeam backup Job -> Undeduped Repo (say keep 1 weeks worth of backups) and then a Veeam backup copy job -> Deduped repository (long term retention). This would allow for restores to come from the undeduped storage which would increase speeds.

Also, havent played around with backup Copy jobs yet but can some of the settings be clarified? If I choose 1 backup copy every day starting at 12 AM and set it to process a backup job, does that mean that every day at 12AM, the latest backup is copied over? How do the backup copy file chain look like, a bunch of VBKs or an VBK+VIB chain? Also, do the restore points include the GFS restore points?
For example, if I wanted an archive of 0 daily restore points (assume I have a primary undeduped storage array), 52 weekly restore points, 36 monthly restore points, and 7 yearly restore points, do i set restore points to 0 (assuming the backup copy job runs daily) or do I set it to 0+52+36+7=95 restore points?

Once again, in the scenario above, how are all these restore points stored? as 125 different VBKs (which would be perfect for dedupe appliances like data domain) or something different (a combination of VBKs and VIBs). A clear answer to this example would be appreciated.


I really wish Veeam had better documentation. A lot of the settings are unclear and none of the documentation really goes into detail how the backup and GFS jobs work, especially with dedupe appliances.
foggy
Veeam Software
Posts: 21124
Liked: 2137 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Version 7 and Deduplication devices.

Post by foggy »

luckyinfil wrote:how do dedupe appliances (such as data domains and even Windows Server 2012) affect Veeam restore times? Since each block needs to be rehydrated before the restore, will it be significantly slower?
Yes, your assumption is correct.
luckyinfil wrote:If so, this leads to my next question of whether one should have a undeduped repository as well as the deduped repository.
ie: Veeam backup Job -> Undeduped Repo (say keep 1 weeks worth of backups) and then a Veeam backup copy job -> Deduped repository (long term retention). This would allow for restores to come from the undeduped storage which would increase speeds.
Doing so is, in fact, a good practice.
luckyinfil wrote:If I choose 1 backup copy every day starting at 12 AM and set it to process a backup job, does that mean that every day at 12AM, the latest backup is copied over? How do the backup copy file chain look like, a bunch of VBKs or an VBK+VIB chain? Also, do the restore points include the GFS restore points?
Every day at 12AM, the backup copy job will start syncing the latest VM state, if the corresponding restore point is available. Note that backup copy does not copy files but rather synthetically creates restore points in remote location from the changed blocks extracted from the source storage. And backup copy job is always incremental (VBK+VIB chain).
luckyinfil wrote:Also, do the restore points include the GFS restore points?
Not sure I follow you here. You can specify GFS retention settings and backup copy will offload corresponding full restore points right next to the backup chain.
luckyinfil wrote:For example, if I wanted an archive of 0 daily restore points (assume I have a primary undeduped storage array), 52 weekly restore points, 36 monthly restore points, and 7 yearly restore points, do i set restore points to 0 (assuming the backup copy job runs daily) or do I set it to 0+52+36+7=95 restore points?
You cannot specify the number of restore points less than 2.
luckyinfil wrote:I really wish Veeam had better documentation. A lot of the settings are unclear and none of the documentation really goes into detail how the backup and GFS jobs work, especially with dedupe appliances.
Have you checked this user guide section?
dellock6
VeeaMVP
Posts: 6155
Liked: 1968 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Version 7 and Deduplication devices.

Post by dellock6 »

There is however a big different between deduplication appliances: the impact on restore operation is substantial on all those doing inline deduplication. There are few exceptions (Windows 2012 and Exagrid) where dedup is post-process, and you can choose to have at least the last version of the backup in a non-deduplicated state. Exagrid also has specific settings for Veeam.
This means for sure more raw disk space to be used, but at least for the last backup (90% and more of all restores are made from the last version) can be executed from a non-deduplicated backup file.

However, honestly since Veeam 7.0 is even better to design a 2 tier backup storage, and only use a dedup appliance as a second (log term) tier, while use a non deduplicated and fast storage as Tier 1, just to save 1 or 2 restore points.

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
luckyinfil
Enthusiast
Posts: 91
Liked: 10 times
Joined: Aug 30, 2013 8:25 pm
Contact:

Re: Version 7 and Deduplication devices.

Post by luckyinfil »

foggy wrote: Every day at 12AM, the backup copy job will start syncing the latest VM state, if the corresponding restore point is available. Note that backup copy does not copy files but rather synthetically creates restore points in remote location from the changed blocks extracted from the source storage. And backup copy job is always incremental (VBK+VIB chain).
So in this case, how does the VBK+VIB chain work. What happens when this chain exceeds the maximum number of restore points? IE: Max of 4 restore points, what happens to this chain VBK+VIB+VIB+VIB. In a normal incremental job, you would have up to 7 restore points (VBK+VIB+VIB+VIB+VBK+VIB+VIB, the 8th restore point would delete the first VBK chain).


foggy wrote: You cannot specify the number of restore points less than 2.
That does not answer my question at all. In my example above, I want to archive 0 daily restore points, 52 weekly, 36 monthly, and 7 yearly, what is the number i set the "Restore points to keep" setting to and what do the files on the backup copy repository look like.
dellock6
VeeaMVP
Posts: 6155
Liked: 1968 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Version 7 and Deduplication devices.

Post by dellock6 »

If I understood correctly, GFS restore points are not counted towards the maximum restore points setting, this is for the continuous backupcopy.

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
foggy
Veeam Software
Posts: 21124
Liked: 2137 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Version 7 and Deduplication devices.

Post by foggy »

luckyinfil wrote:So in this case, how does the VBK+VIB chain work. What happens when this chain exceeds the maximum number of restore points? IE: Max of 4 restore points, what happens to this chain VBK+VIB+VIB+VIB. In a normal incremental job, you would have up to 7 restore points (VBK+VIB+VIB+VIB+VBK+VIB+VIB, the 8th restore point would delete the first VBK chain).
The backup chain is transformed to make room for the most recent restore point. The process is explained on the page I referred to above in detail.
luckyinfil wrote:I want to archive 0 daily restore points, 52 weekly, 36 monthly, and 7 yearly, what is the number i set the "Restore points to keep" setting to and what do the files on the backup copy repository look like.
You can set it to any number, let it be the minimum 2 restore points, for example. You will have the two latest restore points as VBK+VIB and then the specified GFS restore points as VBK files.
luckyinfil
Enthusiast
Posts: 91
Liked: 10 times
Joined: Aug 30, 2013 8:25 pm
Contact:

Re: Version 7 and Deduplication devices.

Post by luckyinfil »

dellock6 wrote:If I understood correctly, GFS restore points are not counted towards the maximum restore points setting, this is for the continuous backupcopy.

Luca.
If I can get a confirmation, that answers my question but raises another. How are these GFS restore points stored?

ie:

1 VBK for each restore point ( ie: multiple VBKs for weekly, multiple VBKs for monthly, multiple VBKs for yearly etc)
VBK chains (forward or reverse incremental?) for each GFS "type? ( ie: VBK chain for all the the weeklies, VBK chain for all the monthlies, VBK for all the yearlies, etc)
1 VBK chain for all GFS types
foggy
Veeam Software
Posts: 21124
Liked: 2137 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Version 7 and Deduplication devices.

Post by foggy »

luckyinfil wrote:1 VBK for each restore point ( ie: multiple VBKs for weekly, multiple VBKs for monthly, multiple VBKs for yearly etc)
This one.
luckyinfil
Enthusiast
Posts: 91
Liked: 10 times
Joined: Aug 30, 2013 8:25 pm
Contact:

Re: Version 7 and Deduplication devices.

Post by luckyinfil »

In the case where you have Veeam -> Undeduped Repository (short term backups, 1 week) -> Dedupe Appliance (long term backups), how can you configure backup copy job such that it only copies data from the backup job (undeduped Repository) that are more than 1 week old.

For example, someone may want to keep 52 weeks of daily backup restore points (365 restore points in total). In the above architecture, they would want to have 1 week of daily backups in the undeduped (for fast restores and live restores) and the next 51 weeks of daily backups onto the dedupe appliance (which would be significantly slower for restores).


Also, if someone wanted to keep 365 daily restore points for a backup copy job, that would be 1 VBK and 364 VIBs. Are there any issues with that? Intuitively to me, there would be extremely slow restore times due to the LONG chain of backups, AND the rehydration by the dedupe appliance. Is it possible to store the daily backups as VBKs only similar to the GFS restore points as you've indicated earlier?
foggy
Veeam Software
Posts: 21124
Liked: 2137 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Version 7 and Deduplication devices.

Post by foggy »

luckyinfil wrote:In the case where you have Veeam -> Undeduped Repository (short term backups, 1 week) -> Dedupe Appliance (long term backups), how can you configure backup copy job such that it only copies data from the backup job (undeduped Repository) that are more than 1 week old.
That could be achieved by the regular backup job with 7 restore points retention to the primary repository and backup copy job from the primary repository to the secondary repository with 365 restore points retention.
luckyinfil wrote:Also, if someone wanted to keep 365 daily restore points for a backup copy job, that would be 1 VBK and 364 VIBs. Are there any issues with that? Intuitively to me, there would be extremely slow restore times due to the LONG chain of backups, AND the rehydration by the dedupe appliance. Is it possible to store the daily backups as VBKs only similar to the GFS restore points as you've indicated earlier?
Correct, long incremental chains mean longer restores. That's why GFS retention keeps restore points as VBK files. However, there's no built-in ability to offload VBK daily, except doing this manually via some script.
luckyinfil
Enthusiast
Posts: 91
Liked: 10 times
Joined: Aug 30, 2013 8:25 pm
Contact:

Re: Version 7 and Deduplication devices.

Post by luckyinfil »

foggy wrote: That could be achieved by the regular backup job with 7 restore points retention to the primary repository and backup copy job from the primary repository to the secondary repository with 365 restore points retention.
Correct, long incremental chains mean longer restores. That's why GFS retention keeps restore points as VBK files. However, there's no built-in ability to offload VBK daily, except doing this manually via some script.
Regarding the first point, that would create a duplicate of data which I was trying to avoid.

Regarding the second point, that's extremely unfortunate. Seems like the backup copy job still has a lot to improve before it can be considered enterprise grade. There should be an option to have full daily VBKs for the GFS settings so that those who have dedupe appliances and want many daily restore points can still have relatively fast restores.
Vitaliy S.
VP, Product Management
Posts: 27297
Liked: 2773 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Version 7 and Deduplication devices.

Post by Vitaliy S. »

If you want to store daily restore points as VBKs (full backups), then you can create a regular backup job and run active fulls every day. Alternatively, you can configure your primary backup job to reversed incremental mode and then use a post backup job script to offload the VBK (full VM backup) to a dedupe storage. Hope this helps!
kte
Expert
Posts: 179
Liked: 8 times
Joined: Jul 02, 2013 7:48 pm
Full Name: Koen Teugels
Contact:

Re: Version 7 and Deduplication devices.

Post by kte »

can you run windows 2012 dedup on GFS backups and will windows be able to keep up ? I saw that it can only 300 gb a day.
Vitaliy S.
VP, Product Management
Posts: 27297
Liked: 2773 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Version 7 and Deduplication devices.

Post by Vitaliy S. »

Koen,

Yes, you can, however I've actually heard that Windows dedupe can do 100 GB per hour. Please search for existing topic on Windows dedupe best practices and real life examples.

Thanks!
veremin
Product Manager
Posts: 20335
Liked: 2277 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Version 7 and Deduplication devices.

Post by veremin »

The additional backup job seems to be the best option, indeed, since the maximum value of restore points backup copy job can keep in case of simple retention period is 99.

Thanks.
foggy
Veeam Software
Posts: 21124
Liked: 2137 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Version 7 and Deduplication devices.

Post by foggy »

v.Eremin wrote:the maximum value of restore points backup copy job can keep in case of simple retention period is 99.
Specifically to avoid extremely long incremental backup chains, I suppose. So you would still need to use GFS retention for longer term archival.
dellock6
VeeaMVP
Posts: 6155
Liked: 1968 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Version 7 and Deduplication devices.

Post by dellock6 »

GFS retention makes dedup appliances really helpful. There are high chances many blocks in different VBK files are the same, so a dedup appliance can achieve in this situation a high deduplication ratio. With this kind of appliances, to have several full VIB files is not a huge problem.

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
opg70
Influencer
Posts: 24
Liked: 3 times
Joined: Oct 06, 2013 8:48 am
Contact:

Re: Version 7 and Deduplication devices.

Post by opg70 »

Alright, since my first post was moderated - I'll be a bit more vague. Has anyone any experience using standalone software (there are open source compression/deduplication software which seem to perform quite well - and provide much better compression that say ZIP) with scripting to launch VBK compression and dedup? I'm thinking it should be fairly trivial and might give (with a good server) just as good if not better results than using hardware dedup appliances or Windows 2012 dedup (with its limitations on file capacity able to be processed). Only disadvantage would be having to manage the extraction of the needed backup files manually.
dellock6
VeeaMVP
Posts: 6155
Liked: 1968 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Version 7 and Deduplication devices.

Post by dellock6 » 1 person likes this post

You already hit the major problem with software-base compression and deduplication: since they create custom files containing the original Veeam file, every restore operation on these files needs to wait first for its extraction, thus creating huge bottlenecks in restores. Eventually, the RTO times are going to be soo huge that the solutions would be nearly useless for any scenario.
Also, remember Veeam needs to take every file in the backup chain always available, also because for example the full backup is the starting point for any incremental backup. Compressing the VBK file into a proprietary format would be equal to take it offline, thus forcing Veeam to create a new full.

Deduplication and compression ultimately needs to be completely transparent to Veeam, so it can still see its original files.

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 33 guests