-
- Enthusiast
- Posts: 57
- Liked: 3 times
- Joined: Jul 02, 2013 4:17 am
- Full Name: NIck
- Contact:
EMC Data Domain
Hi,
i am looking for on field experience with Veeam and DD. Any of you did any deployment, if yes any recommendations/things to be worry about? issues?
what kind of dedup can you achieve?
cheers
i am looking for on field experience with Veeam and DD. Any of you did any deployment, if yes any recommendations/things to be worry about? issues?
what kind of dedup can you achieve?
cheers
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: EMC Data Domain
Nick, while you're waiting for others to chime in, I recommend to review the following existing topics regarding DataDomain deployment:
EMC Data Domain best practices
Veeam, DataDomain and Linux NFS share
Hope this helps.
EMC Data Domain best practices
Veeam, DataDomain and Linux NFS share
Hope this helps.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: EMC Data Domain
Hi Nick, generally it works well. I know of happy customers with PBs of Veeam backup data sitting in DataDomain boxes, however they are using pretty high end DD boxes. I've also seen customers struggling with the lowest end DD offering in terms of performance, but the mid-range boxes seem to be fine. Just be sure to follow the EMC recommendations on integrating with Veeam (they have a white paper available), and you should be good.
Dedupe will be decent if you deploy Veeam in recommended manner (as periodic Active Full backups will dedupe nicely), but dedupe ratios are all over the place, because they depend on the specific workload - so this is a wrong question to ask on a public forum. For example, you will not get much dedupe backing up servers holding X-Ray images, as I saw with one hospital
Also, you might be interested in our recent announcement > EMC Data Domain Boost coming to Veeam Availability Suite v8!
Dedupe will be decent if you deploy Veeam in recommended manner (as periodic Active Full backups will dedupe nicely), but dedupe ratios are all over the place, because they depend on the specific workload - so this is a wrong question to ask on a public forum. For example, you will not get much dedupe backing up servers holding X-Ray images, as I saw with one hospital
Also, you might be interested in our recent announcement > EMC Data Domain Boost coming to Veeam Availability Suite v8!
-
- Novice
- Posts: 9
- Liked: 5 times
- Joined: Mar 24, 2011 2:37 pm
- Contact:
Re: EMC Data Domain
One thing that helped our performance immensely when backing up to a CIFS share on our Data Domain was splitting the jobs up to different backup repositories. The best practice documentation is somewhat lackluster and didn't shed light on the fact that Veeam only has one write stream per backup repository.
So, in order to shrink our backup window, I created four backup repositories that all pointed to the same CIFS share on the Data Domain. I then I split our ten backup jobs out to the four backup repositories. I scheduled four jobs to kick off at the same time, each writing to a separate repository. It worked amazingly well. We cut our full backup times over half and our incrementals sped up quite a bit as well. We had one full backup job that was taking around 23 hours to complete and that time was reduced to a little under 8 hours. That same job's incremental was 50 minutes before, compared to 14 minutes after.
This http://forums.veeam.com/veeam-backup-re ... 21574.html post by my coworker uncovered this info for us.
I'm interested in what the DD Boost might gain us. Highly looking forward to version 8.
So, in order to shrink our backup window, I created four backup repositories that all pointed to the same CIFS share on the Data Domain. I then I split our ten backup jobs out to the four backup repositories. I scheduled four jobs to kick off at the same time, each writing to a separate repository. It worked amazingly well. We cut our full backup times over half and our incrementals sped up quite a bit as well. We had one full backup job that was taking around 23 hours to complete and that time was reduced to a little under 8 hours. That same job's incremental was 50 minutes before, compared to 14 minutes after.
This http://forums.veeam.com/veeam-backup-re ... 21574.html post by my coworker uncovered this info for us.
I'm interested in what the DD Boost might gain us. Highly looking forward to version 8.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: EMC Data Domain
Just wanted to note that there is no such limit. We do one write stream per each job hitting the repository. But, depending on your setup, you may indeed end up with a single job using up all available task slots, resulting in (effectively) one write stream per backup repository - just as explained in the above linked topic. Thanks!tjgrie wrote:The best practice documentation is somewhat lackluster and didn't shed light on the fact that Veeam only has one write stream per backup repository.
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: EMC Data Domain
Absolutely correct Anton, I was going to point ot the same thing. That being said, unless you take very good care it's really hard to optimize the number of I/O streams on the repository in V7.
So, I will use this as an opportunity to push one of my feature requests, the ability to set a target "number of I/O streams" for a repository. This could then be used as a hint by the load balancer on how to assign tasks slots to jobs. If a repository was configured for 16 tasks, but had it's ideal number of I/O streams set to 4, the load balancer would attempt to assign those task slots evenly across up to 4 jobs. If there weren't enough jobs pending to keep 4 I/O streams the tasks could still be assigned to the remaining jobs, but with this feature as long as there were multiple jobs created you could get maximum performance from the repository without having to create multiple repositories and manually assign jobs to them and this would also have the advantage of not having jobs artificially limited to a lower number of tasks even when there isn't contention, which is what happens with manually creating multiple repositories and manually assigning jobs to them.
So, I will use this as an opportunity to push one of my feature requests, the ability to set a target "number of I/O streams" for a repository. This could then be used as a hint by the load balancer on how to assign tasks slots to jobs. If a repository was configured for 16 tasks, but had it's ideal number of I/O streams set to 4, the load balancer would attempt to assign those task slots evenly across up to 4 jobs. If there weren't enough jobs pending to keep 4 I/O streams the tasks could still be assigned to the remaining jobs, but with this feature as long as there were multiple jobs created you could get maximum performance from the repository without having to create multiple repositories and manually assign jobs to them and this would also have the advantage of not having jobs artificially limited to a lower number of tasks even when there isn't contention, which is what happens with manually creating multiple repositories and manually assigning jobs to them.
-
- Enthusiast
- Posts: 57
- Liked: 3 times
- Joined: Jul 02, 2013 4:17 am
- Full Name: NIck
- Contact:
Re: EMC Data Domain
thanks very much to everyone, one of the reason why we are pushing veeam is the community. We are EMC partner and we will try to see if we can organize some testings. I will keep posted so we might be able to set up a good thread dedicated to this.
-
- Enthusiast
- Posts: 57
- Liked: 3 times
- Joined: Jul 02, 2013 4:17 am
- Full Name: NIck
- Contact:
Re: EMC Data Domain
one question for replication.
Since Veeam lacks still the WAN Acceleration for replicas, i was thinking to leverage the EMC Replication feature in the DD. My question thou is, since i cannot run VMs directly from the DD storage, at a certain point i will need to restore the Vms from the DD to the DR enviroment. I have heard some weird stories about the fact that when it comes to restore, the DD is pretty slow.
Any thoughts?
cheers
Since Veeam lacks still the WAN Acceleration for replicas, i was thinking to leverage the EMC Replication feature in the DD. My question thou is, since i cannot run VMs directly from the DD storage, at a certain point i will need to restore the Vms from the DD to the DR enviroment. I have heard some weird stories about the fact that when it comes to restore, the DD is pretty slow.
Any thoughts?
cheers
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: EMC Data Domain
v8 has WAN acceleration for replicas, see Veeam Availability Suite v8 (under Data Loss Avoidance).
-
- Novice
- Posts: 9
- Liked: 5 times
- Joined: Mar 24, 2011 2:37 pm
- Contact:
Re: EMC Data Domain
Sorry, I mistyped that bit. The take home lesson for anyone wanting to max out their write performance to their DD is to create multiple backup repositories, set maximum concurrent tasks to 4 or 6 or whatever, and then split your backup jobs between them, and then run those jobs simultaneously. Tslighter's responses to us in the other thread were enormously helpful.Gostev wrote: Just wanted to note that there is no such limit. We do one write stream per each job hitting the repository. But, depending on your setup, you may indeed end up with a single job using up all available task slots, resulting in (effectively) one write stream per backup repository - just as explained in the above linked topic. Thanks!
What? An EMC partner not pushing Avamar or Networker? For shame! On a more serious note, we have Avamar but use Veeam to backup our virtual environment. Maybe one day Avamar won't have such a clunky UI, won't be so price prohibitive, and not be nearly as easy to use as Veeam...myrdin wrote:thanks very much to everyone, one of the reason why we are pushing veeam is the community. We are EMC partner and we will try to see if we can organize some testings. I will keep posted so we might be able to set up a good thread dedicated to this.
I think that you can run VMs directly from the DD, but please hit up an EMC DPAD (formerly BRMS) person to confirm. I've only done some limited testing with the SureBackup and the DD but I was able to get a couple of VMs to boot and run from it. We've only done single file restores so far and they seem to be quick. I've not had to restore an entire VM from the DD yet to test the speed yet. Perhaps I'll find some time today and test that out.myrdin wrote:Since Veeam lacks still the WAN Acceleration for replicas, i was thinking to leverage the EMC Replication feature in the DD. My question thou is, since i cannot run VMs directly from the DD storage, at a certain point i will need to restore the Vms from the DD to the DR environment. I have heard some weird stories about the fact that when it comes to restore, the DD is pretty slow.
-
- Enthusiast
- Posts: 61
- Liked: 7 times
- Joined: Aug 01, 2012 8:33 pm
- Full Name: Max
- Location: Fort Lauderdale, Florida
- Contact:
Re: EMC Data Domain
Until version 8 is released, I can say that the Data Domain is NOT compatible with Veeam v7 as primary storage, no matter what their documentation tells you. I've had tickets with EMC about this and they've assured me that they are fully compatible with Veeam (their documentation had version 6), and were very reluctant to admit what was completely obvious.
Yes, Veeam can backup to a DD. However, you have to disable everything on Veeam to make it work. To me, that's like saying Veeam is compatible with a USB thumb drive. Sure, but you're not going to like the results.
I'm very excited that Veeam v8 will support DD Boost, and it will make it so much more bearable. However, it will still only make it partially compatible, and here is why:
Data Domain is designed to be a replacement of tape backups. It's a streaming backup appliance that does in-line deduplication, which means that all data that comes in gets automatically deduped. You can't turn that feature off, it's the whole point of the product.
Here's one thing the marketing brochures and sales guys never talk about: "rehydration". All data going out must be "rehydrated" (more on that later). With Veeam v7, writing backups to the DD is actually pretty fast. There are some extra settings, like you need to disable compression and use the "dedupe friendly" deduplication option. You also need to enable aligning storage blocks for the Data Domain repository. But it's fast because the DD is designed to take in streaming data. Pushing data back out is another matter.
You'll think it's running pretty great during test backups and the first few days of your backup schedule, until you reach the number of restore points you set up in Veeam. Then the problem arises when you try to do anything that requires reading the data, like doing a synthetic full backup. Synthetic fulls require a huge amount of read IO as well as write IO. That's where the "rehydration" comes in. The Data Domain must rehydrate data in real-time as it's being read, which is very CPU intensive and very, very slow. For example, I have a job that's about 1.5 terabytes of data. The daily incremental backups take a couple hours or less. The synthetic full took days to finish.
What else is heavy on read IO? SureBackup jobs, Instant Recovery, file browsing, Virtual Lab, and even Backup Copy jobs. Forget about those features if you're using a Data Domain, unless you don't mind waiting an hour to boot up a single VM.
Veeam v8 is going to address the issue with synthetic fulls and backup copy jobs by using some magic and Data Domain's "DD Boost" technology (*requires an additional license from EMC). But it's not going to have any effect on the performance of SureBackup jobs or Instant Recovery. The Data Domain must rehydrate data in order for it to be read, and nothing short of redesigning the Data Domain's hardware is going to change that. Other in-line dedupe appliances include staging partitions that negate this problem, but DD does not.
Let me be clear, the Data Domain works as intended, and I like the product. But it is simply not designed for backup solutions such as Veeam. My beef is with EMC selling the Data Domain as a primary storage device, and not properly documenting or disclosing the facts about the incompatibilities with some of Veeam's main selling features. You will need a primary storage for staging your backups, and for secondary storage or long-term archiving, the Data Domain works well.
And I hope that Veeam will make it very clear about the limitations when they are pushing v8 with DD Boost support.
Yes, Veeam can backup to a DD. However, you have to disable everything on Veeam to make it work. To me, that's like saying Veeam is compatible with a USB thumb drive. Sure, but you're not going to like the results.
I'm very excited that Veeam v8 will support DD Boost, and it will make it so much more bearable. However, it will still only make it partially compatible, and here is why:
Data Domain is designed to be a replacement of tape backups. It's a streaming backup appliance that does in-line deduplication, which means that all data that comes in gets automatically deduped. You can't turn that feature off, it's the whole point of the product.
Here's one thing the marketing brochures and sales guys never talk about: "rehydration". All data going out must be "rehydrated" (more on that later). With Veeam v7, writing backups to the DD is actually pretty fast. There are some extra settings, like you need to disable compression and use the "dedupe friendly" deduplication option. You also need to enable aligning storage blocks for the Data Domain repository. But it's fast because the DD is designed to take in streaming data. Pushing data back out is another matter.
You'll think it's running pretty great during test backups and the first few days of your backup schedule, until you reach the number of restore points you set up in Veeam. Then the problem arises when you try to do anything that requires reading the data, like doing a synthetic full backup. Synthetic fulls require a huge amount of read IO as well as write IO. That's where the "rehydration" comes in. The Data Domain must rehydrate data in real-time as it's being read, which is very CPU intensive and very, very slow. For example, I have a job that's about 1.5 terabytes of data. The daily incremental backups take a couple hours or less. The synthetic full took days to finish.
What else is heavy on read IO? SureBackup jobs, Instant Recovery, file browsing, Virtual Lab, and even Backup Copy jobs. Forget about those features if you're using a Data Domain, unless you don't mind waiting an hour to boot up a single VM.
Veeam v8 is going to address the issue with synthetic fulls and backup copy jobs by using some magic and Data Domain's "DD Boost" technology (*requires an additional license from EMC). But it's not going to have any effect on the performance of SureBackup jobs or Instant Recovery. The Data Domain must rehydrate data in order for it to be read, and nothing short of redesigning the Data Domain's hardware is going to change that. Other in-line dedupe appliances include staging partitions that negate this problem, but DD does not.
Let me be clear, the Data Domain works as intended, and I like the product. But it is simply not designed for backup solutions such as Veeam. My beef is with EMC selling the Data Domain as a primary storage device, and not properly documenting or disclosing the facts about the incompatibilities with some of Veeam's main selling features. You will need a primary storage for staging your backups, and for secondary storage or long-term archiving, the Data Domain works well.
And I hope that Veeam will make it very clear about the limitations when they are pushing v8 with DD Boost support.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: EMC Data Domain
Hi, Max.
I am sorry, but I don't think you have correct understanding of the upcoming Veeam integration with DDBoost, because you are listing some issues it resolves as outstanding (for example, synthetic fulls doing read I/O on DataDomain). Also, you are making some incorrect statements on the current state of things (for example, enabling aligning will actually hurt DataDomain integration - as explained right there in the user interface, aligning should only be enabled for constant block size deduplicating storage, while DataDomain uses variable block size).
And we've been very clear on the recommended way of using deduplicating storage with Veeam, promoting the correct architecture for a long while now. In short, you should backup to small but fast storage (for fastest backup and restore), then Backup Copy to large low cost per TB archival storage (for long term retention, so can be slow). This is the cheapest and the most effective way to meet the 3-2-1 best practice of backup. Any other backup architecture you can come up with will either leave you with empty pockets for not good reason, or will leave you unprotected.
So, regardless of the fact that DDBoost integration resolves most outstanding issues you have listed below, we will continue to recommend what we call "the ultimate VM backup architecture" even when v8 is out.
And as a final note, we find that most performance issues we are seeing in support are related to lower end DataDomain boxes anyway. As noted earlier, I know of happy customers with petabytes of Veeam backup data sitting in higher end DataDomain boxes, and they actually could not care less of DDBoost integration, because it works very well for them already one particular customer I am thinking about backs up direct to DataDomain though, and I believe they are using active fulls instead of synthetic fulls.
Thanks!
I am sorry, but I don't think you have correct understanding of the upcoming Veeam integration with DDBoost, because you are listing some issues it resolves as outstanding (for example, synthetic fulls doing read I/O on DataDomain). Also, you are making some incorrect statements on the current state of things (for example, enabling aligning will actually hurt DataDomain integration - as explained right there in the user interface, aligning should only be enabled for constant block size deduplicating storage, while DataDomain uses variable block size).
And we've been very clear on the recommended way of using deduplicating storage with Veeam, promoting the correct architecture for a long while now. In short, you should backup to small but fast storage (for fastest backup and restore), then Backup Copy to large low cost per TB archival storage (for long term retention, so can be slow). This is the cheapest and the most effective way to meet the 3-2-1 best practice of backup. Any other backup architecture you can come up with will either leave you with empty pockets for not good reason, or will leave you unprotected.
So, regardless of the fact that DDBoost integration resolves most outstanding issues you have listed below, we will continue to recommend what we call "the ultimate VM backup architecture" even when v8 is out.
And as a final note, we find that most performance issues we are seeing in support are related to lower end DataDomain boxes anyway. As noted earlier, I know of happy customers with petabytes of Veeam backup data sitting in higher end DataDomain boxes, and they actually could not care less of DDBoost integration, because it works very well for them already one particular customer I am thinking about backs up direct to DataDomain though, and I believe they are using active fulls instead of synthetic fulls.
Thanks!
-
- Enthusiast
- Posts: 61
- Liked: 7 times
- Joined: Aug 01, 2012 8:33 pm
- Full Name: Max
- Location: Fort Lauderdale, Florida
- Contact:
Re: EMC Data Domain
I think I was pretty clear that v8 resolves the synth full issue.Gostev wrote:Hi, Max.
I am sorry, but I don't think you have correct understanding of the upcoming Veeam integration with DDBoost, because you are listing some issues it resolves as outstanding (for example, synthetic fulls doing read I/O on DataDomain).
Do you mean this white paper from 2011? Or this one about v8, where you recommend using the Data Domain as primary storage?And we've been very clear on the recommended way of using deduplicating storage with Veeam, promoting the correct architecture for a long while now.
I never said I disagreed with the architecture. I'm questioning the recommendation of Data Domain, and pointing out what version 8 will not do.So, regardless of the fact that DDBoost integration resolves most outstanding issues you have listed below, we will continue to recommend what we call the "ultimate VM backup architecture" even when v8 is out.
Of course, screw the lower end Data Domain boxes. Perhaps you should add a disclaimer in your advertising, "Veeam v8 with DD Boost *for high-end Data Domain boxes only"And as a final note, we find that most performance issues we are seeing in support are related to lower end DataDomain boxes anyway. As noted earlier, I know of happy customers with petabytes of Veeam backup data sitting in higher end DataDomain boxes, and they actually could not care less of DDBoost integration, because it works very well for them already one particular customer I am thinking about backs up direct to DataDomain though, and I believe they are using active fulls instead of synthetic fulls.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: EMC Data Domain
Sorry, I've missed you mentioning that down the line.
We could not recommend this architecture in 2011, because we did not have Backup Copy jobs 3 years ago to start with however, I do not see any recommendations for using DataDomain as the primary storage in the second link. You certainly can, and as noted in my previous post many companies do, so we still need to give them recommendations on backup job settings. For example, one common scenario is backup direct to DataDomain, then replicate to another DataDomain using native replication to meet 3-2-1 rule. As long as you already bought both units, and are happy with restore performance with your Data Domain model, why not? DDBoost will make this deployment scenario work even better.
The blog post you link is actually written by the same guy who wrote the "ultimate VM backup architecture" one, so you can be sure he will never advise any other architecture for new deployments
But, you can achieve the better results by using proper VM backup architecture even with low-end Data Domain unit, so I don't really see any reason for disclaimers. What matters is that regardless of Data Domain model particular customer has, there will be a way to deploy Veeam in a way that will allow meeting required RPOs and RTOs. Higher end Data Domain model will only provide more deployment options. So frankly, I do not see a need for any disclaimers (unless it was a joke).
We could not recommend this architecture in 2011, because we did not have Backup Copy jobs 3 years ago to start with however, I do not see any recommendations for using DataDomain as the primary storage in the second link. You certainly can, and as noted in my previous post many companies do, so we still need to give them recommendations on backup job settings. For example, one common scenario is backup direct to DataDomain, then replicate to another DataDomain using native replication to meet 3-2-1 rule. As long as you already bought both units, and are happy with restore performance with your Data Domain model, why not? DDBoost will make this deployment scenario work even better.
The blog post you link is actually written by the same guy who wrote the "ultimate VM backup architecture" one, so you can be sure he will never advise any other architecture for new deployments
Assuming you are talking about backup up direct to Data Domain, that would be absolutely correct statement (or your restore performance will be abysmal). Although as of 2014, I would say "mid-range and above", because recent Data Domain hardware refresh bumped the performance dramatically, taking mid-range device to the level of former higher-end one.soylent wrote:Of course, screw the lower end Data Domain boxes. Perhaps you should add a disclaimer in your advertising, "Veeam v8 with DD Boost *for high-end Data Domain boxes only"
But, you can achieve the better results by using proper VM backup architecture even with low-end Data Domain unit, so I don't really see any reason for disclaimers. What matters is that regardless of Data Domain model particular customer has, there will be a way to deploy Veeam in a way that will allow meeting required RPOs and RTOs. Higher end Data Domain model will only provide more deployment options. So frankly, I do not see a need for any disclaimers (unless it was a joke).
-
- Enthusiast
- Posts: 61
- Liked: 7 times
- Joined: Aug 01, 2012 8:33 pm
- Full Name: Max
- Location: Fort Lauderdale, Florida
- Contact:
Re: EMC Data Domain
Thanks for the reply. The disclaimer comment was tongue-in-cheek. And I want to say that I am happy with Veeam. I am not happy with Data Domain. I think you misunderstand, or perhaps my comment comes off as an attack on Veeam. My concern is that people will see your endorsement of Data Domain with DD Boost, and purchase a "lower end" model as you put it, and end up in the situation I was in. The main offender here is EMC/Data Domain, because they have been deceptive in their marketing and selling of the Data Domain boxes as being fully compatible with Veeam v6, v7, and of course now with v8. But EMC has never, and will never admit that SureBackup or Instant Recovery or Virtual Lab are unusable on a Data Domain. Because then nobody would buy it if you can't use Veeam's best features with it.
Existing customers with Veeam + Data Domain (like myself) will definitely benefit from DD Boost support, and I am very happy that Veeam done this. It shows to me that Veeam is willing to work with even the most incompatible hardware, and that's one of the things that makes me a loyal Veeam customer. I just don't think it's a good idea to put so much emphasis on the DD Boost support. New customers will purchase Veeam and a lower-end Data Domain, and they will not be happy with it.
Existing customers with Veeam + Data Domain (like myself) will definitely benefit from DD Boost support, and I am very happy that Veeam done this. It shows to me that Veeam is willing to work with even the most incompatible hardware, and that's one of the things that makes me a loyal Veeam customer. I just don't think it's a good idea to put so much emphasis on the DD Boost support. New customers will purchase Veeam and a lower-end Data Domain, and they will not be happy with it.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: EMC Data Domain
OK, since we are both appending our posts after we post them, we are kind of playing catch up here But I think I have accidentally already answered the concern about low-end DataDomain device in the last paragraph of my previous post, while taking care of your other addition
Gostev wrote: Assuming you are talking about backup up direct to Data Domain, that would be absolutely correct statement (or your restore performance will be abysmal). Although as of 2014, I would say "mid-range and above", because recent Data Domain hardware refresh bumped the performance dramatically, taking mid-range device to the level of former higher-end one.
But, you can achieve the better results by using proper VM backup architecture even with low-end Data Domain unit, so I don't really see any reason for disclaimers. What matters is that regardless of Data Domain model particular customer has, there will be a way to deploy Veeam in a way that will allow meeting required RPOs and RTOs. Higher end Data Domain model will only provide more deployment options.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: EMC Data Domain
P.S. Unlike with v7, because Backup Copy without DDBoost will not provide acceptable transform performance with low-end Data Domain storage.
-
- Enthusiast
- Posts: 61
- Liked: 7 times
- Joined: Aug 01, 2012 8:33 pm
- Full Name: Max
- Location: Fort Lauderdale, Florida
- Contact:
Re: EMC Data Domain
In my defense, you started it! I am not a fast typer, so I didn't see your edit until after I posted my replyGostev wrote:OK, since we are both appending our posts after we post them, we are kind of playing catch up here
-
- Enthusiast
- Posts: 57
- Liked: 3 times
- Joined: Jul 02, 2013 4:17 am
- Full Name: NIck
- Contact:
Re: EMC Data Domain
well it looks like i have started something here eheh
can you post models with which Veeam works fairly well? DD2200?
i have seen soylent had problems with DD in terms of performance, but with which model?
cheers
can you post models with which Veeam works fairly well? DD2200?
i have seen soylent had problems with DD in terms of performance, but with which model?
cheers
-
- Enthusiast
- Posts: 61
- Liked: 7 times
- Joined: Aug 01, 2012 8:33 pm
- Full Name: Max
- Location: Fort Lauderdale, Florida
- Contact:
Re: EMC Data Domain
I have the DD2500, purchased early 2014. Gostev mentioned that they did a hardware refresh recently, but I don't know how recent that was.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: EMC Data Domain
DD2500 is at the top of low-end offerings bucket for EMC (includes DD160, DD2200 and DD2500 - in that order).
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: EMC Data Domain
Max, in general, the 4 digit models are the newest ones, while the 3-digits are older. This does not means a 2500 is more powerful than a DD980 for example, is the contrary, only that it comes from a newer generation.
Luca.
Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Veeam ProPartner
- Posts: 64
- Liked: 9 times
- Joined: Apr 26, 2011 10:18 pm
- Full Name: Tomas Olsen
- Contact:
Re: EMC Data Domain
The four digit models are indeed the new series for Datadomain. Be aware that they also bumped their disk size on the new series. The old ones used 1TB drives, while the new series use 3TB drives. Even though the hardware is more powerful, they can also store 3x since the old ones.
I have used Veeam combined with the old DD140/DD160, DD5xx, DD6xx and several with new new series with acceptable performance (it its time) since veeam 5.0.
I have had a few issues of using surebackup, but as long as you are somewhat patient you can easily have a few vm's running in the morning, from yesterdays backup. But I agree, surebackup and instant recovery will perform better running of faster disks on current versions.
I have never used synthetic full with datadomain due to how they handle mixed IO. A weekly active full backup is much more effective. And reversed incremental backups also don't make any sence. If the network have been the bottleneck, and the proxies have spare cpu cycles, I have also experimented with using some low level of light compression or deduplication to maximize data over a troubled or maxed out network link.
A few sites have also been set up with a linux server in front, to present nfs to veeam backup server. but I can't say that I have seen amazing performance compared to only using cifs. It seems a bit faster, but it also adds an extra layer of complexity. A thought would be when using virtual proxies, to use affinity rules to group linux NFS server and proxy on the same host to make sure data don't travel several times inn and out of host's before it hit's the storage repository.
When it comes to deduplication ratio's, I couldn't care less. I can easily show you a dedup ratio of 180x on a datadomain when using another backup software.
It's much more important to look at the footprint of your stored backup. forget about the ratio. When CBT was introduced and since Veeam also can do deduplication and compression, ratio was severly reduced. But that didn't necessarily increase the footprint of your data stored on a datadomain. You will in some cases see that storage used, might be less then earlier, with maybe a dedupratio as low as 6-12 times globally.
It's like women shopping: they only tell you how much discount they got, not how much they actually paid
So I'm actually satisfied with using datadomain as repository, even though I look forward to seeing what new versions and features will do for performance. And I do see some potential of improvement. But I believe we are on the right track.
I have used Veeam combined with the old DD140/DD160, DD5xx, DD6xx and several with new new series with acceptable performance (it its time) since veeam 5.0.
I have had a few issues of using surebackup, but as long as you are somewhat patient you can easily have a few vm's running in the morning, from yesterdays backup. But I agree, surebackup and instant recovery will perform better running of faster disks on current versions.
I have never used synthetic full with datadomain due to how they handle mixed IO. A weekly active full backup is much more effective. And reversed incremental backups also don't make any sence. If the network have been the bottleneck, and the proxies have spare cpu cycles, I have also experimented with using some low level of light compression or deduplication to maximize data over a troubled or maxed out network link.
A few sites have also been set up with a linux server in front, to present nfs to veeam backup server. but I can't say that I have seen amazing performance compared to only using cifs. It seems a bit faster, but it also adds an extra layer of complexity. A thought would be when using virtual proxies, to use affinity rules to group linux NFS server and proxy on the same host to make sure data don't travel several times inn and out of host's before it hit's the storage repository.
When it comes to deduplication ratio's, I couldn't care less. I can easily show you a dedup ratio of 180x on a datadomain when using another backup software.
It's much more important to look at the footprint of your stored backup. forget about the ratio. When CBT was introduced and since Veeam also can do deduplication and compression, ratio was severly reduced. But that didn't necessarily increase the footprint of your data stored on a datadomain. You will in some cases see that storage used, might be less then earlier, with maybe a dedupratio as low as 6-12 times globally.
It's like women shopping: they only tell you how much discount they got, not how much they actually paid
So I'm actually satisfied with using datadomain as repository, even though I look forward to seeing what new versions and features will do for performance. And I do see some potential of improvement. But I believe we are on the right track.
-
- Lurker
- Posts: 2
- Liked: never
- Joined: Jun 16, 2014 12:47 pm
- Full Name: Jason Jones
- Contact:
Re: EMC Data Domain
WARNING: I'm an EMC DPAD Employee - so be nice. And I'm pro anything that speaks BOOST. Us DPAD guys are super excited that Veeam v8 will support BOOST!
After reviewing the v8 Boost whitepaper I had some questions as it related to replicating:
What's the preferred ideal way to replicate Veeam backups in a Datadomain landscape? Would you use MTREE replication? Directory replication? Or do you Veeam guys prefer to use Veeam replication?
Many 3rd party companies that utilize the BOOST SDK leverage this API to facilitate application aware replication (we sometimes call that "Managed File Replication" or "Boost Replication") Are there any plans to have Veeam support this capability? (You would instrument the backup and replication jobs in Veeam and you would be able to track not only backup status but replication status as well)
-jj
After reviewing the v8 Boost whitepaper I had some questions as it related to replicating:
What's the preferred ideal way to replicate Veeam backups in a Datadomain landscape? Would you use MTREE replication? Directory replication? Or do you Veeam guys prefer to use Veeam replication?
Many 3rd party companies that utilize the BOOST SDK leverage this API to facilitate application aware replication (we sometimes call that "Managed File Replication" or "Boost Replication") Are there any plans to have Veeam support this capability? (You would instrument the backup and replication jobs in Veeam and you would be able to track not only backup status but replication status as well)
-jj
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: EMC Data Domain
Hi, Jason. Thanks for registering on our forums to post this.
The ideal way to replicate Veeam backups is our Backup Copy job, as it provides more granular control than just copying the entire backup files. For example, you can pick and choose individual VMs to copy (not the entire backup files), do many-jobs-to-one-backup-file type of copy, and so on. It is also storage agnostic, so you can use it to get data into, or from Data Domain world easily.
Now, of course we also have customers who are using native DataDomain replication to copy the entire backup files as they are, but we do not control this native replication in our upcoming product release. We may or may not do this in future, depending on real world feedback on our v8.
Thanks!
The ideal way to replicate Veeam backups is our Backup Copy job, as it provides more granular control than just copying the entire backup files. For example, you can pick and choose individual VMs to copy (not the entire backup files), do many-jobs-to-one-backup-file type of copy, and so on. It is also storage agnostic, so you can use it to get data into, or from Data Domain world easily.
Now, of course we also have customers who are using native DataDomain replication to copy the entire backup files as they are, but we do not control this native replication in our upcoming product release. We may or may not do this in future, depending on real world feedback on our v8.
Thanks!
-
- Lurker
- Posts: 2
- Liked: never
- Joined: Jun 16, 2014 12:47 pm
- Full Name: Jason Jones
- Contact:
Re: EMC Data Domain
Are there any plans for a Veeam Backup Copy job to support Datadomain Boost / Managed File Replication (when the first landing zone is Datadomain via Boost of course...)
I assume that a Native Veeam Backup Copy job would be a Veeam instance performing the WAN replication to I suppose another Veeam instance?
Just curious.
THANKS
I assume that a Native Veeam Backup Copy job would be a Veeam instance performing the WAN replication to I suppose another Veeam instance?
Just curious.
THANKS
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: EMC Data Domain
For your first question, see the last sentence in my previous response
Yes, without going to deep into the woods, it is something like this. Backup Copy process goes direct from one Veeam repository to another Veeam repository. WAN acceleration is optional, and is normally used for WAN links under 100Mbps. On fast links, standard Backup Copy will simply work much faster than with WAN acceleration enabled, since it does not have to do any extra data processing.
Thanks.
Yes, without going to deep into the woods, it is something like this. Backup Copy process goes direct from one Veeam repository to another Veeam repository. WAN acceleration is optional, and is normally used for WAN links under 100Mbps. On fast links, standard Backup Copy will simply work much faster than with WAN acceleration enabled, since it does not have to do any extra data processing.
Thanks.
-
- Enthusiast
- Posts: 55
- Liked: 6 times
- Joined: May 25, 2012 2:09 pm
- Full Name: Steve Galbincea
- Location: Houston, TX
- Contact:
Re: EMC Data Domain
Quick question on Backup Copy vs. MTREE Replication:
If we choose to try MTREE replication, what is the easiest way to make Veeam aware of the replicated repository on the other side? Does creating a repository out of the replicated MTREE at DR automatically detect the updated files at a certain interval? Or do we have to manually rescan the repository to update what is in it every time the MTREE replicates?
If we choose to try MTREE replication, what is the easiest way to make Veeam aware of the replicated repository on the other side? Does creating a repository out of the replicated MTREE at DR automatically detect the updated files at a certain interval? Or do we have to manually rescan the repository to update what is in it every time the MTREE replicates?
Senior Solutions Engineer, SLED - VMware
-
- Enthusiast
- Posts: 55
- Liked: 6 times
- Joined: May 25, 2012 2:09 pm
- Full Name: Steve Galbincea
- Location: Houston, TX
- Contact:
Re: EMC Data Domain
To expand on the above, I created the MTREE replication and it worked exactly as I expected - the bits were copied across much faster as they did not have to be reprocessed by Veeam. I added a new repo pointed at the CIFS share for the replicated data on the DR side and was successfully able to see the backups in there. My one major challenge now is that when you go to restore you cannot determine which repo is which since both repositories identify the job name identically.
Senior Solutions Engineer, SLED - VMware
-
- Product Manager
- Posts: 20406
- Liked: 2298 times
- Joined: Oct 26, 2012 3:28 pm
- Full Name: Vladimir Eremin
- Contact:
Re: EMC Data Domain
You can try to edit job name setting directly in .vbm file of copied backup job, using any text editor, and see whether it helps. This approach is not recommended, though; so, copy .vbm before editing. Thanks.
Who is online
Users browsing this forum: Bing [Bot], restore-helper, Semrush [Bot] and 257 guests