-
- Influencer
- Posts: 18
- Liked: 1 time
- Joined: Jun 15, 2017 6:44 am
- Full Name: Michael
- Contact:
Deduplication Storage as first target OK nowadays?
Hi all
I would like to know if the new EMC Data Domain and HP StoreOnce appliances are good to go nowadays as a first target for backups via veeam (backup jobs and backup copy jobs). I know that in the old days it was not recommended to have the ciritcal "fast" restore object placed on those because you had to wait quite a while until the File Level Explorer loaded.
Is this still true for today or can we palce a backup job directly to an appliance as mentioned above? If yes, is there any special settings in the job or a specific backup method needed, for example forever forwarded?
Thank you very much for your help
I would like to know if the new EMC Data Domain and HP StoreOnce appliances are good to go nowadays as a first target for backups via veeam (backup jobs and backup copy jobs). I know that in the old days it was not recommended to have the ciritcal "fast" restore object placed on those because you had to wait quite a while until the File Level Explorer loaded.
Is this still true for today or can we palce a backup job directly to an appliance as mentioned above? If yes, is there any special settings in the job or a specific backup method needed, for example forever forwarded?
Thank you very much for your help
-
- VP, Product Management
- Posts: 7081
- Liked: 1511 times
- Joined: May 04, 2011 8:36 am
- Full Name: Andreas Neufert
- Location: Germany
- Contact:
Re: Deduplication Storage as first target OK nowadays?
The Veeam best practices combine the 3 (copies), 2 (medias), 1 (offsite copy) rule together with the fast backups and restore + Cheap long term storage demand from customers.
So usually we write into fast storage the most actual restore points and create a copy of the backups to another storage system which is optimized for long term backup on cheaper media (slower).
From a support perspective we support all storages including the deduplication storages at both. It is up to the vendor and you to select the right storage solution for the fast and long term side. The storage type need to be selected based on requirements. All Storage vendors have guidelines when and when not to use specific storages. Best approach would be to document what you need from SLA perspective (RTO + RPO + How long you want to keep the backups + dataamount + budget) and ask the vendors or partners to create you a solution for it. As well document what you use from primary storage side as potentially a storage snapshot environment can be used for fast restore.
So usually we write into fast storage the most actual restore points and create a copy of the backups to another storage system which is optimized for long term backup on cheaper media (slower).
From a support perspective we support all storages including the deduplication storages at both. It is up to the vendor and you to select the right storage solution for the fast and long term side. The storage type need to be selected based on requirements. All Storage vendors have guidelines when and when not to use specific storages. Best approach would be to document what you need from SLA perspective (RTO + RPO + How long you want to keep the backups + dataamount + budget) and ask the vendors or partners to create you a solution for it. As well document what you use from primary storage side as potentially a storage snapshot environment can be used for fast restore.
-
- Influencer
- Posts: 18
- Liked: 1 time
- Joined: Jun 15, 2017 6:44 am
- Full Name: Michael
- Contact:
Re: Deduplication Storage as first target OK nowadays?
Hi
Very usefull feedback, thank you. Basically we would like to use a physical backup proxy but just to connect to the storage by iSCSI (direct san backup) and backup all data to a HP StoreOnce. In addition we are going to have a 2nd HP Store Once for replication and a tape media for long term an off-site backup.
But in this ca we do not have any local capacity for short term backup.
Do you think this is a big disadvantage or do you think the setup is okey anyway?
Cheers
Very usefull feedback, thank you. Basically we would like to use a physical backup proxy but just to connect to the storage by iSCSI (direct san backup) and backup all data to a HP StoreOnce. In addition we are going to have a 2nd HP Store Once for replication and a tape media for long term an off-site backup.
But in this ca we do not have any local capacity for short term backup.
Do you think this is a big disadvantage or do you think the setup is okey anyway?
Cheers
-
- VP, Product Management
- Posts: 7081
- Liked: 1511 times
- Joined: May 04, 2011 8:36 am
- Full Name: Andreas Neufert
- Location: Germany
- Contact:
Re: Deduplication Storage as first target OK nowadays?
Overall please speak with HPE about it. As you can not replicate with in StoreOnce today other than rehydrate the dedup data you need careful planning regarding performance and backup runtimes.
In this actual setup I would use somethin like an HPE Apollo to backup to it and then use 2 Backup Copy Jobs to write the 2 StoreOnce Storages.
Anyway use Catalyst (over IP) as transport protocoll to StoreOnce, there is no iSCSI option.
Again please check with HPE to find the best setup and architecure for your usecase. As we do not know the details it is hard to say if the above would work or not.
In this actual setup I would use somethin like an HPE Apollo to backup to it and then use 2 Backup Copy Jobs to write the 2 StoreOnce Storages.
Anyway use Catalyst (over IP) as transport protocoll to StoreOnce, there is no iSCSI option.
Again please check with HPE to find the best setup and architecure for your usecase. As we do not know the details it is hard to say if the above would work or not.
-
- Expert
- Posts: 245
- Liked: 58 times
- Joined: Apr 28, 2009 8:33 am
- Location: Strasbourg, FRANCE
- Contact:
Re: Deduplication Storage as first target OK nowadays?
Don’t waste your $ with this crappy dedupe appliance (storeonce or datadomain), take an Apollo server with a bunch of disk and ReFS filesystem.
It rocks, it’s my opinion.
It rocks, it’s my opinion.
-
- Expert
- Posts: 176
- Liked: 30 times
- Joined: Jul 26, 2018 8:04 pm
- Full Name: Eugene V
- Contact:
Re: Deduplication Storage as first target OK nowadays?
I would recommend against StoreOnce as a primary target based on my experience, as there are real restore limitations especially if your VMs are of more than a single disk.
vmware-vsphere-f24/slow-restores-from-h ... =storeonce
vmware-vsphere-f24/slow-restores-from-h ... =storeonce
-
- Technology Partner
- Posts: 36
- Liked: 38 times
- Joined: Aug 21, 2017 3:27 pm
- Full Name: Federico Venier
- Contact:
Re: Deduplication Storage as first target OK nowadays?
I work in HPE and in my lab I have tested both configurations:
1) Veeam proxy running and writing to Apollo 4200 Gen9 local disks
2) Veeam proxy/gateway running on the same Apollo and writing directly to StoreOnce using Catalyst in source side deduplication mode.
The Apollo had 2 x 16Gb FC connections to 3PAR, and I configured the Veeam job (18 VMs) to use HW snapshots (I love Veeam integration with 3PAR and Nimble snapshots)
In both cases I measured a stable 3GB/s ingestion rate, which is not far from my 2 FC ports maximum bandwidth.
Apollo configuration: one socket @ 18 cores (yes one socket only), one SmartArray , 25@12TB SATA in RAID60 (12+12) + 1 HS, Strip size=128KB and ReFS bs=64KB. I also had 2 SSD for the OS and for vPower NFS.
StoreOnce was an old SO 5100 with 24 disks connected via one 10GbE to the Apollo.
During both tests the Apollo system resources were quite loaded. StoreOnce, instead, was not particularly under stress. I guess that I could have connected a second Apollo to it for total 6GB/s, but this was not tested.
I also I wanted to compare the dedupe of the two configurations. I did my best to tune my workload generator to simulate an average real-world workload and I run a daily backup with a retention of 56 restore points (8 weeks forward full with weekly synthetic full).
Below there are the dedupe ratios measured at the end of the test:
- Veeam compression & dedupe = 2 to 1,
- Veeam comp & dedupe & ReFS = 3 to 1 (I expected something more from ReFS fast clone)
- Veeam & StoreOnce = 16 to 1
I cannot say that one configuration is better than the other, it all depends on our requirements.
I like Apollo for its performance especially on single stream backup/restore (a single VM backup runs at 1.5GB/s)
I like StoreOnce for its ability to reduce network/storage utilization and for multi stream performance.
Personally I would not design a data protection solution that relies uniquely on Windows based storage. The reason is obvious, if a ransomware infects my datacenter, then it could also infect my Apollo. At this point the malware could easily access my local ReFS and corrupt my backup data (.vbk and .vib). Data on StoreOnce, instead, is fully protected because it is accessible only using Catalyst API and thus it is invisible for the Windows OS and for any malware running on it.
1) Veeam proxy running and writing to Apollo 4200 Gen9 local disks
2) Veeam proxy/gateway running on the same Apollo and writing directly to StoreOnce using Catalyst in source side deduplication mode.
The Apollo had 2 x 16Gb FC connections to 3PAR, and I configured the Veeam job (18 VMs) to use HW snapshots (I love Veeam integration with 3PAR and Nimble snapshots)
In both cases I measured a stable 3GB/s ingestion rate, which is not far from my 2 FC ports maximum bandwidth.
Apollo configuration: one socket @ 18 cores (yes one socket only), one SmartArray , 25@12TB SATA in RAID60 (12+12) + 1 HS, Strip size=128KB and ReFS bs=64KB. I also had 2 SSD for the OS and for vPower NFS.
StoreOnce was an old SO 5100 with 24 disks connected via one 10GbE to the Apollo.
During both tests the Apollo system resources were quite loaded. StoreOnce, instead, was not particularly under stress. I guess that I could have connected a second Apollo to it for total 6GB/s, but this was not tested.
I also I wanted to compare the dedupe of the two configurations. I did my best to tune my workload generator to simulate an average real-world workload and I run a daily backup with a retention of 56 restore points (8 weeks forward full with weekly synthetic full).
Below there are the dedupe ratios measured at the end of the test:
- Veeam compression & dedupe = 2 to 1,
- Veeam comp & dedupe & ReFS = 3 to 1 (I expected something more from ReFS fast clone)
- Veeam & StoreOnce = 16 to 1
I cannot say that one configuration is better than the other, it all depends on our requirements.
I like Apollo for its performance especially on single stream backup/restore (a single VM backup runs at 1.5GB/s)
I like StoreOnce for its ability to reduce network/storage utilization and for multi stream performance.
Personally I would not design a data protection solution that relies uniquely on Windows based storage. The reason is obvious, if a ransomware infects my datacenter, then it could also infect my Apollo. At this point the malware could easily access my local ReFS and corrupt my backup data (.vbk and .vib). Data on StoreOnce, instead, is fully protected because it is accessible only using Catalyst API and thus it is invisible for the Windows OS and for any malware running on it.
-
- Lurker
- Posts: 1
- Liked: never
- Joined: Feb 27, 2017 2:45 pm
- Full Name: Tomislav Tezak
- Contact:
Re: Deduplication Storage as first target OK nowadays?
We used to have a StoreOnce 4900 for our main backups and let me tell you we had nothing but Problems with it.
First we tried with the FibreChannel Integration and the StoreOnce has an inbuilt Limitation which Drops all other backups.
When we switched to Network the Performance was better but it seemed like the backupchain was corrupted every 2-3 months. We then had to delete backups weekwise until it was back to working.
We confronted HP about this and they were just saying that we should buy the next Generation, it would work then.
So what we did is buy an Exagrid which uses a landing Zone for the newest backups. It then moves older backups into the deduplicated storage. We never had a Problem since.
Be very careful with HPE StoreOnce, it doesn't seem like its worth the Money.
First we tried with the FibreChannel Integration and the StoreOnce has an inbuilt Limitation which Drops all other backups.
When we switched to Network the Performance was better but it seemed like the backupchain was corrupted every 2-3 months. We then had to delete backups weekwise until it was back to working.
We confronted HP about this and they were just saying that we should buy the next Generation, it would work then.
So what we did is buy an Exagrid which uses a landing Zone for the newest backups. It then moves older backups into the deduplicated storage. We never had a Problem since.
Be very careful with HPE StoreOnce, it doesn't seem like its worth the Money.
-
- Novice
- Posts: 6
- Liked: 2 times
- Joined: Apr 28, 2015 2:12 pm
- Full Name: Jeff White
- Contact:
Re: Deduplication Storage as first target OK nowadays?
We have been running with Data Domain 990's for past 3 years. We had them for other backup applications (Boost/TSM) prior to using Veeam, so they were available as a repository
For:
Fast write time
Against:
Restore time was poor - often 3.5 hours just to restore a single file.
Mtree replication - definitely worth avoiding when creating offsite copies of Veeam backups. Simply does not work. We use Veeam BCJ's instead
We have recently upgraded to DD9800's
Still have the same fast write
But meta data now held on separate SSD and those 3.5 hour restore times are now 20 minutes max
But still avoid data domain mtree replication and continue with Veeam BCJ's - works fine for us
For:
Fast write time
Against:
Restore time was poor - often 3.5 hours just to restore a single file.
Mtree replication - definitely worth avoiding when creating offsite copies of Veeam backups. Simply does not work. We use Veeam BCJ's instead
We have recently upgraded to DD9800's
Still have the same fast write
But meta data now held on separate SSD and those 3.5 hour restore times are now 20 minutes max
But still avoid data domain mtree replication and continue with Veeam BCJ's - works fine for us
-
- Enthusiast
- Posts: 42
- Liked: 9 times
- Joined: Feb 03, 2014 7:40 am
- Contact:
Re: Deduplication Storage as first target OK nowadays?
On our 5:th year with a deduplicating Dell DR4100 appliance as first backup target. Physical storage is about 15 TB and dedup ratio has been around 7x. It has served us well with about 600MB/s write speeds over 10 GBE and 400 MB/s reads. From launching Veeam Console to getting ping answers from the VM being restored takes about 15 minutes for an average single-disk VM of about 30 GB.
The instant recovery is too slow though, the unit does not handle live VMs well since they both read and write during bootup/running.
I would not hesitate to use deduplicating appliances again if they can compete with price.
The instant recovery is too slow though, the unit does not handle live VMs well since they both read and write during bootup/running.
I would not hesitate to use deduplicating appliances again if they can compete with price.
-
- Veteran
- Posts: 636
- Liked: 100 times
- Joined: Mar 23, 2018 4:43 pm
- Full Name: EJ
- Location: London
- Contact:
Re: Deduplication Storage as first target OK nowadays?
Well I guess that's the point. The de-dupe devices save money by using less physical disk. But if the device itself is much more expensive it outweighs the savings. So then less effective solutions like ReFS start to become competitive.
-
- Enthusiast
- Posts: 26
- Liked: 4 times
- Joined: Dec 28, 2016 3:42 pm
- Contact:
Re: Deduplication Storage as first target OK nowadays?
If you are open to other products, take a look at the Quantum DXi series. They have the Datamover service built into the appliance and it works very well. I have two 4700s. I write all the primary jobs to one and replicate to another one in another location. I also write off a copy to LTO8 Tapes using Quantum's Scalar i3 tape library. Works like a charm.
-
- Enthusiast
- Posts: 37
- Liked: 4 times
- Joined: Nov 05, 2014 4:19 pm
- Contact:
Re: Deduplication Storage as first target OK nowadays?
I found I get much more bang for buck utilizing Windows Server 2012/2016 Dedup with attached JBOD enclosures in whatever RAID level is appropriate.
A 24 drive JBOD enclosure with 4+ TB HDD on a dual link RAID 10 has always provided great access time and is far cheaper than any vendor product out there.
One with no Dedup and REFS for primary storage.
Add a second one for your Backup Copies and enable Windows Dedup on these volumes.
A cheap server and a third enclosure for your off site.
A 24 drive JBOD enclosure with 4+ TB HDD on a dual link RAID 10 has always provided great access time and is far cheaper than any vendor product out there.
One with no Dedup and REFS for primary storage.
Add a second one for your Backup Copies and enable Windows Dedup on these volumes.
A cheap server and a third enclosure for your off site.
-
- Expert
- Posts: 176
- Liked: 30 times
- Joined: Jul 26, 2018 8:04 pm
- Full Name: Eugene V
- Contact:
Re: Deduplication Storage as first target OK nowadays?
FWIW there is an official KB on the topic of Data Domain, but it sadly omits the limitations of multi-disk VMs. I very much interpret this KB as dedupe appliances for primary repository NOT Recommended.
https://www.veeam.com/kb1956
I would add to stress that if your VMs consist of multiple disks you will not have a good time restoring even the Full VM restore.For quick recovery you may consider using fast primary storage and keeping a several restore points (3-7) for quick restore operations such as Instant Recovery, SureBackup, Windows or Other-OS File restores since they generate the highest amount of random reads. Then use the DataDomain as a secondary storage to store files for long term retention. If an EMC Data Domain Deduplication System will be used as primary storage, it is strongly suggested to leverage alternative restore capabilities within Veeam Backup & Replication such as Entire VM restore and VM files restore. This may result in faster recovery capabilities when used with EMC Data Domain Deduplication Systems than Instant Recovery and File Level Restore operations.
https://www.veeam.com/kb1956
-
- Expert
- Posts: 176
- Liked: 30 times
- Joined: Jul 26, 2018 8:04 pm
- Full Name: Eugene V
- Contact:
Re: Deduplication Storage as first target OK nowadays?
What are your thoughts on ReFS block clone powered synthetic fulls? Our SE has strongly suggested that this feature is more desirable than dedupe capabilities, especially since in a dedupe-powered synthetic full will have to do a full read/rehydrate/write/dehydrate cycle.
-
- Technology Partner
- Posts: 36
- Liked: 38 times
- Joined: Aug 21, 2017 3:27 pm
- Full Name: Federico Venier
- Contact:
Re: Deduplication Storage as first target OK nowadays?
Below I have summarized few performance data from my tests with ReFS. If you have more questions, feel free to ask.
Talking about Synthetic Full with ReFS and StoreOnce, I like to call them Virtual Synthetic Full (VSF) to differentiate from traditional Synthetic that required a server for reading old backup, processing/merging, and writing the new full.
StoreOnce VSF
It does not rehydrate data, the VSF process is offloaded to StoreOnce. Basically, Veeam completes the incremental backup and then it sends to StoreOnce a small file with the instructions for building the new full. At this point the backup is over, but the job holds on until StoreOnce has completed the VSF and has generated all the new full backup files.
ReFS VSF
It is very fast, it adds just 35" to the entire job.
Server details: Apollo 4200, 1 socket @18 cores, 24*12TB HDD in RAID60 (12+12), 2 FC ports at 16Gb/s
Job details: 18VMs, Processed 1.3TB, Read 295GB, Transferred 125GB, Processing rate 3GB/s (Average >3GB/s Peak 3.7GB/s)
Job duration: Incremental 5':30", VSF 6':05"
VSF job details
Talking about Synthetic Full with ReFS and StoreOnce, I like to call them Virtual Synthetic Full (VSF) to differentiate from traditional Synthetic that required a server for reading old backup, processing/merging, and writing the new full.
StoreOnce VSF
It does not rehydrate data, the VSF process is offloaded to StoreOnce. Basically, Veeam completes the incremental backup and then it sends to StoreOnce a small file with the instructions for building the new full. At this point the backup is over, but the job holds on until StoreOnce has completed the VSF and has generated all the new full backup files.
ReFS VSF
It is very fast, it adds just 35" to the entire job.
Server details: Apollo 4200, 1 socket @18 cores, 24*12TB HDD in RAID60 (12+12), 2 FC ports at 16Gb/s
Job details: 18VMs, Processed 1.3TB, Read 295GB, Transferred 125GB, Processing rate 3GB/s (Average >3GB/s Peak 3.7GB/s)
Job duration: Incremental 5':30", VSF 6':05"
VSF job details
- Job started at 11/10/2018 6:08:45 AM
Building VMs list 00:07
VM size: 1.8 TB (1.3 TB used)
Changed block tracking is enabled
Queued for processing at 11/10/2018 6:09:22 AM
Required backup infrastructure resources have been assigned
Creating storage snapshot 00:00
Processing kv-9450-vm05-3PAR2 02:24
Processing kv-9450-vm08-3PAR2 02:16
Processing kv-8200-vm01-3PAR1 00:41
Processing kv-8200-vm03-3PAR1 02:19
Processing kv-9450-vm00-3PAR2 02:18
Processing kv-9450-vm07-3PAR2 02:19
Processing kv-8200-vm06-3PAR1 02:15
Processing kv-8200-vm05-3PAR1 02:19
Processing kv-8200-vm04-3PAR1 02:15
Processing kv-9450-vm01-3PAR2 02:15
Processing kv-8200-vm08-3PAR1 02:19
Processing kv-9450-vm03-3PAR2 02:15
Processing kv-8200-vm09-3PAR1 02:19
Processing kv-9450-vm02-3PAR2 02:15
Processing kv-8200-vm07-3PAR1 02:15
Processing kv-9450-vm04-3PAR2 02:15
Processing kv-9450-vm06-3PAR2 02:15
Processing kv-8200-vm02-3PAR1 02:15
All VMs have been queued for processing
Deleting storage snapshot 00:00
Synthetic full backup created successfully [fast clone] 00:35
Load: Source 50% > Proxy 89% > Network 76% > Target 0%
Primary bottleneck: Proxy
Job finished at 11/10/2018 6:14:51 AM
-
- Enthusiast
- Posts: 37
- Liked: 4 times
- Joined: Nov 05, 2014 4:19 pm
- Contact:
Re: Deduplication Storage as first target OK nowadays?
You are of course correct that on any dedup data will likely be painfully slow to restore.evilaedmin wrote: ↑Nov 24, 2018 9:46 pm What are your thoughts on ReFS block clone powered synthetic fulls? Our SE has strongly suggested that this feature is more desirable than dedupe capabilities, especially since in a dedupe-powered synthetic full will have to do a full read/rehydrate/write/dehydrate cycle.
In my above described model. For example.
The primary array on REFS may hold say... 5-7 days of backup data ready for a faster recovery.
This would certainly be a good use case for REFS block clone synthetic fulls.
The secondary on site and off site dedup arrays would hold a longer retention. (Up to disk capacity or as required 30 days maybe.)
While the dedup data is certainly more "time costly" to access you very likely will not be conducting a full or large restore from this data set.
Without knowing your data set or retention requirements I can only speak for what works for me.
For primary storage, if you can get a correctly sized single array and maintain your retention in primary REFS file system that would work very well.
For any "Backup Copy" or off site data I don't see an advantage to REFS unless I am missing something. (As block cloning only takes place within one volume.)
Who is online
Users browsing this forum: No registered users and 47 guests