Discussions specific to the VMware vSphere hypervisor
Post Reply
DukeR
Influencer
Posts: 14
Liked: 1 time
Joined: Jun 15, 2017 6:44 am
Full Name: Michael
Contact:

Deduplication Storage as first target OK nowadays?

Post by DukeR » Nov 14, 2018 2:46 pm 1 person likes this post

Hi all

I would like to know if the new EMC Data Domain and HP StoreOnce appliances are good to go nowadays as a first target for backups via veeam (backup jobs and backup copy jobs). I know that in the old days it was not recommended to have the ciritcal "fast" restore object placed on those because you had to wait quite a while until the File Level Explorer loaded.

Is this still true for today or can we palce a backup job directly to an appliance as mentioned above? If yes, is there any special settings in the job or a specific backup method needed, for example forever forwarded?

Thank you very much for your help

Andreas Neufert
Veeam Software
Posts: 3907
Liked: 706 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by Andreas Neufert » Nov 14, 2018 3:01 pm 1 person likes this post

The Veeam best practices combine the 3 (copies), 2 (medias), 1 (offsite copy) rule together with the fast backups and restore + Cheap long term storage demand from customers.

So usually we write into fast storage the most actual restore points and create a copy of the backups to another storage system which is optimized for long term backup on cheaper media (slower).

From a support perspective we support all storages including the deduplication storages at both. It is up to the vendor and you to select the right storage solution for the fast and long term side. The storage type need to be selected based on requirements. All Storage vendors have guidelines when and when not to use specific storages. Best approach would be to document what you need from SLA perspective (RTO + RPO + How long you want to keep the backups + dataamount + budget) and ask the vendors or partners to create you a solution for it. As well document what you use from primary storage side as potentially a storage snapshot environment can be used for fast restore.

DukeR
Influencer
Posts: 14
Liked: 1 time
Joined: Jun 15, 2017 6:44 am
Full Name: Michael
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by DukeR » Nov 14, 2018 3:22 pm

Hi

Very usefull feedback, thank you. Basically we would like to use a physical backup proxy but just to connect to the storage by iSCSI (direct san backup) and backup all data to a HP StoreOnce. In addition we are going to have a 2nd HP Store Once for replication and a tape media for long term an off-site backup.

But in this ca we do not have any local capacity for short term backup.

Do you think this is a big disadvantage or do you think the setup is okey anyway?

Cheers

Andreas Neufert
Veeam Software
Posts: 3907
Liked: 706 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by Andreas Neufert » Nov 14, 2018 3:33 pm

Overall please speak with HPE about it. As you can not replicate with in StoreOnce today other than rehydrate the dedup data you need careful planning regarding performance and backup runtimes.

In this actual setup I would use somethin like an HPE Apollo to backup to it and then use 2 Backup Copy Jobs to write the 2 StoreOnce Storages.

Anyway use Catalyst (over IP) as transport protocoll to StoreOnce, there is no iSCSI option.

Again please check with HPE to find the best setup and architecure for your usecase. As we do not know the details it is hard to say if the above would work or not.

NightBird
Service Provider
Posts: 178
Liked: 32 times
Joined: Apr 28, 2009 8:33 am
Location: Strasbourg, FRANCE
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by NightBird » Nov 14, 2018 4:22 pm 3 people like this post

Don’t waste your $ with this crappy dedupe appliance (storeonce or datadomain), take an Apollo server with a bunch of disk and ReFS filesystem.
It rocks, it’s my opinion.

evilaedmin
Expert
Posts: 111
Liked: 13 times
Joined: Jul 26, 2018 8:04 pm
Full Name: Eugene V
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by evilaedmin » Nov 18, 2018 11:20 pm 1 person likes this post

I would recommend against StoreOnce as a primary target based on my experience, as there are real restore limitations especially if your VMs are of more than a single disk.

vmware-vsphere-f24/slow-restores-from-h ... =storeonce

FedericoV
Influencer
Posts: 11
Liked: 10 times
Joined: Aug 21, 2017 3:27 pm
Full Name: Federico Venier
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by FedericoV » Nov 19, 2018 6:32 am 1 person likes this post

I work in HPE and in my lab I have tested both configurations:
1) Veeam proxy running and writing to Apollo 4200 Gen9 local disks
2) Veeam proxy/gateway running on the same Apollo and writing directly to StoreOnce using Catalyst in source side deduplication mode.
The Apollo had 2 x 16Gb FC connections to 3PAR, and I configured the Veeam job (18 VMs) to use HW snapshots (I love Veeam integration with 3PAR and Nimble snapshots)
In both cases I measured a stable 3GB/s ingestion rate, which is not far from my 2 FC ports maximum bandwidth.
Apollo configuration: one socket @ 18 cores (yes one socket only), one SmartArray , 25@12TB SATA in RAID60 (12+12) + 1 HS, Strip size=128KB and ReFS bs=64KB. I also had 2 SSD for the OS and for vPower NFS.
StoreOnce was an old SO 5100 with 24 disks connected via one 10GbE to the Apollo.
During both tests the Apollo system resources were quite loaded. StoreOnce, instead, was not particularly under stress. I guess that I could have connected a second Apollo to it for total 6GB/s, but this was not tested.
I also I wanted to compare the dedupe of the two configurations. I did my best to tune my workload generator to simulate an average real-world workload and I run a daily backup with a retention of 56 restore points (8 weeks forward full with weekly synthetic full).
Below there are the dedupe ratios measured at the end of the test:
- Veeam compression & dedupe = 2 to 1,
- Veeam comp & dedupe & ReFS = 3 to 1 (I expected something more from ReFS fast clone)
- Veeam & StoreOnce = 16 to 1
I cannot say that one configuration is better than the other, it all depends on our requirements.
I like Apollo for its performance especially on single stream backup/restore (a single VM backup runs at 1.5GB/s)
I like StoreOnce for its ability to reduce network/storage utilization and for multi stream performance.
Personally I would not design a data protection solution that relies uniquely on Windows based storage. The reason is obvious, if a ransomware infects my datacenter, then it could also infect my Apollo. At this point the malware could easily access my local ReFS and corrupt my backup data (.vbk and .vib). Data on StoreOnce, instead, is fully protected because it is accessible only using Catalyst API and thus it is invisible for the Windows OS and for any malware running on it.

GastroTom
Lurker
Posts: 1
Liked: never
Joined: Feb 27, 2017 2:45 pm
Full Name: Tomislav Tezak
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by GastroTom » Nov 19, 2018 7:09 am

We used to have a StoreOnce 4900 for our main backups and let me tell you we had nothing but Problems with it.
First we tried with the FibreChannel Integration and the StoreOnce has an inbuilt Limitation which Drops all other backups.
When we switched to Network the Performance was better but it seemed like the backupchain was corrupted every 2-3 months. We then had to delete backups weekwise until it was back to working.
We confronted HP about this and they were just saying that we should buy the next Generation, it would work then.
So what we did is buy an Exagrid which uses a landing Zone for the newest backups. It then moves older backups into the deduplicated storage. We never had a Problem since.
Be very careful with HPE StoreOnce, it doesn't seem like its worth the Money.

mk2311
Novice
Posts: 4
Liked: 2 times
Joined: Apr 28, 2015 2:12 pm
Full Name: Jeff White
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by mk2311 » Nov 19, 2018 8:16 am 2 people like this post

We have been running with Data Domain 990's for past 3 years. We had them for other backup applications (Boost/TSM) prior to using Veeam, so they were available as a repository
For:
Fast write time
Against:
Restore time was poor - often 3.5 hours just to restore a single file.
Mtree replication - definitely worth avoiding when creating offsite copies of Veeam backups. Simply does not work. We use Veeam BCJ's instead

We have recently upgraded to DD9800's
Still have the same fast write
But meta data now held on separate SSD and those 3.5 hour restore times are now 20 minutes max

But still avoid data domain mtree replication and continue with Veeam BCJ's - works fine for us

tomhkr
Enthusiast
Posts: 41
Liked: 9 times
Joined: Feb 03, 2014 7:40 am
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by tomhkr » Nov 19, 2018 9:59 am

On our 5:th year with a deduplicating Dell DR4100 appliance as first backup target. Physical storage is about 15 TB and dedup ratio has been around 7x. It has served us well with about 600MB/s write speeds over 10 GBE and 400 MB/s reads. From launching Veeam Console to getting ping answers from the VM being restored takes about 15 minutes for an average single-disk VM of about 30 GB.

The instant recovery is too slow though, the unit does not handle live VMs well since they both read and write during bootup/running.

I would not hesitate to use deduplicating appliances again if they can compete with price.

ejenner
Expert
Posts: 421
Liked: 64 times
Joined: Mar 23, 2018 4:43 pm
Full Name: EJ
Location: London
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by ejenner » Nov 19, 2018 11:55 am 2 people like this post

Well I guess that's the point. The de-dupe devices save money by using less physical disk. But if the device itself is much more expensive it outweighs the savings. So then less effective solutions like ReFS start to become competitive.

tommyo
Influencer
Posts: 22
Liked: 4 times
Joined: Dec 28, 2016 3:42 pm
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by tommyo » Nov 19, 2018 1:43 pm 1 person likes this post

If you are open to other products, take a look at the Quantum DXi series. They have the Datamover service built into the appliance and it works very well. I have two 4700s. I write all the primary jobs to one and replicate to another one in another location. I also write off a copy to LTO8 Tapes using Quantum's Scalar i3 tape library. Works like a charm.

Murigar
Enthusiast
Posts: 32
Liked: 4 times
Joined: Nov 05, 2014 4:19 pm
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by Murigar » Nov 19, 2018 3:07 pm

I found I get much more bang for buck utilizing Windows Server 2012/2016 Dedup with attached JBOD enclosures in whatever RAID level is appropriate.

A 24 drive JBOD enclosure with 4+ TB HDD on a dual link RAID 10 has always provided great access time and is far cheaper than any vendor product out there.
One with no Dedup and REFS for primary storage.
Add a second one for your Backup Copies and enable Windows Dedup on these volumes.
A cheap server and a third enclosure for your off site.

evilaedmin
Expert
Posts: 111
Liked: 13 times
Joined: Jul 26, 2018 8:04 pm
Full Name: Eugene V
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by evilaedmin » Nov 24, 2018 9:44 pm

FWIW there is an official KB on the topic of Data Domain, but it sadly omits the limitations of multi-disk VMs. I very much interpret this KB as dedupe appliances for primary repository NOT Recommended.
For quick recovery you may consider using fast primary storage and keeping a several restore points (3-7) for quick restore operations such as Instant Recovery, SureBackup, Windows or Other-OS File restores since they generate the highest amount of random reads. Then use the DataDomain as a secondary storage to store files for long term retention. If an EMC Data Domain Deduplication System will be used as primary storage, it is strongly suggested to leverage alternative restore capabilities within Veeam Backup & Replication such as Entire VM restore and VM files restore. This may result in faster recovery capabilities when used with EMC Data Domain Deduplication Systems than Instant Recovery and File Level Restore operations.
I would add to stress that if your VMs consist of multiple disks you will not have a good time restoring even the Full VM restore.

https://www.veeam.com/kb1956

evilaedmin
Expert
Posts: 111
Liked: 13 times
Joined: Jul 26, 2018 8:04 pm
Full Name: Eugene V
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by evilaedmin » Nov 24, 2018 9:46 pm

Murigar wrote:
Nov 19, 2018 3:07 pm
Add a second one for your Backup Copies and enable Windows Dedup on these volumes.
What are your thoughts on ReFS block clone powered synthetic fulls? Our SE has strongly suggested that this feature is more desirable than dedupe capabilities, especially since in a dedupe-powered synthetic full will have to do a full read/rehydrate/write/dehydrate cycle.

FedericoV
Influencer
Posts: 11
Liked: 10 times
Joined: Aug 21, 2017 3:27 pm
Full Name: Federico Venier
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by FedericoV » Nov 26, 2018 11:32 am

Below I have summarized few performance data from my tests with ReFS. If you have more questions, feel free to ask.
Talking about Synthetic Full with ReFS and StoreOnce, I like to call them Virtual Synthetic Full (VSF) to differentiate from traditional Synthetic that required a server for reading old backup, processing/merging, and writing the new full.
StoreOnce VSF
It does not rehydrate data, the VSF process is offloaded to StoreOnce. Basically, Veeam completes the incremental backup and then it sends to StoreOnce a small file with the instructions for building the new full. At this point the backup is over, but the job holds on until StoreOnce has completed the VSF and has generated all the new full backup files.
ReFS VSF
It is very fast, it adds just 35" to the entire job.
Server details: Apollo 4200, 1 socket @18 cores, 24*12TB HDD in RAID60 (12+12), 2 FC ports at 16Gb/s
Job details: 18VMs, Processed 1.3TB, Read 295GB, Transferred 125GB, Processing rate 3GB/s (Average >3GB/s Peak 3.7GB/s)
Job duration: Incremental 5':30", VSF 6':05"
VSF job details
  • Job started at 11/10/2018 6:08:45 AM
    Building VMs list 00:07
    VM size: 1.8 TB (1.3 TB used)
    Changed block tracking is enabled
    Queued for processing at 11/10/2018 6:09:22 AM
    Required backup infrastructure resources have been assigned
    Creating storage snapshot 00:00
    Processing kv-9450-vm05-3PAR2 02:24
    Processing kv-9450-vm08-3PAR2 02:16
    Processing kv-8200-vm01-3PAR1 00:41
    Processing kv-8200-vm03-3PAR1 02:19
    Processing kv-9450-vm00-3PAR2 02:18
    Processing kv-9450-vm07-3PAR2 02:19
    Processing kv-8200-vm06-3PAR1 02:15
    Processing kv-8200-vm05-3PAR1 02:19
    Processing kv-8200-vm04-3PAR1 02:15
    Processing kv-9450-vm01-3PAR2 02:15
    Processing kv-8200-vm08-3PAR1 02:19
    Processing kv-9450-vm03-3PAR2 02:15
    Processing kv-8200-vm09-3PAR1 02:19
    Processing kv-9450-vm02-3PAR2 02:15
    Processing kv-8200-vm07-3PAR1 02:15
    Processing kv-9450-vm04-3PAR2 02:15
    Processing kv-9450-vm06-3PAR2 02:15
    Processing kv-8200-vm02-3PAR1 02:15
    All VMs have been queued for processing
    Deleting storage snapshot 00:00
    Synthetic full backup created successfully [fast clone] 00:35
    Load: Source 50% > Proxy 89% > Network 76% > Target 0%
    Primary bottleneck: Proxy
    Job finished at 11/10/2018 6:14:51 AM
P.S. This week I'm at the HPE Discover Madrid, If you are here then stop by my desk and I'll show it live.

Murigar
Enthusiast
Posts: 32
Liked: 4 times
Joined: Nov 05, 2014 4:19 pm
Contact:

Re: Deduplication Storage as first target OK nowadays?

Post by Murigar » Dec 03, 2018 6:43 pm

evilaedmin wrote:
Nov 24, 2018 9:46 pm
What are your thoughts on ReFS block clone powered synthetic fulls? Our SE has strongly suggested that this feature is more desirable than dedupe capabilities, especially since in a dedupe-powered synthetic full will have to do a full read/rehydrate/write/dehydrate cycle.
You are of course correct that on any dedup data will likely be painfully slow to restore.
In my above described model. For example.
The primary array on REFS may hold say... 5-7 days of backup data ready for a faster recovery.
This would certainly be a good use case for REFS block clone synthetic fulls.

The secondary on site and off site dedup arrays would hold a longer retention. (Up to disk capacity or as required 30 days maybe.)
While the dedup data is certainly more "time costly" to access you very likely will not be conducting a full or large restore from this data set.

Without knowing your data set or retention requirements I can only speak for what works for me.
For primary storage, if you can get a correctly sized single array and maintain your retention in primary REFS file system that would work very well.
For any "Backup Copy" or off site data I don't see an advantage to REFS unless I am missing something. (As block cloning only takes place within one volume.)

Post Reply

Who is online

Users browsing this forum: Baidu [Spider], Bing [Bot] and 12 guests