Deduplication appliance recommendation

Availability for the Always-On Enterprise

Deduplication appliance recommendation

Veeam Logoby rasmusan » Sat Jun 06, 2015 5:26 pm

Hello

I am looking for some recommendations regarding deduplication solution... Which products/solutions do you people have experinces with (good/bad) ?

Specifically I have a case with a customer who has around 40TB of data - where a big part of this is graphical/CAD data. Deduplication solution is to be used for longer term retetion, as they have other storage solution for first copy of backup data... We have been looking at EMC Data Domain DD2500 with DDBoost as a possible solution, however this is quite expensive...

also what about Windows server 2012 R2 deduplication for these amounts of data - has anyone experince with this?
rasmusan
Enthusiast
 
Posts: 28
Liked: never
Joined: Tue Jan 20, 2015 9:03 pm
Full Name: Rasmus Andersen

Re: Deduplication appliance recommendation

Veeam Logoby Gostev » Sat Jun 06, 2015 5:45 pm

Most optimistic Windows Server 2012 R2 dedupe scalability limit that I saw reported was around 10TB, and I personally recommend no more than 5TB. That said, are you sure you want deduplication in this particular case? Most likely, this data will not dedupe very well... so a raw storage may end up much cheaper, and will guarantee the performance levels too. Remember, with Veeam you can just go with an industry standard server stuffed with large hard drives.
Gostev
Veeam Software
 
Posts: 21517
Liked: 2383 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Deduplication appliance recommendation

Veeam Logoby rasmusan » Sat Jun 06, 2015 6:00 pm

Hi Gostev

yes I was not that into Server 2012 R2 Dedup as well... just could not find some documentation telling the limits, but thanks for pointing that out :)

Well, as I have a traditional storage array for the purpose of short retention (and also due to performance), however this is for the longer retention. If you want to retain like a year worth of data with a monthly interval, and Veeam can compress to lets say 25TB, you would need quite a lot of raw disk space - so the purpose of the dedup is to have these multiple archive copies... makes sense ?
rasmusan
Enthusiast
 
Posts: 28
Liked: never
Joined: Tue Jan 20, 2015 9:03 pm
Full Name: Rasmus Andersen

Re: Deduplication appliance recommendation

Veeam Logoby Gostev » Sat Jun 06, 2015 7:12 pm

Yes, it certainly does. I did not know if you were thinking GFS, or something else.

By the way, depending on the time scale of this project, you may also want consider evaluating Windows Server 2016 deduplication, which has major deduplication engine enhancements coming, many of which are specifically aimed to enhance performance and scale.
Gostev
Veeam Software
 
Posts: 21517
Liked: 2383 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Deduplication appliance recommendation

Veeam Logoby Matte » Sun Jun 07, 2015 10:14 pm

Looking at the enhancements Microsofts have done in Windows Server 2016 Deduplication, I'm not sure that they are enough to make it viable for such a case.

The limits in Windows Server 2012 R2 are volumes < 10TB, but files approaching 1TB aren't good candidates.
The limits in Windows Server 2016 are volumes up to 64TB (As it is now multi threaded) and files up to 1TB.

Can Windows Server Deduplication handle larger files? Probably, but would require some case specific testing, to see the actual results and performance. I have personally seen Windows Server 2012 R2 have issues with .vbk files larger than 1TB, and it isn’t pretty when that happens. Remember, the official Microsoft recommendation is that these files "aren't good candidates".

Of course you could split out your VMs into multiple backup jobs to keep the .vbk size down, but considering its graphical/CAD - the customer probably has large file server(s), which wouldn't make that a good option - but that’s just an assumption.
Matte
Service Provider
 
Posts: 1
Liked: never
Joined: Mon Oct 27, 2014 10:29 am

Re: Deduplication appliance recommendation

Veeam Logoby dellock6 » Mon Jun 08, 2015 11:48 am

Just a hint from a linux guy... If you are looking for a cheap/free solution, give a try to opendedup, it's has no file size limit as I remember ;)
Luca Dell'Oca
EMEA Cloud Architect @ Veeam Software

@dellock6
http://www.virtualtothecore.com
vExpert 2011-2012-2013-2014-2015-2016
Veeam VMCE #1
dellock6
Veeam Software
 
Posts: 5118
Liked: 1361 times
Joined: Sun Jul 26, 2009 3:39 pm
Location: Varese, Italy
Full Name: Luca Dell'Oca

Re: Deduplication appliance recommendation

Veeam Logoby Gostev » Mon Jun 08, 2015 2:32 pm

Matte, my recommendation above is also based on the fact that there is a new backup storage option in B&R v9 which goes hand in hand with Windows Server 2016 deduplication needs and requirements. It's a part of one top secret feature that we will not announce until closer to the release though. Thanks!
Gostev
Veeam Software
 
Posts: 21517
Liked: 2383 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Deduplication appliance recommendation

Veeam Logoby rasmusan » Mon Jun 08, 2015 5:25 pm

I am not necessarily looking for at free/cheap solution - just the "optimal" solution for deduplication with fairly large amounts of data... what do other customers do? what are your experiences?
rasmusan
Enthusiast
 
Posts: 28
Liked: never
Joined: Tue Jan 20, 2015 9:03 pm
Full Name: Rasmus Andersen

Re: Deduplication appliance recommendation

Veeam Logoby hans_lenze » Mon Jun 08, 2015 8:11 pm 2 people like this post

We mostly use standard rack servers with Windows Server 2012R2 deduplication and a lot of local storage. It works fine but you have to tick all the boxes when you set it up or you'll be in a world of hurt along the road. The maximum file fragmentation count for NTFS volumes is a nasty little detail that can cause big problems (format with /L and edit the registry to quickly defragment individual files). We store GFS backups offsite and you can expect 60 to 80% reduction on top of the Veeam dedup and compression. The full backup in most chains is over 2,5TB so big files work kind of okay with Windows Server dedup. The biggest problem is the fact that it's a single threaded process and it slow down when it runs out of memory or when the file is very big. In time it will deduplicate the whole file but it will take some time to process that first full backup. The deduplication process remembers which blocks were processed and picks up where it left off. You just won't see any additional free diskspace until the whole file has been processed.
Remember that you have to keep sufficient free disk space to fill the chuck store or the process wil fail and you're stuck.

We've been looking at dedicated appliances and got some ExaGrid boxes. So far they are a dream to work with. The on board Veeam datamover and dedicated "landing zone" work as advertised and the performance is very good. I've seen 600MB/s when restoring VMs to the virtualization platform. They don't need any additional licenses and they work as a repository straight out of the box.
hans_lenze
Service Provider
 
Posts: 16
Liked: 4 times
Joined: Fri Sep 07, 2012 7:07 am

Re: Deduplication appliance recommendation

Veeam Logoby damasta » Sun Jun 21, 2015 10:27 am 1 person likes this post

Hi,

If you prefer a turn-key appliance, you might want to check out the Fujitsu ETERNUS CS800 - which is cheaper and faster than the EMC DD boxes.

But if you prefer a DIY solution, what's wrong with windows native dedup? It's free and included in every 2012 R2 server you buy. Just switch it on.

Microsoft officially supports dedup with their own product DPM, provided you follow these tweaking guidelines:
https://technet.microsoft.com/en-us/lib ... 91438.aspx
In short, they tell you to split your backup repository across multiple volumes that are between 5-7 TB in size, change dedup operation so it is better suited to very large container files and setup scheduling so that dedup doesn't collide with your data protection jobs.

As matte pointed out, most windows dedup limitations have been addressed in server 2016:
http://blogs.technet.com/b/filecab/arch ... iew-2.aspx
I am evaluating that as I write this - with great results so far.

Personally, I agree with hans_lenze's warning: windows dedup is offline, and the target must always have enough free space to be able to ingest a full backup. And if for some reason, dedup can't finish processing all data before the next dump - the problem is vastly exacerbated.
The DD and CS appliances don't have that issue. they dedup in memory, before anything hits the drives. So you don't need to worry about free disk space. Or at least not until much much later than any offline dedup scheme.
Also, a full backup of 40TB will be... a challenge for any target. So get something that performs well! In three to five years, those 40TB will easily swell to 80TB...

At the end of the day, evaluating windows dedup is only going to cost you some time and a bunch of TB to scratch around on. Spin up a windows server VM, and test dedup on a volume of your choice. Since dedup doesn't depend on storage spaces or hyper-v, you can run it comfortably inside a VM, for testing and production purposes. In fact, my current test setup is a ~1.5 TB VHDX residing on a qnap NAS box, which I mounted on my laptop, that has been passed through client hyper-v to my server 2016 test VM. No, it's not fast. ;-) but I am more interested in compression rates than performance right now.
damasta
Lurker
 
Posts: 1
Liked: 1 time
Joined: Sun Jun 21, 2015 9:43 am

Re: Deduplication appliance recommendation

Veeam Logoby rgarrison » Mon Jun 22, 2015 3:33 pm

I'm a fan of Data Domain, which has been working very well for us with DDBoost and Veeam. We have 12 Data Domain appliances receiving backup data and replicating and after tweaking the settings, it "just works". It took some time and trial and error to find the optimal settings for the Veeam jobs, but once we did it's been smooth sailing. Dedup ratio is good for our data, but that is obviously very dependent on the nature of the source material.

DD and DDBoost is definitely expensive, but it does work very well with Veeam. Once Veeam adds the managed file replication capability, it will be close to perfect in my eyes.

In no way am I trying to bad mouth Windows dedup (I don't have experience with it), but I thought I'd at least share a positive experience with Data Domain and Veeam.
rgarrison
Novice
 
Posts: 7
Liked: never
Joined: Thu Jan 08, 2015 1:45 pm
Full Name: Ryan Garrison

Re: Deduplication appliance recommendation

Veeam Logoby jian17 » Mon Jun 29, 2015 6:58 pm

Hello rgarrison, what kind of restore speeds are you seeing from your DD? Both full VM and file level recovery?

We are seeing pretty slow restore coming from our DD2500 and have tried every setting suggested in the other threads.
jian17
Novice
 
Posts: 3
Liked: never
Joined: Fri Jun 26, 2015 2:18 pm

Re: Deduplication appliance recommendation

Veeam Logoby smd32 » Sun Jan 24, 2016 6:35 am

Gostev wrote:Matte, my recommendation above is also based on the fact that there is a new backup storage option in B&R v9 which goes hand in hand with Windows Server 2016 deduplication needs and requirements. It's a part of one top secret feature that we will not announce until closer to the release though. Thanks!

Did this secret feature turn out to be the scale-out backup repositories or something else?
smd32
Service Provider
 
Posts: 13
Liked: never
Joined: Sun Jan 24, 2016 4:34 am
Full Name: Scott Drassinower

Re: Deduplication appliance recommendation

Veeam Logoby Gostev » Sun Jan 24, 2016 9:20 pm

Per-VM backup file chains. This option lets you keep individual backup file size small (according to each individual VM size) without having to create a dedicated job for every VM. Microsoft does not recommend giving Windows dedupe large files to work with.
Gostev
Veeam Software
 
Posts: 21517
Liked: 2383 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Deduplication appliance recommendation

Veeam Logoby smd32 » Mon Jan 25, 2016 4:16 am

So per-VM backup file chains, scale-out repository, and leave dedupe to Windows Server 2012 R2 or 2016 instead of Veeam -- gets you most of the functionality of a dedupe appliance? Or go with the first two and stick with Veeam dedupe but just spend the extra cash for more disk?
smd32
Service Provider
 
Posts: 13
Liked: never
Joined: Sun Jan 24, 2016 4:34 am
Full Name: Scott Drassinower

Next

Return to Veeam Backup & Replication



Who is online

Users browsing this forum: Bing [Bot], Google Feedfetcher and 48 guests