v.Eremin wrote:
So, you're using two Tegile devices - one as production datastore, the other (deduplication unit) as a target for backup jobs, right? What particular settings allowed you to achieve decent deduplication rates? The information about both job (compression, deduplication) and device settings might be helpful for future readers. Thanks.
Hello v.Eremin,
Only one Tegile device, an HA2100. It's active/active controllers, so one controller is serving up storage via Fiber Channel to our Hyper-V cluster, the other controller is via CIFS shares providing a target for the backup jobs.
To provide some additional background, we have several large VMs (2 TB) that we have the need to keep on disk for at least 6 months. The goal was to get dedupe between the large weekly Veeam fulls for these VMs. After moving Veeam jobs from CIFS shares on our NetApp to the Tegile, our bottleneck shifted from target to source (Our VMs were still on the NetApp) and backup job times were about 10% faster. Space savings between backup jobs with the default settings was ok, but not as high as I had hoped. After some trial and error, we almost doubled our space savings with the following settings:
Veeam:
"Align backup file data blocks" enabled
"Deduplication" disabled
Compression set to "none"
Jobs are incrementals with a weekly active full backup
Tegile:
Deduplication on, for compression either gzip or lz4. (We used gzip initially and it gets better compression, however it is CPU intensive. with the number of simultaneous jobs we were running we were at 80% CPU, so we switched to lz4 which dropped it to no higher than 40%. The HA2100 we have is the entry model, higher models have additional CPU resources. I'd recommend GZIP for backups on any model above the HA2100).
With dedupe and compression, we have 56 TB of backups taking up 7 TB of space on the Tegile. While I've seen better dedupe with other products, I haven't seen better performance with it. Whether you are doing a restore from 6 months ago or the previous night, the speed is the same, no performance penalty for dedupe and compression. After migrating our VMs over to the 2nd Tegile controller from the NetApp, backups of those VMs sped up another 30%. Backups are almost (but not quite) twice as fast.
A few things to consider:
1) With dedupe and compression off in Veeam and taking place completely on the Tegile, if you archive to tape or cloud your backup files will be larger
2) In our environment, I was surprised that there was no difference in backup times with Veeam dedupe and compression off, and with it on. If the backup takes 4 hours with dedupe/compression off, it still takes 4 hours with it on, even though on some jobs it's sending 75% less data. YMMV.
Next year we plan on adding a second larger Tegile to take over the production VMs currently on the HA2100; the 2100 will be moved to our CoLo for replication. Veeam is currently replicating VMs to our CoLo with storage on a NetApp.