Nakivo style backup storage

TonioRoffo · Post by **TonioRoffo** » Feb 23, 2021 2:58 pm this post

Hello,

I was wondering if a Nakivo style backup repository is considered for Veeam?

Nakivo is in an other league than Veeam, but one thing that they seem to be doing really good is the backup repository approach.

Instead of working with full backup & incremental files, the repository saves compressed, deduped blocks in smaller files. Every backup just points to either blocks already used or creates new ones. Blocks don't get "deleted" unless no backup has need for the blocks.

I see a few advantages:

1. Repository-wide deduplication
2. No issues with full backups/synthetic fulls/merging. (No extra disk space needed, no long operations that can fail, ...)
3. Easy retention and cleaning up for retention, as there are no files to be deleted, just remove pointers to blocks and eventually clean up the blocks.
4. No need for fancy filesystems to speed up merging/synthetic backups
5. No open file issues, backups jobs can be read/mounted/restored even as the backup is running.

This style of backup storage is becoming more & more widespread, not just in products like Nakivo, but also in Duplicacy, Restic, Kopia, and others.

This would seem a great match for Veeam, which does everything else right.

Thanks

Post by **Dima P.** » Feb 23, 2021 5:15 pm this post

Hello Yves,

Thanks for the feedback. Based on your description it looks like branded Windows repository on ReFS with Windows deduplication being enabled?

Feb 23, 2021 6:11 pm

You almost have all those advantages with ReFS and XFS; except for the dedup part. Certainly Nakivo also uses some kind of opensource filesystem/technology?
I don't know Nakivo, but I wouldn't like to be forced to use a proprietary backup storage solution where I cannot easily export/move away my files.

Feb 23, 2021 6:55 pm

Pros are always measured alongside Cons, and those of global deduplication(just to name a few):
- loss of backup portability(loss of backup files control overall) - it simply impossible to copy certain backup outside the product UI. Want a backup file copy on tape? Want a backup file copy on external drive? A network share? In the cloud? Forget about it...if you have 1,000,000 chunks on a repository, what if you want to delete certain backup?
- heavy(and I mean it) dependency on a Repository metadata survival, as if meta is corrupted - your entire repository of backups goes pumpkin. That's why all global deduplication appliances have multi-node meta replication and do daily meta cleanups, daily meta health checks etc.
- global deduplication puts yet even higher load on repository due to the extreme need of organizing that "ever-changing live beast" with daily space reclaim, daily block realignment etc, hammering your storage disks non stop.
- restore patterns could be very different from globally deduplicated backup, where small chunks of data are spread across the storage. In places where you can sequentially read entire VM backup from a single .vbk file, global deduplication will drown in Random IO seeking for blocks here and there. Restore times could be times slower, performance of "Instant Restore" machines running from deduplicated backup will drain them unusable.

soncscy · Feb 23, 2021 7:12 pm

@Yves,

I've not actually used Naviko before (like their articles (sometimes), but they are not really a common name I see), so I had to look up how their repos actually work.

https://helpcenter.nakivo.com/display/N ... Repository

Reading your description, I do see some of the benefits -- global dedup across a backup repository is enticing, but I have some concerns:

1. The big "important" warning is really concerning, and it kind of seems like exposing the Windows Dedup Chunkstore much more publicly. Having the dedup-store exposed like this is a bit concerning, imo.
2. 128 TB is kind of small for what I see in medium or larger sized businesses. For many of the environments I work in, two full backups from VMs would consume this, and the OS dedup really wouldn't help. To be clear, for many environments, I can see this working well, but this is a pretty hard cut limit for a lot of clients I know (physical machine backups would be almost impossible)

The rest of the points I suppose really are doable now with virtually any backup solution and XFS, just without a lot of the built-in limits.

Can I ask, is it mostly the convenience that you can put the Repository atop any existing infrastructure item that you like compared to an XFS volume?

I can see this working out really well for a lot of small clients for that very reason, and that's definitely cool, but I'm also always a bit wary of deduplication as it's a black box. Especially with OS style dedup when the dedup-store is exposed like that, it seems a bit risky.

But good to read about another method.

(edit: also a quick look on google and I can't seem to find comparison numbers between Naviko's native dedup and Windows Dedup, or really any significant studies on how performant it is. Maybe you have some references?)

TonioRoffo · Post by **TonioRoffo** » Feb 24, 2021 7:55 am this post

Thanks all, for your replies.

I'm running a Nakivo setup in parallel with Veeam for a year, and I'll see how it performs by then, on small/underpowered hardware, so worst case scenario, which we unfortunately have at some small clients.

@egor,

Most of the issues you point at are already solved by the software (export/portability) but the points you make about speed/fragmentation are valid, I'll see what happens after 1 year.
"Deleting" is actually easy on these kind of systems, you can remove any point-in-time snapshot you want as there is no full backup file which it depends on, in that regard it is very flexible and efficient.

Our smaller customers struggle with backup storage pricing - they can hardly shell out for a lower-end NAS, don't understand RAID5/6 can't be really used for backup, and surely don't want to pay for (at least) another Windows server license just to be able to use ReFS. I know I can now use a linux repository with XFS instead, but I run into adversity to that with clients/colleagues (Open source! Oh no!) - but they DO want the impossibly long retention & short RTO's.

No need to keep this post going on any longer, it is not my intention to undermine Veeam or prop another product here (that would be preaching to the choir) but just curious about the storage side of things.

Thanks!

TonioRoffo · Post by **TonioRoffo** » Feb 24, 2021 7:58 am this post

soncscy wrote: ↑Feb 23, 2021 7:12 pm https://helpcenter.nakivo.com/display/N ... Repository

Here is some more detail about Nakivo's system - https://helpcenter.nakivo.com/display/K ... Are+Stored

Another, comparable storage system (not for virtual machines) is Duplicacy - https://forum.duplicacy.com/t/snapshot-file-format/1106

Can't immediatly find anything for Restic and/or Kopia.

soncscy · Feb 24, 2021 12:11 pm

Hi Yves,

Thanks for those, but I was more thinking direct numbers comparisons

Like if we sent 2 TB of VM data to a Naviko Repository, then did the same to a standard windows Repository with Windows Dedup and let the optimization job run, which would get better savings?

R&D Forums

Nakivo style backup storage

Re: Nakivo style backup storage

Re: Nakivo style backup storage

Re: Nakivo style backup storage

Re: Nakivo style backup storage

Re: Nakivo style backup storage

Re: Nakivo style backup storage

Re: Nakivo style backup storage

Who is online