Comprehensive data protection for all workloads
Post Reply
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Experiences with virtualized GlusterFS Distributed mode backup and restore

Post by DonZoomik »

Background: a large constantly growing data set of files, organized as filesystem of semi-random (hash-based) paths that we don't control. Currently on stored hardware but planned to be virtualized. It would make a pretty inconvenient VM size (20TB+) so were looking for alternatives. GlusterFS in Distributed mode looks pretty good as you can present it as file system with native client and add nodes with reasonable size as required. Loss of resiliency is not really a problem as virtualized storage is on a SAN.

However there is little information about backup and recovery of such a system. What if we need to restore a node, how does the system react to a rollback? If starting replicas in DR event, they will have metadata mismatches (backups and node starts will be quite randomly distributed) - will it self-heal...?

Anybody have experiences with such a system?
HannesK
Product Manager
Posts: 14844
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Experiences with virtualized GlusterFS Distributed mode backup and restore

Post by HannesK »

Hello,
I found some earlier conversations, but with no final feedback from customers.

1) consistency seems to be possible with snapshots https://docs.gluster.org/en/latest/Admi ... Snapshots/
2) as you mention, that you can present it as file system, I would go with NAS backup. So you can easily restore the data.


Best regards,
Hannes
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Experiences with virtualized GlusterFS Distributed mode backup and restore

Post by DonZoomik »

How would snapshots help? The only scenario I can think of... keep all snapshots between Veeam backups and when restoring, revert Gluster snapshot to earliest common snap. For example, if nodes were backed up at 00:10, 00:50 and 01:20, revert to 00:00 or similar). Or just create snaps with integration scrips on each node and revert to one with earliest snap.

NAS backup would seem to have unneeded cost (if VMs are already virtualized) and very long RTOs (needing to push a lot/everything back to GFS instead of Instant Recovery or ready to start replicas).
borismittelmann
Veeam Software
Posts: 149
Liked: 23 times
Joined: Jul 01, 2013 1:27 pm
Full Name: Boris Mittelmann
Contact:

Re: Experiences with virtualized GlusterFS Distributed mode backup and restore

Post by borismittelmann »

Hi @DonZoomik ,
I believe @HannesK was thinking of using a snapshot for consistency and to offload backup prom production filesystem to read data from that snapshot, which is possible with Veeam NAS backup.

Bo.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Experiences with virtualized GlusterFS Distributed mode backup and restore

Post by DonZoomik »

Exhuming old threads I see! :D
Anyway this project went live with just VM-based backups. We played around a lot and distributed GlusterFS is tolerant to rollbacks. Files missing in filesystem (due to rollback/restore) just disappear from namespace. It's true that NAS backup with Gluster snaps would have been more consistent but the result was deemed good enough as-is with no extra costs.
Post Reply

Who is online

Users browsing this forum: Baidu [Spider], Google [Bot] and 84 guests