VM as repo

flibouille · Post by **flibouille** » Aug 05, 2022 7:16 am this post

Hi,
This may be an odd question, but is it appropriate to use a VM as a repo (for example a ReFS disk on the Veeam VM) or is it strongly advised to have only physical machines (NAS, SAN, linux server...)?
Thanks.

LickABrick · Aug 05, 2022 7:44 am

If you use a VM and someone gets into your vCenter/Hyper-V server he can simply delete the disk/VM with all the backups on it. So yes it can be used but is not advised.

Post by **Mildur** » Aug 05, 2022 7:45 am this post

Hello

Our recommendation is to use a physical server as the backup repository.
A backup server VM, which runs on the production hypervisor can be easily taken over by an attacker who has access to the hypervisor. Your backup server should not be running on the same hardware which you want to protect.

If you can, go for a standalone backup server with build in disks and use Windows 2019/2022 with a ReFS formated backup volume.
For the second copy, use Object Storage as an immutable backup storage so your backups are protected. If you don't want to use Object Storage, you can also use Backup to Tape Jobs. Important is to rotate the tapes reguarly. Don't leave them in the tape drive or an attacker can erase the tapes

Another option would be using a Veeam Cloud Connect Provider as a second target. With a VCC provider, your backups can also be protected by a feature called insider protection.

Thanks
Fabian

flibouille · Post by **flibouille** » Aug 05, 2022 8:27 am this post

Thanks for your answers.

So, even Veeam should be on a standalone server ?

I already have a immutable cloud backup but it needs a repo (hence my question) : it's impossible (for now) to backup directly to the cloud storage.

Post by **Mildur** » Aug 05, 2022 8:37 am this post

So, even Veeam should be on a standalone server ?

If possible, then yes. It would give you better security for the backup environment.
Also, if you have issues with your hypervisor, you also loose your backup server. If you company requires short restore times, you don't want to setup a new VBR server first before restoring the data.
And a physical server gives you most likely better restore performance. In the end, the backups are done to be able to restore in case of an emergency. So we have to think about the restore, when we are planning for the backups.

I already have a immutable cloud backup but it needs a repo (hence my question) : it's impossible (for now) to backup directly to the cloud storage.

That's good. Having protected backups is important today.
Direct Backup/Direct Backup copy to Object Storage will be in our next version, as you probably have heard already.

Aug 05, 2022 4:51 pm

flibouille wrote: Aug 05, 2022 7:16 am This may be an odd question, but is it appropriate to use a VM as a repo (for example a ReFS disk on the Veeam VM) or is it strongly advised to have only physical machines (NAS, SAN, linux server...)?

Another good reading on a similar matter.

Aug 05, 2022 6:55 pm

I will disagree with some of the opinions here about using a VM. Yes, if an attacker gets in, they can delete the VM or VM disk files, but if you have proper security, etc., this "should" never happen. Also, buying hardware specifically for installing Veeam is not always practical. I work for an MSP, and every one of our Veeam VBK, VCC, and O365 servers are VMs. The only things we have that are physical are tape servers right now.

So is using a VM an option, yes, and it all depends on what your infrastructure consists of and how secure it is? Is it best practice to use physical servers I am not so sure as the BP site for Veeam makes mention of both physical and Virtual, which is done by the Veeam Architects? To me, that means you can use a VM if you want to, as it will not disqualify you from support.

Post by **Mildur** » Aug 05, 2022 7:19 pm this post

Is it best practice to use physical servers I am not so sure as the BP site for Veeam makes mention of both physical and Virtual, which is done by the Veeam Architects?

Pretty clear in the BP guide for VBR. Repositories should be physical.

For smaller environments, an all-in-one physical server is recommended (that‘s also in the BP guide).

https://bp.veeam.com/vbr/2_Design_Struc ... ositories/

Physical or Virtual?
In general, we recommend whenever possible to use physical machines as repositories, in order to maximize performance and have a clear separation between the production environment that needs to be protected and the backup storage. It is also recommended combining this with the proxy role if backup from Storage Snapshots is possible. This keeps overheads on the virtual environment and network to a minimum

@chris.childerhose
Can you show me the link, where the guide has a statement that we recommend virtual and physical repositories? Would be good to discuss that internally.
Thank you very much
Fabian

Aug 05, 2022 7:29 pm

For repos, it mentions physical as you have noted yes (virtual was elsewhere for other roles as I was looking at another part of the guide). My point was that yes, it is recommended, but can you use VMs - yes, you can, as there are other ways to protect those from attackers like SAN replication, etc. I am not trying to argue about what the BP guide says just that, of course, everyone from Veeam will say physical, and I get that but coming from another perspective yes VMs do work and work very well for us.

Aug 08, 2022 1:42 pm

I hate to add an extra voice to the conversation, but I think you Chris and Fabian both have valid points and just are looking at it from different perspectives.

From the support side, ultimately there's not much a difference -- put your Veeam infrastructure on virtual or physical boxes, from a pure Support perspective there isn't much of a difference. The biggest and _best_ think you can do is test all aspects of your set up. Test your failover. Test your network with iPerf and make sure that the proposed limit matches the reality of connections. Walk through it entirely and try to imagine the worst of worst situations.

The difference comes in how virtualization and physical differ in performance and other practical properties. I think that the Best Practices (BP) recommendations and Fabian's position is to avoid the "all eggs in one basket" situation that virtualized repositories give. If your main datastore gets hit, even with SAN replication, depending on "how" you got hit, this damage gets replicated to the Disaster Recovery target also (think ransomware or bit flips, etc). There is some protection for bit flips with the SAN level replication and if you're doing LUN snapshots, you have this as a backup also, but this adds overhead to the recovery process, which is on the SAN administrator to resolve. The failover process also may not be as transparent as some might like (this depends _heavily_ on how the failover is done) and it might require a fresh Veeam Configuration DB to avoid headaches with getting the DR storage online in Veeam. It will "work", but remember that depending on how the failover happens, Veeam won't be aware in most cases that it's a failover and might understand the backups you're trying to work with as "new", which gets interesting. Not insurmountable, but it definitely is not something you want to deal with during a disaster.

That isn't to say it's a bad idea to implement, it's definitely saved quite a few customers in many support cases to have this set up, but it needs to be understood that all storage manufacturers handle such failovers a bit differently and you need to test this so there aren't surprises when you need to use it. (And experience with many ransomware cases says that most people will have to face this eventually)

I'd like to address though some thoughts on physical versus virtual as both come with some drawbacks that aren't considered.

Starting with physical repositories, the biggest drawback is obviously the additional cost -- it's an extra server to maintain, it's an extra Windows license or it's a linux box someone has to maintain, both of which have obvious costs involved. Replication of the physical repository can be more difficult, though the answer to this is typically Capacity Tier or Backup Copy to another location (the latter of which involves yet another storage box to maintain).

The main advantage though of physical is that the box is dedicated, not borrowing resources from production, and you don't have the concerns of virtualization to deal with, and as wonderful as virtualization is, the hardware and networking abstraction leads to unexpected issues and unintuitive performance bottlenecks.

Virtualization is convenient because you can much more easily create a setup which isolates the storage drives from the OS drive and recovery is pretty simple. Virtualization has a lot of advantages for fault-tolerance, but keep in mind these are not always as straight forward for DR as it might seem at first blush. The data will not be lost, but you need to understand what a failover looks like for your environment before a disaster happens. Testing is essential, and again, depending on the failover methodology used, it will not always be so straightforward.

With virtualization I usually have two main concerns: Performance and the "One Basket" scenario.

One-Basket is not as much of an issue if you're following 3-2-1 rule and having multiple copies of your backup data, but often budgets are cited as the reason this isn't followed. This is understandable and I can't refute the point, but understand that this means you're 100% reliant on one (or maybe a few) servers to protect your primary data without any alternative. If your production datastore(s) go, so does your backup data. If your vCenter or ESXi boot disk go, you're dead in the water until your hypervisor environment is up and running again. There are hacks around this (e.g., if your vCenter dies or DNS is dead, you can use hostfile hacks to make Veeam see the new vCenter and start restores right away), but of course you need to know the hacks to use them effectively.

Performance becomes another discussion because vCPUs don't exactly work like you'd imagine with physical CPUs. That is, you can't just throw more vCPUs at a situation and expect a linear performance increase because the Hypervisor isn't allocating physical CPU resources in the same way. You should still be sizing as recommended by the Veeam User Guide, but understand that over-sizing can heavily impact the hypervisor performance in a very large and unintuitive way. VMware produced this article for Exchange awhile back: https://www.vmware.com/content/dam/digi ... -guide.pdf Exchange works a bit different than a data mover thread in Veeam does, but a lot of the concerns about reservation and over-subscribing a Virtual Machine remain true. The takeaway from this should be that even if you make a nice and chunky virtual Veeam Repository, you still might get lousy performance depending on what else the host is handling and how the repository is sized. Networking with Virtualization also gets...interesting and leads to a lot of "shoot in the foot" situations that you wouldn't expect if you aren't thinking in terms of virtualization. To give a plain example, consider a physical direct SAN proxy writing to a virtual repository. The system administrator takes careful time to set up a ton of redundancy on the repository VM, but forgets that all the data is going over the single VMKernel NIC, and thus there is only an illusion of redundancy since if that NIC fails, all of the excess work done effectively is useless. I won't belabor all the other virtual networking points, but there have been countless cases I've worked where there should have been exceptionally fast VMXNET3 networks set up, but the VM to VM communication was dragging at an awful 500 KB/s or less even. Ultimately the client pursued the case with Vmware and after a follow up, they just elected to go with E1000 adapters at reduced but still better performance.

Basically, the takeaway here isn't that it's impossible to have a "good" recommendation, but instead that both can work but you must understand that there are downsides to both that need to be considered and accommodated, and most importantly, tested. It's not as simple as just saying "make a virtual repository". My lab environment is well maintained and with hotadd + a virtualized XFS repository, I can move lab exchange environments to the repository in about 2 minutes for 150 GiB. I think this is pretty good. But I've also had to rebuild from scratch _many_ times after we had a datastore failure and the VM wasn't recoverable. In lab I just groan a bit and then re-deploy a new repository and run active fulls, but that's obviously not the same as a production environment.

Personally, I'd lean towards a physical box for the actual storage just to avoid loading the virtual environment. All modern hypervisors can make some amazing use of their physical CPUs to allow for amazing density of workloads, but it needs to be understood there is no magic when it comes to what makes backups as space efficient and fast as they are; it requires power and resources, and often more than people realize and are ready for.

In my opinion (not official Veeam recommendation), if you're going to go the virtualization route, keep it simple. XFS repository (preferable) with copies to another location. Don't get fancy with the disk layouts, just go with "dumb" volumes separate from the OS volume so recovery is fast and simple. If you go with Windows, again keep it simple with ReFS volumes. If you have the space for mirror-parity on either XFS or ReFS, do it and keep the mirror on another datastore. If you virtualize, there is no reason to suffer with network shares (NFS/SMB). Add the repositories as Windows/Linux and get the full advantage of the data mover on the repository.

If you're going with Physical, would still strongly recommend avoiding SMB/NFS for the same reasons. Isolate the physical server from everything else, only the required Veeam infrastructure should be able to talk to it over the necessary ports. This might mean some inconvenience for maintenance as you have to physically walk over to the box and look at a real monitor, but it simplifies the securing aspect a ton.

Aug 08, 2022 1:47 pm

Thanks for this feedback @david.domask -- as noted I approached it from a different angle being an MSP and having a ton of horsepower in the virtual realm as to why we do this. I really like how you laid this out to compare and contrast both. It is all up to the user and knowing the right way to do things in the end which makes the difference.

Aug 08, 2022 8:52 pm

I think we need to be careful not to conflate a "virtual backup infrastructure" with a "virtual backup infrastructure sitting on the same hardware as your production infrastructure". The latter is always a bad idea

I use a combination of physical and virtual concepts for our backup infrastructure. Like others have said, each setup will have it's pros and cons, but this is what we do for most of our customers in the SMB space.

1. We use a dedicated backup server. This is a non-negotiable. We do not run backup infrastructure, except proxies, on the production servers. If the customer won't even buy that, they aren't a customer of ours for very long.
a. Sometimes we'll have 2 dedicated servers - one for backups, and another to hold replicas of the most critical VM infrastructure or for backup copies.

2. We install ESXi on our Veeam servers and virtualize our components. We do NOT add the hosts to the production vCenter, they stay as standalone ESXi hosts. I also like to use raw device mapping for our VM so that it writes straight to the RAID group, bypassing VMware / VMDK, helping performance and reducing VMware storage stack issue potential.
Pros of a virtual infrastructure:
-Easier to troubleshoot the VMs if there's an issue.
-You can run SureBackup on the same host. I find you get better performance this way since your backup files stay local to the host and aren't traversing the network to run. Faster SureBackups is a plus.
-Instant VM restore to an "isolated" ESXi host out of production servers if needed in a pinch.
-Easy to migrate the OS to new hardware.
-You can replicate your Veeam C:\ to another host for DR (for your DR!). Yes, I know that there are tons of other ways to protect your Veeam servers, but I like having this trick in my bag. Example - we have one ESXi host that runs the backup server and a repo, and another ESXi host that runs a second repo only for backup copies. We replicate the Veeam C:\ to the second ESXi host, and on more than one occasion, we've had to turn it on due to an issue with our main backup host, and we can access our secondary repo within minutes for restore. Pretty handy. We can also run new backups temporarily here until our main backup server is fixed.
-A virtualized ESXi host allows you to put maybe one or two other small VMs on that host for emergencies, such as an additional domain controller so that you always have a domain, just in case production goes down. Heck, you could even put your vCenter, or a replicated copy of it there for quicker restores.

Cons of a virtual infrastructure:
-More to update, secure, and harden. In addition to your Windows and Linux operating systems, now you also have to update your ESXi host. Introducing additional software and operating systems inherently introduces additional security vulnerabilities.
-Performance can be hit or miss when you get into huge infrastructures / datasets. We have less than 100TB infrastructures, and almost everything is 10GB, so we aren't on the bleeding edge where we may see some of these weird performance issues.
-A hypervisor inherently opens you up to more bugs and problems. I'd say 50% of my Veeam server outages are due to VMware ESXi taking a dump for some reason. You can't have a PSOD on VMware if you don't have VMware on your server.

3. Harden, harden, harden. We have an entire in-house procedure for this. Veeam components aren't on the domain, nor joined to vCenter (maybe they have a dedicated vCenter, but never are they in the same vCenter as production), MFA, network isolation, etc.

TL;DR: We PHYSICALLY separate our infrastructure from production, but use VIRTUALIZATION technology where it makes sense to give us additional flexibility and options. Is it perfect? No. However, we think it's the best combination of functionality, security, and performance if done correctly.

Aug 09, 2022 4:29 pm

If I have a virtual server, I'm using a NAS as my backing storage and I'm connecting to the NAS as either a SMB share or more preferably, as an ISCSI volume connected as a RDM disk. Deleting the VM will not delete the contents of the volume. But I'm also not a huge fan of using a NAS for the backing storage. So my better thought would be to use a SAN, even an old SAN, as the backing storage and still preset the ISCSI volumes (or SAS in the case of a DAS) to the VM as a RDM disk. Ultimately, I prefer a physical server for the the repository using local storage, battery-backed RAID card, etc.

I don't mind using a NAS at a recovery site for a copy job target if needed. In fact, that's what I'm going to be doing for a client. I have a client that is currenly using a QNAP NAS as their primary storage target, and as soon as their new SAN get's delivered, we'll repurpose the NAS as a copy job target at their recovery site. The downside is that it will be a VM that will be the repository server at the recovery site which isn't idea, but much better than what they had. We've already placed a Dell PowerEdge R540 as the local/primary storage repository. I haven't decided where I'm putting the backup server, but it will more than likely be relocated as a VM on the host at the recovery site. I could repurpose an old server to act as the repo server at the recovery site running Linux for a hardened repository which isn't a terrible idea....just wasn't part of the original plan as I had planned on the linux repo server being virtual, but, as previously noted, if someone were to breach the vCenter or host, then they could get into the repo VM at the console and in theory delete the data from there one way or another.

R&D Forums

VM as repo

Re: VM as repo

Re: VM as repo

Re: VM as repo

Re: VM as repo

Re: VM as repo

Re: VM as repo

Re: VM as repo

Re: VM as repo

Re: VM as repo

Re: VM as repo

Re: VM as repo

Re: VM as repo

Who is online