Comprehensive data protection for all workloads
mkaec
Veteran
Posts: 465
Liked: 136 times
Joined: Jul 16, 2015 1:31 pm
Full Name: Marc K
Contact:

Re: feature request: split vbk

Post by mkaec »

When backing up to a deduplicating appliance, the device doesn't start deduplicating until the file is unlocked. If you back up a 12 TB file server, the deduplicating appliance will sit there doing nothing until the job is complete. If Veeam were splitting the VBK files, the device could work on them as the job progresses.
Gostev
Chief Product Officer
Posts: 31807
Liked: 7300 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: feature request: split vbk

Post by Gostev »

Marc, I got curious what specific appliance are you talking about? Because talking about all the market leaders, I'm only aware of two types:
1. Dell EMC Data Domain and HPE StoreOnce do inline deduplication: data is deduped before it lands on disks in the dedupe appliance.
2. ExaGrid does post-process deduplication: data is deduped after spending a few days in the landing zone, so latest backups are never touched.
mkaec
Veteran
Posts: 465
Liked: 136 times
Joined: Jul 16, 2015 1:31 pm
Full Name: Marc K
Contact:

Re: feature request: split vbk

Post by mkaec »

ExaGrid. It does do post-process deduplication, but it's not lazy about it. I'm not aware of the specific logic, but there is a balance it applies between making sure there are resources to land new files, and deduplicating files that have already landed. After it's done deduplicating, it leaves the original data sitting in the landing zone, which essentially serves as a cache.

I have a 6 TB VM that is backed up by Veeam. As it's going, I see the usage in the landing zone steadily increase. Then after the Veeam job completes, I see the usage slowly decrease. (The ExaGrid UI only shows the landing zone as used when it contains data that hasn't been deduplicated yet.) Contrast that to an older application that splits files into 20 GB chunks. I have a 2TB job with that application. I can run an ExaGrid deduplication report before the job is complete and see what kind of ratios were achieved with the files landed earlier in the job.
Gostev
Chief Product Officer
Posts: 31807
Liked: 7300 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: feature request: split vbk

Post by Gostev »

I see. Our joint (Veeam+ExaGrid) recommendation has been to size the landing zone so that latest backups remain undeduped and thus available for fast restores, so my comment was based on this best practice rather than anything else.
stoyo
Lurker
Posts: 1
Liked: never
Joined: Aug 16, 2019 8:38 pm
Full Name: S. Toyo
Contact:

[MERGED] Have Veeam Backup & Replication break up backup into smaller chunks

Post by stoyo »

We are in need of a solution if we can set Veeam Backup & Replication to break the backup file into smaller chunks. We are thinking something smaller like 1-2TB chunks. Right now it's getting so large that we’re only able to store it on extremely large disks..... Is this even possible? Anyone?

MOD EDIT: reduced to normal size
HannesK
Product Manager
Posts: 14840
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Have Veeam Backup & Replication break up backup into smaller chunks

Post by HannesK »

Hello,
and welcome to the forums.

The backup files are 1 file per VM per backup run. There is no supported way to split them.

In case you did not activate per-vm backup files yet, the following checkbox might solve your issues: https://helpcenter.veeam.com/docs/backu ... ml?ver=100

I'm not sure what you mean with "extremely large disks", but today I would go at least for 64TB and with REFS up to 300-500TB per volume to minimize management overhead.

Best regards,
Hannes
Akkim
Novice
Posts: 3
Liked: never
Joined: Aug 21, 2020 2:01 pm
Full Name: Micah Imparato
Contact:

Re: feature request: split vbk

Post by Akkim »

For years the request has remained the same ::

Please, please, please, please PLEASE allow us to split/chunk VEEAM backup content. This thread itself goes back years and the problem remains the same as it always has been - the ability to get VEEAM to be used more commonly boils down to being able to make the backups more portable.

As VM sizes grow, the ability to off-load them to USB for off-site copies remains a huge element people want (not everyone goes "yay, let's dump this all in S3 and hope for the best).

This is undoubtedly one of the most consistently and most visibly asked for items from the User Community (perhaps real Deduplication is second) but always seems to go ignored.

If you can respond to a thread that started more than 4 years ago.. you can recognize the importance of this to the userbase and push it up the list.

--Micah
mkaec
Veteran
Posts: 465
Liked: 136 times
Joined: Jul 16, 2015 1:31 pm
Full Name: Marc K
Contact:

Re: feature request: split vbk

Post by mkaec » 1 person likes this post

Who says it's on the list?
Akkim
Novice
Posts: 3
Liked: never
Joined: Aug 21, 2020 2:01 pm
Full Name: Micah Imparato
Contact:

Re: feature request: split vbk

Post by Akkim »

Who says it's on the list?
Re: feature request: split vbk

Post by foggy » Nov 18, 2019 12:46 pm
Hi Jeff, so far it didn't get high enough in the priority list.
Good enough?

--Micah
mkaec
Veteran
Posts: 465
Liked: 136 times
Joined: Jul 16, 2015 1:31 pm
Full Name: Marc K
Contact:

Re: feature request: split vbk

Post by mkaec »

Yes. He even called it a priority list.
soncscy
Veteran
Posts: 643
Liked: 312 times
Joined: Aug 04, 2019 2:57 pm
Full Name: Harvey
Contact:

Re: feature request: split vbk

Post by soncscy » 1 person likes this post

Isn't that a bit pedantic? I wouldn't hold people to such a statement, priorities change.

I'm sorry, call me privileged, but if the data is that important to you, pay to protect it.

Splitting, imho, is not a great answer, as it complicates the risk instead of reducing it. Now instead of one device to worry about bitrot or hardware failures on, I have 1+N devices I need to ensure are well kept in a clients environment and they don't have a built in redundancy system. Copying the files also isn't really a valid answer as now I need to track these copies and worry about (N+(NB)) devices where B is the number of backup drives per backup drive, all of which are completely unaware of each other. The idea is disastrous from my perspective, and it's not just about mild convenience for me, it's about feasibly monitoring these devices.

Testing the backups becomes that much more difficult (I cannot even imagine how things like Surebackup or Instant Recovery work like this...), and all this has done is allow my clients to save a few $$$ by splitting the file while increasing the risk factor of hardware failure N times.

I have this conversation with clients a lot and while I get the idea and the desire, I have never really heard a compelling reason besides saving $$$. Rotation is not a real deal breaker as you can build such rotating arrays pretty easily and have the data redundancy.

I'm not saying there is no merit to the idea, but I think it bears scrutiny as to what recovering from such backups really entails; having "successful" backups means nothing if your disaster plan means that a single USB malfunction or a tiny little bit of bitrot can tank your entire backup.

I know times are tough and budgets are tight, but from my perspective, splitting is just investing in higher risk low reward backups. I personally hope it never becomes a thing.
Seve CH
Enthusiast
Posts: 89
Liked: 35 times
Joined: May 09, 2016 2:34 pm
Full Name: JM Severino
Location: Switzerland
Contact:

Re: feature request: split vbk

Post by Seve CH »

sonscy, I think that you won the price for "The most pedantic answer on Veeam Forum" with your answer.

You have your own problems. We have others. That simple.

My offsite/offline backup can't be done on common USB disks because it doesn't fit. I am talking about the third backup of the same data:
1. Online/Onsite backup
2. Online/Offsite backup (copy job)
3. Offline/Offsite "air-gapped" backup (USB Disks via copy job) in a safe, far from hackers and ransomware.

If I wanted to use some third party file storage, I will be limited because of file size (It won't work or it would be so painful slow that it won't matter if supported or not).
If I wanted to do fast operations, it will take a while because we talk of 1 VM = 1 File = 1 Thread for copy jobs.

So it doesn't have anything to do with Surebackup, data at risk and so on. It is better to have a backup than no backup. That simple. I refuse to buy a tape library.

If you do not want file splitting because your specific business case doesn't allow for it, it is easy: do not check the checkbox "allow file splitting / per disk backup file / etc..".

Now I'm lucky: in my new company (subject to FDA's GxP by the way, well that is not being lucky ;-)), I no longer have huge VMs (10TB is the biggest). In my former one, 30 TB VMs full of uncompressible data (medical imaging) was not an exception.

My needs for my feature request from 2017 have changed a bit, but the ground reasons for it are still valid. As somebody else said: I suspect that this is not being implemented because some architectural limitations from Veeam (They might need to reengineer the whole VM processing system).

I am still happy with Veeam. Veeam have other functionality that compensates for this (We are buying 30 additional Veeam Core Ent+ licenses soon). Sadly, this function is more on the technical/operations side than the "bling-bling" shinny functions that may appeal to high management on a specification sheet, so I do not expect any solution soon (if ever).

Best regards.
crackocain
Service Provider
Posts: 248
Liked: 28 times
Joined: Dec 14, 2015 8:20 pm
Full Name: Mehmet Istanbullu
Location: Türkiye
Contact:

Re: feature request: split vbk

Post by crackocain »

I think this feature is critical for file integrity.

You know proxies process virtual disks. So if every virtual disk write independent vbk file dehydration process are going fast and file modification is less. So split vbk is distribute the risk factor.
VMCA v12
Akkim
Novice
Posts: 3
Liked: never
Joined: Aug 21, 2020 2:01 pm
Full Name: Micah Imparato
Contact:

Re: feature request: split vbk

Post by Akkim »

The reality is - when you have LARGE systems being backed up, your options for getting them onto removable devices for off-site storage becomes a challenge. Given that VEEAM doesn't really allow you to be overly selective about application-aware backups, the larger your Exchange/SQL environments get, the larger your VBK files are. If we could specify which databases were backed up in Exchange, that would also help - but that's a less-expected feature.

If we're not able to split the size of the VBK files up within the VEEAM Engine itself (IE :: Create multi-part VBK files splitting each file into a max of X size) - we would need some other mechanism to handle getting this content off-loaded. I'm certainly not here to say that it isn't without risks, but it is most definitely an expectation for most of my clients to be able to take all of their content on a removable device.

While External USB drives are getting larger, the size of a single drive is the ultimate gate here. Sure there's 14TBs out there - but I have a few customers who are insane eMail hoarders and a full backup of their Exchange environment (the passive copy of their databases is on a dag member that gets backed up to minimize impact) is almost 10TB itself.

If I could deal with size limits and multi-part VBK files, it would be exceptionally easy to balance and fill up USB devices to get them off-site (accepting and knowing the risks of multiple devices, etc.. similar to what happens with jobs spanning across tapes).

It shouldn't be hard to implement something like this, it shouldn't really take years of development to solve it.. and it really should be fairly up there on the priority list given that it's been kicked around in this thread alone for years.

--Micah
vmtech123
Veeam Legend
Posts: 251
Liked: 136 times
Joined: Mar 28, 2019 2:01 pm
Full Name: SP
Contact:

Re: feature request: split vbk

Post by vmtech123 » 3 people like this post

Tape is great. I can hold entire jobs, or HUGE VM's on a single tape. I can even bring that tape to another location, import it and away I go.

Saying you REFUSE to buy tape, but insist on someone add a feature is interesting. Why do you refuse to use tape? It's quite common in large enterprise customers. The cost savings is good, and you can replicate to multiple sites. move to a vault etc.
Seve CH
Enthusiast
Posts: 89
Liked: 35 times
Joined: May 09, 2016 2:34 pm
Full Name: JM Severino
Location: Switzerland
Contact:

Re: feature request: split vbk

Post by Seve CH »

I do not see why you consider interesting me not liking tape, but asking for a different unrelated feature.

IMHO, tape is dead except some corner use cases. Even CERN is moving away from tape and these guys understand a bit about storage.
https://indico.cern.ch/event/862873/con ... STI_v2.pdf

I do not want to convert this thread into tape vs HDs one. You may want tape, I do not. One of the main advantages for HDs is flexibility:

With USB disks, you only need a laptop (or better a server with 10G/40Gbit NICs), internet connection to download Veeam setup, the encryption keys and in a few minutes you start to be back on business via VM instant recovery.

But... your backup files must be small enough to pass into USB disks, hence, the feature request ;-)

Regards.
mkaec
Veteran
Posts: 465
Liked: 136 times
Joined: Jul 16, 2015 1:31 pm
Full Name: Marc K
Contact:

Re: feature request: split vbk

Post by mkaec »

The other advantage split VBKs would bring to an ExaGrid environment is to allow replication to start sooner. If dedplication is delayed until the end of the job, then replication won't start until after deduplication finishes, which itself will take a while because dedplicating several TB takes a long time. If the VBKs were split into 20 GB segments, many of the files would be deduplicated and replicated before the job finished.
jcamping
Lurker
Posts: 1
Liked: never
Joined: Mar 23, 2016 6:31 am
Full Name: James Camping
Contact:

Re: feature request: split vbk

Post by jcamping »

One option I would like is splitting the job by disks on the VM so I can get around filesystem limitations. Our large VMs create problematic backup files for our storage.


Thanks,
-James
PetrM
Veeam Software
Posts: 3626
Liked: 608 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: feature request: split vbk

Post by PetrM »

James, what about to create jobs per disk by excluding specific disks which are not supposed to be processed?

Thanks!
soncscy
Veteran
Posts: 643
Liked: 312 times
Joined: Aug 04, 2019 2:57 pm
Full Name: Harvey
Contact:

Re: feature request: split vbk

Post by soncscy » 1 person likes this post

Seve CH wrote: Aug 24, 2020 6:19 am sonscy, I think that you won the price for "The most pedantic answer on Veeam Forum" with your answer.

You have your own problems. We have others. That simple.
Somehow I missed this from back in august, but I think you fundamentally misunderstand my post. The "pedantic" part was in response to the person wondering where the feature was if a single PM mentioned it was being considered. Nothing more nothing less. But you raise some common complaints/comments I've had to address, so sure, lemme take a shot.
Seve CH wrote: Aug 24, 2020 6:19 am My offsite/offline backup can't be done on common USB disks because it doesn't fit. I am talking about the third backup of the same data:
1. Online/Onsite backup
2. Online/Offsite backup (copy job)
3. Offline/Offsite "air-gapped" backup (USB Disks via copy job) in a safe, far from hackers and ransomware.
What stops you from just unplugging set 3 when it's a proper storage system? You can achieve the same net-effect of rotated drives by just investing a few $$$ in a physical kill switch. You can just look on whatever local tech retailer you have for network kill switch, and get the exact same effect, without having to deal with split files.
Seve CH wrote: Aug 24, 2020 6:19 am If I wanted to use some third party file storage, I will be limited because of file size (It won't work or it would be so painful slow that it won't matter if supported or not).
If I wanted to do fast operations, it will take a while because we talk of 1 VM = 1 File = 1 Thread for copy jobs.
Not really sure what this is in reference to as it's not something I mentioned.
Seve CH wrote: Aug 24, 2020 6:19 am So it doesn't have anything to do with Surebackup, data at risk and so on. It is better to have a backup than no backup. That simple. I refuse to buy a tape library.
Nowhere in my post do I mention tape. And I think your statement "it's better to have a backup than no backup" is misguided. A corrupt backup file is just wasted space, and USB drives just aren't built to handle the kind of workloads that backup applications send. The risk factor is distributed much more evenly with tiny files; lose a few sectors, and meh, you lost a few spreadsheets. Maybe. But with a compressed/deduped image level backup file, you lose even a few sectors, you potentially tank the entire backup. And when you add in multiple hardware devices that have no means of mirroring or redundancy, what's the point of such a backup? You can't validate it, you can't test it, it's inherently more at risk than any other backup you have by nature of the connection. A single drive goes, and your entire backup goes.
Seve CH wrote: Aug 24, 2020 6:19 am If you do not want file splitting because your specific business case doesn't allow for it, it is easy: do not check the checkbox "allow file splitting / per disk backup file / etc..".
Doesn't work that way with clients -- give people an option, they will use it. Tell them it's unsupported, they'll still use it. Tell them it's the worst decision they can possibly make, they'll still use it. As long as the number on the budget sheet is smaller than the alternative, they will use it.

This is of course my opinion and position, and it's based off of years of experience watching the faces of clients when their "last resort" USB drive wouldn't even light up the LEDs when plugging it in, never mind the countless drives that hit sudden drive errors as soon as we tried to read from it from any number of machines.
Seve CH wrote: Aug 24, 2020 6:19 am Now I'm lucky: in my new company (subject to FDA's GxP by the way, well that is not being lucky ;-)), I no longer have huge VMs (10TB is the biggest). In my former one, 30 TB VMs full of uncompressible data (medical imaging) was not an exception.

My needs for my feature request from 2017 have changed a bit, but the ground reasons for it are still valid. As somebody else said: I suspect that this is not being implemented because some architectural limitations from Veeam (They might need to reengineer the whole VM processing system).
I'm not really sure I get your point on the first part -- 30 TB is not even that big, it's chunky. For anything hosting a DB, this is a normal size. My statement from above still stands -- if it's important data, pay for it. If the cost of being without that data is greater than the cost of storage for it, then the storage cost is a good investment. If you're truly in such a position that you must split it out, then just download 7zip and do a store and split. You get the exact same effect (of course with a temporary local storage cost).
mkaec
Veteran
Posts: 465
Liked: 136 times
Joined: Jul 16, 2015 1:31 pm
Full Name: Marc K
Contact:

Re: feature request: split vbk

Post by mkaec »

PetrM wrote: Oct 06, 2020 10:28 pm James, what about to create jobs per disk by excluding specific disks which are not supposed to be processed?
That strategy has some issues. It breaks Enterprise Manager Searches. And I don't think Instant Restore would work.
HannesK
Product Manager
Posts: 14840
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: feature request: split vbk

Post by HannesK » 5 people like this post

Our large VMs create problematic backup files for our storage.
It's the year 2020 and even good old NTFS goes beyond 16TB per file (I know, that there is crappy storage still out there with the 16TB LUN limit). VMs with 30TB and larger usually do not "appear out of nowhere". Backup repositories need to be planned according source requirements.

In the end, there are two file-systems today that are good for repositories: REFS and XFS. No relevant limits with them.

To be realistic: designing the repository following the best practices will be more successful than waiting for the software to change the backup format.
mkaec
Veteran
Posts: 465
Liked: 136 times
Joined: Jul 16, 2015 1:31 pm
Full Name: Marc K
Contact:

Re: feature request: split vbk

Post by mkaec » 2 people like this post

The other realistic viewpoint is to choose a different backup software if this is a breaking item. It's always important to buy software based on what features currently exist versus what could happen in the future.
yasuda
Enthusiast
Posts: 64
Liked: 10 times
Joined: May 15, 2014 3:29 pm
Full Name: Peter Yasuda
Contact:

Re: feature request: split vbk

Post by yasuda »

Seve CH wrote: Aug 31, 2020 9:57 am With USB disks, you only need a laptop (or better a server with 10G/40Gbit NICs), internet connection to download Veeam setup, the encryption keys and in a few minutes you start to be back on business via VM instant recovery.
Trying to wrap my head around this. Instant recovery of a 10+ TB VBK from a laptop over USB? Would that work? I've never tried it, never even considered it, but if it works in any useful manner, I am impressed.

So if you had your split VBKs, you'd plug both into your laptop, and expect VIR to work?

Would this work for you? Plug 2 or more USB HDs into your backup server, use Storage Spaces to make a single big Simple Resiliency volume, and copy your VBK to that?

py
Seve CH
Enthusiast
Posts: 89
Liked: 35 times
Joined: May 09, 2016 2:34 pm
Full Name: JM Severino
Location: Switzerland
Contact:

Re: feature request: split vbk

Post by Seve CH » 2 people like this post

Hi Yasuda,

With VM Instant recovery, it doesn't matter how big the disks are: you are booting from backup (you will need to do storage motion later). And you do not require the old Veeam server: Just download from the web site, use a demo license and a ESX host by IP+Root user and start booting stuff.

In my company, we tried to restore important VMs from the USB disk at the same time. It is part of our DR procedures. We were able to boot 20-30VMs from a 24TB Veeam files backup and able to read data from our GxP systems, extract Solidworks models, etc. So we would be able to work in a degraded state, but we would be able to recover part of the production and client support in 3-4 hours.

In my former company, we had 60-70TB of Veeam backups on USB disks (several) and we were able to identify our patients, use the laboratory system and recover the ICU Care and Emergency Room (slow, but functional) in less than 3 hours.

Important to note:
You will need lots of RAM on your "veeam server" (our laptop had 16GB, in the hospital we had a dedicated rack server for that)
You will need plenty of temporary write cache space (or redirect writes somewhere)
You will be limited on the IOPS that the USB disk can provide (we use Western Digital my Duo, that is RAID0 of 2x 3.5" HDs)
And you will need an ESX host to start the VMs (in a ransomwares attack, you already have them)

Regards
pkelly_sts
Veteran
Posts: 600
Liked: 66 times
Joined: Jun 13, 2013 10:08 am
Full Name: Paul Kelly
Contact:

Re: feature request: split vbk

Post by pkelly_sts » 1 person likes this post

I have to say, to completely exclude tape, which is fundamentally designed to provide the ability to shift data off-site in manageable chunks, seems short-sighted. Just because the wheel has been around for thousands of years doesn't mean it's use-case has been outgrown. Tape absolutely still has a place for most businesses as part of a comprehensive total solution.
LeighdePaor
Service Provider
Posts: 4
Liked: never
Joined: Jun 05, 2018 1:30 pm
Full Name: Leigh de Paor
Contact:

Re: feature request: split vbk

Post by LeighdePaor »

You could consider setting up a virtual tape library (VTL) as the Veeam destination and the VTL could have the deduped storage as destination - use a tape format with a size that suits your dedup environment.
E.g a VTL like StarWind using LTO2 emulation would generate 100-200GB files on the dedup landing zone while taking your entire server backup.
It does add a layer of complexity but if storage cost is your number 1 priority this could help reduce the constraints.
If cost is a big barrier to a POC maybe have a look at Quadstor VTL - https://www.quadstor.com/virtual-tape-library.html
robnicholsonmalt
Expert
Posts: 135
Liked: 24 times
Joined: Dec 21, 2018 11:42 am
Full Name: Rob Nicholson
Contact:

Re: feature request: split vbk

Post by robnicholsonmalt »

So here we are a year later - did this request just fade away? It's not just a problem to Enterprise customers. I use Veeam on my own home development server and I've run out of disk space for the Veeam backup (4TB). So I'm going to upgrade the disk enclosure - I'm currently copying the files to a 8TB USB drive and there is a single 2TB VBK to copy. I'm using robocopy which might, for whatever, reason stop at the last byte. It would help if this VBK was made up of smaller chunks. In addition, because I don't trust USB, I'm going to use Get-FileHash to check source and target. That would be more manageable if the chunks were smaller.

This is an old long thread and I admit I've not read it all. Was there a technical reason put forward why smaller chunks is a bad idea? Zip seems to have coped with it for many years.
HannesK
Product Manager
Posts: 14840
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: feature request: split vbk

Post by HannesK »

It depends on what you mean with "fade away" :-) We are aware of that request, but that change requires a lot of re-design in the core architecture. So there are no changes in the near future planned.

Side note: with object storage, there are many small objects instead of large backups files. And we see object storage as "the future"
vmtech123
Veeam Legend
Posts: 251
Liked: 136 times
Joined: Mar 28, 2019 2:01 pm
Full Name: SP
Contact:

[MERGED] Feature Request

Post by vmtech123 »

Split Veeam job into several files.

I don't know if this has been asked, or why it's not something brought up more.

Could we have the option to split a job into multiple backup files? Would this allow us more parallelization? Example, 30TB file server would be come 3 10 TB files. It would also utilize 3 tapes at once backing up to tape rather than using 1 drive and filling 3 tapes in a row?
Post Reply

Who is online

Users browsing this forum: Bing [Bot], cserban, Dima P., elenalad, Google [Bot], Majestic-12 [Bot] and 248 guests