-
- Novice
- Posts: 5
- Liked: 4 times
- Joined: Jun 21, 2017 7:37 am
- Full Name: Lars Jorgensen
- Contact:
[Feature Request] Better handling of large NAS backups
Hi
We are currently in the process of switching from CommVault to Veeam and really enjoying the whole "it just works" thing while backing up virtual machines, SQL and Exchange servers. Restoring is also a breeze.
I my continuing quest to abolish CommVault within our organisation, I'm currently working hard to migrate the NAS backups to Veeam using File to Tape jobs via SMB shares. This has less of the Just Works feel about it, rather giving a feel of uncharted territory still being forged. With machetes and the constant dread of weird bugs.
We have a Hitachi HNAS with about 400TB of data distributed across two EVS's (virtual file servers) each with four major share nodes. It looks a bit like this
\\nas-01\cifs01$\Data
\\nas-01\cifs01$\User
\\nas-01\cifs01$\Shared
\\nas-01\cifs02$\Data
And so on.
I set eight jobs up working on the eight major share nodes like this:
Job #1: \\nas-01\cifs01$
Job #2: \\nas-01\cifs02$
Reading the documentation I quickly realised that an external SQL database was needed since the provided SQL Express was not up to the task of 1 million files or more. I don't know how many files that NAS exactly has, but a guess would be between 20 and 30 million. So I pointed Veeam to an external SQL Standard server.
Veeam does not provide a guideline on database sizing with that many files, other than "don't do it on SQL Express". So I had a lot of fun hitting various filesystem and database limits on the SQL server trying to backup that NAS, especially explosive growth of the TEMPDB. I opened case #02266022 with support, and they quickly recognised the problem. It seems that Veeam is not equipped to handle a lot of files per job so the advice was to split the jobs even further. And try not to run them simultaneously.
I've split the jobs up further, so I now have:
Job #1: \\nas-01\cifs01$\Data
Job #2: \\nas-01\cifs01$\User
Job #3: \\nas-01\cifs01$\Shared
Job #4: \\nas-01\cifs02$\Data
Since I have four drives in the tape library, I've decided to run three jobs simultaneously to still be able to restore. The first three (with about 1.5 million files each) have been running for a few days without hitches, but this setup is not easy to automate. I need to build a staggered schedule for 20+ jobs and hope not too many of them will run at the same time. Eyes-on monitoring will be needed regularly to make sure that one job has not grown too large and therefore will not finish before the next one starts.
This is not production ready software. I can understand that you're working on implementing NDMP which will be a real boon. Using SMB I've found that permissions can be a real problem and hard to work around. I expect NDMP to solve this, and hopefully to work with much larger data sets than the current SMB implementation so I can use fewer jobs to get that backup running.
Then there's the Restore functionality. No file browser. Are you joking? You can't expect people to remember the exact name of the file they are missing. Not being able to browse directories and files for a restore seems ridiculous. I expect this is fine for restoring VMs from tape (which is what the tape subsystem seems to be built for at the moment). When the database issues with backing up large NAS'es are solved, you really need to focus on enabling users to browse files to restore.
We are currently in the process of switching from CommVault to Veeam and really enjoying the whole "it just works" thing while backing up virtual machines, SQL and Exchange servers. Restoring is also a breeze.
I my continuing quest to abolish CommVault within our organisation, I'm currently working hard to migrate the NAS backups to Veeam using File to Tape jobs via SMB shares. This has less of the Just Works feel about it, rather giving a feel of uncharted territory still being forged. With machetes and the constant dread of weird bugs.
We have a Hitachi HNAS with about 400TB of data distributed across two EVS's (virtual file servers) each with four major share nodes. It looks a bit like this
\\nas-01\cifs01$\Data
\\nas-01\cifs01$\User
\\nas-01\cifs01$\Shared
\\nas-01\cifs02$\Data
And so on.
I set eight jobs up working on the eight major share nodes like this:
Job #1: \\nas-01\cifs01$
Job #2: \\nas-01\cifs02$
Reading the documentation I quickly realised that an external SQL database was needed since the provided SQL Express was not up to the task of 1 million files or more. I don't know how many files that NAS exactly has, but a guess would be between 20 and 30 million. So I pointed Veeam to an external SQL Standard server.
Veeam does not provide a guideline on database sizing with that many files, other than "don't do it on SQL Express". So I had a lot of fun hitting various filesystem and database limits on the SQL server trying to backup that NAS, especially explosive growth of the TEMPDB. I opened case #02266022 with support, and they quickly recognised the problem. It seems that Veeam is not equipped to handle a lot of files per job so the advice was to split the jobs even further. And try not to run them simultaneously.
I've split the jobs up further, so I now have:
Job #1: \\nas-01\cifs01$\Data
Job #2: \\nas-01\cifs01$\User
Job #3: \\nas-01\cifs01$\Shared
Job #4: \\nas-01\cifs02$\Data
Since I have four drives in the tape library, I've decided to run three jobs simultaneously to still be able to restore. The first three (with about 1.5 million files each) have been running for a few days without hitches, but this setup is not easy to automate. I need to build a staggered schedule for 20+ jobs and hope not too many of them will run at the same time. Eyes-on monitoring will be needed regularly to make sure that one job has not grown too large and therefore will not finish before the next one starts.
This is not production ready software. I can understand that you're working on implementing NDMP which will be a real boon. Using SMB I've found that permissions can be a real problem and hard to work around. I expect NDMP to solve this, and hopefully to work with much larger data sets than the current SMB implementation so I can use fewer jobs to get that backup running.
Then there's the Restore functionality. No file browser. Are you joking? You can't expect people to remember the exact name of the file they are missing. Not being able to browse directories and files for a restore seems ridiculous. I expect this is fine for restoring VMs from tape (which is what the tape subsystem seems to be built for at the moment). When the database issues with backing up large NAS'es are solved, you really need to focus on enabling users to browse files to restore.
-
- Enthusiast
- Posts: 31
- Liked: 14 times
- Joined: Jun 20, 2017 3:17 pm
- Contact:
Re: [Feature Request] Better handling of large NAS backups
I agree with you, personally I wouldn't use veeam if the main use case is file based backup to tapes (SMB, tapes), at least with the current versions.
Take in account that, while "traditional old" backup software started with file based backup to tape and then implemented vm support and disk based backup (half baked, at least in the first iterations), for veeam was the contrary, they started with vm, disk based (good) and then as an "after thought" they inserted tape backups for file based backups (indeed, let's be real, quite half baked, as you say), especially due to heavy demand from enterprises (like yours, if you have already a "legacy backup system" is difficult to sell veeam, especially if handle tapes and files worse).
I use tape handling in veeam just to ship the disk backup repository images to tape, for archival and added redundancy, and for that is ok.
But I expect that veeam for direct tape/disk SMB/files backup will become better in say one or two versions, especially to win more of the market (like your case). Many companies can't allow more than one backup software system.
Take in account that, while "traditional old" backup software started with file based backup to tape and then implemented vm support and disk based backup (half baked, at least in the first iterations), for veeam was the contrary, they started with vm, disk based (good) and then as an "after thought" they inserted tape backups for file based backups (indeed, let's be real, quite half baked, as you say), especially due to heavy demand from enterprises (like yours, if you have already a "legacy backup system" is difficult to sell veeam, especially if handle tapes and files worse).
I use tape handling in veeam just to ship the disk backup repository images to tape, for archival and added redundancy, and for that is ok.
But I expect that veeam for direct tape/disk SMB/files backup will become better in say one or two versions, especially to win more of the market (like your case). Many companies can't allow more than one backup software system.
-
- Product Manager
- Posts: 14726
- Liked: 1706 times
- Joined: Feb 04, 2013 2:07 pm
- Full Name: Dmitry Popov
- Location: Prague
- Contact:
Re: [Feature Request] Better handling of large NAS backups
Hello Lars,
Thanks for a fair feedback we will discuss it with our tape team for sure. We are working on some improvements that should allow you to create file to tape jobs easier (so you don’t have to add the folder structure and credentials manually to the file to tape job)
Thanks for a fair feedback we will discuss it with our tape team for sure. We are working on some improvements that should allow you to create file to tape jobs easier (so you don’t have to add the folder structure and credentials manually to the file to tape job)
We are working on some improvements that should allow you to create file to tape jobs easier (so you don’t have to add the folder structure and credentials manually to the file to tape job)I've split the jobs up further
Have you tried Files node. It shows the index of backed up files and you can click and restore specific file from tapeThen there's the Restore functionality. No file browser.
Out of curiosity, what NDMP version this NAS supports? Thanks in advance.Hitachi HNAS
-
- Novice
- Posts: 5
- Liked: 4 times
- Joined: Jun 21, 2017 7:37 am
- Full Name: Lars Jorgensen
- Contact:
Re: [Feature Request] Better handling of large NAS backups
Nice. The real improvement would be not having to create 30+ jobs to backup one NAS because of the "millions of files" problem.Dima P. wrote:We are working on some improvements that should allow you to create file to tape jobs easier (so you don’t have to add the folder structure and credentials manually to the file to tape job)
Had a quick look in the interface. It supports version 2, 3 and 4. Recommending 4 which is the default.Out of curiosity, what NDMP version this NAS supports? Thanks in advance.
-
- Product Manager
- Posts: 14726
- Liked: 1706 times
- Joined: Feb 04, 2013 2:07 pm
- Full Name: Dmitry Popov
- Location: Prague
- Contact:
Re: [Feature Request] Better handling of large NAS backups
Lars,
Thanks for the heads up!
Thanks for the heads up!
Who is online
Users browsing this forum: No registered users and 15 guests