Host-based backup of VMware vSphere VMs.
Post Reply
sbbots
Enthusiast
Posts: 96
Liked: 25 times
Joined: Aug 16, 2013 5:44 pm
Full Name: Matt B

Unusually large VBK size

Post by sbbots »

Perhaps one of the smarter people on the forum can help me regarding the size of vbk backup files in two scenerios:

Site 1 - Veeam running on a Dell PowerVault stand-alone server. The backup repository is a NTFS volume on the server; 4K block size.
Site 2 - Veeam running on a VM on the host. The backup repository is an EXT4 (CIFS) volume located on a Synology NAS; 64K block size.

The sites are identical in the # of VMs and the amount of data to backup. When running an Active Full:

Site 1 - vbk file is ~460GB.
Site 2 - vbk file is ~780GB.

My assumption is that the discrepancy in vbk size is due to the block size of each volume. The Synology NAS uses a larger block size (64K) for performance, which means it is less efficient at data storage. But should it be using that much more space? What do you think? Just curious to know if my assumption is correct and I can go on living knowing that the NAS will use more space, or whether I messed up somewhere. Any help/suggestions appreciated.

PS - The Synology DS412+ NAS, which was recommended to me on the VMware forums, absolutely flies through backups. Regardless of vbk size, it is a screamer for a small, inexpensive NAS.
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: VBK Size - NTFS (4K) vs EXT4 (64K)

Post by tsightler » 1 person likes this post

I'd say that there's no way the difference you are seeing is caused by the block size of the storage system. The block size is simply the smallest allocation unit on the underlying filesystem/block device so the total difference in wasted space between 4K vs 64k can't be more than 60K for a single large file. Sure, if you had a thousand 1K files, well, then your looking at using 64MB vs 4MB, but for a single large file, the total allocation difference will only be one single block extra, one 4K or one 64K block.

However, even when the space is "wasted" a directory listing will still show the actual size of the file, not the wasted space. For example, on Windows the difference between the actual file size and the allocation size can easily be seen in the file properties as the difference between the "Size" and "Size on Disk" in the file properties. The "Size" value will display the actual size of the file down to the byte, but the "Dize on disk" value will always be rounded up to a size divisible by 4K (assuming NTFS with 4K cluster size and not NTFS compression or other factors). For a 64K block size filesystem the difference between the file size, and the actual size on disk, could be up to 64K - 1 byte, but no more, but the OS will still report the actual size of the file to the byte, not the wasted size.

So, when comparing these two backups to really know what's different between them I would suggest looking at the Processed, Read, and Transferred statistics for the jobs. Hopefully posting that information should shine a light on the difference between those two jobs. My guess is you'll see more data read than expected at the larger site, but I'm certainly interested to find out.
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: VBK Size - NTFS (4K) vs EXT4 (64K)

Post by dellock6 »

The only way to be sure the discrepancy is caused by the block size would be to save the SAME backup file in both repositories.
In this situation, I would say the two sites are "really similar" but are not the same: different traffic, different I/O patterns, different content in every VM, and so on. Even if the used space of all VM disks is the same, this does not mean the amount of unique blocks is the same...
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
sbbots
Enthusiast
Posts: 96
Liked: 25 times
Joined: Aug 16, 2013 5:44 pm
Full Name: Matt B

Re: VBK Size - NTFS (4K) vs EXT4 (64K)

Post by sbbots »

Thank you both for the test ideas.
tsightler wrote:...I would suggest looking at the Processed, Read, and Transferred statistics for the jobs.
This is something I noticed and should have mentioned in my original post. Site #2 is processing much more data than Site #1 (~226GB more), yet the data size is basically the same at both sites. There is a small dedupe ratio difference (1.2x vs 1.3x) but that does not explain the difference either. It is almost like Site #2 is backing up blank space on one particular VM...

Which made me think of a change we recently made to one particular VM at Site #2; The data drive vmdk was expanded from 800GB to 1024GB (~224GB), which happens to be almost exactly the difference in vbk size. There is no new data in that portion, as it was only expanded to accomodate future tape restores, but it makes me think. For some reason is veeam backing up that expanded portion of the vmdk as if it were full of data? Hmmm. I may recreate that data drive vmdk this weekend, run a full backup and see what happens.
dellock6 wrote:The only way to be sure the discrepancy is caused by the block size would be to save the SAME backup file in both repositories.
Doing that now. Running an active full to a USB 3.0 drive formatted as NTFS with a 4096 allocation unit size. I am watching the data processed and so far it looks like the block size is not making a difference; The backup is still unusually large.
sbbots
Enthusiast
Posts: 96
Liked: 25 times
Joined: Aug 16, 2013 5:44 pm
Full Name: Matt B

Re: VBK Size - NTFS (4K) vs EXT4 (64K)

Post by sbbots »

tsightler wrote:One thing you should keep in mind is that Veeam is fully an image based backup program, it is not aware of the file usage of a volume at all, only whether the blocks have ever been used or not, if they have ever had data written, Veeam will back it up.

So, for example, if I have a volume that is 500GB, but it has only 250GB of files with the rest free, but perhaps some blocks in that 500GB of disk had previously been written, but later deleted, Veeam will still backup these blocks as they still exist on the disk image. The only way to prevent this is to use a tool like sdelete to zero out the "free" space, although you must first defragment the disk, then use sdelete, otherwise the existing files may be fragmented over a larger space.
That is exactly the answer to the problem. Somewhere along the line we must have moved several hundred GB of data on that drive. Even though the actual data is not there anymore Veeam can see those data blocks as changed and thus backs them up.

Thank you again for your expertise. I am most appreciative.
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Unusually large VBK size

Post by dellock6 »

Hi Matt,
glad you nailed the problem. Yes it can be the reason: using CBT for block processing, Veeam sees those blocks as "changed" even if they were lately emptied. You can reclaim used space in the VBK file by running an sdelete operation in the guest, and then a full backup on Veeam (not an incremental).

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
sbbots
Enthusiast
Posts: 96
Liked: 25 times
Joined: Aug 16, 2013 5:44 pm
Full Name: Matt B

Re: Unusually large VBK size

Post by sbbots »

dellock6 wrote:Hi Matt,
glad you nailed the problem. Yes it can be the reason: using CBT for block processing, Veeam sees those blocks as "changed" even if they were lately emptied. You can reclaim used space in the VBK file by running an sdelete operation in the guest, and then a full backup on Veeam (not an incremental).
I defragged the drive twice and then started sdelete to zero-out the empty space. Sdelete is still running but when it is finished I will run an Active Full backup. If this does the trick then the problem = nailed and you guys are geniuses :mrgreen:

A Google search for "sdelete" will find it (not sure if I can post links). The latest is version 1.61. Also, make sure to use the switch -z in an elevated command prompt. I guess there was some confusion years ago because it was originally -c. So in my case I am using "sdelete -z e:" to zero-out the E drive.

I have a certain, recorded "walking" zombie show to watch, so I will let you know in the morning if it worked. Thanks again guys.
veremin
Product Manager
Posts: 20400
Liked: 2298 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Unusually large VBK size

Post by veremin »

Yep, you're completely right here. In Secure Delete v1.51 "-c" parameter was responsible for zeroing free space. Meanwhile, starting from Secure Delete v1.6 you should utilize "-z" for the same purpose.

Thanks.
sbbots
Enthusiast
Posts: 96
Liked: 25 times
Joined: Aug 16, 2013 5:44 pm
Full Name: Matt B

Re: Unusually large VBK size

Post by sbbots »

Secure Delete worked PERFECTLY on the server last night :D The backup vbk size went from 788GB down to 387GB. Tom and Luca - I really appreciate the help.

Before

Image

After

Image
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Unusually large VBK size

Post by dellock6 »

Glad it was helpful, and thanks for posting the results, it's always nice to see real data about these solutions.

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 69 guests