Backup size clarification

camerond · Post by **camerond** » Jul 25, 2016 7:54 am this post

I saw the backup size mentioned in the list of known issues and limitations, but I'd like some details clarified. I'm backing up two volumes using Entire Computer: OS on four disks in RAID10, data on two disks in RAID1. The OS volume is 251GB with 4.6GB used, data is 1.8TB and 110GB used. From my understanding of the limitation as described, the backup will look at all the dirty blocks and back those up, even ones that belonged to files now deleted.

The four disks used for the OS were pulled from other servers, most likely hardware RAID5 configurations. So I expected there to be a bit of grey area with the OS volume in terms of dirty blocks. Agent proceeded to read and process 251GB and transferred about 90GB to the repository. Knowing that subsequent backups for this volume would be minuscule, this large initial backup was forgivable.

The data volume is a little different though. The disks were new and unused, were configured in RAID1 and initialised by the controller before being presented to the OS for formatting. The volume has 7% of the space used. The files stored on it haven't changed too much over the years, just new ones added and some occasionally overwritten. However when the agent came to the data volume it thought every block was dirty, and went about processing the whole thing. I stopped it about 40% through because I didn't see the point in filling up the repository with a backup that had very little actual data.

I'm going to experiment with zero filling free space on the volumes, but that very act will ensure that every single free space block will become dirty, and so the next backup will become just as large. Unless dedupe and compression will counteract this? Is there a way to zero fill the free space without everything being marked dirty?

Specs

Source "Cleveland"

CentOS 6.8 x64, Xeon 5450 with 32GB RAM
Kernel 2.6.32-642.3.1.el6.x86_64
Intel SR2520SAXSR with S5000VSA, BIOS S5000.86B.12.00.0098.062320091136 06/23/2009
Intel Embedded Server RAID Technology using LSI MegaSR RAID5 version v15.04.2013.1016, built on Oct 16 2013 at 19:20:04 driver
4x Seagate ST3146356SS in RAID10, 2x Seagate ST2000DL003 in RAID1
Note: one of the 2TB drives died which has been replaced with a ST3000DM001, rebuild status unknown (next job)

Target

CIFS on Windows Server 2008 R2 x64

Post by **nielsengelen** » Jul 25, 2016 8:02 am this post

There are tools which can do this such as zerofree (ubuntu manpage) however these type of tools are always use on own risk. Before using this tool make sure you run a fsck on the disks you will use zerofree on. However I greatly advise you to try this in a test VM first to learn it (if possible) and do some catch up by reading the man guide and feedback from people via Google results.

Post by **PTide** » Jul 25, 2016 10:55 am this post

Hi,

We plan to implement a functional analogue of Bitlooker in VAL to make it possible to skip dirty blocks.

Thanks

camerond · Post by **camerond** » Jul 26, 2016 1:06 am this post

vmniels wrote:There are tools which can do this such as zerofree (ubuntu manpage) however these type of tools are always use on own risk. Before using this tool make sure you run a fsck on the disks you will use zerofree on. However I greatly advise you to try this in a test VM first to learn it (if possible) and do some catch up by reading the man guide and feedback from people via Google results.

I'm not sure how accurately I could simulate the physical setup as a VM, or if I'd be able to know I was even close? By that I mean I'm not sure how much bearing the physical side of things has on the "dirty" blocks. Aside from that, I suppose I could provision some thick VM disks that are eager zeroed, and once the OS is on there I could fill free space with /dev/random, then zero fill it, and then see what VAL makes of it.

camerond · Post by **camerond** » Jul 26, 2016 1:07 am this post

PTide wrote: We plan to implement a functional analogue of Bitlooker in VAL to make it possible to skip dirty blocks.

That's great news. I'll continue testing once the failed disk has been sorted out.

Post by **nielsengelen** » Jul 26, 2016 7:08 am this post

You could simulate stuff with dd and random read/write tools such as fio. My main reason why I said about testing it in a VM is to learn the zerofree tool before doing it on a live server. Better safe than sorry

R&D Forums

Backup size clarification

Re: Backup size clarification

Re: Backup size clarification

Re: Backup size clarification

Re: Backup size clarification

Re: Backup size clarification

Who is online