Comprehensive data protection for all workloads
Post Reply
ashleyw
Service Provider
Posts: 181
Liked: 30 times
Joined: Oct 28, 2010 10:55 pm
Full Name: Ashley Watson
Contact:

Veeam v6 and sdelete/defrag optimisations for CBT.

Post by ashleyw »

Hi,

In the past it's been suggested that to reduce full backup times (for windows VMs at least) and subsequent CBT backups, the process should be;
1. Run defrag on each drive on each VM.
2. Run sdelete on each drive (which will max out the used space to be the same as the allocated space, but write 0 blocks to unused areas).
3. Run a storage vmotion to a workspace area (in thin provisioned mode).
4. Run a storage vmotion from the workspace area to the VMs original location (in thin provisioned mode).

All good but manually time consuming. I know there have been some powershell scripts to make things slightly easier.

My questions are;
1. Are there any changes in V6 to make this easier or automatic? (being able to not back up the page files will help a little).
2. Are there any automatic reports we can pull off the backup logs to work out a list so we can prioritise which VMs we need to do this on - currently I'm managing a farm of around 400 VMs and would like an easy way to determine any quick wins.
3. We have a number of VMs running Linux (mainly CentOS) with ext3 and some with LVM etc - what process should we follow on these to achieve the same results?

cheers
Ashley
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Veeam v6 and sdelete/defrag optimisations for CBT.

Post by Vitaliy S. »

Hi Ashley,

I believe this is reasonable to do only for small shops with dozens of VMs. In bigger environments with hundreds of VMs (like yours), such micro-management is hardly possible and as you've noted correctly is very time consuming.

The only change coming for v6 is swap files exclusion, which is largest provider for changed blocks inside the guest OS.

That said, I think you would prefer not to stress your storage performance with regular sVMotion tasks just to save some time for your backup window. I agree that you would get some improvements by doing this, but the bigger you are the smaller benefits you will have from that.

Thanks.
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam v6 and sdelete/defrag optimisations for CBT.

Post by Gostev »

Ashley, regarding (3) - unlike NTFS, ext3 zeroes deleted blocks automatically, so there is no point of similar procedures there.
tsightler
VP, Product Management
Posts: 6009
Liked: 2842 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Veeam v6 and sdelete/defrag optimisations for CBT.

Post by tsightler »

Gostev wrote:Ashley, regarding (3) - unlike NTFS, ext3 zeroes deleted blocks automatically, so there is no point of similar procedures there.
I'm not sure what information this is based on, but I don't believe this is true. There are tools that allow undelete of files from an ext3 filesystem, which would be pretty much impossible if the blocks were zeroed. The only thing different regarding deletes on an ext3 filesystem that I'm aware of is that the filesize and block addresses of the file are removed from the inode, making the actual data much more difficult to find, thus complicating an undelete, but the data blocks are not actually touched.
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam v6 and sdelete/defrag optimisations for CBT.

Post by Gostev »

It was very long ago when I ran into this behavior. Possibly, it was an option and not default behavior. Just did a quick Google and found many mentions of "zerofree" option for ext3, could be this.
tsightler
VP, Product Management
Posts: 6009
Liked: 2842 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Veeam v6 and sdelete/defrag optimisations for CBT.

Post by tsightler »

Gotcha. I know there are patches out there that add a feature like this, but I'm not aware of this ever becoming part of the mainstream kernel (although it is possible, I don't follow kernel development as closely as I used to) or being used by default in any mainstream distro.

There is a small, standalone binary also called "zerofree", available in many repositories, that can zero out the unused space of a unmounted (or read-only mounted) ext3 filesystem, but I've never used it.
ashleyw
Service Provider
Posts: 181
Liked: 30 times
Joined: Oct 28, 2010 10:55 pm
Full Name: Ashley Watson
Contact:

Re: Veeam v6 and sdelete/defrag optimisations for CBT.

Post by ashleyw »

thanks for this guys, I have a couple of questions.
Even though we don't want to micro manage manually, I don't mind if a set of scripts can do this (as long as I can put my feet up!).
I've been checking through the VMware APIs and it appears that the ability to run OS level commands has now been incorporated into the core APIs; http://blogs.vmware.com/vix/2011/07/vix ... -apis.html

so theoretically it would be easy enough to create a script to run sdelete inside the guests and then to force an svmotion of the VM afterwards and then to repeat this for specific VMs...

But to see any real benefits as you point out, we'd need to target this to the VMs that would give the most gain (in terms of Veeam speed up).

So, my question is, how do we determine the number of "dirty" unused blocks - ie. the number of blocks in a VM that are unused by the OS but are not zeroed? If we could get the dirty block count as a percentage for each VM virtual disk and then sort this (based on backup job) - we'd end up with a top X list of VMs to run the time consuming process of sdelete on which would immediately reduce our full back times and synthetic full rollups.

Any ideas?

cheers
Ashley
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam v6 and sdelete/defrag optimisations for CBT.

Post by Gostev »

This would require some low level programming skills and knowledge of MFT format. Not something that can be scripted for sure.

By the way, why are you so concerned about?

First, doing this will not help your incremental (CBT) backups at all, because deleting the data in NTFS does not actually touch the content of blocks the deleted file consist of. So, nothing changes on disks, thus CBT will not see any disk changes either way, and your backup will fly. In fact, wiping will slow your incremental backups down instead, because it will make actual changes to disk blocks content, and CBT will return those blocks as changed, requiring the backup engine to read and process their contents.

Second, in case of typical server workloads, the amount of data on the server is either constantly increasing (file server), or is steady (transaction logs volume). In both cases, as soon as some file is deleted, its blocks are quickly overwritten by new data. So, the following full backup will not benefit much from sdelete, as all "dirty" disk block would still belong to some existing file.

Finally, even in cases when you simply delete some file and nothing takes is place... it is still very likely that the same file is also stored on some other server. For example, you moved the file there (we rarely just delete stuff, right)? Or, there could be a similar file (for example, newer version). Now, because we do deduplication, each unique block will be stored in backup only once anyway. So, even if you wipe blocks belonging to the deleted file on one VM, this won't really affect the overall backup size, as the same blocks would still have to be stored there, because they are a part of some other file stored on another VM.

Of course, wiping disk may give extremely impressive results in artificial conditions (such as deleting 4 GB DVD ISO from Windows XP VM with 2 GB footprint, and having a job with that single VM). This seems to make people think that wiping disks is big thing that must be done regularly. But if we are talking about backing up real-world workloads by larger jobs with multiple VMs, you won't really see much benefits from doing this even for full backups. And moreover, as I've already explained earlier, incremental CBT backups will do better if you DON'T wipe the disk.
ashleyw
Service Provider
Posts: 181
Liked: 30 times
Joined: Oct 28, 2010 10:55 pm
Full Name: Ashley Watson
Contact:

Re: Veeam v6 and sdelete/defrag optimisations for CBT.

Post by ashleyw »

thanks, Well once a month, we full backups. Once a week we run synthetic fulls. Nightly we run CBT incrementals. Despite splitting our into multiple jobs over multiple engines and creating custom scheduling to balance the load between the engines and using a highly optimised ZFS based (NexentaStor) backup target, we are still stuck with a full backup process that takes approximately 3.5 days, a weekly rollup process that takes around 1.5 days, and nightly CBT backups that take around 9 hours (much of the slowness is due to our crap primary storage which we are in the process of addressing).

There appear to be major architectural changes in v6 that we are holding out for which will most likely have a dramatic impact on our backup throughput - we can't wait!

but I'm still hoping to find that "magic bullet" that will give us optimal performance - perhaps sdelete isn't the answer (particularly as we are a dev shop and make extensive use of thin provisioning already).

cheers
Ashley
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam v6 and sdelete/defrag optimisations for CBT.

Post by Gostev »

Good point, thin-provisioned disks is another no-play area for wiping, as doing this will expand them.

Yes, I'd say you should wait until v6 is out (hopefully by the end of this month), this should be your magic bullet as v6 makes it very easy to scale your backups assuming production and backup storage are not your primary bottlenecks. And even if they are, you will be able to clearly see that with our built-in bottleneck detector, and then address that too.
jakesterpdx
Enthusiast
Posts: 28
Liked: 3 times
Joined: Jul 23, 2010 10:10 pm
Full Name: Jake H
Contact:

Re: Veeam v6 and sdelete/defrag optimisations for CBT.

Post by jakesterpdx »

anybody tried out UBERAlign?
http://nickapedia.com/2011/11/03/straig ... uberalign/

Seems to automate this whole process. I've downloaded the .ova and intend to play around with it.
Post Reply

Who is online

Users browsing this forum: legil.miguel and 192 guests