Page 1 of 1

Veeam replication and size of roll-backs

Posted: Aug 03, 2009 3:48 pm
by cby
Have done some tests with Veeam data replication and found it to work fine. However, it did highlight an issue concerning roll-backs, or more precisely the size of roll-backs.

As an example, we back up 2 Linux VMs in a single scheduled job, 155GB native (including 10GB for the 2 swap partitions), 79GB after de-duplication and compression. We keep the previous 2 days' of roll-backs. The Veeam roll-backs appear to be unusually big given the amount of activity in a typical day on the 2 VMs. They are approximately 4GB every day, even at weekends when there is no user activity. It did prompt a number of questions:

- How are the roll-backs generated? Block-based?

- What are the changes that go on at system level that create so many deltas? Syslog, cron jobs, background tasks, file-access recording -- none of these are excessive or create many disk changes.

- Is the swap partition the main culprit when it comes to roll-back size (there is an amount of swap activity that will be addressed with more memory to be allocated!)

- Would swap on a Raw Disk Device reduce the size by not getting backed up? Do deltas in the 10GB of total swap on the 2 VMs get backed up in the Veeam roll-backs?

- If we move to RDD what happens to swap when we have to restore from a replicated data set? Does the target host have a predefined swap on a RDD?


Our EVA LUN snapshots confirm the daily size of deltas so the Veeam roll-back sizes are consistent with this.


Would appreciate any advice/observations especially if you've been running Veeam replication with Linux VMs.

Re: Veeam replication and size of roll-backs

Posted: Aug 03, 2009 5:15 pm
by Gostev
Hello, yes increments are done on block level (1024KB), so single bit change in the 1024KB block makes whole block market as "dirty" and this block will be processed as a part of incremental replication. Thus, defragmenting your VMs usually helps to reduce incrementals in size, because disk changes are becoming more physically consolidated. On fragmented VMs, when new files are created, they scatter over multiple parts of disk, and thus making more disk blocks "dirty" than in case of defragmented disks.

Unfortunately, I cannot really comment on what it is exactly that makes so many changes on your VM disks - probably, swap and logs? I am not a big Linux guru, but cannot think of anything else. You can try experimenting with moving swap to RDD, this may help indeed if main reason of changes is swap.

Re: Veeam replication and size of roll-backs

Posted: Aug 03, 2009 6:25 pm
by drbarker
By default, most linux distro's tend to allocate more swap space than a windows server - linux will proactivly swap stuff out to disk to make room for more disk cache. In general, this is a good thing (tm) but you'll get through more backup space with Veeam.

In practice, we haven't bothered optimising swap space in guests:
- Veeam has already taken 2.9Tb of production data and deduped it down to 510Gb.
- We tend to have more Windows than Linux boxes - we only get througth ~25Gb incr per day for ~100 VM's
- We're happy enough with things :-)

Re: Veeam replication and size of roll-backs

Posted: Aug 03, 2009 8:09 pm
by Gostev
Oh, thank you indeed - always interesting to see real-world numbers!

Re: Veeam replication and size of roll-backs

Posted: Aug 03, 2009 9:01 pm
by tsightler
drbarker wrote: In practice, we haven't bothered optimising swap space in guests:
- Veeam has already taken 2.9Tb of production data and deduped it down to 510Gb.
- We tend to have more Windows than Linux boxes - we only get througth ~25Gb incr per day for ~100 VM's
Wow, are you're VM's very similar? We don't see anything near this level of dedupe on our systems but our ~50 VM's are all pretty different and many already contain compressed data.

Re: Veeam replication and size of roll-backs

Posted: Aug 03, 2009 9:55 pm
by Gostev
Good dedupe ratios do require that VMs are made from the same template, as opposed to building them all from scratch.

Re: Veeam replication and size of roll-backs

Posted: Aug 03, 2009 10:44 pm
by drbarker
Yes, templates rock :D

The only thing I wish we did that we're currently not is the sdelete trick... I keep hoping vmware will update vmtools to let you automate a disk sdelete - it would certainly help with dedup backups & deduping storage arrays. I suspect we could get the ratio up a bit more if we tried :-)

Re: Veeam replication and size of roll-backs

Posted: Aug 03, 2009 11:10 pm
by tsightler
Our systems are deployed from templates, but they still diverge significantly from that point. We do see excellent dedupe and compression on systems that are similar but most of our systems are very different.

Re: Veeam replication and size of roll-backs

Posted: Aug 04, 2009 12:35 am
by Gostev
drbarker wrote:The only thing I wish we did that we're currently not is the sdelete trick... I keep hoping vmware will update vmtools to let you automate a disk sdelete - it would certainly help with dedup backups & deduping storage arrays. I suspect we could get the ratio up a bit more if we tried :-)
What would you say if these kind of VM optimizations were a part of Veeam Backup VM processing (the process kicking in automatically after backup is completed for specific VM, so if anything goes wrong you can simply rollback)?

Re: Veeam replication and size of roll-backs

Posted: Aug 04, 2009 3:24 am
by fredbloggs
Gostev wrote:What would you say if these kind of VM optimizations were a part of Veeam Backup VM processing (the process kicking in automatically after backup is completed for specific VM, so if anything goes wrong you can simply rollback)?
Personally it'd be a nice feature, but I don't find it too difficult to simply use windows scheduler to run an sdelete on a weekly basis.

I also store my pagefile (swap) on a separate vmdk and configure this disk so that it is independant and persistent. In this way Veeam doesn't backup these file systems (unable to snapshot) and therefore helps to keep the backup / replication jobs down. It's fine for a DR perspective where the DR VM would get powered on and recreate this stuff anyway.

Re: Veeam replication and size of roll-backs

Posted: Aug 04, 2009 7:32 am
by drbarker
Gostev wrote: What would you say if these kind of VM optimizations were a part of Veeam Backup VM processing (the process kicking in automatically after backup is completed for specific VM, so if anything goes wrong you can simply rollback)?
If Veeam could do it, that would be great - but...

My concern would be that I don't want 100 VM's to zero their disks at the same time - the effect on the backend disk would be 'dramatic' ;-) Ideally, anything that was tidying up disks would be virtual center aware - e.g. if SCSI queues started building up, back off a bit.

Re: Veeam replication and size of roll-backs

Posted: Aug 04, 2009 10:19 am
by Gostev
fredbloggs wrote:I also store my pagefile (swap) on a separate vmdk and configure this disk so that it is independant and persistent. In this way Veeam doesn't backup these file systems (unable to snapshot) and therefore helps to keep the backup / replication jobs down. It's fine for a DR perspective where the DR VM would get powered on and recreate this stuff anyway.
QFE, this is great best practice.

Re: Veeam replication and size of roll-backs

Posted: Aug 04, 2009 10:20 am
by Gostev
drbarker wrote:My concern would be that I don't want 100 VM's to zero their disks at the same time - the effect on the backend disk would be 'dramatic' ;-) Ideally, anything that was tidying up disks would be virtual center aware - e.g. if SCSI queues started building up, back off a bit.
Yes, good point - this will need to be taken into account. Activities need to be coordinated across all "optimization" jobs as well as regular backup and replication jobs.

Re: Veeam replication and size of roll-backs

Posted: Aug 05, 2009 9:33 am
by cby
People

Thanks for the helpful information.

Currently backing up 8 RHEL5 VMs, 1.1TB raw, de-duped to 715GB. Not too shabby given there are a number of large Oracle databases and millions of scanned archive images. Moving the swap to RDD or independent vmdk is definitely worth investigating. The number of VMs will increase to 20+ by year-end so extrapolating these figures produce very big and potentially unmanageable numbers with the existing infrastructure and working practices (read on...)

To make matters 'interesting' the databases are shut down and exported every night. Leaving aside the need to shut/start Oracle databases, we have requested the DBAs run hot backups (with RMAN for example) but to no avail. The export file on the heaviest used VM is around 20GB which probably explains the size of the deltas. I'm not familiar with the export process but presumably any changes to the database, however small and reflected in the export file, would have a profound effect on the underlying block structure of the disk as far as 'dirty' bits are concerned. The consequences of shutting/starting the databases every night before backup must also have an impact.

Our VMs are deployed from templates but change significantly within a short period, especially the database servers.

I guess I knew the answer to the issue of large deltas -- a combination of database exports, non-independent/non-RDD swap, log files getting updated all over the place.

Defragmentation of Linux filesystems is a moot point -- we run a regularly scheduled defrag of our (non-VM!) Tru64 ADVFS system and achieve 5-10% I/O improvements. Not sure whether defragging or equivalent would achieve this on RHEL5.