Comprehensive data protection for all workloads
Post Reply
cby
Expert
Posts: 109
Liked: 6 times
Joined: Feb 24, 2009 5:02 pm
Contact:

Veeam replication and size of roll-backs

Post by cby »

Have done some tests with Veeam data replication and found it to work fine. However, it did highlight an issue concerning roll-backs, or more precisely the size of roll-backs.

As an example, we back up 2 Linux VMs in a single scheduled job, 155GB native (including 10GB for the 2 swap partitions), 79GB after de-duplication and compression. We keep the previous 2 days' of roll-backs. The Veeam roll-backs appear to be unusually big given the amount of activity in a typical day on the 2 VMs. They are approximately 4GB every day, even at weekends when there is no user activity. It did prompt a number of questions:

- How are the roll-backs generated? Block-based?

- What are the changes that go on at system level that create so many deltas? Syslog, cron jobs, background tasks, file-access recording -- none of these are excessive or create many disk changes.

- Is the swap partition the main culprit when it comes to roll-back size (there is an amount of swap activity that will be addressed with more memory to be allocated!)

- Would swap on a Raw Disk Device reduce the size by not getting backed up? Do deltas in the 10GB of total swap on the 2 VMs get backed up in the Veeam roll-backs?

- If we move to RDD what happens to swap when we have to restore from a replicated data set? Does the target host have a predefined swap on a RDD?


Our EVA LUN snapshots confirm the daily size of deltas so the Veeam roll-back sizes are consistent with this.


Would appreciate any advice/observations especially if you've been running Veeam replication with Linux VMs.
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam replication and size of roll-backs

Post by Gostev »

Hello, yes increments are done on block level (1024KB), so single bit change in the 1024KB block makes whole block market as "dirty" and this block will be processed as a part of incremental replication. Thus, defragmenting your VMs usually helps to reduce incrementals in size, because disk changes are becoming more physically consolidated. On fragmented VMs, when new files are created, they scatter over multiple parts of disk, and thus making more disk blocks "dirty" than in case of defragmented disks.

Unfortunately, I cannot really comment on what it is exactly that makes so many changes on your VM disks - probably, swap and logs? I am not a big Linux guru, but cannot think of anything else. You can try experimenting with moving swap to RDD, this may help indeed if main reason of changes is swap.
drbarker
Enthusiast
Posts: 45
Liked: never
Joined: Feb 17, 2009 11:50 pm
Contact:

Re: Veeam replication and size of roll-backs

Post by drbarker »

By default, most linux distro's tend to allocate more swap space than a windows server - linux will proactivly swap stuff out to disk to make room for more disk cache. In general, this is a good thing (tm) but you'll get through more backup space with Veeam.

In practice, we haven't bothered optimising swap space in guests:
- Veeam has already taken 2.9Tb of production data and deduped it down to 510Gb.
- We tend to have more Windows than Linux boxes - we only get througth ~25Gb incr per day for ~100 VM's
- We're happy enough with things :-)
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam replication and size of roll-backs

Post by Gostev »

Oh, thank you indeed - always interesting to see real-world numbers!
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Veeam replication and size of roll-backs

Post by tsightler »

drbarker wrote: In practice, we haven't bothered optimising swap space in guests:
- Veeam has already taken 2.9Tb of production data and deduped it down to 510Gb.
- We tend to have more Windows than Linux boxes - we only get througth ~25Gb incr per day for ~100 VM's
Wow, are you're VM's very similar? We don't see anything near this level of dedupe on our systems but our ~50 VM's are all pretty different and many already contain compressed data.
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam replication and size of roll-backs

Post by Gostev »

Good dedupe ratios do require that VMs are made from the same template, as opposed to building them all from scratch.
drbarker
Enthusiast
Posts: 45
Liked: never
Joined: Feb 17, 2009 11:50 pm
Contact:

Re: Veeam replication and size of roll-backs

Post by drbarker »

Yes, templates rock :D

The only thing I wish we did that we're currently not is the sdelete trick... I keep hoping vmware will update vmtools to let you automate a disk sdelete - it would certainly help with dedup backups & deduping storage arrays. I suspect we could get the ratio up a bit more if we tried :-)
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Veeam replication and size of roll-backs

Post by tsightler »

Our systems are deployed from templates, but they still diverge significantly from that point. We do see excellent dedupe and compression on systems that are similar but most of our systems are very different.
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam replication and size of roll-backs

Post by Gostev »

drbarker wrote:The only thing I wish we did that we're currently not is the sdelete trick... I keep hoping vmware will update vmtools to let you automate a disk sdelete - it would certainly help with dedup backups & deduping storage arrays. I suspect we could get the ratio up a bit more if we tried :-)
What would you say if these kind of VM optimizations were a part of Veeam Backup VM processing (the process kicking in automatically after backup is completed for specific VM, so if anything goes wrong you can simply rollback)?
fredbloggs
Service Provider
Posts: 47
Liked: never
Joined: Mar 18, 2009 1:05 am
Contact:

Re: Veeam replication and size of roll-backs

Post by fredbloggs »

Gostev wrote:What would you say if these kind of VM optimizations were a part of Veeam Backup VM processing (the process kicking in automatically after backup is completed for specific VM, so if anything goes wrong you can simply rollback)?
Personally it'd be a nice feature, but I don't find it too difficult to simply use windows scheduler to run an sdelete on a weekly basis.

I also store my pagefile (swap) on a separate vmdk and configure this disk so that it is independant and persistent. In this way Veeam doesn't backup these file systems (unable to snapshot) and therefore helps to keep the backup / replication jobs down. It's fine for a DR perspective where the DR VM would get powered on and recreate this stuff anyway.
drbarker
Enthusiast
Posts: 45
Liked: never
Joined: Feb 17, 2009 11:50 pm
Contact:

Re: Veeam replication and size of roll-backs

Post by drbarker »

Gostev wrote: What would you say if these kind of VM optimizations were a part of Veeam Backup VM processing (the process kicking in automatically after backup is completed for specific VM, so if anything goes wrong you can simply rollback)?
If Veeam could do it, that would be great - but...

My concern would be that I don't want 100 VM's to zero their disks at the same time - the effect on the backend disk would be 'dramatic' ;-) Ideally, anything that was tidying up disks would be virtual center aware - e.g. if SCSI queues started building up, back off a bit.
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam replication and size of roll-backs

Post by Gostev »

fredbloggs wrote:I also store my pagefile (swap) on a separate vmdk and configure this disk so that it is independant and persistent. In this way Veeam doesn't backup these file systems (unable to snapshot) and therefore helps to keep the backup / replication jobs down. It's fine for a DR perspective where the DR VM would get powered on and recreate this stuff anyway.
QFE, this is great best practice.
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam replication and size of roll-backs

Post by Gostev »

drbarker wrote:My concern would be that I don't want 100 VM's to zero their disks at the same time - the effect on the backend disk would be 'dramatic' ;-) Ideally, anything that was tidying up disks would be virtual center aware - e.g. if SCSI queues started building up, back off a bit.
Yes, good point - this will need to be taken into account. Activities need to be coordinated across all "optimization" jobs as well as regular backup and replication jobs.
cby
Expert
Posts: 109
Liked: 6 times
Joined: Feb 24, 2009 5:02 pm
Contact:

Re: Veeam replication and size of roll-backs

Post by cby »

People

Thanks for the helpful information.

Currently backing up 8 RHEL5 VMs, 1.1TB raw, de-duped to 715GB. Not too shabby given there are a number of large Oracle databases and millions of scanned archive images. Moving the swap to RDD or independent vmdk is definitely worth investigating. The number of VMs will increase to 20+ by year-end so extrapolating these figures produce very big and potentially unmanageable numbers with the existing infrastructure and working practices (read on...)

To make matters 'interesting' the databases are shut down and exported every night. Leaving aside the need to shut/start Oracle databases, we have requested the DBAs run hot backups (with RMAN for example) but to no avail. The export file on the heaviest used VM is around 20GB which probably explains the size of the deltas. I'm not familiar with the export process but presumably any changes to the database, however small and reflected in the export file, would have a profound effect on the underlying block structure of the disk as far as 'dirty' bits are concerned. The consequences of shutting/starting the databases every night before backup must also have an impact.

Our VMs are deployed from templates but change significantly within a short period, especially the database servers.

I guess I knew the answer to the issue of large deltas -- a combination of database exports, non-independent/non-RDD swap, log files getting updated all over the place.

Defragmentation of Linux filesystems is a moot point -- we run a regularly scheduled defrag of our (non-VM!) Tru64 ADVFS system and achieve 5-10% I/O improvements. Not sure whether defragging or equivalent would achieve this on RHEL5.
Post Reply

Who is online

Users browsing this forum: No registered users and 224 guests