Large VIB file on a small static server

J1mbo · Post by **J1mbo** » Mar 09, 2012 9:04 am this post

Anything that changes a block will create an increment load. Specify VM and host RAM to avoid paging. Do you have any defrag utility? Or some indexing function? Another potential culprit could be NTFS last accessed time if pre 2008 server (set HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem Key NtfsDisableLastAccessUpdate as DWORD to value 1 then reboot).

Unison · Post by **Unison** » Mar 12, 2012 3:30 am this post

Hi Jimbo
Disk defrag reports the drives are not badly fragmented and no there is no indexing happening on this server. All our servers are 2003 so i will try that key you mention. Can you pls give me any extra information on what exactly that key does or why it might help to resolve this large increment issue?

Also my backups are set to 'local target' - but seems as though another poster got good results from setting to LAN Target. I might try this but wanted to know if anyone is aware of a reason why you wouldnt want to set the jobs as LAN target even though the target is actually local to the veeam server? Could this impact speed or put any extra traffic on the network etc?

Thanks
Gav

Post by **Gostev** » Mar 12, 2012 9:56 am this post

LAN and especially WAN target use smaller blocks to track the changes in the virtual disk, which reduces the incremental backup size (typically 2x with WAN target comparing to Local target). However, this may also reduce backup performance because of many more blocks to process (this is actually explained right there in UI in the option descriptions). This speed reduction will not be seen when the job is WAN link speed bound anyway, but you will definitely see the performance drops with local backups.

Unison · Post by **Unison** » Mar 21, 2012 3:47 am this post

Hey guys,
I have tried setting the NtfsDisableLastAccessUpdate to 1 as per J1mbo's suggestion however that is still not helping to reduce the increment size on these servers. I am puzzled as to why a different imaging program that was backing these servers up for years only produced increments of a few hundred MB at most but veeam is popping out increments 6-7-8 gig in size.
Does anyone have any more ideas on how to pin point what is causing this? Really appreciate any help/advice.
Thanks

Post by **Gostev** » Mar 21, 2012 10:30 am this post

Disk fragmentation, or workload specifics results in those few hundred MB making 6-8 GB of blocks all around the disk dirty?

Unison · Post by **Unison** » Mar 22, 2012 1:47 am this post

Thanks Gostev,
Tho not exactly sure of your question - i just cant see why one backup program only sees (for years) a few hundred MB of changes per increment but Veeam is seeing Gig's and Gig's.

I might try disabling CBT on these jobs to see if that helps however i dont believe that will make any impact on this issue because CBT is just telling veeam what has changed since the last backup....which i think veeam will find the same changes when it does the backup without CBT.....OR is it possible that CBT is reporting back changes that veeam and other backup applications would just not consider as a 'change' or necessary to backup?
Will veeam backup every block that CBT says has changed or will veeam still read from the blocks presented by CBT and make a decision on whether or not to backup those blocks?

Post by **tsightler** » Mar 22, 2012 2:50 am this post

So was the other "image based" product backing up via the hypervisor, or was it using some type of in-guest agent? If it was an in-guest agent my guess is it was using a very small block size. Veeam uses large blocks (1MB by default for "local target"), that means that a disk with many very small changes spread across the disk will show a large change rate. You can try using "WAN target" which uses 256K blocks even if your disks are local as what this actually sets is the smallest block size that Veeam will process when a change to the disk occurs. This can have a huge savings on systems where there is a lot of very small changes, however, that's still typically much larger that agent based image products, which generally use something along the lines of the NTFS cluster size, commonly 16KB. This is pretty well covered in other threads.

Unison · Post by **Unison** » Mar 22, 2012 3:34 am this post

Thanks tsighler
I have just done some testing on some VM's and turned off CBT on their jobs - i expected the increments to take a bit longer but no, they are taking ages. Some increments taking over 1hr when with CBT they took a few minutes. HOWEVER the increment size is extremely small - granted there was not much time for change during the increment tests but all increments were well and truly small, all less than half the size of an increment with CBT.....and one was down to 50MB!!!! perfect increment size for a 'low activity' VM!
So my question in the last post still stands - can anyone help? The question being:

Will veeam backup every block that CBT says has changed or will veeam still read from the blocks presented by CBT and make a decision on whether or not to backup those blocks?

It seems that with veeam doing the backup on its own without CBT, the increments are smaller (but take a lot longer because veeam has to process the whole vm again) - could CBT be more 'sensitive' than veeam and when CBT is enabled, veeam actually just backs up anything and everything CBT tells it to (by marking more blocks as changed compared to what veeam would consider changed)?

Also, yes the other imaging product was agent based - symantec system recovery - i understand what is said about different block size processing of the different products and could see how this would have an impact on some servers. It doesnt make sense on these servers though because some of them are doing very (very!) little.
BUT, what i will do now is set these backup jobs to have CBT turned back on (so i get the speed back!!!) AND i will change them to a WAN TARGET so we get the smallest possible block size...I will then report back what speed penalty there is (thats why i didnt change from local to wan already because of the performance hit) and if the increment sizes improve.

If anyone has more info on why veeam might be able to produce a smaller increment on its own without using CBT compared to when it us using CBT i would be really interested to know.

Thanks guys

Unison · Post by **Unison** » Mar 22, 2012 5:43 am this post

POSSIBLE SOLUTION / CAUSE....

(Sry, long post but hopefully the pro's can shed some light

)

I think i found the answer to the above question (Does veeam just backup everything CBT says has changed?)....Short answer....YES.

Through some testing today, i have found something undesirable (or maybe realised it rather) - which is probably the cause of my larger than expected increments.

One of these servers suffering from large increments has one 40gig drive. 22gig of that is used - when veeam does a full backup of this server the VBK file is about 7gig - a veeam increment for this server with CBT is normally around 500MB.
Today i copied an 8gig file to this server, deleted it, copied it over again and deleted it again - after this the 8gig file was now gone from this server and its used space was again 22gig. I than ran another increment of this server. Some of you might at this point see where this is going....
With no new data added to this server, the increment should be extremely small....but in this case, because CBT is enabled on this server, this next increment was 9GIG (which is even bigger than this servers base image!)......obviously this is because when i copied the 8gig file to this server, all those blocks changed, then i deleted it, then i added the 8gig file again more blocks changed.....at the end of this process, 9gigs worth of blocks had changed so when the next increment was run CBT told veeam that alllllllll these blocks had changed. Simple. But - no data was on these blocks....they had just been changed.
This was what i was trying to find out and it seems this is the case - CBT tells veeam that a block has changed, veeam just listens and then backs up that block, veeam doesnt do any further analysis (suppose there is an obvious answer here as backing up block level veeam cant actually see what is in the block so doesnt know if it holds data or not, it just knows that its changed). I suspect this is why the increments are so large, when a file is added then deleted or just deleted, blocks are changed and then because they are changed they are backed up - even if those blocks actually contain no data any more - like in this test, 9 gigs worth of empty blocks were backed up even though there was no data in any of the blocks - basically creating a huge increment with absolutely no real data in it.

Is it possible......when CBT hands veeam a list of all the blocks that have changed, is there any way that veeam could then process each block and test/analyse if it actually contains 'real data' - so that the backup is fast because only changed blocks are being processed BUT blocks which contain no data are NOT backed up? (or is my comment above about not being able to analyse the block level the limiting factor?).

I also suspect that this is why the veeam increments are smaller in size when you turn OFF CBT - because veeam is then processing the vm like other imaging products (although through the HV), not just listening to CBT about what blocks have changed. So without CBT your increments only contain new data so are small, but with CBT your increments not only contain new data but also possibly empty blocks which will blow out the size of your increments.

From testing - without using CBT the increments take just as long as the full backup, obviously because veeam has to process the VM each increment. Does anyone know why, without CBT veeam takes so long to process the VM and create a new increment? With agent based imaging apps, the entire server is processed with each increment - why can it do it in minutes but veeam takes the same amount of time as its full backup? I would like to turn off CBT to reduce the increment sizes BUT i cant because without CBT the increments just take too long. Both agent based imaging and veeam (without CBT) process the entire server for an increment but the agent based is so much faster - why does veeam take so long when its doing the same thing?

Thanks in advance for taking the time to read/respond

Post by **Gostev** » Mar 22, 2012 9:54 am this post

The size of the produced increment is going to be exactly the same with and without CBT, because the only difference is how we identify changed blocks (by using CBT data, or by reading each and every block and comparing its state to the previous state). As you can imagine, both methods will produce exactly identical results - but one of them will be real slow.

Your testing shows difference in backup size because you are not doing it correctly. If you re-run the same job again soon, of course the incremental backup size produced will be smaller than with once-per-day run, because very little data is changed in this short time span. If you create 2 separate jobs for the same VM (with and without CBT) and run those on the same schedule, you will see that they size of backup files they produce is the same.

Your scenario with adding and deleting file is also not valid. As I already posted in another topic yesterday, in real world server workloads, data is never just removed, but is always replaced with more data (immediately, or eventually). When was the last time that you removed a hard drive from any of your computers, because you did not need the disk space that you needed before? We only keep adding.

The real reasons for increments being large for some server are well known, and explained by Tom just before your two post.

Post by **tsightler** » Mar 22, 2012 2:58 pm this post

I believe you may not be understanding exactly how Veeam works as it works on a completely different principle from an agent based image backup solution. Agent based solutions effectively backup the "filesystem image", however, Veeam backs up the VM disk image, exactly as it exist, without regard to the filesystem (with one exception in V6 with regards to the Windows swapfile). If you create a 40GB virtual disk, add 20GB of data to it, and then delete it, then take a Veeam backup, the Veeam increment will 20GB (assuming no compression). The idea here is that, when you restore this VM, it will be an EXACT image of what that VM looked like when it was backed up, including any deleted data as, even if the data has been deleted, the blocks themselves did change. This actually offers some advantages. For example, I was once able to use a Veeam backup to recover a file that had been created and deleted on the same day by using an NTFS "undelete" program on a Veeam backup from a week earlier. This is something that would have been completely not possible with a traditional backup product. Also, a Veeam backup of a powered off VM is a forensically accurate image of the system, which can be very valuable for security related analysis.

Certainly, the disadvantage of this could be that a system which does have a lot of files being created and removed will show large increments as Veeam will literally backup every block that has changed on this disk. For example, a server which processes image files or archive files using a drop folder may see a very large increment. The only real workaround for this is to use some tool like "sdelete" that "zeros" the data whenever a file is deleted. Veeam will still see these as changed blocks, but since they will contain zeros it will not grow the incremental backup.

Unison · Post by **Unison** » Mar 22, 2012 11:47 pm this post

tsightler - thank you! I really appreciate your response - thanks for delivering it with 'teaching/sharing' in mind rather than arrogance.
Your helping me to see this clearer. It makes sense why a veeam full backup is slower than the old agent based imaging we used because for the FULL veeam is looking at every block for the vm regardless of if that block contains data or not - i.e. even if your disk has 200gig of 'free space' the veeam backup process will still look at every single one of those 200gig blocks where as a 'file system' aware imaging app will recognise the space as empty....hence the longer times for veeam backups.
It doesnt look like i am going to be able to do anything about these large increments - on top of the fact a larger block size is being used (also tried WAN Target for the smaller block but that didnt reduce the size by much) i think that the large increment size issue is compounded by the fact that the increments will contain 'changed blocks' that actually contain no data. Thanks for pointing out the benefits of this - the fact you have an exact forensically viable disc copy AND you can recover deleted files from restore points because only the addressing is removed when a file is deleted, not the actual data. I would rather only have increments that contain new data than have these benefits but you cant have everything and i suppose i chalk this one up to new requirements/processes of our new VM/backup environment.
Because of dedup and better compression, i was actually expecting smaller backups for all my VMs compared to the old backup system on the physical systems - and this is true for the FULL backups but not with the increments. With the veeam Fulls they are a lot smaller than the fulls with the old software so i save heaps of space there but then when the increments kick in, 1-2 days into an imaging set i am already using more backup space than a full 7 day imaging set with the old solution because the increments are so much smaller with the old backup product/technology.

Also thanks Gostev. As mentioned in my post, my testing was fairly rough + quickly done and i realise it was not exactly scientific but it did seem to show me different results which is why i fired off the question to you guys who know a tremendous amount about how veeam and the backup process work. You did answer my question though, so thank you - confirming that veeam will recognise the exact same 'changed blocks' on its own as it would with just listening to CBT (essentially the increment sizes will be exactly the same size with or without CBT - so you have helped me remove suspicion from the CBT process). I didnt see your other post, sorry - but can you see how increments can be blown up as in my test simply by adding a file to a server then deleting it before an increment - you have deleted the file but because the blocks have changed, the increment will grow by the size of the file you added/deleted even though that file doesn't exist any more (i.e. if no compression etc is used). I didnt realise this so if someone else finds this post because they too are noticing huge increments - perhaps they will realise that even if a file is placed on a vm then deleted before a backup, that file will have caused changed blocks so those blocks will be backed up and add to the size of your increment. If you have a system where files are added/removed a lot, this will likely be your cause for large increments - would you consider this one of the 'real reasons' increments are huge?

So far i believe the reasons mentioned to be:
1- high activity discs (no help for you in this case)
2- Regularly defraging your disc (will cause huge amount of changed blocks so huge increments - dont defrag regularly in this case or defrag only right before your next full)
3- lots of small changes to lots of small files - i.e. resulting in full 1MB blocks being backed up for even a small 50kb file change (switching to WAN target may help you in this case)
4- NTSF change on last access reg key (I had this set but didnt seem to help at all so probably wasnt causing issues in my environment)
5- If files are added to a vm then removed - even those now 'empty' blocks will be backed up (even though we know they are not technically empty)

I suspect reason 5 and 3 are whats causing the large increments for me - even though switching to 256k blocks (down from 1mb blocks) via setting the WAN TARGET didnt really help. I cant do anything about 'empty' blocks being backed up and really a changed block is a changed block so it has to be backed up by veeam. Setting the backup job to WAN target didnt really produce any slower backups or any performance hits (not sure why that is? shouldn't i see performance problems going from local target to WAN target?) and didnt result is much smaller increments.

I think others who become new to veeam and discover larger than expected increments should be able to find all the reasons here and possible tweaks so thanks to all who have responded so far.

ChrisL · Post by **ChrisL** » Apr 09, 2012 7:15 pm this post

[merged]

Hi all,

Interesting one for you, it's not specifically a problem with Veeam, that bit is working just fine, bit its sort of related.

Long story short, got a Server 2003 VM backing up in its own Veeam job set to user reverse incrementals and so creates a .vbk and corresponding .vrb's. The curious thing is that the vrb files are consistently quite large, regardless of how much the VM is use. It's a main file server, so I would expect some pretty heavy usage but since we are a school, you would expect the usage (and hence the rollback sizes) to drop right off during the holidays, as we are right now, but the rollbacks are still about the same size as normal.

Veeam reports the VM size as 900GB and the rollbacks are usually about 60-70GB, whether the system is used or not. I'm not seeing any errors in the reports, CBT is working fine as far as I can tell, and this is the only job of 22 jobs that is behaving in this way - its not actually failing, just taking a long time to complete.

So, im sure Veeam is working fine, it's presumably something to do with the VM, my question is what might it be doing that is causing the large amounts of data to change? There's no SQL on the server and as far as I can tell there's no background defragging going on. Alternatively, is the any way that I can see what is actually being backed up in the rollback file which might give a clue as to what the actual differences are?

Historically the VM was P2V'd from a physical box if that's relevant in any way..

Any thoughts welcomed..

ChrisL · Post by **ChrisL** » Apr 09, 2012 9:10 pm this post

Thanks for merging, I knew I should have checked the forums first, d'oh!

I'll withdraw the last question then, about whether it's possible to look 'inside' to rollback, I realise now that that's a silly question and shouldn't have been asked and seems so obvious now taking about how Veeam actually works at a VMDK block level! The first bit remains though, I wonder what exactly the VM is actually doing during the day that can cause so many block changes that are then being recorded by VMWare and hance captured by Veeam..

Post by **Vitaliy S.** » Apr 09, 2012 9:56 pm this post

Hi Chris, actually it might be anything starting from last access time updates up to daily antivirus checks or file relocation. Please look through the first pages of this topic for more info. Hope this helps!

Unison · Post by **Unison** » Apr 10, 2012 1:31 am this post

I have had some positive results since my last post on this issue above…

Based on some of the input in this topic I have gone back to my VM’s/Veeam setup and changed/tried a few things – my increments are now MUCH smaller now! What worked for me, may not work for you but if your seeing large increments – you could try what I did.

It seems that ‘reason 3’ in my last post above was the main cause for my issue with large increments. The issue is primarily due to the large block size of 1mb used by veeam when you specify the target as ‘local’ on your backup jobs – I could probably word that better so please do not interpret that as me ‘blaming’ veeam for this issue….I am not. I am basically saying, if veeam used a smaller block size…..this wouldn’t be much of an issue at all – if this is not too much of a ridiculous question, I wonder if someone can please provide a clear answer: why does veeam use such large block sizes (even 256k for WAN target is very large but is the smallest veeam can use)? Why not something much smaller still like 32 or 64k to get closer to agent based imaging increment sizes?

What IS working for me now though…
I trawled through the discs on all my VM’s and did a massive clean out – clearing old files, archiving others, deleting temp files etc etc…some of these servers are citrix servers so I got rid of temp internet folders and cookies where some of those folders had tens of thousands of files in them of no more than a few KB each!...after reclaiming some gigs I defragged all discs – everything looked a bit ‘cleaner’ from the defrag report.
I then did some backups with veeam – and my increments, were smaller! Clearing out the discs, removing unnecessary fragmented files and then doing a fresh defrag certainly appears to have helped. Increments have pretty much halved in size just by doing this – but obviously didn’t help much on servers that did not need a defrag and did not have much data ‘cleaned’ off.
Now to the block size issue – as in above posts, I have now also changed ALL my backup jobs and set their targets as ‘WAN Target’. This is so I can get the smallest possible block size with veeam of 256KB….now im getting super small increments again, pretty damn close to what I was getting with agent based imaging of the physical servers. With the tidy/defrag and the WAN Target setting, my increments have gone from being about 8-10gig on the worst end to now being around 2gig-500mb!! A massive improvement!

I will complete defrags on a more regular basis now (just before a new image base) and I will apply this WAN Target setting to all other VM’s as they come on board. There was talk of a performance issue or overhead from using the ‘WAN Target’ setting over the Local Target setting however I am seeing no such performance issues and I have been running like this for a week now. In most cases the jobs set as WAN target were finishing before/quicker than those jobs that were set as local target – Im not sure if this is weird, but it seems weird to me….i would expect the ‘wan target’ jobs to take longer – but they do not.

Appreciate all the input on this topic to date – thanks! Will keep an eye on this topic and eager to see if anyone can answer my queries above.
If you are having issues with your VM’s and Veeam creating large increments – hopefully you are able to use some of the information in this topic to get veeam and your vm’s producing acceptable increment sizes!

ChrisL · Post by **ChrisL** » Apr 10, 2012 12:11 pm this post

Thanks Vitaliy, I'll have a poke around and see what might be happening. The good news is that the server, being an old 2003 'box', is due to be decommissioned soonish and probably split up into a few smaller new 2008 VMs. Hopefully the new ones will play a bit nicer!

Post by **dellock6** » Apr 10, 2012 5:42 pm this post

Gav, reducing the block size Veeam is processing will surely improve deduplication ratio, but on the other side is going to increase the burden on Proxies CPUs: 256k is 4 times smaller than 1Mb, so you will have 4x the amount of blocks to compare in order to dedupe. Going even smaller on blocks will surely require much more cpu power, and probably memory/cache to store hashes for dedup.
Probably in the future, taking advantage of cpu constantly increasing their power, Veeam will be able to reduce this value.
My 2 cents.

Unison · Post by **Unison** » Apr 11, 2012 2:40 am this post

Thanks for the response dellock6.
Basically the lowest block size is set by veeam at 256k (using WAN Target) because veeam thinks that any smaller block size would be too much for modern processors to handle and would overload veeam proxies during dedup/compression?
You might not exactly know why veeam have set the lowest block size at 256k but is that basically what your thinking is the reason for the limit?

During my tests, I didn’t see any performance hit on my veeam server using the different ‘block sizes’ of WAN (256k) and Local (1MB) targets – which is also the proxy. Its CPU is an i7 2700K 3.5ghz 8 cores – running W7 pro. Due to the fact I see no penalty in backup time using WAN or Local target (to my surprise I might add) – I now set all my veeam jobs to use WAN target so I get the smallest possible increment size.
It seems that with this processor (which is not the best, not server grade, not the fastest, not very expensive) we could all expect that smaller block sizes would not over load our veeam servers or blow out our backup windows….even when you say something scary like a 256k block size will take 4x more ‘effort’ to process compared to 1MB blocks…..in the real world, it doesn’t translate and doesn’t seem to impact processing time at all…..even with a desktop processor like this one I am using (obviously if your using a lower grade of processor than what I am, you might indeed see processing issues the smaller you set the block size).

Veeam let us choose from a block sizes from 256k to 1MB using the drop down target type in the job setup – would it be possible for veeam to add a few more items to this drop down list so we can drop the block size even further down past 256k – say adding a 128, 64 and 32k option? That may be massively over simplifying what would be required – but I wonder if you or one of the veeam team could comment on the possibility of allowing us to choose a smaller block size than 256k?
We could then test our veeam servers against the different block sizes and determine what gives us the best increment size without causing a processing overload on our veeam servers – this would allow us to choose a block size that takes more advantage of our individual processing capabilities and allow us to get maximum use out of our storage systems.
Whats the harm in letting us choose a block size lower than 256k? If it can be done, can it be added to a future patch/release?

Post by **tsightler** » Apr 11, 2012 3:25 am this post

I would have to say that yes, you are probably making it a little too simple.

Since you didn't really provide any details of the VMs you were testing against it's pretty much impossible to know for sure, however, my guess is that your VMs are generally small, and that your testing involved running a limited number of jobs at a time, with minimal restore points. Backing up a 2TB VM using 256KB blocks would naturally have a much larger impact than a 100GB VM as the 100GB VM would potentially need 400,000 hashes, while the 2TB VM would need 8 million. On the other hand, a 2TB VM with 1MB blocks would only need 2 million. All those hashes have to be stored in memory, and compared against as the job runs to determine deplicate blocks. As the VM size gets bigger and bigger, the more impact the smaller block size will have. Imagine the impact of 32K blocks on the 2TB VM? Suddenly it would potentially need 64 million hashes, which would certainly be impactful.

You then also need a storage technology that can store all those small blocks efficiently, track the blocks across incrementals and reverse incrementals, and do so efficiently for things like file level recoveries, instant restores, surebackup, etc. The smaller the block size the more likely that the file will become fragmented over time, thus adding another factor to performance.

That being said, if your VMs are fairly small, using WAN mode generally will not have a significant negative impact on performance, at least for backup, but it is a balancing act that has many more variables that your simple test addresses, specifically around restore times and capabilities. Even expensive hardware dedupe appliances have a very negative impact on features like Instant Restore and Surebackup due to the overhead of rehydrating data from their small block dedupe.

In other words, it's not just about about reducing the block size, but about doing so without impacting backup performance, restore performance, and scalability. Right now, 256K seems to be the lower bound that makes sense when balancing those requirements, but certainly I wouldn't rule out a smaller lower block size in the future.

Unison · Post by **Unison** » Apr 11, 2012 4:24 am this post

Thanks Tom, lots of great information and detail in there – appreciate your insight.
We will have 9 VMs when I am finished and they mostly are pretty small….around 100-200gig each with just one being around 600gig….probably explains the reason why I don’t see a performance hit when I use the smaller or larger block size because we are not playing around with large enough numbers to reveal any of the saleability/performance/restore problems you mention.
I can see your point when you start to talk about TB’s….you clearly show where the problems will come in on the larger end of the scale – but on the lower end of the scale where a smaller number of VMs are involved and smaller VM sizes….even the 256k block size might be a bit much and a bit inflexible to the ‘needs’ of the smaller guys. We are an SMB Accounting firm with 2 offices and 60 staff….not terribly small and I am sure there are many businesses around this size or even smaller using Veeam. A smaller block size selection will help us, not the big guys ……but I absolutely agree that based on the potential problems as things scale up, a smaller block size is not practical.
1MB block is a good size for large systems to balance out the performance/risk issues…..however 1MB for smaller shops….even 256k for smaller shops really might not be low enough…..but as you said you wouldn’t rule out a smaller block size option in future, I too hope that comes about to recognise the differences in this section of the Veeam market and user base.

With the issues regarding recovery – I begin a new base and backup set every week, so that the set does not become too ‘complicated’. This also helps with fragmentation issues because last weeks images are cleared off the main drive and the new set is laid down basically on a clean drive each time and only has increments added to it for one week, minimising the fragmentation issue and taking complication out at restore time. Do you think that going to new sets more regularly will improve some of the potential recovery problems when using smaller block sizes because of the short set window, blocks not needing to be tracked very long etc etc?

Also with certain recovery options like instant restore, sure backup and file level recovery with smaller vm sizes and smaller block sizes – is it very common to see a recovery fail because the ‘system’ has trouble processing a larger number of blocks during the hydration process? I have not seen it ever happen – but then again, I have not been using veeam for long (note, the only dedup I am using is what veeam does – allowing only veeam dedup in the backup process removes a layer of complexity/potential-issue?).

As you describe, the risk and balancing act becomes more and more complicated as you scale up – like with most things. However hopefully veeam can provide lower block sizes for the smaller scale setups that would be impacted more by a larger block size as apposed to a smaller block size.

In smaller setups, a larger block size means larger increments and not much else – but smaller block size means smaller increments with no real processing performance impact, no real recovery impact, no real impact to backup window.

In a larger setup, a small block size means not much in the way of saving space (because you will have heaps anyway ), it means huge excess processing demands, extra requirements on storage, longer recovery times, flaky recoveries etc etc…lots of bad stuff.

Some scenarios the smaller block sizes would be useful, some it would not – but if possible, hopefully veeam in future offer some smaller selections. Im sure many people have been surprised by their new large increments after moving from physical to virtual servers….some might speak up, like here in this post, others might just accept it and not say anything…..possibly though, a lot of this ‘surprise/shock’ could be avoided with the use of a smaller block size for smaller VM’s (not that large block size is the only cause for large increments – but I think it would be a key cause most of the time).

Post by **dellock6** » Apr 11, 2012 7:22 am this post

I think that the fact is, disks always tend to grow in size and performance at the same price, so it really easier to buy bigger disks and let the backup grow in size. Think about SATA disks, two years ago 1Tb was the biggest size, now we have already reached 3Tb. Using larger storage for backup gives you more space for retention, and can let Veeam run faster having the ability to use bigger block size.

Unison · Post by **Unison** » Apr 11, 2012 10:47 pm this post

I agree that there is that argument too – ‘discs are so cheep…just buy more’. Without getting right into that and arguing against it, I think that really there is a possibility veeam could be ‘improved’ here by getting it to more efficiently using the space that it has available now (by allowing us to select a wider range of block sizes)……rather than just adding more drives as it sucks up huge amounts of space due to large block sizes.
Making it possible to select smaller block sizes (smaller than 256k) would allow many setups to overcome the huge increment size problem without impacting other areas (like cpu performance, backup speed, recovery time, recovery complexity etc etc).
Those with big systems obviously wouldn’t get much benefit from using the smaller block sizes….but those of us with a smaller number of servers and with servers of only a couple hundred gig each are going to see much benefit.
I was able to massively reduce my increment sizes across all servers and that was largely due to the fact I was able to drop down to a block size of 256k from 1mb without any performance or speed issues – my increments were touching 10gig each….NOW they are back down to pre veeam days of between 2gig and 500mb. If I am taking several snap shots a day…..those massive increments will soak up all my storage in less than a week…..where as with a more efficient (smaller block size), the same amount of storage will allow me to retain months of backups!
Hopefully veeam look at adding some smaller block size selections in a patch or future release – I think its pretty clear their product could do it but it will be up to us to select the appropriate block size to fit our performance, restore and storage capabilities.

Anton Gostev, you’re a veeam product manager and have contributed in this discussion – is there any possibility that smaller block sizes could be added to the selection list to help smaller shops with this large increment issue? It seems like a small change which could make a big difference to many in your SMB market.

Post by **Gostev** » Apr 11, 2012 11:01 pm this post

No, smaller block sizes are not possible for a number of technical reasons, as well as backup performance considerations and unacceptable memory requirements for the data mover agent process. Tom has a good summary above.

Smaller increments for archival purposes, on the other hand, are achievable.

Unison · Post by **Unison** » Apr 11, 2012 11:20 pm this post

Thanks Tom for all your input here - and thanks Gostev for providing that final answer on this possibility.

Anton, not sure what you mean here though "Smaller increments for archival purposes, on the other hand, are achievable." - can you pls explain what you are referring to? There are other ways besides what i have done to get smaller increments?

Post by **Gostev** » Apr 11, 2012 11:32 pm this post

Sorry, I meant to say "possible in future". This will require some new functionality in our product... which we are considering to be high priority for some unrelated reasons anyway.

Unison · Post by **Unison** » Apr 11, 2012 11:37 pm this post

great, thanks for clearing that up

Look forward to seeing the product evolve and backups becoming as small as possible.

Thomy · Post by **Thomy** » Apr 16, 2012 1:45 pm this post

Hello

I can confirm this behaviour, i have the same problem with Windows VMs (my Linux-based VMs are backuped fine with about ~5GB incremental for 70VMs daily) i have ~50GB for my 50 Windows VMs daily.
I am still searching for a solution (Veeam 6.0 Patch 3 on ESXi4.1U2 with vCenter Server 5.0).

Best regards

Thomas

Unison · Post by **Unison** » Apr 17, 2012 12:52 am this post

Hey Thomy,
Your 50VMs are generating 50GIG worth of increments (across all of them) each day? How long have you been using Veeam? What were you using before veeam to backup and how big were the increments with it? If it was agent based physical imaging as discussed above - it may relate to the fact veeam uses a MUCH bigger block size than than agent based imaging.

Have you tried any of the suggestions in above posts to see if they help you?
# Seeing if your VM discs are badly defragged
# Not running defrags too regularly and only running a defrag after the last image of a set and just before the first image of a new set.
# Changing your Target Type to WAN Target in the backup jobs

Be interested to see what you find...

ChrisL · Post by **ChrisL** » Apr 22, 2012 5:04 pm this post

Hi all, little update for the knowledge base from here. We were finding we had large rollbacks on a relatively static file server but these have dropped off since running a defrag on the server. Previously we had rollbacks of about 60-70GB even during quiet times, now we have rollbacks of 20GB during busy times. On top of this, the backup time for the job (we have one job per VM) has dropped from 6+ hours to just over 2 hours - not bad for a 1.2TB VM!

For the record, I defragged both the system disk and the data disk, but didn't run a job between doing them, so I don't eally know which one made the biggest difference. It's possible that the large rollbacks were actually being caused by the system disk, which dropped from ~30% fragmented to 0% fragmented.

Either way, defrag has made a huge difference here and should be recommended for all.

The bad bit however is that we had to pretty much delete the backup chain and start a new one from scratch. The first job after the defrag obviously made a huge rollback, as you would expect, and this together with the large full VBK file pretty much filled the repository. Deleting everything and running a new full has obviously resolved this, but with a slightly scary few days while we were a little less protected than we would have liked - the new full took about 36 hours to complete so we were dependent on the tape backups if anything had have happened.

A thought for the Veeam team.. Maybe a mention in the installation guide to include the advice to run a defrag on each server before running the first full on it. Without fully understanding how Veeam processes the backups (which you may not during the initial install since you are new to the technology) it doesn't seem like an obvious thing to do, but it clearly helps hugely to start with a clean snapshot.

R&D Forums

Re: Unusually Large Increments from Veeam???

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Consistently large rollbacks from 2003 VM

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Re: Large VIB file on a small static server

Who is online