Large amount of data change every 28 days

Availability for the Always-On Enterprise

Large amount of data change every 28 days

Veeam Logoby aharvey » Mon Dec 01, 2014 7:05 pm

I have a Veeam job set up to do a daily incremental backup of several servers for an offsite backup. One of the servers in the list is our main file server. On average, the changed data for this server is around 5 GB per day. But every 28 days, the changed data explodes and on that day, the backup file is between 350 and 400 GB. This causes it to take a couple days to get the backup file offsite over the WAN connection we have. The file server is running Server 2012. I have looked at scheduled jobs to see if anything would be causing data to change, and haven't found any. Defrag is not running, I don't see anything in the event logs, and nothing else seems to coincide with the 28 day schedule.

I'm looking for any advice or tips on finding out what data is changing and why.

I thought about just looking at what the incremental file contains to see what changed to see if that would point me in a helpful direction, but Veeam appears to show you all files on the server whether they were backed up as a part of that specific incremental backup job or not. Unless there is a different way to view the vib file to show only what it specifically contains, but since it is block-based backup, that might not be possible, or even useful.

Any other insight into why so much data is changed every 28 days?
aharvey
Novice
 
Posts: 8
Liked: 1 time
Joined: Fri Apr 08, 2011 2:47 pm
Full Name: Aaron Harvey

Re: Large amount of data change every 28 days

Veeam Logoby foggy » Mon Dec 01, 2014 8:20 pm 2 people like this post

Someone uploading a huge file (e.g. monthly personal backup) on this server?
foggy
Veeam Software
 
Posts: 15303
Liked: 1133 times
Joined: Mon Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson

Re: Large amount of data change every 28 days

Veeam Logoby Gostev » Mon Dec 01, 2014 8:50 pm 2 people like this post

I would just log all write I/O activity on the file server with Process Monitor on that day before big backup, and then review it offline. There's got to be a huge amount of writes going, there is no magic.
Gostev
Veeam Software
 
Posts: 21622
Liked: 2411 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Large amount of data change every 28 days

Veeam Logoby Vitaliy S. » Mon Dec 01, 2014 9:05 pm 2 people like this post

How many virtual disks do you have on this server? If you have many of them, then I would also suggest checking job session stats to see what disks generates more data.
Vitaliy S.
Veeam Software
 
Posts: 19974
Liked: 1145 times
Joined: Mon Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov

Re: Large amount of data change every 28 days

Veeam Logoby v.Eremin » Tue Dec 02, 2014 8:18 am 2 people like this post

Don't you use 2012 deduplication feature by any chance? Monthly deduplication activity, such as optimize and garbage collection, might increase number of changed blocks dramatically. Thanks.
v.Eremin
Veeam Software
 
Posts: 13734
Liked: 1027 times
Joined: Fri Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin

Re: Large amount of data change every 28 days

Veeam Logoby Shestakov » Tue Dec 02, 2014 10:40 am 2 people like this post

It might take more time, but you can also do instant recovery of two restore points: the one before the big change and another just after. Do not power-on them automatically, not to harm your production file server VM. Connect them to an isolated network and compare restored files in the guest OS using some specific tool. So you will be able to see the difference between the files. Thanks.
Shestakov
Veeam Software
 
Posts: 5145
Liked: 430 times
Joined: Wed May 21, 2014 11:03 am
Location: Saint Petersburg
Full Name: Nikita Shestakov

Re: Large amount of data change every 28 days

Veeam Logoby Dima P. » Tue Dec 02, 2014 12:23 pm 2 people like this post

It might be monthly scheduled antivirus activities and windows task scheduler is worth checking. Also, I would check the group polices applied to this file server, possibly, some kind of monthly software distribution: cmd > gpresult /Scope User /v
Dima P.
Veeam Software
 
Posts: 6698
Liked: 479 times
Joined: Mon Feb 04, 2013 2:07 pm
Location: SPb
Full Name: Dmitry Popov

Re: Large amount of data change every 28 days

Veeam Logoby aharvey » Tue Dec 02, 2014 6:01 pm

Thanks for all the suggestions. Vladimir's caught my attention, because I am using Windows deduplication on 2 volumes of this server, and also initially suspected it could be the culprit, but I haven't found any specific dedup settings that would run every 4 weeks. The Optimization runs daily, and the Garbage Collection and Scrubbing run weekly. I would expect to see the data change every week instead of every 4 weeks. Those do happen to run on Saturdays, which is when the data changes occur. So my instincts are still pointing me to dedup, but I haven't found any technet information or other articles that points to a 28 day event that would be the cause. Are you aware of any that occur monthly?

To answer a few of the other questions: This server has 3 volumes on 3 different vmdk files. (OS, and 2 data drives) There is no antivirus running, I have checked the group policies, and also looked through the task scheduler, but don't see any glaring culprits. We've thought about the personal backup issue, but haven't seen any yet, but need to truly verify with either a comparison or I/O logging. I'll look into one or both of those options and dig some more. Thanks again for the ideas. (I'm still leaning toward dedup)
aharvey
Novice
 
Posts: 8
Liked: 1 time
Joined: Fri Apr 08, 2011 2:47 pm
Full Name: Aaron Harvey

Re: Large amount of data change every 28 days

Veeam Logoby v.Eremin » Wed Dec 03, 2014 8:19 am

Deduplication is still major suspect, from my perspective, as it's notoriously known for causing huge amount of changed blocks. Nevertheless, you're right in saying that most of deduplication activities have weekly schedule, instead of monthly one.
v.Eremin
Veeam Software
 
Posts: 13734
Liked: 1027 times
Joined: Fri Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin

Re: Large amount of data change every 28 days

Veeam Logoby bouo2492 » Mon May 09, 2016 12:51 pm

OP, did you ever find a solution for this problem ? I have exactly the same thing happening to me. Thanks
bouo2492
Novice
 
Posts: 3
Liked: 6 times
Joined: Mon May 09, 2016 12:36 pm

Re: Large amount of data change every 28 days

Veeam Logoby aharvey » Mon May 09, 2016 9:09 pm 1 person likes this post

bouo2492- It was Windows Deduplication. I disabled it on this server, and never had the huge data change again. It's been a long time, but I think I came across some article/reference that talked about a 28 day "cleanup" that I suspect was the cause.
aharvey
Novice
 
Posts: 8
Liked: 1 time
Joined: Fri Apr 08, 2011 2:47 pm
Full Name: Aaron Harvey

Re: Large amount of data change every 28 days

Veeam Logoby bouo2492 » Wed May 11, 2016 5:12 pm 5 people like this post

I think I found the exact reason why it's happening. You were right with the cleanup job. It's called Deduplication garbage collector. It's very well explained here:
http://social.technet.microsoft.com/wik ... rview.aspx
bouo2492
Novice
 
Posts: 3
Liked: 6 times
Joined: Mon May 09, 2016 12:36 pm

Re: Large amount of data change every 28 days

Veeam Logoby ahsict » Thu May 12, 2016 9:17 am

Thanks! I have bee struggling with this problem for ages - what confused me was that when you look at task scheduler it says that the dedupe tasks run weekly:

Image

I must have looked at this a dozen times over the last year or so and always discarded de-dupe as the culprit because I was looking for a monthly task!

Little did I realise that it was coded in such a way that every 4th run it acts differently.

Thanks for posting your solution
ahsict
Novice
 
Posts: 3
Liked: never
Joined: Tue Jul 07, 2015 11:44 am
Full Name: Martin Simpson

Re: Large amount of data change every 28 days

Veeam Logoby bouo2492 » Thu May 12, 2016 3:07 pm 1 person likes this post

ahsict - I was having the problem for a long time too before finding the explication this week. I had the same reasoning as yours. Why was it happening once a month when there was no monthly job ? I think that Microsoft has done a poor job documenting deduplication...
To correct the situation, I scheduled my full backup the same day as the end of the garbage collector job.
bouo2492
Novice
 
Posts: 3
Liked: 6 times
Joined: Mon May 09, 2016 12:36 pm

Re: Large amount of data change every 28 days

Veeam Logoby mkreitzer » Mon May 16, 2016 3:17 pm

Since my company has been considering employing windows deduplication on large file servers I found this alarming so tried to dig a little. I found this:

https://support.microsoft.com/en-us/kb/3066175

It seems you can disable that "every 4th run" behavior and still retain "95%" of the benefit of dedup. I'm wondering if there's a way to change the behavior of the job to avoid touching large numbers of blocks as well.
mkreitzer
Novice
 
Posts: 8
Liked: never
Joined: Thu Dec 17, 2015 3:54 pm
Full Name: Michael Kreitzer


Return to Veeam Backup & Replication



Who is online

Users browsing this forum: Yahoo [Bot] and 1 guest