Availability for the Always-On Enterprise
Post Reply
aharvey
Novice
Posts: 8
Liked: 1 time
Joined: Apr 08, 2011 2:47 pm
Full Name: Aaron Harvey
Contact:

Large amount of data change every 28 days

Post by aharvey » Dec 01, 2014 7:05 pm

I have a Veeam job set up to do a daily incremental backup of several servers for an offsite backup. One of the servers in the list is our main file server. On average, the changed data for this server is around 5 GB per day. But every 28 days, the changed data explodes and on that day, the backup file is between 350 and 400 GB. This causes it to take a couple days to get the backup file offsite over the WAN connection we have. The file server is running Server 2012. I have looked at scheduled jobs to see if anything would be causing data to change, and haven't found any. Defrag is not running, I don't see anything in the event logs, and nothing else seems to coincide with the 28 day schedule.

I'm looking for any advice or tips on finding out what data is changing and why.

I thought about just looking at what the incremental file contains to see what changed to see if that would point me in a helpful direction, but Veeam appears to show you all files on the server whether they were backed up as a part of that specific incremental backup job or not. Unless there is a different way to view the vib file to show only what it specifically contains, but since it is block-based backup, that might not be possible, or even useful.

Any other insight into why so much data is changed every 28 days?

foggy
Veeam Software
Posts: 16662
Liked: 1338 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Large amount of data change every 28 days

Post by foggy » Dec 01, 2014 8:20 pm 2 people like this post

Someone uploading a huge file (e.g. monthly personal backup) on this server?

Gostev
Veeam Software
Posts: 22770
Liked: 2798 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Large amount of data change every 28 days

Post by Gostev » Dec 01, 2014 8:50 pm 2 people like this post

I would just log all write I/O activity on the file server with Process Monitor on that day before big backup, and then review it offline. There's got to be a huge amount of writes going, there is no magic.

Vitaliy S.
Veeam Software
Posts: 21382
Liked: 1273 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Large amount of data change every 28 days

Post by Vitaliy S. » Dec 01, 2014 9:05 pm 2 people like this post

How many virtual disks do you have on this server? If you have many of them, then I would also suggest checking job session stats to see what disks generates more data.

v.Eremin
Veeam Software
Posts: 15058
Liked: 1131 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Large amount of data change every 28 days

Post by v.Eremin » Dec 02, 2014 8:18 am 2 people like this post

Don't you use 2012 deduplication feature by any chance? Monthly deduplication activity, such as optimize and garbage collection, might increase number of changed blocks dramatically. Thanks.

Shestakov
Veeam Software
Posts: 5919
Liked: 513 times
Joined: May 21, 2014 11:03 am
Full Name: Nikita Shestakov
Location: Prague
Contact:

Re: Large amount of data change every 28 days

Post by Shestakov » Dec 02, 2014 10:40 am 2 people like this post

It might take more time, but you can also do instant recovery of two restore points: the one before the big change and another just after. Do not power-on them automatically, not to harm your production file server VM. Connect them to an isolated network and compare restored files in the guest OS using some specific tool. So you will be able to see the difference between the files. Thanks.

Dima P.
Veeam Software
Posts: 8401
Liked: 614 times
Joined: Feb 04, 2013 2:07 pm
Full Name: Dmitry Popov
Location: Prague
Contact:

Re: Large amount of data change every 28 days

Post by Dima P. » Dec 02, 2014 12:23 pm 2 people like this post

It might be monthly scheduled antivirus activities and windows task scheduler is worth checking. Also, I would check the group polices applied to this file server, possibly, some kind of monthly software distribution: cmd > gpresult /Scope User /v

aharvey
Novice
Posts: 8
Liked: 1 time
Joined: Apr 08, 2011 2:47 pm
Full Name: Aaron Harvey
Contact:

Re: Large amount of data change every 28 days

Post by aharvey » Dec 02, 2014 6:01 pm

Thanks for all the suggestions. Vladimir's caught my attention, because I am using Windows deduplication on 2 volumes of this server, and also initially suspected it could be the culprit, but I haven't found any specific dedup settings that would run every 4 weeks. The Optimization runs daily, and the Garbage Collection and Scrubbing run weekly. I would expect to see the data change every week instead of every 4 weeks. Those do happen to run on Saturdays, which is when the data changes occur. So my instincts are still pointing me to dedup, but I haven't found any technet information or other articles that points to a 28 day event that would be the cause. Are you aware of any that occur monthly?

To answer a few of the other questions: This server has 3 volumes on 3 different vmdk files. (OS, and 2 data drives) There is no antivirus running, I have checked the group policies, and also looked through the task scheduler, but don't see any glaring culprits. We've thought about the personal backup issue, but haven't seen any yet, but need to truly verify with either a comparison or I/O logging. I'll look into one or both of those options and dig some more. Thanks again for the ideas. (I'm still leaning toward dedup)

v.Eremin
Veeam Software
Posts: 15058
Liked: 1131 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Large amount of data change every 28 days

Post by v.Eremin » Dec 03, 2014 8:19 am

Deduplication is still major suspect, from my perspective, as it's notoriously known for causing huge amount of changed blocks. Nevertheless, you're right in saying that most of deduplication activities have weekly schedule, instead of monthly one.

bouo2492
Novice
Posts: 3
Liked: 6 times
Joined: May 09, 2016 12:36 pm

Re: Large amount of data change every 28 days

Post by bouo2492 » May 09, 2016 12:51 pm

OP, did you ever find a solution for this problem ? I have exactly the same thing happening to me. Thanks

aharvey
Novice
Posts: 8
Liked: 1 time
Joined: Apr 08, 2011 2:47 pm
Full Name: Aaron Harvey
Contact:

Re: Large amount of data change every 28 days

Post by aharvey » May 09, 2016 9:09 pm 1 person likes this post

bouo2492- It was Windows Deduplication. I disabled it on this server, and never had the huge data change again. It's been a long time, but I think I came across some article/reference that talked about a 28 day "cleanup" that I suspect was the cause.

bouo2492
Novice
Posts: 3
Liked: 6 times
Joined: May 09, 2016 12:36 pm

Re: Large amount of data change every 28 days

Post by bouo2492 » May 11, 2016 5:12 pm 5 people like this post

I think I found the exact reason why it's happening. You were right with the cleanup job. It's called Deduplication garbage collector. It's very well explained here:
http://social.technet.microsoft.com/wik ... rview.aspx

ahsict
Novice
Posts: 3
Liked: never
Joined: Jul 07, 2015 11:44 am
Full Name: Martin Simpson
Contact:

Re: Large amount of data change every 28 days

Post by ahsict » May 12, 2016 9:17 am

Thanks! I have bee struggling with this problem for ages - what confused me was that when you look at task scheduler it says that the dedupe tasks run weekly:

Image

I must have looked at this a dozen times over the last year or so and always discarded de-dupe as the culprit because I was looking for a monthly task!

Little did I realise that it was coded in such a way that every 4th run it acts differently.

Thanks for posting your solution

bouo2492
Novice
Posts: 3
Liked: 6 times
Joined: May 09, 2016 12:36 pm

Re: Large amount of data change every 28 days

Post by bouo2492 » May 12, 2016 3:07 pm 1 person likes this post

ahsict - I was having the problem for a long time too before finding the explication this week. I had the same reasoning as yours. Why was it happening once a month when there was no monthly job ? I think that Microsoft has done a poor job documenting deduplication...
To correct the situation, I scheduled my full backup the same day as the end of the garbage collector job.

mkreitzer
Novice
Posts: 8
Liked: never
Joined: Dec 17, 2015 3:54 pm
Full Name: Michael Kreitzer
Contact:

Re: Large amount of data change every 28 days

Post by mkreitzer » May 16, 2016 3:17 pm

Since my company has been considering employing windows deduplication on large file servers I found this alarming so tried to dig a little. I found this:

https://support.microsoft.com/en-us/kb/3066175

It seems you can disable that "every 4th run" behavior and still retain "95%" of the benefit of dedup. I'm wondering if there's a way to change the behavior of the job to avoid touching large numbers of blocks as well.

Post Reply

Who is online

Users browsing this forum: Bing [Bot], david.tosoff, Google [Bot] and 17 guests