Comprehensive data protection for all workloads
Post Reply
aharvey
Novice
Posts: 8
Liked: 1 time
Joined: Apr 08, 2011 2:47 pm
Full Name: Aaron Harvey
Contact:

Large amount of data change every 28 days

Post by aharvey »

I have a Veeam job set up to do a daily incremental backup of several servers for an offsite backup. One of the servers in the list is our main file server. On average, the changed data for this server is around 5 GB per day. But every 28 days, the changed data explodes and on that day, the backup file is between 350 and 400 GB. This causes it to take a couple days to get the backup file offsite over the WAN connection we have. The file server is running Server 2012. I have looked at scheduled jobs to see if anything would be causing data to change, and haven't found any. Defrag is not running, I don't see anything in the event logs, and nothing else seems to coincide with the 28 day schedule.

I'm looking for any advice or tips on finding out what data is changing and why.

I thought about just looking at what the incremental file contains to see what changed to see if that would point me in a helpful direction, but Veeam appears to show you all files on the server whether they were backed up as a part of that specific incremental backup job or not. Unless there is a different way to view the vib file to show only what it specifically contains, but since it is block-based backup, that might not be possible, or even useful.

Any other insight into why so much data is changed every 28 days?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Large amount of data change every 28 days

Post by foggy » 2 people like this post

Someone uploading a huge file (e.g. monthly personal backup) on this server?
Gostev
Chief Product Officer
Posts: 31428
Liked: 6633 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Large amount of data change every 28 days

Post by Gostev » 2 people like this post

I would just log all write I/O activity on the file server with Process Monitor on that day before big backup, and then review it offline. There's got to be a huge amount of writes going, there is no magic.
Vitaliy S.
VP, Product Management
Posts: 27025
Liked: 2709 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Large amount of data change every 28 days

Post by Vitaliy S. » 2 people like this post

How many virtual disks do you have on this server? If you have many of them, then I would also suggest checking job session stats to see what disks generates more data.
veremin
Product Manager
Posts: 20261
Liked: 2249 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Large amount of data change every 28 days

Post by veremin » 2 people like this post

Don't you use 2012 deduplication feature by any chance? Monthly deduplication activity, such as optimize and garbage collection, might increase number of changed blocks dramatically. Thanks.
Shestakov
Veteran
Posts: 7328
Liked: 781 times
Joined: May 21, 2014 11:03 am
Full Name: Nikita Shestakov
Location: Prague
Contact:

Re: Large amount of data change every 28 days

Post by Shestakov » 2 people like this post

It might take more time, but you can also do instant recovery of two restore points: the one before the big change and another just after. Do not power-on them automatically, not to harm your production file server VM. Connect them to an isolated network and compare restored files in the guest OS using some specific tool. So you will be able to see the difference between the files. Thanks.
Dima P.
Product Manager
Posts: 14388
Liked: 1566 times
Joined: Feb 04, 2013 2:07 pm
Full Name: Dmitry Popov
Location: Prague
Contact:

Re: Large amount of data change every 28 days

Post by Dima P. » 2 people like this post

It might be monthly scheduled antivirus activities and windows task scheduler is worth checking. Also, I would check the group polices applied to this file server, possibly, some kind of monthly software distribution: cmd > gpresult /Scope User /v
aharvey
Novice
Posts: 8
Liked: 1 time
Joined: Apr 08, 2011 2:47 pm
Full Name: Aaron Harvey
Contact:

Re: Large amount of data change every 28 days

Post by aharvey »

Thanks for all the suggestions. Vladimir's caught my attention, because I am using Windows deduplication on 2 volumes of this server, and also initially suspected it could be the culprit, but I haven't found any specific dedup settings that would run every 4 weeks. The Optimization runs daily, and the Garbage Collection and Scrubbing run weekly. I would expect to see the data change every week instead of every 4 weeks. Those do happen to run on Saturdays, which is when the data changes occur. So my instincts are still pointing me to dedup, but I haven't found any technet information or other articles that points to a 28 day event that would be the cause. Are you aware of any that occur monthly?

To answer a few of the other questions: This server has 3 volumes on 3 different vmdk files. (OS, and 2 data drives) There is no antivirus running, I have checked the group policies, and also looked through the task scheduler, but don't see any glaring culprits. We've thought about the personal backup issue, but haven't seen any yet, but need to truly verify with either a comparison or I/O logging. I'll look into one or both of those options and dig some more. Thanks again for the ideas. (I'm still leaning toward dedup)
veremin
Product Manager
Posts: 20261
Liked: 2249 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Large amount of data change every 28 days

Post by veremin »

Deduplication is still major suspect, from my perspective, as it's notoriously known for causing huge amount of changed blocks. Nevertheless, you're right in saying that most of deduplication activities have weekly schedule, instead of monthly one.
bouo2492
Novice
Posts: 3
Liked: 7 times
Joined: May 09, 2016 12:36 pm

Re: Large amount of data change every 28 days

Post by bouo2492 »

OP, did you ever find a solution for this problem ? I have exactly the same thing happening to me. Thanks
aharvey
Novice
Posts: 8
Liked: 1 time
Joined: Apr 08, 2011 2:47 pm
Full Name: Aaron Harvey
Contact:

Re: Large amount of data change every 28 days

Post by aharvey » 1 person likes this post

bouo2492- It was Windows Deduplication. I disabled it on this server, and never had the huge data change again. It's been a long time, but I think I came across some article/reference that talked about a 28 day "cleanup" that I suspect was the cause.
bouo2492
Novice
Posts: 3
Liked: 7 times
Joined: May 09, 2016 12:36 pm

Re: Large amount of data change every 28 days

Post by bouo2492 » 6 people like this post

I think I found the exact reason why it's happening. You were right with the cleanup job. It's called Deduplication garbage collector. It's very well explained here:
http://social.technet.microsoft.com/wik ... rview.aspx
ahsict
Novice
Posts: 3
Liked: never
Joined: Jul 07, 2015 11:44 am
Full Name: Martin Simpson
Contact:

Re: Large amount of data change every 28 days

Post by ahsict »

Thanks! I have bee struggling with this problem for ages - what confused me was that when you look at task scheduler it says that the dedupe tasks run weekly:

Image

I must have looked at this a dozen times over the last year or so and always discarded de-dupe as the culprit because I was looking for a monthly task!

Little did I realise that it was coded in such a way that every 4th run it acts differently.

Thanks for posting your solution
bouo2492
Novice
Posts: 3
Liked: 7 times
Joined: May 09, 2016 12:36 pm

Re: Large amount of data change every 28 days

Post by bouo2492 » 1 person likes this post

ahsict - I was having the problem for a long time too before finding the explication this week. I had the same reasoning as yours. Why was it happening once a month when there was no monthly job ? I think that Microsoft has done a poor job documenting deduplication...
To correct the situation, I scheduled my full backup the same day as the end of the garbage collector job.
mkreitzer
Novice
Posts: 8
Liked: 1 time
Joined: Dec 17, 2015 3:54 pm
Full Name: Michael Kreitzer
Contact:

Re: Large amount of data change every 28 days

Post by mkreitzer » 1 person likes this post

Since my company has been considering employing windows deduplication on large file servers I found this alarming so tried to dig a little. I found this:

https://support.microsoft.com/en-us/kb/3066175

It seems you can disable that "every 4th run" behavior and still retain "95%" of the benefit of dedup. I'm wondering if there's a way to change the behavior of the job to avoid touching large numbers of blocks as well.
Post Reply

Who is online

Users browsing this forum: No registered users and 14 guests