-
- Novice
- Posts: 8
- Liked: 1 time
- Joined: Apr 08, 2011 2:47 pm
- Full Name: Aaron Harvey
- Contact:
Large amount of data change every 28 days
I have a Veeam job set up to do a daily incremental backup of several servers for an offsite backup. One of the servers in the list is our main file server. On average, the changed data for this server is around 5 GB per day. But every 28 days, the changed data explodes and on that day, the backup file is between 350 and 400 GB. This causes it to take a couple days to get the backup file offsite over the WAN connection we have. The file server is running Server 2012. I have looked at scheduled jobs to see if anything would be causing data to change, and haven't found any. Defrag is not running, I don't see anything in the event logs, and nothing else seems to coincide with the 28 day schedule.
I'm looking for any advice or tips on finding out what data is changing and why.
I thought about just looking at what the incremental file contains to see what changed to see if that would point me in a helpful direction, but Veeam appears to show you all files on the server whether they were backed up as a part of that specific incremental backup job or not. Unless there is a different way to view the vib file to show only what it specifically contains, but since it is block-based backup, that might not be possible, or even useful.
Any other insight into why so much data is changed every 28 days?
I'm looking for any advice or tips on finding out what data is changing and why.
I thought about just looking at what the incremental file contains to see what changed to see if that would point me in a helpful direction, but Veeam appears to show you all files on the server whether they were backed up as a part of that specific incremental backup job or not. Unless there is a different way to view the vib file to show only what it specifically contains, but since it is block-based backup, that might not be possible, or even useful.
Any other insight into why so much data is changed every 28 days?
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Large amount of data change every 28 days
Someone uploading a huge file (e.g. monthly personal backup) on this server?
-
- Chief Product Officer
- Posts: 31814
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Large amount of data change every 28 days
I would just log all write I/O activity on the file server with Process Monitor on that day before big backup, and then review it offline. There's got to be a huge amount of writes going, there is no magic.
-
- VP, Product Management
- Posts: 27377
- Liked: 2800 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Large amount of data change every 28 days
How many virtual disks do you have on this server? If you have many of them, then I would also suggest checking job session stats to see what disks generates more data.
-
- Product Manager
- Posts: 20415
- Liked: 2302 times
- Joined: Oct 26, 2012 3:28 pm
- Full Name: Vladimir Eremin
- Contact:
Re: Large amount of data change every 28 days
Don't you use 2012 deduplication feature by any chance? Monthly deduplication activity, such as optimize and garbage collection, might increase number of changed blocks dramatically. Thanks.
-
- Veteran
- Posts: 7328
- Liked: 781 times
- Joined: May 21, 2014 11:03 am
- Full Name: Nikita Shestakov
- Location: Prague
- Contact:
Re: Large amount of data change every 28 days
It might take more time, but you can also do instant recovery of two restore points: the one before the big change and another just after. Do not power-on them automatically, not to harm your production file server VM. Connect them to an isolated network and compare restored files in the guest OS using some specific tool. So you will be able to see the difference between the files. Thanks.
-
- Product Manager
- Posts: 14726
- Liked: 1706 times
- Joined: Feb 04, 2013 2:07 pm
- Full Name: Dmitry Popov
- Location: Prague
- Contact:
Re: Large amount of data change every 28 days
It might be monthly scheduled antivirus activities and windows task scheduler is worth checking. Also, I would check the group polices applied to this file server, possibly, some kind of monthly software distribution: cmd > gpresult /Scope User /v
-
- Novice
- Posts: 8
- Liked: 1 time
- Joined: Apr 08, 2011 2:47 pm
- Full Name: Aaron Harvey
- Contact:
Re: Large amount of data change every 28 days
Thanks for all the suggestions. Vladimir's caught my attention, because I am using Windows deduplication on 2 volumes of this server, and also initially suspected it could be the culprit, but I haven't found any specific dedup settings that would run every 4 weeks. The Optimization runs daily, and the Garbage Collection and Scrubbing run weekly. I would expect to see the data change every week instead of every 4 weeks. Those do happen to run on Saturdays, which is when the data changes occur. So my instincts are still pointing me to dedup, but I haven't found any technet information or other articles that points to a 28 day event that would be the cause. Are you aware of any that occur monthly?
To answer a few of the other questions: This server has 3 volumes on 3 different vmdk files. (OS, and 2 data drives) There is no antivirus running, I have checked the group policies, and also looked through the task scheduler, but don't see any glaring culprits. We've thought about the personal backup issue, but haven't seen any yet, but need to truly verify with either a comparison or I/O logging. I'll look into one or both of those options and dig some more. Thanks again for the ideas. (I'm still leaning toward dedup)
To answer a few of the other questions: This server has 3 volumes on 3 different vmdk files. (OS, and 2 data drives) There is no antivirus running, I have checked the group policies, and also looked through the task scheduler, but don't see any glaring culprits. We've thought about the personal backup issue, but haven't seen any yet, but need to truly verify with either a comparison or I/O logging. I'll look into one or both of those options and dig some more. Thanks again for the ideas. (I'm still leaning toward dedup)
-
- Product Manager
- Posts: 20415
- Liked: 2302 times
- Joined: Oct 26, 2012 3:28 pm
- Full Name: Vladimir Eremin
- Contact:
Re: Large amount of data change every 28 days
Deduplication is still major suspect, from my perspective, as it's notoriously known for causing huge amount of changed blocks. Nevertheless, you're right in saying that most of deduplication activities have weekly schedule, instead of monthly one.
-
- Novice
- Posts: 3
- Liked: 7 times
- Joined: May 09, 2016 12:36 pm
Re: Large amount of data change every 28 days
OP, did you ever find a solution for this problem ? I have exactly the same thing happening to me. Thanks
-
- Novice
- Posts: 8
- Liked: 1 time
- Joined: Apr 08, 2011 2:47 pm
- Full Name: Aaron Harvey
- Contact:
Re: Large amount of data change every 28 days
bouo2492- It was Windows Deduplication. I disabled it on this server, and never had the huge data change again. It's been a long time, but I think I came across some article/reference that talked about a 28 day "cleanup" that I suspect was the cause.
-
- Novice
- Posts: 3
- Liked: 7 times
- Joined: May 09, 2016 12:36 pm
Re: Large amount of data change every 28 days
I think I found the exact reason why it's happening. You were right with the cleanup job. It's called Deduplication garbage collector. It's very well explained here:
http://social.technet.microsoft.com/wik ... rview.aspx
http://social.technet.microsoft.com/wik ... rview.aspx
-
- Novice
- Posts: 3
- Liked: never
- Joined: Jul 07, 2015 11:44 am
- Full Name: Martin Simpson
- Contact:
Re: Large amount of data change every 28 days
Thanks! I have bee struggling with this problem for ages - what confused me was that when you look at task scheduler it says that the dedupe tasks run weekly:
I must have looked at this a dozen times over the last year or so and always discarded de-dupe as the culprit because I was looking for a monthly task!
Little did I realise that it was coded in such a way that every 4th run it acts differently.
Thanks for posting your solution
I must have looked at this a dozen times over the last year or so and always discarded de-dupe as the culprit because I was looking for a monthly task!
Little did I realise that it was coded in such a way that every 4th run it acts differently.
Thanks for posting your solution
-
- Novice
- Posts: 3
- Liked: 7 times
- Joined: May 09, 2016 12:36 pm
Re: Large amount of data change every 28 days
ahsict - I was having the problem for a long time too before finding the explication this week. I had the same reasoning as yours. Why was it happening once a month when there was no monthly job ? I think that Microsoft has done a poor job documenting deduplication...
To correct the situation, I scheduled my full backup the same day as the end of the garbage collector job.
To correct the situation, I scheduled my full backup the same day as the end of the garbage collector job.
-
- Novice
- Posts: 8
- Liked: 1 time
- Joined: Dec 17, 2015 3:54 pm
- Full Name: Michael Kreitzer
- Contact:
Re: Large amount of data change every 28 days
Since my company has been considering employing windows deduplication on large file servers I found this alarming so tried to dig a little. I found this:
https://support.microsoft.com/en-us/kb/3066175
It seems you can disable that "every 4th run" behavior and still retain "95%" of the benefit of dedup. I'm wondering if there's a way to change the behavior of the job to avoid touching large numbers of blocks as well.
https://support.microsoft.com/en-us/kb/3066175
It seems you can disable that "every 4th run" behavior and still retain "95%" of the benefit of dedup. I'm wondering if there's a way to change the behavior of the job to avoid touching large numbers of blocks as well.
Who is online
Users browsing this forum: Bing [Bot], Google [Bot], NightBird and 106 guests