Comprehensive data protection for all workloads
Post Reply
FrancWest
Veteran
Posts: 528
Liked: 104 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Succes notification mail, but job didn't copy any data

Post by FrancWest »

Hi,

case #03388750. We had a connectivity issue with our remote repository. Because of this no data of our copy jobs was transferred since the outage was longer than the retry timeout. So I received failure notification. So far, so good. However, once the connection was restored, the copy job started it's GFS merge. Once this was complete a 'success' notification mail was send. Our provider that hosts the remote repository concluded because of this success notification mail that the copy job had recovered itself and all data has been copied. When you looked closer in the notification mail you can see that the number of bytes copied was all zero. Still the job was reported as successful.

In my opinion this shouldn't have happened. Only the GFS merge was successful, but the copy of data itself not, so the job shouldn't have been reported as successful. Since our provider concluded everything was ok, no further action was taken, but the replication job that uses the remote repository failed later that day since no new data was copied at all.

Franc.
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Succes notification mail, but job didn't copy any data

Post by foggy »

Hi Franc, the job wouldn't have sent two notifications in the described case, there would have been a cumulative report. However, in the case notes I can see the engineer mentioning that the job was started manually at some point (using the Sync Now command or maybe after enabling/disabling the job) in the middle of the interval, which messed things up a bit. It's hard to judge without deep logs analysis, which was done by the engineer, I believe. Anyway, successful notification might be sent in a case where there''s no data to copy, since the latest point was already copied during previous interval, and only merge is required.
FrancWest
Veteran
Posts: 528
Liked: 104 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Succes notification mail, but job didn't copy any data

Post by FrancWest »

Hi Foggy,

I've have indeed a post-script that runs after the hourly backup that starts a new copy interval so that the backup get immediately copied to the remote location.

Is there a way to prevent this in the future? Also, if there's no data to copy, should it be at least a warning instead of success?
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Succes notification mail, but job didn't copy any data

Post by foggy »

Why don't you just configure hourly interval? Backup copy job will monitor appearance of the new data and start copying immediately.

Warning would be undesirable in cases where this is an expected behavior.
FrancWest
Veteran
Posts: 528
Liked: 104 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Succes notification mail, but job didn't copy any data

Post by FrancWest »

I had some issues with that when sometimes the job was restarted while the copy job was still in progress.
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Succes notification mail, but job didn't copy any data

Post by foggy »

But does post-job script change anything in this regard?
FrancWest
Veteran
Posts: 528
Liked: 104 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Succes notification mail, but job didn't copy any data

Post by FrancWest »

Yes, the copy-job interval is set to 24-hours and the script checks if the copy-job is still running, if so, it doesn't restart the copy-job.
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Succes notification mail, but job didn't copy any data

Post by foggy »

Do you mean the copy job was restarted? But the copy job doesn't restart by itself, it works according to the configured copy interval. Even if it wasn't able to copy everything it should during the current interval, it resumes during the next one, copying all the blocks required to build the latest VM state. And to avoid such overlaps you should either increase the interval or analyze bottleneck stats to probably improve job performance, if possible.
FrancWest
Veteran
Posts: 528
Liked: 104 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Succes notification mail, but job didn't copy any data

Post by FrancWest »

When the copy job is still running and a new interval is started, it restarts the job and starts copying all over again. That's why I implemented the script. Sometimes it takes longer for the backup files to be present (for example when it does a health check first, or the merge takes place). Then it could happen that for example 15 minutes before the new interval the backup files are ready for copying. When the interval expires it restarts the job and starts copying again. Increasing the interval isn't an option, since we always want the latest available backup to be copied off-site and with regular copy-jobs and intervals you aren't sure if that's the case. That's why I invoke the copy-job right after the backup job completes.
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Succes notification mail, but job didn't copy any data

Post by foggy »

When the copy job is still running and a new interval is started, it restarts the job and starts copying all over again.
Not all over, it doesn't copy blocks that were already copied, only new ones required to built the latest restore point.
Then it could happen that for example 15 minutes before the new interval the backup files are ready for copying. When the interval expires it restarts the job and starts copying again.
Scheduling the copy job just a few minutes later than the original backup job starts should address such cases - the job will sit idle until backup job completes and start copying immediately.
Increasing the interval isn't an option, since we always want the latest available backup to be copied off-site and with regular copy-jobs and intervals you aren't sure if that's the case. That's why I invoke the copy-job right after the backup job completes.
But in this case, if the job is not able to copy during the interval (which could still be the case), you're missing one restore point as well, since the script doesn't trigger the job if it is still copying.
Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 54 guests