Comprehensive data protection for all workloads
Post Reply
glennsantacruz
Enthusiast
Posts: 55
Liked: never
Joined: Mar 01, 2010 5:57 pm
Full Name: Glenn Santa Cruz
Contact:

Replication job hangs when SQL connection disrupted

Post by glennsantacruz »

According to this topic:

http://www.veeam.com/forums/viewtopic.p ... 322#p10322

Veeam version 4.1 fixed the "hang" issue with loss of WAN connectivity. We are seeing a similar issue when Veeam loses SQL connectivity.

To reproduce:
1) define and start a replication job
2) part-way through the replication job, stop the sql server hosting the Veeam database
3) Veeam will hang indefinitely ; if you resume the sql server, Veeam will respond and properly "fail" the replication job

Anyone else have experience with this behavior? To be fair, we're testing a scenario outlined in this thread: http://www.veeam.com/forums/viewtopic.p ... 463#p14463

tsightler
VP, Product Management
Posts: 5689
Liked: 2515 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Replication job hangs when SQL connection disrupted

Post by tsightler »

Perhaps this is to be expected. If Veeam can't contact the database to update it's status that it's "failed" perhaps it simply keeps trying over and over. I'm sure you could just kill the process in the case where this is a permanent failure. What do you perceive as the issue with this behavior? Since, 99% of the time, you'd probably want it to wait until the SQL was available again to continue it's hard for me to understand what the "correct behavior would be in this case".

glennsantacruz
Enthusiast
Posts: 55
Liked: never
Joined: Mar 01, 2010 5:57 pm
Full Name: Glenn Santa Cruz
Contact:

Re: Replication job hangs when SQL connection disrupted

Post by glennsantacruz »

I would expect Veeam to timeout (perhaps a configurable parameter) instead of indefinite "hang". Our concern here is that backup jobs (as well as replication jobs) would be hung until that SQL connection was re-established. Basically, we're testing the option of "cross-site" SQL servers for push replication, and making sure we know how things behave when "bad things happen." I was surprised that the Veeam server had an indefinite hang instead of a timeout. Not the end of the world, for sure, but the earlier thread (about WAN failure) gave me the impression that this was unexpected behavior.

tsightler
VP, Product Management
Posts: 5689
Liked: 2515 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Replication job hangs when SQL connection disrupted

Post by tsightler »

Right, I could tell by your question that you expected it to timeout, but I guess what I'm not getting is what problem it not timing out actually causes. Thinking about how Veeam is put together, the Veeam UI reads all of it's status info from the database. If the VeeamAgent failed but the database was down, I'm not even sure how the Veeam UI could reflect that. It just seems like a very strange failure state to have to deal with. Waiting for the SQL server to return and otherwise requiring manual intervention, doesn't seem so bad to me, but I thinking maybe you've got another idea what it should do. Are you saying that the Veeam UI/Agent should detect the loss of the database, kill themselves, and report an error? Certainly does seem cleaner.

Post Reply

Who is online

Users browsing this forum: Google [Bot] and 36 guests