Discussions specific to the VMware vSphere hypervisor
Post Reply
bronyrafon
Lurker
Posts: 1
Liked: never
Joined: Oct 28, 2011 8:02 am
Full Name: Gary Williams
Contact:

Cluster fails to start

Post by bronyrafon »

Hi,
I'm testing Veeam backup 5 on our Dev/Test virtual servers. I carried out a backup overnight, and one of our servers (single node Windows 2003 cluster) is no longer able to start the cluster service? The quorum and data disks show as failed if I startup the cluster with the /fixquorum switch. If I disable the cluster disk driver, and reboot, I can access these drives, and the data appears to be accessible, but when I re-enable the cluster disk driver, reboot and try to start the cluster is fails to startup?
Anyone got any ideas on what I need to do to fix this issue?
Thanks

Gostev
SVP, Product Management
Posts: 26700
Liked: 4276 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Cluster fails to start

Post by Gostev »

Generally speaking, Microsoft clusters require SCSI bus sharing for storage, which in turn makes VM snapshots impossible because of VMware limitations. However, image-level backup requires VM snapshots created. Your backup is probably simply incorrect because of that.

AndyGDIT
Enthusiast
Posts: 68
Liked: 9 times
Joined: Nov 14, 2011 3:15 pm
Full Name: Andrew Frye
Contact:

Re: Cluster fails to start

Post by AndyGDIT »

Seeing the exact same issue... Was there a resolution to how he got this fixed?

Vitaliy S.
Product Manager
Posts: 24256
Liked: 1863 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Cluster fails to start

Post by Vitaliy S. »

Andrew,

Not quite sure I understand what your issue is, could you please elaborate it further? Are you trying to restore cluster VM or it (cluster service) fails at the beginning of the backup job?

Thanks.

AndyGDIT
Enthusiast
Posts: 68
Liked: 9 times
Joined: Nov 14, 2011 3:15 pm
Full Name: Andrew Frye
Contact:

Re: Cluster fails to start

Post by AndyGDIT »

Seeing the exact same issues as bronyrafon was seeing.

When we put this server in the Veeam backup, and it took the snapshot, It froze all of the databases on the server I then get serveral errors in the event viewer saying

Error code 1:
SQLVDI: Loc=TriggerAbort. Desc=invoked. ErrorCode=(0). Process=5480. Thread=5108. Server. Instance=MSSQLSERVER. VD=Global\{7B61F660-4ED1-40DC-9384-818FC619715E}4_SQLVDIMemoryName_0.

Error Code 18210:
BackupVirtualDeviceFile::TakeSnapshot: failure on backup device '{7B61F660-4ED1-40DC-9384-818FC619715E}3'. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).


I called Support and they found that it was VSS that had brought the cluster service down, but we could not get it back up. Worked over 16 hours with Microsoft and VMware on this issue. Had to move the databases to another server that was not clustered to get the DB to come back up

I was wondering if you had seen any issues with VSS and Clustered Server Services

Vitaliy S.
Product Manager
Posts: 24256
Liked: 1863 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Cluster fails to start

Post by Vitaliy S. »

Does it behave the same when you manually snapshot this VM (with both VMware Tools quiescence enabled and not)?

AndyGDIT
Enthusiast
Posts: 68
Liked: 9 times
Joined: Nov 14, 2011 3:15 pm
Full Name: Andrew Frye
Contact:

Re: Cluster fails to start

Post by AndyGDIT »

Hi Vitaliy

I cannot give you an answer on this, I cannot replicate the issue, because we were never able to get the Cluster service to restart with the drives available.

Our server was being backed up with a Netbackup Agent before we moved to Veeam. So we were not using the VMware technology (Vstorage). This was the first time that we had done anything like this on this server.

Vitaliy S.
Product Manager
Posts: 24256
Liked: 1863 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Cluster fails to start

Post by Vitaliy S. »

Ah...that explains. Well there are certain things (such I/O freeze for one of the cluster nodes) that may affect Cluster Service while snapshotting.

tsightler
VP, Product Management
Posts: 5679
Liked: 2497 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Cluster fails to start

Post by tsightler »

Hey Andy. Have you looked at this Hotfix?

http://support.microsoft.com/kb/934396

Seems pretty close. What version, service pack, and cumulative update?

AndyGDIT
Enthusiast
Posts: 68
Liked: 9 times
Joined: Nov 14, 2011 3:15 pm
Full Name: Andrew Frye
Contact:

Re: Cluster fails to start

Post by AndyGDIT »

Hey Tom

This backup actually failed, when we got these errors. I investigated the issue and found alot of errors in the Event Viewer, which led us to finding out that the application was not working after we had tried backing up.

I will pass this along to the group that is working the issue and I will get back to you with this

AndyGDIT
Enthusiast
Posts: 68
Liked: 9 times
Joined: Nov 14, 2011 3:15 pm
Full Name: Andrew Frye
Contact:

Re: Cluster fails to start

Post by AndyGDIT »

Sorry misread your post

Windows 2003 R2 Enterprise x64

SQL Server 2005 x64 SP4

I believe it is at 9.4.5000

tsightler
VP, Product Management
Posts: 5679
Liked: 2497 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Cluster fails to start

Post by tsightler »

In general I would make 100% sure that all SQL 2005 SP4 cumulative updates were installed (I think the latest for SP4 is CU3), as well as the most recent Windows 2003 VSS rollup (I think it's http://support.microsoft.com/kb/940349 but not sure). This rollup hotfix contains numerous fixes for VSS. I don't know if this will address the issue, but earlier rollups for VSS included many fixes for SQL and cluster service issues.

As far as recovering from your current issue, if you followed this advice offered higher up in this thread and used the "/fixquorum" thread, that basically just allows the cluster service to start without quorum. You would still need to take the appropriate action to fix the quorum device, either by repairing the quorum data or simply creating a new quorum log. Usually you can fix it with the "ClusterRecovery.exe" utility http://www.microsoft.com/download/en/de ... x?id=10047 but honestly Microsoft should be able to talk you through quorum disk recovery.

AndyGDIT
Enthusiast
Posts: 68
Liked: 9 times
Joined: Nov 14, 2011 3:15 pm
Full Name: Andrew Frye
Contact:

Re: Cluster fails to start

Post by AndyGDIT »

Thanks Tom

I will work with our DBA's and find out what cumulative updates are installed on the server. As for Microsoft, they were no help, as we made the executive decision to move the DB to another server. We were not able to create any new quorum or repair the old one. I will check out the link you sent me and pass it along as well.

I think the issue was, that our cluster was a physical server that was virtualized and we dropped the cluster which was not properly removed causing a one node cluster. I think that once we tried to backup using Veeam, the cluster freaked out and tried to failover, and once it could not, it just gave up and crashed. After all of the issues we had, we are leaning on rebuilding the server without clustering

Gostev
SVP, Product Management
Posts: 26700
Liked: 4276 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Cluster fails to start

Post by Gostev »

AndyGDIT wrote:After all of the issues we had, we are leaning on rebuilding the server without clustering
LOL, something I was just going to propose ;)

Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 20 guests