Comprehensive data protection for all workloads
Post Reply
bteichner
Enthusiast
Posts: 30
Liked: 2 times
Joined: Apr 30, 2012 5:54 pm
Full Name: Brian Teichner
Contact:

SQL 2005 cluster and NetApp SnapDrive

Post by bteichner »

We have a SQL 2005 cluster that we are trying to backup with Veeam, but are running into issues with a cluster failover happening at least once a week. Here are the details on the SQL cluster:

2 Windows Server 2008 R2 VMs
SQL 2005 SP4 cluster
NetApp SnapDrive

All the database and log LUNs are mounted with SnapDrive to drive letters. For nightly backups, we have a SQL management plan that backs up the databases, that is working fine. After this backup, the Veeam backup runs (and is successful) but causes the cluster to failover. For the Veeam backup, we are processing all disks and using application aware processing (with log truncation disabled - since this is done with the SQL management plan). I'm seeing these logs in the Windows event log at the time that Veeam prepares the guest for a backup:

-ONTAP VSS hardware provider service has started.
-Data ONTAP VSS hardware provider is loaded.
-Data ONTAP VSS hardware provider is adding a source lun
-SnapDrive is ready to create Snapshot copy of LUN(s).
-I/O is frozen on database ... No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
-Snapshot of LUN(s) on storage system volume was successfully created.
-Data ONTAP VSS hardware provider has successfully completed CommitSnapshots for SnapshotSetId in 749 milliseconds.
-I/O was resumed on database. No user action is required.

My thought is that because application aware processing is enabled, it is initiating VSS for SnapDrive, which then initiates a NetApp snapshot/backup with SnapDrive.

After that, I start to get errors (once the Veeam job is completing).

-Volume Shadow Copy Service error: Unexpected error DeviceIoControl(\\?\fdc#generic_floppy_drive#) Incorrect function.
-SendErrorToErrLog: Operating system error 21(The device is not ready.) encountered.
-The log for database is not available. Check the event log for related error messages. Resolve any errors and restart the database.

As far as I can tell in the Windows logs, it is taking about 1 hour for the database to recover on the second node. Has anyone had any success backing up a SQL cluster with SnapDrive installed? My next thought is to disable application aware processing to see if that fixes the cluster failover problem, but then I won't have the most consistent backup if I need to restore. Sorry this is a lot of info, but hopefully it makes sense.

Thanks
Vitaliy S.
VP, Product Management
Posts: 27105
Liked: 2717 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: SQL 2005 cluster and NetApp SnapDrive

Post by Vitaliy S. »

Hi Brian,
bteichner wrote:My thought is that because application aware processing is enabled, it is initiating VSS for SnapDrive, which then initiates a NetApp snapshot/backup with SnapDrive.
Yes, most likely this happens because of the VSS freeze.
bteichner wrote: For nightly backups, we have a SQL management plan that backs up the databases
bteichner wrote: My next thought is to disable application aware processing to see if that fixes the cluster failover problem, but then I won't have the most consistent backup if I need to restore.
Since you're already doing consistent backups with native tools there is no need to additionally enable application-aware image processing. You should be able to recover from these backups and backups that were performed by native tools.

Thanks!
Post Reply

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 34 guests