Comprehensive data protection for all workloads
Post Reply
alexandre.nakagawa
Influencer
Posts: 21
Liked: 4 times
Joined: Oct 14, 2021 4:57 pm
Full Name: alexandre nakagawa
Contact:

SQL Instant Recovery Not Instant

Post by alexandre.nakagawa »

Hi, I have a large SQL server (45+tb). it´s a vm (esxi), and I´m using veeam agent (can´t use vmware based backup because it takes 18 hours to complete the backup and the datastore space runs out)

The problem is
if I try to run a sql instant recovery and select (image level point), it´s takes 4 minutes to publish the database. But if I select Point in time recovery (to apply log backups), the database takes 3 to 4 hours, only to publish.

anyone had a similar problem?

ESX hosts 20GB network cards. and connect to pure storage using fiber

Repository is a windows 2019 VM with 120TB formated with Refs.

SQL Server is a SQL 2019.

Case #05066475
Gostev
Chief Product Officer
Posts: 32737
Liked: 7958 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: SQL Instant Recovery Not Instant

Post by Gostev »

So the delay is due to transaction log backups replay which takes a lot of time.

I would look at the performance of the following storage devices:
1. IR cache folder on the mount server for the corresponding backup repository. Are you using SSD as recommended?
2. Backup repository itself as you're effectively running your database from it.
alexandre.nakagawa
Influencer
Posts: 21
Liked: 4 times
Joined: Oct 14, 2021 4:57 pm
Full Name: alexandre nakagawa
Contact:

Re: SQL Instant Recovery Not Instant

Post by alexandre.nakagawa »

1 - IRCache folder, is a vmdk stored on PureStorage Flash Array
2. Backup repository is also on PureStorage.

Disk performance for ircache or repository is not he problem.

Code: Select all

Command Line: DiskSpd\amd64\diskspd.exe -c25G -b512K -w100 -Sh -d600 R:\data01\testfile.dat
Input parameters:
        timespan:   1
        -------------
        duration: 600s
        warm up time: 5s
        cool down time: 0s
        random seed: 0
        path: 'R:\data01\testfile.dat'
                think time: 0ms
                burst size: 0
                software cache disabled
                hardware write cache disabled, writethrough on
                performing write test
                block size: 512KiB
                using sequential I/O (stride: 512KiB)
                number of outstanding I/O operations per thread: 2
                threads per file: 1
                using I/O Completion Ports
                IO priority: normal
System information:
        computer name: veeam-v-1p05
        start time: 2021/10/02 13:52:11 UTC
Results for timespan 1:
*******************************************************************************
actual test time:       600.01s
thread count:           1
proc count:             8
CPU |  Usage |  User  |  Kernel |  Idle
-------------------------------------------
   0|  26.71%|   3.98%|   22.73%|  73.29%
   1|  24.47%|   4.74%|   19.73%|  75.53%
   2|  11.10%|   3.52%|    7.58%|  88.90%
   3|   8.15%|   3.06%|    5.09%|  91.85%
   4|   7.84%|   3.29%|    4.55%|  92.16%
   5|   7.80%|   3.68%|    4.12%|  92.20%
   6|   8.78%|   3.73%|    5.05%|  91.22%
   7|  13.06%|   2.97%|   10.09%|  86.94%
-------------------------------------------
avg.|  13.49%|   3.62%|    9.87%|  86.51%
Total IO
thread |       bytes     |     I/Os     |    MiB/s   |  I/O per s |  file
------------------------------------------------------------------------------
     0 |    651801657344 |      1243213 |    1036.00 |    2072.00 | R:\data01\testfile.dat (25GiB)
------------------------------------------------------------------------------
total:      651801657344 |      1243213 |    1036.00 |    2072.00
Read IO
thread |       bytes     |     I/Os     |    MiB/s   |  I/O per s |  file
------------------------------------------------------------------------------
     0 |               0 |            0 |       0.00 |       0.00 | R:\data01\testfile.dat (25GiB)
------------------------------------------------------------------------------
total:                 0 |            0 |       0.00 |       0.00
Write IO
thread |       bytes     |     I/Os     |    MiB/s   |  I/O per s |  file
------------------------------------------------------------------------------
     0 |    651801657344 |      1243213 |    1036.00 |    2072.00 | R:\data01\testfile.dat (25GiB)
------------------------------------------------------------------------------
total:      651801657344 |      1243213 |    1036.00 |    2072.00
Gostev
Chief Product Officer
Posts: 32737
Liked: 7958 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: SQL Instant Recovery Not Instant

Post by Gostev »

Judging on the block size, you seem to be testing your storage device against Veeam workload. However, during instant recovery the I/O workload is generated by the actual SQL Server, which most certainly does not operate with 512KB blocks :) if I remember correctly, it uses 8KB blocks.
alexandre.nakagawa
Influencer
Posts: 21
Liked: 4 times
Joined: Oct 14, 2021 4:57 pm
Full Name: alexandre nakagawa
Contact:

Re: SQL Instant Recovery Not Instant

Post by alexandre.nakagawa »

So, what is the recomendation? is there anything that can be done? since it´s fastest storage that we have and purestorage says that it´s fastest in the world :D
PetrM
Veeam Software
Posts: 3986
Liked: 686 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: SQL Instant Recovery Not Instant

Post by PetrM »

Hello,

Do you run image-level backup on daily basis and what is the log backup period? I believe that publish will take less time if you specify PIT which is closer to the time of image-level backup.

Thanks!
alexandre.nakagawa
Influencer
Posts: 21
Liked: 4 times
Joined: Oct 14, 2021 4:57 pm
Full Name: alexandre nakagawa
Contact:

Re: SQL Instant Recovery Not Instant

Post by alexandre.nakagawa »

image level each 12 hours. backuplog at 15 min interval.

one thing that I noted. IR publish the database and copy database files at same time;
it´s possible to change this behavior? to first publish and replay the logs, and only after publishing the database, start copying the files?
alexandre.nakagawa
Influencer
Posts: 21
Liked: 4 times
Joined: Oct 14, 2021 4:57 pm
Full Name: alexandre nakagawa
Contact:

Re: SQL Instant Recovery Not Instant

Post by alexandre.nakagawa »

My repository is 4x 30tb vmdks using storage spaces to create a storage pool of 120TB (just for this sql server)

maybe it´s better to create smaller disks to distribute the IO?

this repository is a landing zone to fast recovery, after that the backup is pushed to azure (scale-out repository)
Gostev
Chief Product Officer
Posts: 32737
Liked: 7958 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: SQL Instant Recovery Not Instant

Post by Gostev » 1 person likes this post

I would wait for the devs to investigate where the bottleneck is first. Clearly it has to deal with SQL Server performance of replaying logs from mounted backups.

Meanwhile, I think it would be a good test to do point-in-time database restore to production storage, using the same restore point. If it takes comparable time then maybe it's just what it is: SQL Server is slow to replay big transaction log on a huge database, but this process we can't control or optimize.
alexandre.nakagawa
Influencer
Posts: 21
Liked: 4 times
Joined: Oct 14, 2021 4:57 pm
Full Name: alexandre nakagawa
Contact:

Re: SQL Instant Recovery Not Instant

Post by alexandre.nakagawa »

during restore, i can see this wait in restore log command

percent_complete start_time status command Minutes to Completion Minutes Elapsed wait_type last_wait_type
100 2021-10-14 19:34:18.773 suspended RESTORE LOG 0 5 SLEEP_BPOOL_FLUSH SLEEP_BPOOL_FLUSH

when I do point in time recovery, it´s possible to see the time to replay logs? I only see the total time taken to restore database, about 15 hours.
Gostev
Chief Product Officer
Posts: 32737
Liked: 7958 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: SQL Instant Recovery Not Instant

Post by Gostev »

Yes, devs should be able to see that from debug logs.
alexandre.nakagawa
Influencer
Posts: 21
Liked: 4 times
Joined: Oct 14, 2021 4:57 pm
Full Name: alexandre nakagawa
Contact:

Re: SQL Instant Recovery Not Instant

Post by alexandre.nakagawa »

good idea.
i will stop log backups, restore database to point in time with no recovery, them I will ask to my DBA to create new log backups from sql and apply to this sql server. to see how much time it takes.

:D :mrgreen:
Post Reply

Who is online

Users browsing this forum: Amazon [Bot], J222, Semrush [Bot] and 30 guests