Comprehensive data protection for all workloads
Post Reply
mdiver
Veeam Legend
Posts: 201
Liked: 33 times
Joined: Nov 04, 2009 2:08 pm
Location: Heidelberg, Germany
Contact:

Potential crash with log-shipping databases

Post by mdiver »

Log-shipping with VBR involves the SQL or Oracle server, optionally a log-shipping server and a repo to store the VLBs.
With Hyper-V underneath, VBR might also log-ship directly through the hypervisor by using PowerShell direct.

During the process VBR creates temp files of the logs to be shipped. First on the DB server itself and then on the log-shipping server.
These temp files get deleted once the logs have been shipped to the repo.

On the DB server the temp path for logs is defined in the registry and since 9.5U4a defaults to the largest disk the server has.
The path can be changed: https://www.veeam.com/kb2642

On the log-shipping server the process is always using the %TEMP% path to store the data.

See also: veeam-backup-replication-f2/sqltemplogp ... 77142.html

Problem is, VBR does not seem to check both pathes before it copies the raw transaction log data during each cycle.
So, a large transaction log being shipped might completely fill up the disk of the corresponding system.

This seems especially critical with a Hyper-V host that might only have this single boot device, imposing a risk to all VMs on the hypervisor.

Any experience or best practices on that?
PetrM
Veeam Software
Posts: 3264
Liked: 528 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Potential crash with log-shipping databases

Post by PetrM »

Hello,

I'd say that the primary question to answer here would be to understand why is transaction so large that it can fill up the disk of the corresponding system? Basically, logs are truncated after backup automatically, perhaps you should change log backup interval to create log backups more frequently? By the way, periodic shrink of logs might be a way to go as well.

One more detail to remember is that log shipping server will be used only if there is no direct connection between the guest VM and the repository. If the connection exists, logs will be transferred directly to the repository bypassing the log shipping server.

Thanks!
mdiver
Veeam Legend
Posts: 201
Liked: 33 times
Joined: Nov 04, 2009 2:08 pm
Location: Heidelberg, Germany
Contact:

Re: Potential crash with log-shipping databases

Post by mdiver »

I can't fully agree here, @PetrM.

First of all, AFAIK logs are truncated after the image-level-backup. Not during the log-shipping-cycles.
But this doesn't interfere here as we talk about the newly produced logs of a single cycle anyway.

As you mentioned, during normal operations a database might have much smaller transaction logs during each cycle.
But there are things like database maintenance plans that might put a lot of stress and changes to the DB.
As the Veeam log-shipping runs 24/7 you will at some stage face those large transactions.

You are right - log-shipping via network will be used when available - but in some cases (e.g. larger enterprise environment or hosting providers) you especially want to leverage network isolation.
Using the PowerShell direct way for the logs through the Hyper-V host is a great option to achieve that.

But here it even gets more dangerous: in case of the Hyper-V host becoming the log-shipping server through PS direct, the logs are driven to a directory not being mentioned anywhere AFAIK.
For the log-shipping server it regularly is %temp%. For Hyper-V it becomes %windir%/GUID. So it seems it will definitely fill up your Hyper-V boot disk. I did not find a method to change this.
The risk here is e.g. maintenance plan or even a rogue server operator might crash the virtual infrastructure just by large transactions (some sort of a DOS attack).

IMHO VBR should check the total size of the logs accumulated during each cycle BEFORE transfering it to the temporay folders on each of the three systems if involved (DB-srv, log-shipping-server, Hyper-V) to prevent the disk from filling up.

If the log-shipping job gets stuck because of a full disk, there is now way from inside VBR to clear things up. You manually have to delete those logs, switch DB to simple mode and restart the Veeam Backup service.

Thanks
Mike
mdiver
Veeam Legend
Posts: 201
Liked: 33 times
Joined: Nov 04, 2009 2:08 pm
Location: Heidelberg, Germany
Contact:

Re: Potential crash with log-shipping databases

Post by mdiver »

PetrM wrote: Oct 19, 2021 5:56 pm Hello,

I'd say that the primary question to answer here would be to understand why is transaction so large that it can fill up the disk of the corresponding system? Basically, logs are truncated after backup automatically, perhaps you should change log backup interval to create log backups more frequently? By the way, periodic shrink of logs might be a way to go as well.

One more detail to remember is that log shipping server will be used only if there is no direct connection between the guest VM and the repository. If the connection exists, logs will be transferred directly to the repository bypassing the log shipping server.

Thanks!
Case was filed: #05090044

Thanks
Mike
PTide
Product Manager
Posts: 6431
Liked: 729 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: Potential crash with log-shipping databases

Post by PTide »

@mdiver,
IMHO VBR should check the total size of the logs accumulated during each cycle BEFORE transfering it to the temporay folders on each of the three systems if involved (DB-srv, log-shipping-server, Hyper-V) to prevent the disk from filling up.
Question 1:

Assuming that that thing is implemented, what would you suggest the backup server should do next? Stop the log replication completely, search for another volume with more free space, or something else?

Question 2:

Should VBR UI offer an option for users to define a specific place to store the logs?

Thanks!
mdiver
Veeam Legend
Posts: 201
Liked: 33 times
Joined: Nov 04, 2009 2:08 pm
Location: Heidelberg, Germany
Contact:

Re: Potential crash with log-shipping databases

Post by mdiver » 1 person likes this post

Before crashing database or hypervisor server, I would prefer log-shipping to be stopped with an error thrown in the GUI.
It'll stop anyways once the disk is completely full. But then without any control from within the GUI. The log-shipping process gets stuck and you need a VBR service restart.

RegKey is ok. But to me it seems this key is there for the SQL itself and for the log-shipping server there is the %temp%. But in the special case of Hyper-V with PS direct it's %windir%. Maybe this can be adjusted already? I didn't find anything here.

The problem is valid on three layers:
  • Database server
  • log-shipping server (optional)
  • Hyper-V host (optional)
On all three levels VBR would know in advance if the next chunk of logs will fit. It could decide not to ship and crash but rather stop the job gracefully.
Once solved, log-shipping could commence automatically. So no GUI changes would be needed.

Thanks.
PetrM
Veeam Software
Posts: 3264
Liked: 528 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Potential crash with log-shipping databases

Post by PetrM » 1 person likes this post

Hello,

Ok, I've got the idea. Let's consider it as one of the potential improvements for future releases, however we cannot prioritize it as we don't see these requests quite often. Moreover, I never saw a support case related to the described scenario which is no doubts possible in theory.

Not sure what is the purpose of the support case 05090044, it seems everything works by design. Nevertheless, I'll try to find out if there is an option to change path for log shipping servers and Hyper-V hosts. For database server itself, it is:

Code: Select all

HKEY_LOCAL_MACHINE\SOFTWARE\Veeam\Veeam Backup and Replication
Key: SqlTempLogPath
Type: REG_SZ
Thanks!
mdiver
Veeam Legend
Posts: 201
Liked: 33 times
Joined: Nov 04, 2009 2:08 pm
Location: Heidelberg, Germany
Contact:

Re: Potential crash with log-shipping databases

Post by mdiver » 1 person likes this post

For the log-shipping server it's %temp%. So also changeable.
For Hyper-V it would be needed.

Thanks.
mdiver
Veeam Legend
Posts: 201
Liked: 33 times
Joined: Nov 04, 2009 2:08 pm
Location: Heidelberg, Germany
Contact:

Re: Potential crash with log-shipping databases

Post by mdiver »

PetrM wrote: Oct 21, 2021 3:51 pm Hello,

Ok, I've got the idea. Let's consider it as one of the potential improvements for future releases, however we cannot prioritize it as we don't see these requests quite often. Moreover, I never saw a support case related to the described scenario which is no doubts possible in theory.

Not sure what is the purpose of the support case 05090044, it seems everything works by design. Nevertheless, I'll try to find out if there is an option to change path for log shipping servers and Hyper-V hosts. For database server itself, it is:

Code: Select all

HKEY_LOCAL_MACHINE\SOFTWARE\Veeam\Veeam Backup and Replication
Key: SqlTempLogPath
Type: REG_SZ
Thanks!
@PetrM, thanks for investigating.
Customer had a Hyper-V host stop because VBR filled its boot disk to 0 bytes free with logs, which he did not expect to happen by design.

REG_SZ above is only valid for the log-shipping server. On the Hyper-V host we see those temp files appear in C:\WINDOWS\<GUID>\<GUID>.bak
They are being generated by a process "VeeamPSDirectCtrl_X64.exe" on the Hyper-V host itself. Is there a way to change the directory here?

Thanks
Mike
PetrM
Veeam Software
Posts: 3264
Liked: 528 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Potential crash with log-shipping databases

Post by PetrM »

Hi Mike,

Agree that the issue is not expected at all, thanks for clarifying!

There is a way to change the path on the Hyper-V host, you may try this key:

Code: Select all

HKEY_LOCAL_MACHINE\SOFTWARE\Veeam\Veeam Backup and Replication
Key: TempPathDir
Type: REG_SZ
Thanks!
jotge
Influencer
Posts: 18
Liked: never
Joined: May 20, 2019 11:44 am
Full Name: Jan Groschopp
Location: Deutschland
Contact:

[MERGED] Feature Request: VBR should not use a hard-coded location for log backups in a Hyper-V environment.

Post by jotge »

Hi at all!

We observe the following issue when backing up SQL transaction log via Powerhell Direct in a Hyper-V environment:
1. the log files to be backed up are temporarily stored as a .bak file in the VM under "SqlTempLogPath".
2.Then these files are forwarded to the Hyper-V host where the VM is currently hosted. The .bak files are stored in C:\Windows\xxxxxxx-xxxxxxx-xxxxxxxxx\.
3. the .bak files are transferred to the Veeam repository server and stored in the VM's backup repository in a Veeam proprietary format.
4. the cached .bak files on the Hyper-V host and in the VM are deleted and the space used for them is freed.

From our point of view, this procedure has a great potential for system crashes of all VMs running on the Hyper-V host if the C: drive fills up due to this process.

Unfortunately, we have to make this experience again and again when very large log files fill up the LW C: of our Hyper-V hosts down to a few MB of free space in this way. We then have to manually intervene (stop the backup job, delete the .bak files, shrink the SQL log files, etc.) to prevent worse.

We addressed this issue almost a year ago in Veeam Case #05090044 and Veeam support told us that this issue was handled internally as a Feature Request at Veeam.

What is the status of this FR, when can we expect a resolution?

As a service provider, we run a very large Hyper-V environment with about 16 clusters and 80 Hyper-V hosts running over 1000 VMs in different customer environments. In each customer environment running at least 1 to 4 SQL logbackup jobs. It is almost a gamble as to when which Hyper-V host will be shut down by the process described.

Again the FR: It must be possible to store these cached .bak files on a drive other than C:.

BTW: The same is true for Oracle log backups.


Thank you and have a nice day

Jan
Mildur
Product Manager
Posts: 8735
Liked: 2294 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Potential crash with log-shipping databases

Post by Mildur »

Hi Jan

Can you try the registry value from Petr post?

Thanks
Fabian
Product Management Analyst @ Veeam Software
jotge
Influencer
Posts: 18
Liked: never
Joined: May 20, 2019 11:44 am
Full Name: Jan Groschopp
Location: Deutschland
Contact:

Re: Potential crash with log-shipping databases

Post by jotge »

Hello Fabian,

I have set the key and observe the same behavior as before.

The logs are temporarily stored on the Hyper-V host in C:\WINDOWS\<GUID>\<GUID>.bak.

Regards
Jan
mdiver
Veeam Legend
Posts: 201
Liked: 33 times
Joined: Nov 04, 2009 2:08 pm
Location: Heidelberg, Germany
Contact:

Re: Potential crash with log-shipping databases

Post by mdiver »

Any news on this? Maybe towards V12 GA?

The regkeys provided above don't solve the issue.
It's always C:\WINDOWS\<GUID>\<GUID>.bak on the Hyper-V host being used temporarily to push the logs to during the transfer via PS-direct.

Thanks,
Mike
Mildur
Product Manager
Posts: 8735
Liked: 2294 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Potential crash with log-shipping databases

Post by Mildur »

Hi Mike

We have it on our roadmap, but it will not be a V12 feature. Unfortunately I cannot provide a version or ETA when it will be included.

Thanks
Fabian
Product Management Analyst @ Veeam Software
jotge
Influencer
Posts: 18
Liked: never
Joined: May 20, 2019 11:44 am
Full Name: Jan Groschopp
Location: Deutschland
Contact:

Re: Potential crash with log-shipping databases

Post by jotge »

Hello,

we had another incident today.

The C: drive of one of the Hyper-V cluster nodes filled up because of logshipping. As a result, 17 other VMs from 5 different customer environments have been potentially compromised. Fortunately, we detected the problem in time.

this time we were still lucky...

Regards

Jan
GabesVirtualWorld
Expert
Posts: 244
Liked: 38 times
Joined: Jun 15, 2009 10:49 am
Full Name: Gabrie van Zanten
Contact:

Re: Potential crash with log-shipping databases

Post by GabesVirtualWorld »

Glad I found this thread. C-drive of Hyper-V host had ran out of space and we found a lot of these GUiD.bak files. Couldn't trace them to where they were coming from until I found this thread. We were then able to find the backups causing this.

Would be great if there came a permanent fix that logs will only be written if at least, say 20GB, free space is left.
jotge
Influencer
Posts: 18
Liked: never
Joined: May 20, 2019 11:44 am
Full Name: Jan Groschopp
Location: Deutschland
Contact:

Re: Potential crash with log-shipping databases

Post by jotge »

Hello Gabrie,

we are still in contact about the problem with Veeam.

The current status is that an originally planned private fix will not come. Veeam does not classify the problem as a BUG but sees this as a feature request. Accordingly, there will be a solution to the problem with Veeam, v12a at the earliest. Veeam has promised to work on the problem. Whatever the solution will look like and whether it really comes with v12a is kept open, I have no further information.
jotge
Influencer
Posts: 18
Liked: never
Joined: May 20, 2019 11:44 am
Full Name: Jan Groschopp
Location: Deutschland
Contact:

Re: Potential crash with log-shipping databases

Post by jotge »

Hello,

Time for a conclusion on the implementation of the feature request in v12.1.

As a reminder: the request we formulated in Case #5090044 on 21.10.2021 was as follows:
Now we have the following requests:

Workaround and/or Feature Request on VM level:
When backing up SQL and Oracle transaction logs in Hyper-V environments using Powershell Direct, before creating the copy of the log file currently being backed up, it should be checked if there is
enough free space at the copy destination. If not, the backup operation should be aborted with an error message and a corresponding log entry should be displayed in the log backup job.

Workaround and/or Feature Request on Hyper-V level:
Since backup via Powershell Direct can only be used in Hyper-V environments, we prefer for the log shipping process “VeeamPSDirectCtrl_X64.exe” to use the VM's home directory for the files to be temporarily stored instead of the default directory "%WINDIR%". If this should not be possible, we need at least a solution to change default directory to another path, e.g. via a registry key. In both cases the process should check if there is enough free space before storing the temporary files in the destination directory. If not, the backup process has to be aborted with an error message and a corresponding log entry should be displayed in the log backup job.
As far as I can see, in v12.1 it was only realized that the regsitry key mentioned by PetrM now works.

Code: Select all

HKEY_LOCAL_MACHINE\SOFTWARE\Veeam\Veeam Backup and Replication
Key: TempPathDir
Type: REG_SZ
The possibility of setting the target for the temporary logshipping data at hypervisor level in the VM's home directory could obviously not be realized.

The following points also remain open:
Workaround and/or Feature Request on VM level:
...it should be checked if there is enough free space at the copy destination...
Workaround and/or Feature Request on Hyper-V level:
...the process should check if there is enough free space before storing the temporary files in the destination directory....
In our view, an important problem with the way Veeam processes the logshipping data still exists. Even if a drive is provided at the hypervisor level as a logshipping destination, when it's full, all guest VMs that also use this drive will be affected. This can have an impact across tenants, as a hypervisor host runs VMs from different tenants. As a result, databases with transaction log processing and log backup enabled by Veeam will no longer process data - which means a standstill. The most elegant way to avoid this would be to use the home directory of the VM that generates the logshipping data. If this is full, then only the VM that caused the logshipping data is affected.

Another point from a service provider's point of view is, of course, the cost of providing additional storage is also an important point. We only want to provide as much storage per client as is actually needed and then invoice for it. This is not possible with a global setting of the target directory at the hypervisor level.

We would like to renew our original feature request.

Thanks and have a nice day

Jan
Post Reply

Who is online

Users browsing this forum: restore-helper, Semrush [Bot] and 118 guests