Backup of NAS, file shares, file servers and object storage.
Post Reply
jcofin13
Service Provider
Posts: 217
Liked: 25 times
Joined: Feb 01, 2016 10:09 pm
Contact:

Second opinion on C drive filling up over time

Post by jcofin13 »

Case #07844016
Veeam 12.3.2.3617

Im getting some feedback on the case but no clear answers to my questions below. Just looking for a second opinion so i dont make a mess of the server by purging or modifying something i shouldnt.

Anyway.....
I have a veeam server where the C drive continues to fill up. As best i can tell its related to the C:\ProgramData\Veeam\Backup\System\CheckpointRemoval folder. This was troubleshot with Veeam support in April 2025. Case #07599135.

This folder contains dated folders that go back a couple of years.

In each, they can range from very small to a few hundred mb per day. Over the course of a month they are 10 - 12gb currently. WE had this issue once prior and it was causing the folders to grow to many gb per nigh in some instances.

Anyway.....if i go into a dated folder and review a log file (Agent.Cleanup.Blob.REpo.log.Index.log) i see the following (C:\ProgramData\Veeam\Backup\System\CheckpointRemoval\2025-10-06\<AWS-S3-REPO-Name>):

[05.10.2025 19:28:25.034] < 33064> aws | WARN|HTTP request failed, retry in [4] seconds, attempt number [4], total retry timeout left: [1784] seconds
[05.10.2025 19:28:25.034] < 33064> aws | >> |S3 error: Please reduce your request rate.
[05.10.2025 19:28:25.034] < 33064> aws | >> |Code: SlowDown


When we had this issue in the past (april 2025), the ticket i had stated we reduced the concurrent operations to S3 by modifying/adding a regkey entry:

Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Veeam\Veeam Backup and Replication

S3ConcurrentTaskLimit
dword: 15


As i mentioned, this seemed to help but its doing it again. It doesn't seem to be as drastic or creating as many logs and using space as it was prior (april 2025).

My questions are:

1. How do i clean this up so i can get space back especially from the older folders? Can anyone confirm the dated folders in the checkpointremoval folder are ok to delete?
2. Is the error im seeing in these agent clean up logs shown above the correct place to see whats going on?
3. Is the correct action to reduce the regkey from 15 to 10 or a lesser number to reduce the aws s3 request denials and just continually monitor it over time?
4. Is it normal for the dated folders in the checkpointremoval folder to be 10-12gb per month?
5. Is it possible to monitor for this behavior inside of veeam rather than looking the the txt logs on the server and spot checking it over time? Can VeeamONE monitor for this type of thing?

My guess is the AWS message is driving the log growth and we need to throttle down the request rate as suggested but would like to know if im on the correct path in my troubleshooting.


All files dated prior to 4/30/2025 show a size of ~68gb. (per the previous ticket)
All files dated 5/1/2025 - current show a size of ~58tb

The folder in total ~126gb which seems excessive.....and appears to be growing 10-12gb per month due to the logs in this folder.
david.domask
Veeam Software
Posts: 3009
Liked: 697 times
Joined: Jun 28, 2016 12:12 pm
Contact:

Re: Second opinion on C drive filling up over time

Post by david.domask »

Hi jcofin13,

Thank you for sharing the case numbers and for the detailed write up.

I'll go through your questions one by one, then a bit more commentary:

1. Debug logs are always safe to delete, just means it might be difficult to troubleshoot some issues if the corresponding logs have been cleared. In this case, if it's really the S3 "Slow down" response filling the logs, it's fine to delete the older ones
2. Likely yes, but without a review of the debug logs and the environment will be difficult to give a more assertive answer (see below commentary)
3. Potentially can help, yes. This config parameter changes the number of concurrent HTTP requests made during S3 operations; however, it works in conjunction with the total concurrent tasks allowed for the repository (a concurrent task can have up to (S3ConcurrentTaskLimit) HTTP connections at once, so if you have 10 tasks and S3ConcurrentTaskLimit is set to 10, potentially 100 concurrent HTTP requests)
4. Depends on the environment -- if you have many S3 repositories and a lot of jobs using them, potentially, but at first blush this seems a bit high
5. Can you be more specific on this one? You mean the log folder size or the S3 "SlowDown" responses? For the latter, Veeam utilizes an exponential 'back-off' mechanism whenever an S3 system returns a SlowDown response; during these periods, Veeam slows the number of requests

As for the issue itself, as alluded to above, while the config parameter may assist here, depending on how heavily S3 is being utilized there may be other places to check.

I would advise continue with Support and push for a closer review of the behavior here (e.g., does it occur only during deletes, only during specific time periods, etc) -- if it's really about Checkpoint removal then there may be other options to adjust, but best to wait for Support's review.

If there are concerns on the handling of the case, please use the Talk to a Manager button in the case portal; this will connect you with Support Management and they will review the case and allocate additional resources if necessary.
David Domask | Product Management: Principal Analyst
jcofin13
Service Provider
Posts: 217
Liked: 25 times
Joined: Feb 01, 2016 10:09 pm
Contact:

Re: Second opinion on C drive filling up over time

Post by jcofin13 »

Thank you. I will continue to wait for supports response and work with them but i appreciate this.

I will clean up some of the older logs in the checkpointremoval/system folder prior to april of this year (which was our original ticket on this).

For 5., im specifically looking for alerting on the Slowdown message in the logs with an ouside tool like veeamone or vbr itself so it alerts on it and we can get it looked at before the drive becomes full.

I wonder if the read requests its complaining about are per bucket or per source ip or per aws subscription or what the actual metric is that causes it to request slow down. The subscription itself is a single bucket. Im going to look at the aws side today to see if there is any indication of this error.
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest