we're having a SOBR which has a local extent (NAS CIFS share) and a object storage as capacity extent (wasabi). Now to be honest I mad a small mistake: Instead of using copy jobs I just created a reverse incremental backup job (using this SOBR as repository) and specified the offload policy. Now I've got 7 VM's in my backup chain and more or less 750 restore points per VM, 650 have been offloaded.
Now something went wrong some weeks ago and that's why veeam started to offload to a new folder on s3 which means that all "old" restore points on s3 became unavailable (case # 03620219). We don't know why but tried to fix it by using the following action plan:
- download already offloaded restore points (from the new folder)
- drop the new folder on s3
- db-update for pointing again to the right s3 folder
- run rescan
Now we're still working on the issue but the reason why I created this the thread is the rescan-performance. I've got the feeling that this process could be improved, here are my observations:
- when veeam downlaods the index to the temp directory (c:\windows\temp\) then it does this step by step. It doesn't download files in parallel and it waits for some other task to finish and downloads the next file - it would be faster to download in advance and parallel
- veeam synchronizes the indexes one by one and compares/synchronizes the index from s3 and the local one. I had some attempts when we run out of disk space (because of the huge index size which has been downloaded) after a successful run of several other indexes and when I restarted the rescan, it started from the beginning!! It didn't realize that it has already processed some indexes and therefore wasted a lot of time. It would be better to somehow integrate checksums to compare s3 contents with the local ones to speed up the process
- the whole rescan took 5 days and some indexes were processed 24 hours. I noticed that the first 100 index files are processed very quickly (in about one hour) but the more the processing went on, the slower it got. So once again: For e.g. 650 restore points the first 100 were processed in about 1 hour and the other 550 needed 23 hours. I noticed that veeam was reading somewhere around 300-600 Mbit/s from the CIFS share (local extent) and I really don't get why it reads that much of data
Can please somebody shed some light on this process? Why is it that slow? How could it be accelerated (on my side)? For how many restore points has object storage rescan been tested/built? As I said, my approach wasn't ideal but what if you have a desaster in the future and you'd like to access backups from s3 - if you then have to wait 5 days until the rescan has finished, it would be more than painful.
By the way, during rescan the vm had 1 GB of RAM available and the local temp directory points to a SSD. I've already asked the support team for clarification but they told me that QA is currently under heavy load by testing the new v10 which of course has a high priority so I hope that I get my answers here on the forum.
Thanks in advance!