copy job to VCC provider problems

backupquestions · Oct 17, 2019 2:50 pm

Case # 03807945. I have a ticket at iLand as well and have given that info to Veeam support in hopes they can collaborate.

So I have a server 2016 setup with DAS and REFS 64k block drives. Performance is fine locally on everything.

I have a backup copy job to iLand repository for 70 VMs. The amount of data usually sent per day is 350GB or somewhere around there. This always ran fine for months. I provided the job report to iLand as they requested. The time to complete was randomish usually which is weird to me. It would finish in say 8 hours, or 12 or 14 or sometimes even 20. But at least it would complete within the day.

All of a sudden the performance just took a nosedive this past weekend and the merge times (despite being fast clones!) have skyrocketed. This repo is using per vm backup files. The merge times used to complete in what I believe was 4 hours or so, and now I've had times where it is taking 20 or 30 hours. Since this is just metadata operations and fast clones I would expect far better performance as it used to be and just randomly fell off a cliff.

Veeam engineer told me to try using the defrag and compact option. I told him I can't really do that because I used to use that and it would take 3 or 4 days to run. (putting us way out of compliance and being behind 3 or 4 days in the cloud in case of disaster) iLand told me to turn that off and said with only 5 restore points (which i have) it is not needed. Seems like their storage performance is just bad or not enough cpu/memory (since it should be metadata fast cloning for compact operation I believe)

This is becoming a huge issue as we are behind on backups. We have backups as of Monday night in the cloud repo but are missing tue-wed due to this. If we had a disaster right now and needed them we would be ruined.

Aside from the merge process alone, it seems like the overall job performance just isn't good enough. Our upload bandwidth is 1gbps yet the job usually uploads at only 300mbps or so. I don't know if it is throttled by iLand for ingestion or what... But even at this speed. If you were to upload 350GB of data at 300mbps it would only take 3 hours or less judging from file transfer calculators. I know Veeam has to calculate what to send so it is not that fast, but something seems way off. I have noticed that it takes 30 minutes from the job start to even getting the list of VMs part to be showing and begin transfer. Then it takes like 7 minutes per vm while it says "initializing storage". All of this stuff is adding up.

iLand told me to reboot my veeam server and claimed 2 other customers solved it with that but I'm skeptical as the merge process should be something that happens on the remote end on their system...(i thought?) I did reboot the veeam server last night just to see if it would actually help.

Anyway, I'm hoping to hear there are larger customers with 70 or more VMs going to cloud connect and not having this much trouble. I've got to get this fixed as it is looking very bad for us.

We don't use wan acceleration as iLand claims or at least when we started, claimed that it is better without it if your internet is faster which ours is. Plus they charge extra money per month to use it I believe. But it never had an issue for quite a while without it so I don't think that is relevant.

backupquestions · Oct 17, 2019 4:42 pm

I've added more notes for the veeam tech as well to the ticket. I find from looking at the throughput chart that there are 4 periods of time where there is no activity shown. 0 kb/sec read and 0kb/sec write and these add up to over 3.5 hours. I don't know if this means iland repo not answering for activity or if it is a local server issue, but since all other jobs and repos are fine I would think an iland problem. Straightening this part out would obviously help a lot, but it doesn't even address the merge time issue which is totally separate deal.

Post by **foggy** » Oct 17, 2019 4:58 pm this post

I have noticed that it takes 30 minutes from the job start to even getting the list of VMs part to be showing and begin transfer. Then it takes like 7 minutes per vm while it says "initializing storage". All of this stuff is adding up.

Aside from the transfer, this also doesn't look right and should be investigated by our engineers in the first place. Could be some database issues, for example.

backupquestions · Oct 17, 2019 7:13 pm

Something else iLand is currently telling me is how I should split the job up into smaller jobs. I have one job with 70 vms in it that all copies to VCC.

I don't see why this would really help, because I am using per-vm backup files on their repo. It's the same amount of data that would need to be transferred as there is no dedupe between the vms. I would think running it all on one job would actually be more efficient.

Plus, I don't want to have to re seed everything which it seems like I would have to do if I split it up into a few jobs. I like keeping things simple where there are fewer jobs with more VMs.

What do you think regarding that?

Post by **foggy** » Oct 18, 2019 10:25 am this post

You wouldn't need to re-seed since you have per-VM chains - you will be able to copy backups into different folders and map the jobs to existing backups. I don't think this will help with the performance though. Investigation is required.

backupquestions · Oct 21, 2019 1:38 pm

Could Veeam push iLand and work together on this ticket? Could you have a manager push this issue? It's been several days and it is not getting better. I uploaded logs for Veeam, I requested multiple times for iLand to upload their side's logs which they have not done or at least not notified me that they have.

REFS fast clone merge on per vm chains should perform decently fast right? My seed size is 20TB and my incrementals are about 300GB total for the job each day. The merge is taking 12-17 hours or even more sometimes. I would not expect this on REFS on their repo. On my local repo in a job with far less VMs but the largest ones, only takes a few minutes to merge.

We have 2 companies with Veeam and iLand and have 2 more to implement and I won't be able to go down the path like this if we can't have stable VCC jobs. There has to be larger customers using it without this much trouble on merge/stability.

Today the cloud repo says "resource not ready: cloud task" with one VM remaining in the job. It would have been done uploading several hours ago if not for this, but then the merge would start and that would probably take forever. So overall once this finally finishes it will still take over 24 hours and put me behind again.

We need to add more servers as time goes on and I don't see how that will ever work.

Post by **foggy** » Oct 21, 2019 4:00 pm this post

You can use Talk to a Manager option and ask for the case escalation to a higher tier as well as for assistance with the investigation with iLand.

R&D Forums

copy job to VCC provider problems

Re: copy job to VCC provider problems

Re: copy job to VCC provider problems

Re: copy job to VCC provider problems

Re: copy job to VCC provider problems

Re: copy job to VCC provider problems

Re: copy job to VCC provider problems

Who is online