-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Incremental to rollback transform: possible bottlenecks?
Transforming my backups from incremental to rollback is taking far longer with v6. I'm just trying to find a bottleneck, but I'm not seeing anything.
With v5, the backup/transform combination took around 6 hours. Since v6, it's been taking at least 24 hours. The incremental file sizes have been similar.
CPU on the backup server (which is also the proxy) runs between 20% and 30%. It's not doing anything else and has 4 vCPUs. The server has 6Gb of ram which I'd think is enough, though the task manager shows 0 free (no process has more than 75Mb). Maybe I'll hunt around and try to find that memory -- file cache size issue I vaguely remember reading about.
The job doesn't seem to track the bottleneck during a transform . . ?
The storage is on an iSCSI SAN. Admittedly low-end. However, the transform job is only reading/writing at around 5 MB/s. I can copy files to and from the backup drive during the transform at 50 MB/s. Even allowing for RAID, iSCSI, and randomness that is a very big drop in storage performance.
Could there be something else going on or somewhere else I should look to identify a bottleneck? I don't want to automatically think it could be a v6 issue, but it's hard to ignore the definite difference from 5 to 6 when I don't believe anything else would have changed in my environment.
With v5, the backup/transform combination took around 6 hours. Since v6, it's been taking at least 24 hours. The incremental file sizes have been similar.
CPU on the backup server (which is also the proxy) runs between 20% and 30%. It's not doing anything else and has 4 vCPUs. The server has 6Gb of ram which I'd think is enough, though the task manager shows 0 free (no process has more than 75Mb). Maybe I'll hunt around and try to find that memory -- file cache size issue I vaguely remember reading about.
The job doesn't seem to track the bottleneck during a transform . . ?
The storage is on an iSCSI SAN. Admittedly low-end. However, the transform job is only reading/writing at around 5 MB/s. I can copy files to and from the backup drive during the transform at 50 MB/s. Even allowing for RAID, iSCSI, and randomness that is a very big drop in storage performance.
Could there be something else going on or somewhere else I should look to identify a bottleneck? I don't want to automatically think it could be a v6 issue, but it's hard to ignore the definite difference from 5 to 6 when I don't believe anything else would have changed in my environment.
-
- Chief Product Officer
- Posts: 31707
- Liked: 7212 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: transform incremental to rollback possible bottlenecks
Please note that transform operation is performed by the backup repository agent, and not a backup proxy. However, I assume that your backup repository is also local.
Nevertheless, I will QC ask to compare v5 and v6 on the same data.
Actually, performance drop by an order of magnitude is quite expected with regular hard drives. See the below for sequential vs. random I/O performance comparison depending on hard drives type (regular vs. SSD), speaks for itself...averylarry wrote:The storage is on an iSCSI SAN. Admittedly low-end. However, the transform job is only reading/writing at around 5 MB/s. I can copy files to and from the backup drive during the transform at 50 MB/s. Even allowing for RAID, iSCSI, and randomness that is a very big drop in storage performance.
Nevertheless, I will QC ask to compare v5 and v6 on the same data.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
Thanks -- I know that drastic drops from random access is common. This just seems way more than can be ascribed to storage. Not that it can't be -- but I can max out at 200MB/s sequential (2 Gb agregated iSCSI ports).
So before I just say that's it's storage, I'd like to explore other possibilities.
So before I just say that's it's storage, I'd like to explore other possibilities.
-
- Novice
- Posts: 3
- Liked: never
- Joined: Dec 22, 2011 8:10 pm
- Full Name: Dave Bartel
- Contact:
Re: transform incremental to rollback possible bottlenecks
Perhaps I'm misinterpreting things here, but as I understand it, your Veeam server is pulling data from this SAN, performing the transform and pushing the transformed data back to the same SAN?
If this is the case, what you're likely running into is performance issues when trying to read from and write to the disk at the same time. I have run into similar performance issues, especially in some of the cheaper SAN solutions using software RAID. Also, have you examined your CPU/memory utilization during the transformation process? Task manager should be able to show you quite quickly if you are being bottlenecked by one of those resources.
I'm guessing you've probably checked these items already. But if not, food for thought.
If this is the case, what you're likely running into is performance issues when trying to read from and write to the disk at the same time. I have run into similar performance issues, especially in some of the cheaper SAN solutions using software RAID. Also, have you examined your CPU/memory utilization during the transformation process? Task manager should be able to show you quite quickly if you are being bottlenecked by one of those resources.
I'm guessing you've probably checked these items already. But if not, food for thought.
-
- VP, Product Management
- Posts: 6027
- Liked: 2855 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: transform incremental to rollback possible bottlenecks
Did the target change with the move from V5 -> V6? What is the target? Do you have any load balancing in use talking to this iSCSI target (I ask because an earlier forum posted indicated that disabling round-robin load-balancing made a huge difference in his case).
Also, how many spindles do you have? Synthetic full is all about random IOPS. It only takes 3-4 drives to get to 200MB/s sequential throughput, but those same drives will deliver at best maybe 250-400 random IOPS.
Also, how many spindles do you have? Synthetic full is all about random IOPS. It only takes 3-4 drives to get to 200MB/s sequential throughput, but those same drives will deliver at best maybe 250-400 random IOPS.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
1) CPU and memory -- as stated above, 20%-30% cpu. Memory is totally chewed up by OS cache, but that's how 2008 works. No processes are eating up ram. Good memory discussion here: Memory performance, but the hotfix said it doesn't apply to my server.
2) The target did not change. The server is a virtual machine and has the same virtual disk that it has always used.
3) I just tend to think storage is the wrong path to follow because I did not have this bad of performance when running the same thing under v5. I have 11 spindles running raid 6. I've tried fixed path and round robin with no difference. Additionally, I can easily read/write multiple separate large files at the same time as the transform (simulating pseudo-random simultaneous read/write iops) and get 50MB/s.
My suspicion is the disk cache/memory thing but I can't find a fix for my server (2008 R2 sp1).
Further clues leading towards the memory issue -- the job is still running, veeamagent is using 25% cpu, but the .vrb file hasn't been modified for about an hour and a half. The .vbk file hasn't been modified for almost 4 hours. No other files have been touched for 6 hours.
2) The target did not change. The server is a virtual machine and has the same virtual disk that it has always used.
3) I just tend to think storage is the wrong path to follow because I did not have this bad of performance when running the same thing under v5. I have 11 spindles running raid 6. I've tried fixed path and round robin with no difference. Additionally, I can easily read/write multiple separate large files at the same time as the transform (simulating pseudo-random simultaneous read/write iops) and get 50MB/s.
My suspicion is the disk cache/memory thing but I can't find a fix for my server (2008 R2 sp1).
Further clues leading towards the memory issue -- the job is still running, veeamagent is using 25% cpu, but the .vrb file hasn't been modified for about an hour and a half. The .vbk file hasn't been modified for almost 4 hours. No other files have been touched for 6 hours.
-
- Chief Product Officer
- Posts: 31707
- Liked: 7212 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: transform incremental to rollback possible bottlenecks
Hi Ted, yesterday I have asked QC to compare v5 and v6 transform rates on the same data set and just heard back from them - v6 showed the same (even slightly faster) transform performance. Thanks.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
47 hours and still running. If it's the memory issue I described, could I pursue this with tech support? I don't have any type of Microsoft tech support . . .
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
Also -- any other ideas out there? I've been around computers and storage enough to feel confident of 2 things --
1) I fully understand that storage is a huge probability and I won't convince anyone differently without some type of proof.
2) The read/writes are so ridiculously slow that I have to believe something else is at the very least exacerbating the problem.
I'll see if I can throw more ram at the server once the transform finishes. Also, I think I'll run a transform daily instead of weekly.
1) I fully understand that storage is a huge probability and I won't convince anyone differently without some type of proof.
2) The read/writes are so ridiculously slow that I have to believe something else is at the very least exacerbating the problem.
I'll see if I can throw more ram at the server once the transform finishes. Also, I think I'll run a transform daily instead of weekly.
-
- Chief Product Officer
- Posts: 31707
- Liked: 7212 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: transform incremental to rollback possible bottlenecks
You should open support case at least to have them check the logs and see if everything works OK on the Veeam side. They are unlikely to be able to help you if the issue is on Microsoft side. But I doubt there are issues on Microsoft side here (at least, we are not seeing them).
My only idea - was the job in question upgraded from v5, or created in v6 from scratch.
My only idea - was the job in question upgraded from v5, or created in v6 from scratch.
-
- Novice
- Posts: 3
- Liked: never
- Joined: Dec 22, 2011 8:10 pm
- Full Name: Dave Bartel
- Contact:
Re: transform incremental to rollback possible bottlenecks
Sorry I missed the performance information in the original post, feeling like a bit of an idiot now.
Another long-shot, but do you possibly have enough local storage space on the ESX host running your Veeam VM that you could mount and create a backup repository there to potentially rule out storage as the issue?
I've seen some unusual performance issues lately as well with jobs freezing with no apparent changes for hours lately (Both with V5 and V6), but they've been tough to troubleshoot as we are having some hardware issues as well. If I find anything useful, I'll pass it on.
Another long-shot, but do you possibly have enough local storage space on the ESX host running your Veeam VM that you could mount and create a backup repository there to potentially rule out storage as the issue?
I've seen some unusual performance issues lately as well with jobs freezing with no apparent changes for hours lately (Both with V5 and V6), but they've been tough to troubleshoot as we are having some hardware issues as well. If I find anything useful, I'll pass it on.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
Ha ha ha ha! I wish I had enough storage to mess around with.DaveBartel wrote:Sorry I missed the performance information in the original post, feeling like a bit of an idiot now.
Another long-shot, but do you possibly have enough local storage space on the ESX host running your Veeam VM that you could mount and create a backup repository there to potentially rule out storage as the issue?
I've seen some unusual performance issues lately as well with jobs freezing with no apparent changes for hours lately (Both with V5 and V6), but they've been tough to troubleshoot as we are having some hardware issues as well. If I find anything useful, I'll pass it on.
Is there a built-in 48 hour limit on tranforms? I've had the transform fail twice now at exactly 48:01:08.
The .vbk file is ~990GB. One particular .vib file is ~445GB. There are 7 other .vib files from 1GB to 22GB. The .vrb file gets up to around 150GB before the transform job fails. The disk usage seems to degrade over time, but I haven't really kept track closely.
So right now -- the transform has been running for 4 hours 39 minutes. The .vbk file that's being processed is at 56GB. The disk usage is about 3.8MB/s. I'll try to keep track over time.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
The transform has now been running for 7 hours and 55 minutes. The .vrb (I meant .vrb in the last post) is up from 56GB to 72GB. The disk usage is down to about 2.8MB/s.
-
- Chief Product Officer
- Posts: 31707
- Liked: 7212 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: transform incremental to rollback possible bottlenecks
This is very slow - sounds like a storage issue indeed.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
17 hours, 27 minutes. Up to 110GB .vrb file. The disk usage continues to get worse at 1.8MB/s.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
28 hours 32 minutes. 136GB .vrb file. No apparent activity at all. The .vrb file's last modified time is 40 minutes ago. The .vbk file hasn't changed in over 2 hours. Pretty much 0 usage.
-
- Novice
- Posts: 3
- Liked: never
- Joined: Dec 22, 2011 8:10 pm
- Full Name: Dave Bartel
- Contact:
Re: transform incremental to rollback possible bottlenecks
Depending on your SAN, this may not be relevant, but I had similar performance issues with a low-end SAN we were using for backup storage with Veeam. For some inexplicable reason, the solution was for me to change the NIC-teaming mode on the SAN from round-robin to failover. In theory this should reduce overall throughput, but it returned my backup performance to what it was months ago. Best of luck to you in your continued struggle with this.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
Once you fill up all the various locations where data can cache, whether the OS, the network, the host, the SAN cache, or the physical drives; you'll be at your "worst" storage performance (in general -- I'm simplifying). After over 100GB of data, I believe that all the separate caches are full, and have been for awhile. At this point, there's little left to explain the symptoms I've described from an exclusively storage viewpoint. This is my opinion. I don't have a good idea of how to prove it.
I'm getting very close to deleting this entire backup (for at least the 3rd time since upgrading to v6). I know tech support is busy and it'll be difficult to get past "this is a storage problem". I can't wait 2 days for it to fail and then try again. And again. And again.
My leading ideas:
1) This is the issue with Microsoft dynamic cache as I linked earlier -- http://support.microsoft.com/default.as ... -US;979149 Microsoft does not appear to have a workaround for 2008 R2 (the hotfix isn't applicable because it's included in SP1 but plenty of people still have the issue after SP1) and simply claims that the vendor should "configure the application to flush data more frequently".
2) Some bizarre corruption in the .vib or .vbk file is causing Veeam to slow down and eventually get caught in a "wait-forever" loop or something until a higher-level 48 hour timeout finally kills the job.
3) There is some scaling issue with my server and Veeam v6. Everything is fine with <100GB .vib files. The 445GB .vib file seems to die a slow death.
I'm getting very close to deleting this entire backup (for at least the 3rd time since upgrading to v6). I know tech support is busy and it'll be difficult to get past "this is a storage problem". I can't wait 2 days for it to fail and then try again. And again. And again.
My leading ideas:
1) This is the issue with Microsoft dynamic cache as I linked earlier -- http://support.microsoft.com/default.as ... -US;979149 Microsoft does not appear to have a workaround for 2008 R2 (the hotfix isn't applicable because it's included in SP1 but plenty of people still have the issue after SP1) and simply claims that the vendor should "configure the application to flush data more frequently".
2) Some bizarre corruption in the .vib or .vbk file is causing Veeam to slow down and eventually get caught in a "wait-forever" loop or something until a higher-level 48 hour timeout finally kills the job.
3) There is some scaling issue with my server and Veeam v6. Everything is fine with <100GB .vib files. The 445GB .vib file seems to die a slow death.
-
- Chief Product Officer
- Posts: 31707
- Liked: 7212 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: transform incremental to rollback possible bottlenecks
It would have helped tremendously if you could to dedicate some other storage just for the sake of experiment (even a share on the desktop with a large 2TB hard drive). This would rule out (or prove) a possible issue with that specific storage device you are currently having issues with.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
Indeed -- a decent idea. Don't think I have a 2TB drive, and if I did I can't imagine how long it would take to copy 2TB from the SAN to a single local SATA.
-
- Chief Product Officer
- Posts: 31707
- Liked: 7212 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: transform incremental to rollback possible bottlenecks
Even "green" 2TB guys can do 100MB/s sequential writes. Sure, they will be very bad with random I/O (transform), but still they surely should be able to do better than a few MB/s (to complete halt), as you are seeing...
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
I got a different veeamtransport.exe file from support. 116 hours and counting. The .vrb file is up to 219GB.
-
- Chief Product Officer
- Posts: 31707
- Liked: 7212 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: transform incremental to rollback possible bottlenecks
I am not quite sure what, and why they gave you. Reading back through my response above, there are no known issues with transform performance in v6, so there are no any kind of patches or fixes around this functionality available at this time.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
Case 5163515
I presumed that there is a hard coded 48 hour limit on a transform and the new file has no limit. It's actually veeamtransportsvc.exe. There hasn't been any reference to possible performance issues. I think this is just so it can (eeevveeentuallllllyyyy) finish.
I presumed that there is a hard coded 48 hour limit on a transform and the new file has no limit. It's actually veeamtransportsvc.exe. There hasn't been any reference to possible performance issues. I think this is just so it can (eeevveeentuallllllyyyy) finish.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
No joy. There's a 7 day global timeout that I hit and failed. Allowing for triple data I/O (read from .vib, write to .vbk, write to .vrb), that still means I was running below 2Mb/s.
I'm going to setup iometer and see what it shows.
I'm going to setup iometer and see what it shows.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: transform incremental to rollback possible bottlenecks
Grr. iometer doesn't seem to like my backup disk (dynamic ntfs spanned volume).
I'm considering wiping my entire 4Tb of backup and then I can recreate the datastore for native vmfs5.
I'm considering wiping my entire 4Tb of backup and then I can recreate the datastore for native vmfs5.
-
- Lurker
- Posts: 2
- Liked: never
- Joined: Oct 27, 2011 3:03 pm
- Contact:
Re: transform incremental to rollback possible bottlenecks
Have upgrade from v5 to v6patch2 last week, no changes on hardware, my five transform jobs (on datadomain appliance) are now taking at least 2x more time to achieve ...... Backup server is a windows 2003 and cpu/mem usage on it are very low ( 25 %). I really think there is something on v6 slowing down transform jobs.
-
- VP, Product Management
- Posts: 27325
- Liked: 2778 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: transform incremental to rollback possible bottlenecks
How many transform operations did you already have? Was this the first time you tried to do that after upgrade?
You may want to open a support case at least to have our engineers check the logs and see if everything works fine on our side, may be there is something else that affects transform performance for the backup job.
You may want to open a support case at least to have our engineers check the logs and see if everything works fine on our side, may be there is something else that affects transform performance for the backup job.
-
- Veteran
- Posts: 391
- Liked: 32 times
- Joined: Jul 18, 2011 9:30 am
- Full Name: Hussain Al Sayed
- Location: Bahrain
- Contact:
Re: transform incremental to rollback possible bottlenecks
Hi,
having the same issue, but with no job failure it's just takes longer to finish Rollback Synthetic backup. Backup speed around 700/MBs and 1001/MBs VMDK file around 200GB and less. VBK file around 55GB.
Is it normal or some hot fix available for it?
Thanks,
having the same issue, but with no job failure it's just takes longer to finish Rollback Synthetic backup. Backup speed around 700/MBs and 1001/MBs VMDK file around 200GB and less. VBK file around 55GB.
Is it normal or some hot fix available for it?
Thanks,
-
- Chief Product Officer
- Posts: 31707
- Liked: 7212 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: transform incremental to rollback possible bottlenecks
Transform is a very I/O intensive operation, so it is normal for it to take long time if backup target cannot dish out good IOPS.
Who is online
Users browsing this forum: No registered users and 34 guests