Ciao Pavel,
I have many updated info got after days of logs analysis.
big difference in speed between the VM
the slowest backup VM is server04 which is the Veeam B&R server itself and the only backup proxy with default setup: "transport mode=Automatic selection" and Max concurrent tasks=2.
The backup session named "Backup Dati" is made by 3 VM task sessions and this is timing:
Code: Select all
18/01/2016 tasks period
overalljob 21:00:12------------------------------------------------21:41:13
server01 ---21:02:12---------21:15:02
server04 --------------------------------21:16:07------------21:40:33
server06 ---------------21:09:41---21:15:52
The sessions are not quite concurrent but seems sequential, moreover server04 has unexpected size read and duration. These job historical statistics with my note/doubts:
Code: Select all
Backupjob "Backup Dati"
18/01/2016 21:00:14 :: Job started at 18/01/2016 21:00:12
18/01/2016 21:00:15 :: Building VM list
18/01/2016 21:00:55 :: VM size: 1,3 TB
18/01/2016 21:00:56 :: Changed block tracking is enabled
18/01/2016 21:02:07 :: Preparing next VM for processing-------------------------this is for server01
18/01/2016 21:02:12 :: Preparing next VM for processing-------------------------this is for server06
18/01/2016 21:02:12 :: Processing server01
18/01/2016 21:09:41 :: Processing server06--------------------------------------[color=#4040FF]Why wait since 21:02:12 till 21:09:41 (0:07:28) before start processing ?[/color]
18/01/2016 21:09:41 :: Waiting for the next task
18/01/2016 21:15:53 :: Preparing next VM for processing-------------------------this is for server04
18/01/2016 21:16:06 :: Processing server04--------------------------------------Takes much time 0:24:26, see related session statistics below.
18/01/2016 21:40:33 :: All VMs have been queued for processing
18/01/2016 21:41:09 :: Removing '\\192.168.23.239\Backup_VM\Backup Dati\Backup Dati2016-01-11T210057.vrb' per retention policy
18/01/2016 21:41:13 :: Load: Source 53% > Proxy 24% > Network 57% > Target 29%
18/01/2016 21:41:13 :: Primary bottleneck: Network
18/01/2016 21:41:13 :: Job finished at 18/01/2016 21:41:13
Note server04 delayed since 21:02:07 until 21:15:53 (0:13:46): this is expected behavior because is the Veeam B&R server itself and backup is done in a sequential order: "placed at the bottom of the queue and wait for the resources assignment until all other job tasks are completed." ref
http://forums.veeam.com/vmware-vsphere- ... 25692.html
Code: Select all
server04 task session
18/01/2016 21:02:07 :: Waiting for completion of other tasks (VM is backup server - not a parallel task)
18/01/2016 21:15:53 :: Queued for processing at 18/01/2016 21:15:53
18/01/2016 21:15:54 :: Required backup infrastructure resources have been assigned
18/01/2016 21:16:07 :: VM processing started at 18/01/2016 21:16:07
18/01/2016 21:16:07 :: VM size: 180,0 GB
18/01/2016 21:16:09 :: Getting VM info from vSphere
18/01/2016 21:16:50 :: Inventorying guest system
18/01/2016 21:17:04 :: Excluding own configuration database from VSS snapshot
18/01/2016 21:17:04 :: Preparing guest for hot backup
18/01/2016 21:17:18 :: Creating snapshot
18/01/2016 21:17:21 :: Releasing guest
18/01/2016 21:17:31 :: Indexing guest file system
18/01/2016 21:17:38 :: Saving [vmware00_Datastore_03] server04/server04.vmx
18/01/2016 21:17:38 :: Saving [vmware00_Datastore_03] server04/server04.vmxf
18/01/2016 21:17:39 :: Saving [vmware00_Datastore_03] server04/server04.nvram
18/01/2016 21:17:40 :: Using backup proxy VMware Backup Proxy for disk Hard disk 1 [nbd]----------Backup proxy cannot backup itself in hot add mode thus use nbd=network mode
18/01/2016 21:17:40 :: Using backup proxy VMware Backup Proxy for disk Hard disk 2 [nbd]----------Backup proxy cannot backup itself in hot add mode thus use nbd=network mode
18/01/2016 21:17:42 :: Hard disk 1 (80,0 GB) 54,9 GB read at 44 MB/s [CBT]
18/01/2016 21:17:42 :: Hard disk 2 (100,0 GB) 11,4 GB read at 27 MB/s [CBT]
18/01/2016 21:18:08 :: Getting list of guest file system local users
18/01/2016 21:39:42 :: Saving GuestIndexData.zip
18/01/2016 21:39:43 :: Saving GuestMembers.xml
18/01/2016 21:39:45 :: Removing VM snapshot
18/01/2016 21:40:21 :: Publishing guest file system index to catalog
18/01/2016 21:40:22 :: Finalizing
18/01/2016 21:40:23 :: Truncating transaction logs
18/01/2016 21:40:27 :: Truncating SQL server transaction logs
18/01/2016 21:40:31 :: Swap file blocks skipped: 4,0 GB
18/01/2016 21:40:33 :: Busy: Source 96% > Proxy 36% > Network 5% > Target 2%
18/01/2016 21:40:33 :: Primary bottleneck: Source
18/01/2016 21:40:33 :: Network traffic verification detected no corrupted blocks
18/01/2016 21:40:33 :: Processing finished at 18/01/2016 21:40:33
Now looking at the two Hard disks related statistic rows there are many considerations to do:
1) I'm sure the ""read at...speed" is NOT a "read only activity" instead is (read source, network transport to repository, write to disk) AvgSpeed = processing rate we talk about upper.
2) All used space on each disks (54,9 GB and 11,4 GB) is read. Why ? We expected only changed data to be read.
3) It says [CBT] ?! BURP ! Expect is CBT disabled automatically on any VM that acts as backup proxy moreover VM server04 has ExtensionData.Config.ChangeTrackingEnabled = $false Did you agree with me ?
About the other backup job ever taking 4/5 hours is the backup 2 VM both containing IBM Domino (non-VSS application) thus pre-freeze/post-thaw is used to stop/start service. This statistic of the bigger one:
Code: Select all
Domino task session
19/01/2016 03:31:40 :: Queued for processing at 19/01/2016 03:31:40
19/01/2016 03:31:40 :: Required backup infrastructure resources have been assigned
19/01/2016 03:31:45 :: VM processing started at 19/01/2016 03:31:45
19/01/2016 03:31:45 :: VM size: 680,0 GB
19/01/2016 03:31:48 :: Getting VM info from vSphere
19/01/2016 03:31:56 :: Inventorying guest system
19/01/2016 03:32:04 :: Running pre-freeze script in the guest OS
19/01/2016 03:33:06 :: Preparing guest for hot backup
19/01/2016 03:33:14 :: Creating snapshot
19/01/2016 03:33:18 :: Releasing guest
19/01/2016 03:33:27 :: Running post-thaw script in the guest OS
19/01/2016 03:34:29 :: Indexing guest file system
19/01/2016 03:34:36 :: Saving [vmware00_Datastore_03] domino/domino.vmx
19/01/2016 03:34:36 :: Saving [vmware00_Datastore_03] domino/domino.vmxf
19/01/2016 03:34:37 :: Saving [vmware00_Datastore_03] domino/domino.nvram
19/01/2016 03:34:38 :: Using backup proxy VMware Backup Proxy for disk Hard disk 1 [hotadd]
19/01/2016 03:34:38 :: Using backup proxy VMware Backup Proxy for disk Hard disk 2 [hotadd]
19/01/2016 03:34:40 :: Changed block tracking has been disabled for backup proxy VM.------------[color=#4040FF]??????? weird see note below.[/color]
19/01/2016 03:35:35 :: Hard disk 1 (80,0 GB) 3,5 GB read at 5 MB/s [CBT]
19/01/2016 03:35:36 :: Hard disk 2 (200,0 GB) 102,9 GB read at 7 MB/s [CBT]
19/01/2016 03:35:40 :: Getting list of guest file system local users
19/01/2016 03:50:52 :: Using backup proxy VMware Backup Proxy for disk Hard disk 3 [hotadd]
19/01/2016 03:53:06 :: Hard disk 3 (400,0 GB) 260,5 MB read at 9 MB/s [CBT]
19/01/2016 07:32:25 :: Saving GuestIndexData.zip
19/01/2016 07:32:25 :: Saving GuestMembers.xml
19/01/2016 07:32:28 :: Removing VM snapshot
19/01/2016 07:34:48 :: Publishing guest file system index to catalog
19/01/2016 07:34:50 :: Finalizing
19/01/2016 07:34:52 :: Truncating transaction logs
19/01/2016 07:35:00 :: Swap file blocks skipped: 2,8 GB
19/01/2016 07:35:02 :: Busy: Source 7% > Proxy 13% > Network 21% > Target 78%
19/01/2016 07:35:02 :: Primary bottleneck: Target
19/01/2016 07:35:02 :: Network traffic verification detected no corrupted blocks
19/01/2016 07:35:02 :: Processing finished at 19/01/2016 07:35:02
The only info I found is "You will see warning messages during your VM backup if the changed block tracking is enabled and the VM has an active snapshot"...Yes, true, there is "an active snapshot" Veeam B&R created at 03:33:14. What do you think about ?
The long duration 4/5 hours is due reverse incremental we use...not so smart for a disk size=200GB used space=119,00GB and daily changes=102,9GB.
That's a lot of random I/O on the NAS, and likely why the system is showing the "target" as the bottleneck. I changed it to Forever forward incremental and now starting from 2th backup it takes 25 minutes: "Hard disk 2 (200,0 GB)110,2 GB read at 85 MB/s [CBT]" but the warning "Changed block tracking has been disabled for backup proxy VM." still remains.
Great improvement but anyway we are checking why these high daily changes rates on that disk.
Thanks for help and for the patience to read me.