At present, the user guide contains a brief explanation of the real-time statistics window.
However, occasionally I'll speak to a customer who wants as detailed an explanation as possible of these statistics, so I wanted to outline what I know, and to see if Product Management could clarify the finer points and update anything that may have changed for v8.
On the far right, the "Status" box summarizes only individual task results, it does not
count the total quantity of error messages or any other factor. So for a synthetic full failure, you could see several successes and 0 errors because the synthetic full is part of the job operation, not a separate task. Note that errors indicate a failed task
In the middle top, the "Data" box lists the following:Processed:
This number is supposed to be based on the total size of all VM disks added to the job. I believe there have been bugs affecting this number in the past, but I'm not sure of the current status. While the job is running, this number increases as data blocks are read or skipped, so if you're watching the job in real time you will sometimes see this number jump suddenly due to CBT.Read:
This number indicates how much data passed from your datastores to the source proxies - the same data path described by "source" bottleneck info. This should always be equal to or lower than "processed" but can be significantly lower due to CBT. Transferred:
This number indicates how much data passed from the source proxies to the target-side data mover. This does not directly indicate how large the backup file will be, because of additional deduplication, overhead, and in some cases decompression by the data mover prior to writing the file to disk. Multiply this number by the number in parentheses and you should obtain the amount "read", above. This is the same data path described by "network" in the bottleneck info.
On the left, the Summary box lists:Duration:
Time from start of job to end of job.Processing rate:
From 6.5 to 7.0 this number changed quite a bit. It used to just be "processed" divided by "duration". In 7.0 and by all appearances 8.0 as well, it's derived from the individual disk read rates. If you're backing up only one disk, it will precisely match the read rate listed for that disk in the Action column. When you're backing up several disks in one job, with some durations overlapping due to parallel processing, but other durations separated due to concurrent task limits, it can be difficult to see how the numbers relate. I assume the principle is the same, I just don't know the actual algorithm.Bottleneck:
Highest percentage, as described in bottleneck informationThroughput graph:
The green read rate differs from the processing rate (in the "Summary" box) only in that the graph shows the rate at any given moment, whereas the processing rate can be thought of as more of an average rate over the duration of the job. So, you might see the green graph swing up and down dramatically as the job is running, while the processing rate is slow to change. Also, note that the "speed" number is just describing where the black line is, it's not nearly as useful a number as the processing rate.
The dark red area is analogous to the "transferred" number in the "Data" box.
As to the read rate for each disk in the action column, I would have thought it's just the amount read divided by the duration listed for that disk, but in practice it tends to be a bit higher. Presumably the duration of the "Hard disk" action includes operations that don't count toward the read time, either before or after the actual disk processing.
That brings us to the point where all this gets complicated: figuring out which individual operations within the backup process do and do not affect each number. So while snapshot removal could have a big impact on processing rate in 6.5, in recent versions it can only impact the duration. On the other hand, if you're having a hypothetical problem that causes Veeam to spend five minutes opening a disk that only takes 2 minutes to read, you might see a very low processing rate even though your real concern has nothing to do with throughput.
The numbers can also become confusing when CBT fails. Suddenly your data read matches your data processed, and the job runs twice as long even though your processing rate looks fantastic. The amount transferred should remain the same.
Anyway, if performance is a concern and the problem isn't obvious, don't panic, just collect logs and open a support case.