Interpreting real-time statistics

alanbolte · Dec 04, 2014 8:55 am

At present, the user guide contains a brief explanation of the real-time statistics window. However, occasionally I'll speak to a customer who wants as detailed an explanation as possible of these statistics, so I wanted to outline what I know, and to see if Product Management could clarify the finer points and update anything that may have changed for v8.

On the far right, the "Status" box summarizes only individual task results, it does not count the total quantity of error messages or any other factor. So for a synthetic full failure, you could see several successes and 0 errors because the synthetic full is part of the job operation, not a separate task. Note that errors indicate a failed task

In the middle top, the "Data" box lists the following:
Processed: This number is supposed to be based on the total size of all VM disks added to the job. I believe there have been bugs affecting this number in the past, but I'm not sure of the current status. While the job is running, this number increases as data blocks are read or skipped, so if you're watching the job in real time you will sometimes see this number jump suddenly due to CBT.
Read: This number indicates how much data passed from your datastores to the source proxies - the same data path described by "source" bottleneck info. This should always be equal to or lower than "processed" but can be significantly lower due to CBT.
Transferred: This number indicates how much data passed from the source proxies to the target-side data mover. This does not directly indicate how large the backup file will be, because of additional deduplication, overhead, and in some cases decompression by the data mover prior to writing the file to disk. Multiply this number by the number in parentheses and you should obtain the amount "read", above. This is the same data path described by "network" in the bottleneck info.

On the left, the Summary box lists:
Duration: Time from start of job to end of job.
Processing rate: From 6.5 to 7.0 this number changed quite a bit. It used to just be "processed" divided by "duration". In 7.0 and by all appearances 8.0 as well, it's derived from the individual disk read rates. If you're backing up only one disk, it will precisely match the read rate listed for that disk in the Action column. When you're backing up several disks in one job, with some durations overlapping due to parallel processing, but other durations separated due to concurrent task limits, it can be difficult to see how the numbers relate. I assume the principle is the same, I just don't know the actual algorithm.
Bottleneck: Highest percentage, as described in bottleneck information

Throughput graph:
The green read rate differs from the processing rate (in the "Summary" box) only in that the graph shows the rate at any given moment, whereas the processing rate can be thought of as more of an average rate over the duration of the job. So, you might see the green graph swing up and down dramatically as the job is running, while the processing rate is slow to change. Also, note that the "speed" number is just describing where the black line is, it's not nearly as useful a number as the processing rate.
The dark red area is analogous to the "transferred" number in the "Data" box.

As to the read rate for each disk in the action column, I would have thought it's just the amount read divided by the duration listed for that disk, but in practice it tends to be a bit higher. Presumably the duration of the "Hard disk" action includes operations that don't count toward the read time, either before or after the actual disk processing.

That brings us to the point where all this gets complicated: figuring out which individual operations within the backup process do and do not affect each number. So while snapshot removal could have a big impact on processing rate in 6.5, in recent versions it can only impact the duration. On the other hand, if you're having a hypothetical problem that causes Veeam to spend five minutes opening a disk that only takes 2 minutes to read, you might see a very low processing rate even though your real concern has nothing to do with throughput.

The numbers can also become confusing when CBT fails. Suddenly your data read matches your data processed, and the job runs twice as long even though your processing rate looks fantastic. The amount transferred should remain the same.

Anyway, if performance is a concern and the problem isn't obvious, don't panic, just collect logs and open a support case.

Post by **phokay** » Jan 19, 2015 2:26 am this post

Can we assume that "Read:" in "Data" box equal to actual blocks on disks changed at source VM? I am puzzled to see the size of "read:" so large for the jobs.

Post by **foggy** » Jan 19, 2015 8:47 am this post

With CBT enabled, yes. For the hints on what could cause large increments, please see this thread.

nmdange · Post by **nmdange** » Aug 17, 2017 9:25 pm this post

Just something I've been curious about. If you are looking at the throughput graph in a backup job, it includes two different numbers. Read speed (green color on the graph) and transfer speed (red color on the graph.) What's the difference between these two numbers?

I'm guessing the read speed is how fast the backup proxy is able to read the data from the source system, and then transfer speed is how fast it moved from the backup proxy to the backup repository? Sometimes I see a very large difference between these two numbers, and sometimes they are close together. I don't have any issues with the backup speeds, just want to educate myself a bit on how it works

DGrinev · Post by **DGrinev** » Aug 18, 2017 12:47 pm this post

Hi,

Please review this existing thread where one of ours community experts has described the meanings of the statistics.
Also, you can review the Sticky thread that consists of useful information for self education. Thanks!

R&D Forums

Interpreting real-time statistics

Re: Interpreting real-time statistics

Re: Interpreting real-time statistics

[MERGED] Question about throughput graph

Re: Question about throughput graph

Who is online