-
- Novice
- Posts: 6
- Liked: never
- Joined: Oct 05, 2012 7:46 pm
- Full Name: Mervin Cinq-Mars
- Contact:
VMWare VM CPU Usage: Feature Request extended time options
We have a Pure VMware environment that we use Veeam One monitoring increasing on.
We use the VM CPU Usage alarm to accomplish two goal.
1) find VM's that could use additional vCPU's. Planning does not always keep up with reality, and sometimes systems get used more than in the past, and our Operations team would like to find these systems and add more CPU resources, keep these machines running well.
2) find machines that are using more CPU time that they legitimately should be, because of a problem. Including situations like a run away process, hung machine, unintentional heavy task etc.
Like most environments, we have a mix of systems,
a) some that idle the majority of the time, and spike for a few minutes or so
b) some that idle the majority of the time, and heavy spike for 30min / one / two / three hours
c) some that do some significant CPU work regularly including short spikes for much of the workday
Using the default VM CPU usage alert on "a" type systems, it works well. no false positives, and if we get an alert it means something has gone wrong and we investigate and fix the issue. Using the default VM CPU usage alert on "b or C" type systems, it does not works well. Too many false positives.
excluding type b or c machines is not an option, otherwise we to not meet goal #2
So, I created an additional alert, a copy of VM CPU usage, calling it VM CPU usage -HEAVY. This is set to alert on 98% usage and for the max time of 60 minutes. Type b and c are assigned to this alert and excluded from the default alert.
Using the VM CPU usage -HEAVY alert on "b" type systems, it works well for only some of them. I still get too many false positives, since some of them have a heavy CPU task that can last more than 60 minutes. Using the VM CPU usage -HEAVY alert on "c" type systems, it works well. no false positives, and if we get an alert it means something has gone wrong and we investigate and fix the issue.
I am left with machines that idle almost all the time, but on occasion work hard for more than 60 minutes straight.
If veeam had an option to alert after 90/120/180 minutes etc, this group could be taken care of.
Thus my feature request
PS I don't add more VCPU to these tasks because they are batch processes, and it does not matter that they take an hour or two. Adding more vCPU to these would soon get me into trouble with scheduling CPU's impacting he environment as a whole. We stick to the "use as few vCPU as the job requires" rule as often as possible.
We use the VM CPU Usage alarm to accomplish two goal.
1) find VM's that could use additional vCPU's. Planning does not always keep up with reality, and sometimes systems get used more than in the past, and our Operations team would like to find these systems and add more CPU resources, keep these machines running well.
2) find machines that are using more CPU time that they legitimately should be, because of a problem. Including situations like a run away process, hung machine, unintentional heavy task etc.
Like most environments, we have a mix of systems,
a) some that idle the majority of the time, and spike for a few minutes or so
b) some that idle the majority of the time, and heavy spike for 30min / one / two / three hours
c) some that do some significant CPU work regularly including short spikes for much of the workday
Using the default VM CPU usage alert on "a" type systems, it works well. no false positives, and if we get an alert it means something has gone wrong and we investigate and fix the issue. Using the default VM CPU usage alert on "b or C" type systems, it does not works well. Too many false positives.
excluding type b or c machines is not an option, otherwise we to not meet goal #2
So, I created an additional alert, a copy of VM CPU usage, calling it VM CPU usage -HEAVY. This is set to alert on 98% usage and for the max time of 60 minutes. Type b and c are assigned to this alert and excluded from the default alert.
Using the VM CPU usage -HEAVY alert on "b" type systems, it works well for only some of them. I still get too many false positives, since some of them have a heavy CPU task that can last more than 60 minutes. Using the VM CPU usage -HEAVY alert on "c" type systems, it works well. no false positives, and if we get an alert it means something has gone wrong and we investigate and fix the issue.
I am left with machines that idle almost all the time, but on occasion work hard for more than 60 minutes straight.
If veeam had an option to alert after 90/120/180 minutes etc, this group could be taken care of.
Thus my feature request
PS I don't add more VCPU to these tasks because they are batch processes, and it does not matter that they take an hour or two. Adding more vCPU to these would soon get me into trouble with scheduling CPU's impacting he environment as a whole. We stick to the "use as few vCPU as the job requires" rule as often as possible.
-
- VP, Product Management
- Posts: 27377
- Liked: 2800 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: VMWare VM CPU Usage: Feature Request extended time optio
Hi Mervin,
P.S. phew...it took me a while to read it and mull it over Great post, btw!
Thanks!
You can also use CPU ready metric to locate VMs that are experiencing lack of CPU resources.mervincm wrote:find VM's that could use additional vCPU's. Planning does not always keep up with reality, and sometimes systems get used more than in the past, and our Operations team would like to find these systems and add more CPU resources, keep these machines running well.
mervincm wrote:I still get too many false positives, since some of them have a heavy CPU task that can last more than 60 minutes.
Yes, we can add more options to this list, however if these tasks happen on regular basis, have you considered configuring suppress period by a specific time period?mervincm wrote:If veeam had an option to alert after 90/120/180 minutes etc, this group could be taken care of.
P.S. phew...it took me a while to read it and mull it over Great post, btw!
Thanks!
-
- Novice
- Posts: 6
- Liked: never
- Joined: Oct 05, 2012 7:46 pm
- Full Name: Mervin Cinq-Mars
- Contact:
Re: VMWare VM CPU Usage: Feature Request extended time optio
You can also use CPU ready metric to locate VMs that are experiencing lack of CPU resources.mervincm wrote:find VM's that could use additional vCPU's. Planning does not always keep up with reality, and sometimes systems get used more than in the past, and our Operations team would like to find these systems and add more CPU resources, keep these machines running well.
By lack of CPU resources, I mean "didn't add enough vCPUs to satisfy the workload"
I understand CPU ready to be useful to find cases of "added more vCPU's that the physical hardware could provide because it was busy feeding other virtual machines"
Do I have this incorrect?
mervincm wrote:I still get too many false positives, since some of them have a heavy CPU task that can last more than 60 minutes.
Yes, we can add more options to this list, however if these tasks happen on regular basis, have you considered configuring suppress period by a specific time period?mervincm wrote:If veeam had an option to alert after 90/120/180 minutes etc, this group could be taken care of.
Suppressions would work if they could be done by VM, if there was a repeating pattern, and if I had the time to learn the pattern.
I just see a few more options here would be a lot easier and would take care of the majority of my false positives.
-
- VP, Product Management
- Posts: 27377
- Liked: 2800 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: VMWare VM CPU Usage: Feature Request extended time optio
You got this correct, I was just trying to point out that you can also track physical CPU usage via this metric.mervincm wrote:I understand CPU ready to be useful to find cases of "added more vCPU's that the physical hardware could provide because it was busy feeding other virtual machines"
Do I have this incorrect?
Thanks for the feedback, I will ask our dev team to add more options then.mervincm wrote:I just see a few more options here would be a lot easier and would take care of the majority of my false positives.
-
- Novice
- Posts: 6
- Liked: never
- Joined: Oct 05, 2012 7:46 pm
- Full Name: Mervin Cinq-Mars
- Contact:
Re: VMWare VM CPU Usage: Feature Request extended time optio
Is there any feedback on if this can be done, and if so some idea of time frame? We need to decide if this can or can not be used for our purposes.
-
- VP, Product Management
- Posts: 27377
- Liked: 2800 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: VMWare VM CPU Usage: Feature Request extended time optio
Yes, I have asked to add more time periods in the options list of v8. Meanwhile, I believe it might be possible to set different options via SQL script. I will check with the devs tomorrow.
-
- Novice
- Posts: 6
- Liked: never
- Joined: Oct 05, 2012 7:46 pm
- Full Name: Mervin Cinq-Mars
- Contact:
Re: VMWare VM CPU Usage: Feature Request extended time optio
great, thanks for the amazing customer interaction!
-
- VP, Product Management
- Posts: 27377
- Liked: 2800 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: VMWare VM CPU Usage: Feature Request extended time optio
Unfortunately, it is not possible to adjust these periods in v7, but at least I've seen them already added in one of the builds of the next version.
Who is online
Users browsing this forum: No registered users and 18 guests