How to identify an idle VM
What is the objective of this? Simple because creating virtual machines is quick and easy and after a few months they can become hundreds or even thousands. Once you create a VM you normally have some information concerning its use and lifecycle: how long it is going to run? Or when is it going to be deleted? but within a few days of working in other projects it is likely that you will forget about this and the VM will keep running in your infrastructure for ever.
In some cases, this happens with hundreds of VMs that remain in your infrastructure making use of resources, limiting the performance of the servers and forcing to buy new equipment. Usually, at this point is when we start to think about which VMs are really being used and due to the lack of information, the question that we make is: How can we identify the useless VMs in the infrastructure?
You can answer this question from different perspectives, lets check some of them:
- Contact the VM manager and ask him if the VM is still in use.
- Search for the records that we have about the VM: the life cycle.
- Check the consumption performance of the VM and we will know if it is being used.
The first two points may answer the question, however, in a medium/large infrastructure it can be really time consuming to contact all VM the owners, and tracking the records of all the VMs is not a common practice when sysadmins have other priorities. The third option is the most accurate approach, but if you don’t have a solution adapted, it will be impossible to analyze the behavior VM per VM.
DC Scope allows to track and centralize the information for each VM, including the owner, the person in charge and its complete life-cycle. Additionally, it analyzes the performance and consumption rates of the VMs every minute. Each virtual machine is then labeled with one of the following categories:
Idle: If during a given period of time, 100% of the processor and memory consumption values are less than certain activity rate (Y%). The activity rate is proportional to the processor and memory capacity of the server where the VM is hosted. In this category, we will find all the VMs that have low consumption rates and therefore those who are likely to not being used.
Lazy: for Idle VMs where one of the resources (processor or memory) has consumption peaks higher than 30% for less than 10%.
Undersized: If during a given period of time, 100% of the processor and memory consumption values are higher than 70%.
Busy: If during a given period of time, 100% of the processor and memory consumption values are higher than 90%.
Oversized: If during a given period of time, 100% of the processor and memory consumption values are less than 30%.
Ghost: VMs turned off.
So, how to identify the useless VMs? Let’s focus on the Idle (unused VMS) and Ghost VMs.
The ghosts VMs are VMs that were turned OFF, but it does not mean that these are useless VMs (think about templates), but it is likely that some of them are just VMs that we no longer use. In this case will be important to check the end date of these VMs! And if it has already expired and has not been turned on since then, that’s a clue to delete that VM.
For the VMs Idle, the process is a little bit tricky. In order to be more precise with the detection of useless VMs, DC Scope includes a notion of noise. In the previous definitions, the detection of virtual machines is strict, 100% of points must be below or above a certain threshold. However, a virtual machine still has some activity (activity peaks) even if it is no being used. This activity results from the activity of the operating system (checking, installation, updates, etc.) or from the temporary activity of an application in the VMs (updates). So, in order to improve the detection of useless VMs, DC Scope performs these analyzes according to four thresholds: 100%, 99.99%, 99.9% and 99%.
Here you can check the analysis, starting by those VMs idle un the lower threshold (2%) and over a higher period of time (100%). Just like the Ghost VMs, some Idle VMs can have a specific function and can be deleted, but if they are in this category, it is very likely that they are useless VMs.
Once you go through the list, you can start with the deletion process. We recommend to turn the VMs OFF before deleting them and if nobody complaints then you can go ahead and delete the VM.
The deletion of useless VMs is a fundamental optimization and cleaning step to correctly manage your infrastructure, it allows you to:
- Reduce the allocation rates vCPU/VRAM in the ESX.
- Gain in stockage (VM sizing, swap, logs)
- Improve the visibility over the VMs in your infrastructure.
Now you have the information, you just need to download DC Scope to start an audit of your infrastructure, delete all the useless VMs and recover all those wasted resources!
Here you have a few useful tips:
- Try to be informed concerning the creation and end-of-life of each VM.
- Analyze all the ghost VMs as a priority. They could had been there for a long time.
- When checking the idle VMs, start by those with the lower activity threshold during longer periods of time.