Skip to main content

How to Measure CPU Ready in OpenShift Virtualization?


eprieto
Forum|alt.badge.img+7
  • On the path to Greatness
  • 159 comments

Everyone managing VMware understands what we mean when we talk about CPU Ready (%RDY) as a Performance Metric.

In this explanation, I will define what it is and how it is measured in OpenShift Virtualization.

 

What is CPU Ready?

The term "CPU Ready" can be misleading. One might think that it represents the amount of CPU ready to be used and that a high value is a good thing. However, the higher the CPU Ready, the worse the performance of the vSphere infrastructure, and the more applications will suffer.

Official VMware Definition:

It is the percentage of time a "world" is ready to run and waiting for approval from the CPU Scheduler.

In vSphere, a "world" is a process.

  • The higher the CPU Ready, the more time VMs spend waiting instead of executing.
  • In other words, a "world" (vCPU) is waiting to be scheduled on a physical CPU.
  • CPU Ready measures the time a vCPU waits to be executed on a physical core.

What Causes High CPU Ready?

Identifying high CPU usage is easy, but determining the cause of CPU Ready can be more challenging.
The two main causes of high CPU Ready are:

  1. High CPU Oversubscription
  2. Use of CPU Limits

CPU Oversubscription

The most common reason for high CPU Ready is assigning more vCPUs than the physical CPUs can handle.

General rules for vCPU to pCPU ratios:

  • 1:1 to 1:3 β†’ No problem.
  • 1:3 to 1:5 β†’ Performance degradation may begin.
  • 1:5 or higher β†’ Likely CPU Ready issues.

How to View CPU Ready in VMware?

  • The best way to analyze CPU Ready is at the VM level and per vCPU, not at the host level.
  • In vSphere Client, you can add CPU Ready as a performance metric in the performance charts.
  • In esxtop, you can view %RDY per VM.

 

What is a Normal CPU Ready Value?

  • VMware recommends keeping CPU Ready below 5% per vCPU.
  • CPU Ready is measured in milliseconds (ms) in the vSphere UI.
  • To convert it to a percentage:
    • Example: If a VM has 2173 ms of CPU Ready in a 20-second period (20000 ms): (2173/20000)=10.87%(2173 / 20000) = 10.87\%
    • This is considered high and indicates a problem.

 

How to Measure CPU Ready in OpenShift Virtualization?

In VMware vSphere, CPU Ready (%RDY) measures the time a vCPU spends waiting to be scheduled on a physical CPU.

In OpenShift Virtualization, there is no direct "CPU Ready" metric, but the equivalent is measured through CPU Throttling.

πŸ“Œ Equivalent Metric in OpenShift Virtualization

container_cpu_cfs_throttled_seconds_total

  • Measures the total time (in seconds) that a container (or VM in OpenShift Virtualization) has been throttled, meaning it has been restricted in its CPU usage because it exceeded assigned resources.

πŸ“Œ Other Related Metrics:

  • container_cpu_cfs_periods_total β†’ Total CPU periods allocated to the container.
  • container_cpu_cfs_throttled_periods_total β†’ Number of CPU periods in which the VM was throttled.

How to Monitor CPU Throttling in OpenShift?

πŸ“Œ Using Grafana in OpenShift Monitoring

If you have Prometheus and Grafana configured, you can create a dashboard with:

  • Metric: container_cpu_cfs_throttled_seconds_total
  • Filter: {pod=~"my-vm-.*"} (to focus on specific VMs)

πŸ“Œ Using oc adm top to Get Real-Time CPU Usage

Command:

oc adm top pods -n my-namespace
  • Displays real-time CPU and memory usage for each VM.

 

πŸš€ Recommendations to Avoid CPU Throttling in OpenShift Virtualization

 

1️⃣ Avoid Unnecessary CPU Limits

  • If a VM has a low CPU Limit, it can be constantly throttled, leading to high CPU Throttling.
  • Instead, define only CPU Requests and leave the Limit open if the host has available resources.

2️⃣ Monitor and Adjust Assigned Resources

  • Regularly review CPU Throttling metrics in Prometheus/Grafana.
  • Ensure critical VMs have sufficient resources.

 

3️⃣ Avoid vCPU Overcommitment on Physical Nodes

  • In OpenShift, assigning more vCPUs than the physical CPU capacity of the node can cause contention and throttling.
  • Follow recommended vCPU to pCPU ratios.

πŸ“Š Thresholds for Interpreting CPU Throttling in OpenShift

CPU Throttling (%) Status Performance Impact
0 - 5% πŸ”΅ Optimal No impact on performance.
5 - 10% 🟑 Warning Possible latency in CPU-sensitive applications.
10 - 20% 🟠 Moderate Issue Delays and performance degradation may occur.
> 20% πŸ”΄ Critical The VM or container is heavily throttled, severely impacting performance.

 

πŸ› οΈ Quick Comparison: VMware vs. OpenShift Virtualization

Concept VMware (vSphere) OpenShift Virtualization (KubeVirt)
CPU Ready (%) %RDY in esxtop container_cpu_cfs_throttled_seconds_total
GUI Monitoring vSphere Client OpenShift Grafana Dashboard
CLI Monitoring esxtop oc adm top pods + Prometheus
Main Cause CPU Oversubscription CPU Limits or node overload

 

 

 

3 comments

Chris.Childerhose
Forum|alt.badge.img+21
  • Veeam Legend, Veeam Vanguard
  • 8459 comments
  • March 15, 2025

Never used OpenShift before but this was a very interesting read.  Thanks for sharing.


Tommy O'Shea
Forum|alt.badge.img+3
  • Experienced User
  • 85 comments
  • March 17, 2025

This was very informative. Recently we discovered a customer's performance issues were due to CPU oversubscription, so the timing on this post was great. 


eprieto
Forum|alt.badge.img+7
  • Author
  • On the path to Greatness
  • 159 comments
  • March 17, 2025
Tommy O'Shea wrote:

This was very informative. Recently we discovered a customer's performance issues were due to CPU oversubscription, so the timing on this post was great. 

Hi Tommy, I’m very happy to be able to help you. 


Comment