Investigating high CPU utilization of JVM processes – Finding the Culprit

Investigating high CPU utilization of JVM processes – Finding the Culprit.

Recently, in one of our client sites, we have observed a crunch in CPU utilization. At some random intervals, CPU went up to 100% and kept utilizing the same until restarting the application. This application is a standalone java application and involves with heavy database operations. It was critical to find the issue and fixing it soon, as this CPU utilization affected other applications on the same server. So I’m going to explain the methodology that we used to find and fix this issue.

Environment
Sun Solaris Server – 12 cores

  1. Finding the process
    We have used “top” command to get the most CPU utilizing applications.
    Since the server has 12 cores, it showed total CPU utilization by all cores.
    But once disabled the Irix mode (pressing ‘I’) we got average CPU per process. Then obtain the PID of the most CPU utilizing the process.
Irix/Solaris_Mode_toggle
When operating in 'Solaris mode' ('I' toggled Off), a
task's CPU usage will be divided by the total number of
CPUs. After issuing this command, you'll be informed of
the new state of this toggle.

  1. Finding the thread
    Once the PID of the process is obtained, used ‘H’ key to list the threads of that process. Then obtained the NID or ‘Soft process Id’ from the first column.

  1. Get the thread dump
    Used the following command to obtain a thread dump of the JVM process identified in step 1.
jstack -l PID > jstack.txt

  1. Isolate the stack trace

Inside the thread dump file, it had the stack trace of each thread. But most important thing is identifying the correct stack trace. For each stack trace, there was a nid field in HEX format which is equivalent to the thread id obtained in step 2. Once the thread id is converted to HEX, we have isolated the stack trace.

  1. Find the culprit code block

From the stack trace, we have identified the code block which was executed with 100% CPU. In our case, it was a while loop which entered into an infinite cycle when certain condition met.

  1. Fix it!

So we have fixed that, tested and deployed to production. Now the JVM process is running smoothly even CPU doesn’t know that 🙂

Dinusha

  • Bio
  • Latest Posts

About

Dinusha is a senior software engineer in Interblocks ltd. Currently working closely with java, fintech data analytics and real-time transaction processing technologies. https://www.linkedin.com/in/dinusha-thilakarathna-b4642736

Author: Dinusha Thilakarathne

Dinusha is a senior software engineer in Interblocks ltd. Currently working closely with java, fintech data analytics and real-time transaction processing technologies. https://www.linkedin.com/in/dinusha-thilakarathna-b4642736