Skip to main content

rawops.dev

P2

High CPU Usage — Linux Troubleshooting Guide

Identify and resolve processes causing high CPU utilization on Linux. Covers real-time monitoring, runaway process detection, and safe mitigation options.

15 min7 steps
Progress: 0/7 steps
0%

Get a quick overview of system load and CPU utilization.

uptime && echo '---' && mpstat 1 3 2>/dev/null || vmstat 1 3
Expected: Load average and CPU breakdown (user, system, iowait, idle). High iowait indicates disk bottleneck, not CPU.

Find which processes are using the most CPU.

ps aux --sort=-%cpu | head -15
Expected: Sorted process list. The %CPU column shows per-core usage (can exceed 100% on multi-core).

Watch CPU usage in real-time to see if it's sustained or spiky.

top -b -n 3 -d 2 | head -40
Expected: Three snapshots of top processes. Look for consistently high CPU consumers.

Look for processes using >50% CPU that shouldn't be.

ps -eo pid,ppid,user,%cpu,%mem,etime,comm --sort=-%cpu | awk '$4 > 50'
Expected: Any process above 50% CPU. Check etime (elapsed time) — long-running high-CPU processes are suspicious.

Get more details about the offending process.

# Replace PID with the actual process ID:
ls -la /proc/PID/exe && cat /proc/PID/cmdline | tr '\0' ' ' && echo
Expected: Full path to the executable and the command line used to start it.

If it's a systemd service, check its status and recent logs.

# Replace SERVICE with the service name:
systemctl status SERVICE && journalctl -u SERVICE --since '1 hour ago' --no-pager | tail -30
Expected: Service status and recent log entries. Look for error patterns, restarts, or resource warnings.

Either restart the service, reduce its priority, or kill it if necessary.

# Option 1: Restart a service
systemctl restart SERVICE
# Option 2: Lower priority (nice value)
renice +10 -p PID
# Option 3: Kill as last resort
kill PID  # graceful
kill -9 PID  # force (last resort)
Expected: Service restarted, process priority lowered, or process terminated. Verify with 'top' or 'ps'.
Only use kill -9 as a last resort. It doesn't allow the process to clean up.