Linux Memory Analysis: From Basic Commands to Container Isolation

A practical guide to checking memory usage in Linux. Covers essential commands (ps, top, htop), explains VSZ vs RSS, and discusses the memory visibility isolation issues within Docker containers."

When a Linux server starts acting sluggish, the first suspect is usually memory. Whether you are debugging a standalone server or a Kubernetes pod, knowing exactly how to pinpoint the “memory hogs” is a critical skill.

This post covers the quick commands to find the culprit, explains what the metrics actually mean, and dives into why memory analysis inside Docker containers can be deceptive.

1. The Quick Fix: Finding the Culprit

If you need to identify the process consuming the most memory immediately, you don’t need to install anything.

Method 1: The Script-Friendly Way (`ps`)

The ps command is available on almost every Linux system. To list the top 10 memory-consuming processes:

ps aux --sort=-%mem | head -n 11

Breakdown:

ps aux: Lists all processes for all users.
--sort=-%mem: Sorts by memory percentage in descending order (note the minus sign -).
head -n 11: Shows the top 11 lines (1 header line + top 10 processes).

Method 2: The Interactive Way (`htop` / `top`)

While ps is great for snapshots, interactive tools are better for monitoring.

htop (Recommended): If installed, use this. You can simply click the MEM% column header to sort, or press F6 to select the sort criteria. It provides a much clearer visual representation of bars and colors.
top (Built-in): If you don’t have htop, run top. By default, it sorts by CPU. Press Shift + M inside the interface to switch to Memory sorting.

2. Understanding the Metrics: VSZ vs. RSS

When you run the commands above, you will see two confusing columns: VSZ and RSS. Understanding the difference is vital.

Metric	Full Name	Definition
VSZ	Virtual Memory Size	The total amount of memory a process has asked for (allocated). This includes memory that has been mapped but not yet used.
RSS	Resident Set Size	The actual amount of physical RAM the process is currently using. This is the number you usually care about.

The Catch: RSS is often an overestimate because it includes Shared Libraries. If 10 processes all use libc, the memory used by libc is counted in the RSS of all 10 processes.

Pro Tip: If you need absolute precision (e.g., “How much RAM will I get back if I kill this process?”), you should look at PSS (Proportional Set Size), which divides shared memory among processes. Tools like smem can show this.

3. The System-Level View: Don’t Panic

Before hunting individual processes, check the global system state:

free -h

A common source of confusion for new Linux users is the “free” column. You might see:

              total        used        free      shared  buff/cache   available  
Mem:           15Gi       4.5Gi       200Mi       1.0Gi        10Gi        10Gi

Free: 200Mi. (Panic? No.)
Available: 10Gi. (Relax.)

Linux follows the philosophy that “unused RAM is wasted RAM.” It automatically uses free memory to cache disk files (buff/cache) to speed up I/O. If applications need more RAM, the kernel instantly reclaims it from the cache. Always look at the “available” column.

When Memory Truly Runs Out: The OOM Killer

If “available” memory hits zero, the Linux kernel invokes the OOM (Out of Memory) Killer. It sacrifices a process to save the system.

To check if your application (e.g., a Python script or Java app) was a victim:

dmesg | grep -i "out of memory"  
# Or check the logs  
grep -i "killed" /var/log/kern.log

4. The Container “Lie”: Memory in Docker & Kubernetes

If you are running these commands inside a Docker container or a Kubernetes Pod, what you see might not be real.

The Visibility Problem

You might set a Kubernetes limit of 512MB for your pod. However, if you run free -h or top inside that pod, you will likely see 64GB (or whatever the host node has).
Why?

Containers are not Virtual Machines. They utilize Namespaces for isolation and Cgroups for resource limitation.

Cgroups strictly enforce the limit. If you use >512MB, the kernel kills you.
Namespaces isolate processes and networks, but they generally share the kernel’s /proc filesystem.

Tools like free and top read from /proc/meminfo. Since /proc is often a direct mount from the host kernel, the container “sees” the host’s total resources, even though it can’t use them.

The Consequence: This causes issues with runtimes like the JVM (Java). Older Java versions would see 64GB RAM, create a massive Heap, and immediately get killed by the container Cgroup limit.

How to Fix It?

Trust Cgroups, not /proc: Check specific Cgroup files (e.g., /sys/fs/cgroup/memory/) for the real usage and limits.
LXCFS: In production Kubernetes clusters, we often use LXCFS. This is a FUSE-based filesystem that “masks” /proc files. When a process inside a container reads /proc/meminfo, LXCFS intercepts the call and returns values consistent with the container’s Cgroup limits.

A Note on Disk Space (`df -h`)

Similarly, running df -h inside a container usually shows the host’s disk size and usage.

This is because the default Docker storage driver, overlay2, operates at the file level. It shares the underlying filesystem (ext4/xfs) of the host. There is no separate “virtual disk” created for the container. While you can limit writable space using XFS project quotas, standard tools like df will essentially show you the underlying host partition.

Conclusion

Quick Check: Use htop or ps aux --sort=-%mem.
Analysis: Focus on RSS for physical memory usage, but remember it includes shared memory.
Containers: Be skeptical of system tools inside Docker. The container might claim it has 64GB of RAM and 1TB of disk, but Cgroups (the enforcement layer) usually disagrees.

1. The Quick Fix: Finding the Culprit#

Method 1: The Script-Friendly Way (ps)#

Method 2: The Interactive Way (htop / top)#

2. Understanding the Metrics: VSZ vs. RSS#

3. The System-Level View: Don’t Panic#

When Memory Truly Runs Out: The OOM Killer#

4. The Container “Lie”: Memory in Docker & Kubernetes#

The Visibility Problem#

How to Fix It?#

A Note on Disk Space (df -h)#

Conclusion#