Re: Possible memory leak in 6.17.7
From: David Wang
Date: Mon Dec 08 2025 - 06:09:20 EST
On Mon, 10 Nov 2025 18:20:08 +1000
Mal Haak <malcolm@xxxxxxxxxx> wrote:
> Hello,
>
> I have found a memory leak in 6.17.7 but I am unsure how to track it
> down effectively.
>
> I am running a server that has a heavy read/write workload to a cephfs
> file system. It is a VM.
>
> Over time it appears that the non-cache useage of kernel dynamic
> memory increases. The kernel seems to think the pages are reclaimable
> however nothing appears to trigger the reclaim. This leads to
> workloads getting killed via oomkiller.
>
> smem -wp output:
>
> Area Used Cache Noncache
> firmware/hardware 0.00% 0.00% 0.00%
> kernel image 0.00% 0.00% 0.00%
> kernel dynamic memory 88.21% 36.25% 51.96%
> userspace memory 9.49% 0.15% 9.34%
> free memory 2.30% 2.30% 0.00%
>
> free -h output:
>
> total used free shared buff/cache available
> Mem: 31Gi 3.6Gi 500Mi 4.0Mi 11Gi 27Gi
> Swap: 4.0Gi 179Mi 3.8Gi
>
> Reverting to the previous LTS fixes the issue
>
> smem -wp output:
> Area Used Cache Noncache
> firmware/hardware 0.00% 0.00% 0.00%
> kernel image 0.00% 0.00% 0.00%
> kernel dynamic memory 80.22% 79.32% 0.90%
> userspace memory 10.48% 0.20% 10.28%
> free memory 9.30% 9.30% 0.00%
>
I think the `memory allocation profiling` feature can help.
https://docs.kernel.org/mm/allocation-profiling.html
You would need to build a kernel with
CONFIG_MEM_ALLOC_PROFILING=y
CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y
And check /proc/allocinfo for the suspicious allocations which take
more memory than expected.
(I once caught a nvidia driver memory leak.)
FYI
David