Re: [PATCH v6 6/9] mm: multigenerational lru: aging
From: Alexey Avramov
Date: Tue Jan 11 2022 - 09:23:05 EST
> I do not really see any arguments why an userspace based trashing
> detection cannot be used for those.
Firsly,
because this is the task of the kernel, not the user space.
Memory is managed by the kernel, not by the user space.
The absence of such a mechanism in the kernel is a fundamental problem.
The userspace tools are ugly hacks:
some of them consume a lot of CPU [1],
some of them consume a lot of memory [2],
some of them cannot into process_mrelease() (earlyoom, nohang),
some of them kill only the whole cgroup (systemd-oomd, oomd) [3]
and depends on systemd and cgroup_v2 (oomd, systemd-oomd).
One of the biggest challenges for userspace oom-killers is to potentially
function under intense memory pressure and are prone to getting stuck in
memory reclaim themselves [4].
It is strange that after decades of user complaints about thrashing and
not-working OOM killer I have to explain the obvious things.
The basic mechanism must be implemented in the kernel.
Stop shifting responsibility to the user space!
Secondly,
the real reason for the min_ttl_ms mechanism is that without it,
multi-minute stalls are possible [5] even when the killer is expected to
arrive, and memory pressure is closed to 100 at this period [6].
This fixes a bug that does not exist in the mainline LRU (this is
MGLRU-specific bug). BTW, the similar symptoms were recently fixed in the
mainline [7].
[1] https://github.com/facebookincubator/oomd/issues/79
[2] https://github.com/hakavlad/nohang#memory-and-cpu-usage
[3] https://github.com/facebookincubator/oomd/issues/125
[4] https://lore.kernel.org/all/CALvZod7vtDxJZtNhn81V=oE-EPOf=4KZB2Bv6Giz+u3bFFyOLg@xxxxxxxxxxxxxx/
[5] https://github.com/zen-kernel/zen-kernel/issues/223
[6] https://raw.githubusercontent.com/hakavlad/cache-tests/main/mg-LRU-v3_vs_classic-LRU/3-firefox-tail-OOM/mg-LRU-1/psi2
[7] https://lore.kernel.org/linux-mm/20211202150614.22440-1-mgorman@xxxxxxxxxxxxxxxxxxx/
[I am duplicating a previous message here - it was not delivered to mailing lists]