Re: [PATCH v4 2/8] mm: vmscan: make global slab shrink lockless

From: Vlastimil Babka
Date: Wed Mar 08 2023 - 10:03:12 EST


On 3/7/23 07:55, Qi Zheng wrote:
> The shrinker_rwsem is a global read-write lock in
> shrinkers subsystem, which protects most operations
> such as slab shrink, registration and unregistration
> of shrinkers, etc. This can easily cause problems in
> the following cases.
>
> 1) When the memory pressure is high and there are many
> filesystems mounted or unmounted at the same time,
> slab shrink will be affected (down_read_trylock()
> failed).
>
> Such as the real workload mentioned by Kirill Tkhai:
>
> ```
> One of the real workloads from my experience is start
> of an overcommitted node containing many starting
> containers after node crash (or many resuming containers
> after reboot for kernel update). In these cases memory
> pressure is huge, and the node goes round in long reclaim.
> ```
>
> 2) If a shrinker is blocked (such as the case mentioned
> in [1]) and a writer comes in (such as mount a fs),
> then this writer will be blocked and cause all
> subsequent shrinker-related operations to be blocked.
>
> Even if there is no competitor when shrinking slab, there
> may still be a problem. If we have a long shrinker list
> and we do not reclaim enough memory with each shrinker,
> then the down_read_trylock() may be called with high
> frequency. Because of the poor multicore scalability of
> atomic operations, this can lead to a significant drop
> in IPC (instructions per cycle).
>
> So many times in history ([2],[3],[4],[5]), some people
> wanted to replace shrinker_rwsem trylock with SRCU in
> the slab shrink, but all these patches were abandoned
> because SRCU was not unconditionally enabled.
>
> But now, since commit 1cd0bd06093c ("rcu: Remove CONFIG_SRCU"),
> the SRCU is unconditionally enabled. So it's time to use
> SRCU to protect readers who previously held shrinker_rwsem.
>
> This commit uses SRCU to make global slab shrink lockless,
> the memcg slab shrink is handled in the subsequent patch.
>
> [1]. https://lore.kernel.org/lkml/20191129214541.3110-1-ptikhomirov@xxxxxxxxxxxxx/
> [2]. https://lore.kernel.org/all/1437080113.3596.2.camel@xxxxxxxxxxxx/
> [3]. https://lore.kernel.org/lkml/1510609063-3327-1-git-send-email-penguin-kernel@xxxxxxxxxxxxxxxxxxx/
> [4]. https://lore.kernel.org/lkml/153365347929.19074.12509495712735843805.stgit@localhost.localdomain/
> [5]. https://lore.kernel.org/lkml/20210927074823.5825-1-sultan@xxxxxxxxxxxxxxx/
>
> Signed-off-by: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx>

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>