Re: [PATCH] mm/vmscan: don't scan adjust too much if current is not kswapd

From: Matthew Wilcox
Date: Wed Sep 14 2022 - 19:02:28 EST


On Wed, Sep 14, 2022 at 03:51:42PM -0700, Andrew Morton wrote:
> On Wed, 14 Sep 2022 10:33:18 +0800 Hongchen Zhang <zhanghongchen@xxxxxxxxxxx> wrote:
>
> > when a process falls into page fault and there is not enough free
> > memory,it will do direct reclaim. At the same time,it is holding
> > mmap_lock.So in case of multi-thread,it should exit from page fault
> > ASAP.
> > When reclaim memory,we do scan adjust between anon and file lru which
> > may cost too much time and trigger hung task for other thread.So for a
> > process which is not kswapd,it should just do a little scan adjust.
>
> Well, that's a pretty nasty bug. Before diving into a possible fix,
> can you please tell us more about how this happens? What sort of
> machine, what sort of workload. Can you suggest why others are not
> experiencing this?

One thing I'd like to know is whether the page fault is for an anonymous or
file-backed page. We already drop the mmap_lock for doing file I/O
(or we should ...) and maybe we also need to drop the mmap_lock for
doing direct reclaim?