Re: [PATCH] mm: vmscan: fix not scanning anonymous pages when detecting file refaults

From: Minchan Kim
Date: Fri Jun 28 2019 - 19:34:21 EST


On Fri, Jun 28, 2019 at 10:22:52AM -0400, Johannes Weiner wrote:
> Hi Minchan,
>
> On Fri, Jun 28, 2019 at 03:51:38PM +0900, Minchan Kim wrote:
> > On Thu, Jun 27, 2019 at 02:41:23PM -0400, Johannes Weiner wrote:
> > > On Wed, Jun 19, 2019 at 04:08:35PM +0800, Kuo-Hsin Yang wrote:
> > > > Fixes: 2a2e48854d70 ("mm: vmscan: fix IO/refault regression in cache workingset transition")
> > > > Signed-off-by: Kuo-Hsin Yang <vovoy@xxxxxxxxxxxx>
> > >
> > > Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> > >
> > > Your change makes sense - we should indeed not force cache trimming
> > > only while the page cache is experiencing refaults.
> > >
> > > I can't say I fully understand the changelog, though. The problem of
> >
> > I guess the point of the patch is "actual_reclaim" paramter made divergency
> > to balance file vs. anon LRU in get_scan_count. Thus, it ends up scanning
> > file LRU active/inactive list at file thrashing state.
>
> Look at the patch again. The parameter was only added to retain
> existing behavior. We *always* did file-only reclaim while thrashing -
> all the way back to the two commits I mentioned below.

Yeah, I know it that we did force file relcaim if we have enough file LRU.
What I confused from the description was "actual_reclaim" part.
Thanks for the pointing out, Johannes. I confirmed it kept the old
behavior in get_scan_count.

>
> > So, Fixes: 2a2e48854d70 ("mm: vmscan: fix IO/refault regression in cache workingset transition")
> > would make sense to me since it introduces the parameter.
>
> What is the observable behavior problem that this patch introduced?
>
> > > forcing cache trimming while there is enough page cache is older than
> > > the commit you refer to. It could be argued that this commit is
> > > incomplete - it could have added refault detection not just to
> > > inactive:active file balancing, but also the file:anon balancing; but
> > > it didn't *cause* this problem.
> > >
> > > Shouldn't this be
> > >
> > > Fixes: e9868505987a ("mm,vmscan: only evict file pages when we have plenty")
> > > Fixes: 7c5bd705d8f9 ("mm: memcg: only evict file pages when we have plenty")
> >
> > That would affect, too but it would be trouble to have stable backport
> > since we don't have refault machinery in there.
>
> Hm? The problematic behavior is that we force-scan file while file is
> thrashing. We can obviously only solve this in kernels that can
> actually detect thrashing.

What I meant is I thought it's -stable material but in there, we don't have
refault machinery in v3.8.
I agree this patch fixes above two commits you mentioned so we should use it.