Re: [PATCH 1/5] mm/vmscan: put the redirtied MADV_FREE pages back to anonymous LRU list

From: John Hubbard
Date: Wed Jul 14 2021 - 15:48:34 EST


On 7/14/21 4:48 AM, Matthew Wilcox wrote:
On Wed, Jul 14, 2021 at 07:36:57PM +0800, Miaohe Lin wrote:
On 2021/7/13 21:34, Matthew Wilcox wrote:
On Tue, Jul 13, 2021 at 09:13:51PM +0800, Miaohe Lin wrote:
When the MADV_FREE pages are redirtied before they could be reclaimed, the pages
should be put back to anonymous LRU list by setting SwapBacked flag, thus the
pages will be reclaimed in normal swapout way.

Agreed. But the question is why this needs an explicit handling here
when we already do handle this case when trying to unmap the page.

This makes me think more. It seems even the page_ref_freeze call is guaranteed to
success as no one can grab the page refcnt after the page is successfully unmapped.

NO! This is wrong. Every page can have its refcount speculatively raised
(and then lowered). The two prime candidates for this are lockless GUP
and page cache lookups, but there can be others too.


Many thanks for pointing this out. My overlook! Sorry!
So, it seems lockless GUP can redirty the MADV_FREE page. But is it ok to just release
a redirtied MADV_FREE pages? Because we hold the last reference here and the page will
be freed anyway...

I don't see how lockless GUP can redirty the page. It can grab the
refcount, thus making the refcount here two. Then the call to freeze
here fails and the page stays on the list. But the lockless GUP checks
the page is still in the page table (and discovers it isn't, so releases
the reference count). Am I missing a path that lets lockless GUP dirty
the page?


If a device driver pins some pages using gup, and the device then uses dma
to write to those pages, then you could get there. That story is part of the
reasoning that led to creating pin_user_pages(), which btw does not yet
fully solve that case.

Basically, though, unless a non-CPU device has access to the page, it's
hard to see how gup itself can lead to a page getting dirtied.

thanks,
--
John Hubbard
NVIDIA