Re: [PATCH 0/2] don't use mapcount() to check large folio sharing

From: David Hildenbrand
Date: Wed Aug 02 2023 - 08:46:05 EST


On 02.08.23 14:40, Ryan Roberts wrote:
On 02/08/2023 13:35, Yin, Fengwei wrote:


On 8/2/2023 6:27 PM, Ryan Roberts wrote:
On 28/07/2023 17:13, Yin Fengwei wrote:
In madvise_cold_or_pageout_pte_range() and madvise_free_pte_range(),
folio_mapcount() is used to check whether the folio is shared. But it's
not correct as folio_mapcount() returns total mapcount of large folio.

Use folio_estimated_sharers() here as the estimated number is enough.

Yin Fengwei (2):
madvise: don't use mapcount() against large folio for sharing check
madvise: don't use mapcount() against large folio for sharing check

mm/huge_memory.c | 2 +-
mm/madvise.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)


As a set of fixes, I agree this is definitely an improvement, so:

Reviewed-By: Ryan Roberts
Thanks.



But I have a couple of comments around further improvements;

Once we have the scheme that David is working on to be able to provide precise
exclusive vs shared info, we will probably want to move to that. Although that
scheme will need access to the mm_struct of a process known to be mapping the
folio. We have that info, but its not passed to folio_estimated_sharers() so we
can't just reimplement folio_estimated_sharers() - we will need to rework these
call sites again.
Yes. This could be extra work. Maybe should delay till David's work is done.

What you have is definitely an improvement over what was there before. And is
probably the best we can do without David's scheme. So I wouldn't delay this.
Just pointing out that we will be able to make it even better later on (if
David's stuff goes in).

Agreed, we just should be careful and clearly spell out the implications and that this is eventually also not what we 100% want.

That MADV_PAGEOUT now fails on a PTE-mapped THP -- as can be seen when executing the cow selftest where MADV_PAGEOUT will essentially fail -- is certainly undesired and should be fixed IMHO.

--
Cheers,

David / dhildenb