Re: [PATCH v1 04/18] mm: track mapcount of large folios in single value

From: Yin, Fengwei
Date: Fri Apr 19 2024 - 10:07:16 EST

Next message: Yin, Fengwei: "Re: [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios"
Previous message: Will Deacon: "Re: [PATCH 1/3] x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n"
In reply to: Lance Yang: "Re: [PATCH v1 04/18] mm: track mapcount of large folios in single value"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 4/10/2024 3:22 AM, David Hildenbrand wrote:

Let's track the mapcount of large folios in a single value. The mapcount of
a large folio currently corresponds to the sum of the entire mapcount and
all page mapcounts.

This sum is what we actually want to know in folio_mapcount() and it is
also sufficient for implementing folio_mapped().

With PTE-mapped THP becoming more important and more widely used, we want
to avoid looping over all pages of a folio just to obtain the mapcount
of large folios. The comment "In the common case, avoid the loop when no
pages mapped by PTE" in folio_total_mapcount() does no longer hold for
mTHP that are always mapped by PTE.

Further, we are planning on using folio_mapcount() more
frequently, and might even want to remove page mapcounts for large
folios in some kernel configs. Therefore, allow for reading the mapcount of
large folios efficiently and atomically without looping over any pages.

Maintain the mapcount also for hugetlb pages for simplicity. Use the new
mapcount to implement folio_mapcount() and folio_mapped(). Make
page_mapped() simply call folio_mapped(). We can now get rid of
folio_large_is_mapped().

_nr_pages_mapped is now only used in rmap code and for debugging
purposes. Keep folio_nr_pages_mapped() around, but document that its use
should be limited to rmap internals and debugging purposes.

This change implies one additional atomic add/sub whenever
mapping/unmapping (parts of) a large folio.

As we now batch RMAP operations for PTE-mapped THP during fork(),
during unmap/zap, and when PTE-remapping a PMD-mapped THP, and we adjust
the large mapcount for a PTE batch only once, the added overhead in the
common case is small. Only when unmapping individual pages of a large folio
(e.g., during COW), the overhead might be bigger in comparison, but it's
essentially one additional atomic operation.

Note that before the new mapcount would overflow, already our refcount
would overflow: each mapping requires a folio reference. Extend the
focumentation of folio_mapcount().

Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>

Reviewed-by: Yin Fengwei <fengwei.yin@xxxxxxxxx>

Next message: Yin, Fengwei: "Re: [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios"
Previous message: Will Deacon: "Re: [PATCH 1/3] x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n"
In reply to: Lance Yang: "Re: [PATCH v1 04/18] mm: track mapcount of large folios in single value"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]