Re: [PATCH v2 2/3] mm: rmap: Move the cache flushing to the correct place for hugetlb PMD sharing

From: Baolin Wang
Date: Thu Apr 28 2022 - 03:06:30 EST




On 4/28/2022 1:55 PM, Muchun Song wrote:
On Wed, Apr 27, 2022 at 06:52:06PM +0800, Baolin Wang wrote:
The cache level flush will always be first when changing an existing
virtual–>physical mapping to a new value, since this allows us to
properly handle systems whose caches are strict and require a
virtual–>physical translation to exist for a virtual address. So we
should move the cache flushing before huge_pmd_unshare().


Right.

As Muchun pointed out[1], now the architectures whose supporting hugetlb
PMD sharing have no cache flush issues in practice. But I think we
should still follow the cache/TLB flushing rules when changing a valid
virtual address mapping in case of potential issues in future.

Right. One point i need to clarify. I do not object this change but
want you to clarify this (not an issue in practice) in commit log
to let others know they do not need to bp this.


[1] https://lore.kernel.org/all/YmT%2F%2FhuUbFX+KHcy@xxxxxxxxxxxxxxxxxxxxx/
Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
---
mm/rmap.c | 40 ++++++++++++++++++++++------------------
1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 61e63db..4f0d115 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1535,15 +1535,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
* do this outside rmap routines.
*/
VM_BUG_ON(!(flags & TTU_RMAP_LOCKED));
+ /*
+ * huge_pmd_unshare may unmap an entire PMD page.
+ * There is no way of knowing exactly which PMDs may
+ * be cached for this mm, so we must flush them all.
+ * start/end were already adjusted above to cover this
+ * range.
+ */
+ flush_cache_range(vma, range.start, range.end);
+

flush_cache_range() is always called even if we do not need to flush.

Right, this is intended. In the original code, if it is not a shared PMD, we will use flush_cache_page() to do cache flushing. However the flush_cache_page() can not cover the whole size of a hugetlb page on some architectures, which is fixed by patch 3.

How about introducing a new helper like hugetlb_pmd_shared() which
returns true for shared PMD? Then:

if (hugetlb_pmd_shared(mm, vma, pvmw.pte)) {
flush_cache_range(vma, range.start, range.end);
huge_pmd_unshare(mm, vma, &address, pvmw.pte);
flush_tlb_range(vma, range.start, range.end);
}

The code could be a little simpler. Right?

IMHO after patch 3, the code will be changed as below, so seems no need to separate the validation of the shared PMDs from huge_pmd_unshare() into a new function.

if (folio_test_hugetlb(folio)) {
flush_cache_range(vma, range.start, range.end);

if (!folio_test_anon(folio)) {
VM_BUG_ON(!(flags & TTU_RMAP_LOCKED));

if (hugetlb_pmd_shared(mm, vma, pvmw.pte)) {
huge_pmd_unshare(mm, vma, &address, pvmw.pte));
flush_tlb_range(vma, range.start, range.end);
......
break;
}
}
} else {
......
}