Re: [GIT PULL] arm64 updates for 6.13-rc1

From: Yang Shi
Date: Wed Dec 04 2024 - 11:06:34 EST




On 12/4/24 7:50 AM, Catalin Marinas wrote:
On Wed, Dec 04, 2024 at 04:32:11PM +0100, David Hildenbrand wrote:
On 04.12.24 16:29, Catalin Marinas wrote:
On Mon, Dec 02, 2024 at 08:22:57AM -0800, Yang Shi wrote:
On 11/28/24 1:56 AM, David Hildenbrand wrote:
On 28.11.24 02:21, Yang Shi wrote:
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 87b3f1a25535..ef303a2262c5 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -30,9 +30,9 @@ void copy_highpage(struct page *to, struct
page *from)
        if (!system_supports_mte())
            return;
-    if (folio_test_hugetlb(src) &&
-        folio_test_hugetlb_mte_tagged(src)) {
-        if (!folio_try_hugetlb_mte_tagging(dst))
+    if (folio_test_hugetlb(src)) {
+        if (!folio_test_hugetlb_mte_tagged(src) ||
+            !folio_try_hugetlb_mte_tagging(dst))
                return;
            /*
I wonder why we had a 'return' here originally rather than a
WARN_ON_ONCE() as we do further down for the page case. Do you seen any
issue with the hunk below? Destination should be a new folio and not
tagged yet:
Yes, I did see problem. Because we copy tags for all sub pages then set
folio mte tagged when copying the data for the first subpage. The
warning will be triggered when we copy the second subpage.
It's rather weird, though. We're instructed to copy a single page, yet
copy tags for all pages.

This really only makes sense when called from folio_copy(), where we are
guaranteed to copy all pages.

I'm starting to wonder if we should be able to hook into / overload
folio_copy() instead, to just handle the complete hugetlb copy ourselves
in one shot, and assume that copy_highpage() will never be called for
hugetlb pages (WARN and don't copy tags).
Actually folio_copy() is just called by migration. Copy huge page in CoW is
more complicated and uses copy_user_highpage()->copy_highpage() instead of
folio_copy(). It may start the page copy from any subpage. For example, if
the CoW is triggered by accessing to the address in the middle of 2M. Kernel
may copy the second half first then the first half to guarantee the accessed
data in cache.
Still trying to understand the possible call paths here. If we get a
write fault on a large folio, does the core code allocate a folio of the
same size for CoW or it starts with smaller ones? wp_page_copy()
allocates order 0 AFAICT, though if it was a pmd fault, it takes a
different path in handle_mm_fault(). But we also have huge pages using
contiguous ptes.

Unless the source and destinations folios are exactly the same size, it
will break many assumptions in the code above. Going the other way
around is also wrong, dst larger than src, we are not initialising the
whole dst folio.

Maybe going back to per-page PG_mte_tagged flag rather than per-folio
would keep things simple, less risk of wrong assumptions.
I think the magic bit here is that for hugetlb, we only get hugetlb folios
of the same size, and no mixtures.

Yes, hugetlb always allocates the same order folio for CoW. And hugetlb CoW path is:

handle_mm_fault() ->
  hugetlb_fault() ->
    hugetlb_wp()

Ah, ok, we do check for this and only do the advance copy for hugetlb
folios. I'd add a check for folio size just in case, something like
below (I'll add some description and post it properly):

diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 87b3f1a25535..c3a83db46ec6 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -30,11 +30,14 @@ void copy_highpage(struct page *to, struct page *from)
if (!system_supports_mte())
return;
- if (folio_test_hugetlb(src) &&
- folio_test_hugetlb_mte_tagged(src)) {
- if (!folio_try_hugetlb_mte_tagging(dst))
+ if (folio_test_hugetlb(src)) {
+ if (!folio_test_hugetlb_mte_tagged(src) ||
+ from != folio_page(src, 0) ||
+ WARN_ON_ONCE(folio_nr_pages(src) != folio_nr_pages(dst)))

The check is ok, but TBH I don't see too much benefit. The same order is guaranteed by hugetlb fault handler. And I don't think we will support mixed order for hugetlb in foreseeable future.

return;
+ WARN_ON_ONCE(!folio_try_hugetlb_mte_tagging(dst));
+
/*
* Populate tags for all subpages.
*