Re: [PATCH 1/3] mm: Move arch_do_swap_page() call to before swap_free()

From: David Hildenbrand
Date: Wed May 17 2023 - 04:37:17 EST


On 16.05.23 04:35, Peter Collingbourne wrote:
On Mon, May 15, 2023 at 05:16:09PM -0700, Peter Collingbourne wrote:
On Sat, May 13, 2023 at 05:29:53AM +0200, David Hildenbrand wrote:
On 13.05.23 01:57, Peter Collingbourne wrote:
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
the call to swap_free() before the call to set_pte_at(), which meant that
the MTE tags could end up being freed before set_pte_at() had a chance
to restore them. One other possibility was to hook arch_do_swap_page(),
but this had a number of problems:

- The call to the hook was also after swap_free().

- The call to the hook was after the call to set_pte_at(), so there was a
racy window where uninitialized metadata may be exposed to userspace.
This likely also affects SPARC ADI, which implements this hook to
restore tags.

- As a result of commit 1eba86c096e3 ("mm: change page type prior to
adding page table entry"), we were also passing the new PTE as the
oldpte argument, preventing the hook from knowing the swap index.

Fix all of these problems by moving the arch_do_swap_page() call before
the call to free_page(), and ensuring that we do not set orig_pte until
after the call.

Signed-off-by: Peter Collingbourne <pcc@xxxxxxxxxx>
Suggested-by: Catalin Marinas <catalin.marinas@xxxxxxx>
Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965
Cc: <stable@xxxxxxxxxxxxxxx> # 6.1
Fixes: ca827d55ebaa ("mm, swap: Add infrastructure for saving page metadata on swap")
Fixes: 1eba86c096e3 ("mm: change page type prior to adding page table entry")

I'm confused. You say c145e0b47c77 changed something (which was after above
commits), indicate that it fixes two other commits, and indicate "6.1" as
stable which does not apply to any of these commits.

Sorry, the situation is indeed a bit confusing.

- In order to make the arch_do_swap_page() hook suitable for fixing the
bug introduced by c145e0b47c77, patch 1 addresses a number of issues,
including fixing bugs introduced by ca827d55ebaa and 1eba86c096e3,
but we haven't fixed the c145e0b47c77 bug yet, so there's no Fixes:
tag for it yet.

- Patch 2, relying on the fixes in patch 1, makes MTE install an
arch_do_swap_page() hook (indirectly, by making arch_swap_restore()
also hook arch_do_swap_page()), thereby fixing the c145e0b47c77 bug.

- 6.1 is the first stable version in which all 3 commits in my Fixes: tags
are present, so that is the version that I've indicated in my stable
tag for this series. In theory patch 1 could be applied to older kernel
versions, but it wouldn't fix any problems that we are facing with MTE
(because it only fixes problems relating to the arch_do_swap_page()
hook, which older kernel versions don't hook with MTE), and there are
some merge conflicts if we go back further anyway. If the SPARC folks
(the previous only user of this hook) want to fix these issues with ADI,
they can propose their own backport.

@@ -3959,7 +3960,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
VM_BUG_ON(!folio_test_anon(folio) ||
(pte_write(pte) && !PageAnonExclusive(page)));
set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
- arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte);
folio_unlock(folio);
if (folio != swapcache && swapcache) {


You are moving the folio_free_swap() call after the folio_ref_count(folio)
== 1 check, which means that such (previously) swapped pages that are
exclusive cannot be detected as exclusive.

Ack. I will fix this in v2.

I gave this some thought and concluded that the added complexity needed
to make this hook suitable for arm64 without breaking sparc probably
isn't worth it in the end, and as I explained in patch 2, sparc ought
to be moving away from this hook anyway. So in v2 I replaced patches 1
and 2 with a patch that adds a direct call to the arch_swap_restore()
hook before the call to swap_free().

As a side note, I recall that sparc code might be a bit fragile and eventually broken already (arch_unmap_one()):

https://lkml.kernel.org/r/d98bd1f9-e9b7-049c-7bde-3348b074eb18@xxxxxxxxxx

--
Thanks,

David / dhildenb