Re: [PATCH] arm64/pageattr: Propagate return value from __change_memory_common

From: Yang Shi

Date: Tue Nov 11 2025 - 18:59:41 EST

On 11/11/25 9:52 AM, Ryan Roberts wrote:

On 11/11/2025 05:12, Dev Jain wrote:

On 11/11/25 10:38 am, Yang Shi wrote:

On 11/10/25 8:55 PM, Dev Jain wrote:

On 11/11/25 10:14 am, Yang Shi wrote:

On 11/10/25 8:37 PM, Dev Jain wrote:

On 11/11/25 9:47 am, Yang Shi wrote:

On 11/10/25 7:39 PM, Dev Jain wrote:

On 05/11/25 9:27 am, Dev Jain wrote:

On 04/11/25 6:26 pm, Will Deacon wrote:

On Tue, Nov 04, 2025 at 09:06:12AM +0530, Dev Jain wrote:

On 04/11/25 12:15 am, Yang Shi wrote:

On 11/3/25 7:16 AM, Will Deacon wrote:

On Mon, Nov 03, 2025 at 11:43:06AM +0530, Dev Jain wrote:

Post a166563e7ec3 ("arm64: mm: support large block mapping when
rodata=full"),
__change_memory_common has a real chance of failing due to split
failure.
Before that commit, this line was introduced in c55191e96caa,
still having
a chance of failing if it needs to allocate pagetable memory in
apply_to_page_range, although that has never been observed to be true.
In general, we should always propagate the return value to the caller.

Cc: stable@xxxxxxxxxxxxxxx
Fixes: c55191e96caa ("arm64: mm: apply r/o permissions of VM
areas to its linear alias as well")
Signed-off-by: Dev Jain <dev.jain@xxxxxxx>
---
Based on Linux 6.18-rc4.

   arch/arm64/mm/pageattr.c | 5 ++++-
   1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 5135f2d66958..b4ea86cd3a71 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -148,6 +148,7 @@ static int change_memory_common(unsigned
long addr, int numpages,
       unsigned long size = PAGE_SIZE * numpages;
       unsigned long end = start + size;
       struct vm_struct *area;
+    int ret;
       int i;
         if (!PAGE_ALIGNED(addr)) {
@@ -185,8 +186,10 @@ static int change_memory_common(unsigned
long addr, int numpages,
       if (rodata_full && (pgprot_val(set_mask) == PTE_RDONLY ||
                   pgprot_val(clear_mask) == PTE_RDONLY)) {
           for (i = 0; i < area->nr_pages; i++) {
- __change_memory_common((u64)page_address(area->pages[i]),
+            ret =
__change_memory_common((u64)page_address(area->pages[i]),
                              PAGE_SIZE, set_mask, clear_mask);
+            if (ret)
+                return ret;

Hmm, this means we can return failure half-way through the
operation. Is
that something callers are expecting to handle? If so, how can they
tell
how far we got?

IIUC the callers don't have to know whether it is half-way or not
because the callers will change the permission back (e.g. to RW) for the
whole range when freeing memory.

Yes, it is the caller's responsibility to set VM_FLUSH_RESET_PERMS flag.
Upon vfree(), it will change the direct map permissions back to RW.

Ok, but vfree() ends up using update_range_prot() to do that and if we
need to worry about that failing (as per your commit message), then
we're in trouble because the calls to set_area_direct_map() are unchecked.

In other words, this patch is either not necessary or it is incomplete.

Here is the relevant email, in the discussion between Ryan and Yang:

https://lore.kernel.org/all/fe52a1d8-5211-4962-afc8-
c3f9caf64119@xxxxxxxxxxxxxxxxxxxxxx/

We had concluded that all callers of set_memory_ro() or set_memory_rox()
(which require the
linear map perm change back to default, upon vfree() ) will call it for
the entire region (vm_struct).
So, when we do the set_direct_map_invalid_noflush, it is guaranteed that
the region has already
been split. So this call cannot fail.

https://lore.kernel.org/all/f8898c87-8f49-4ef2-86ae-
b60bcf67658c@xxxxxxxxxxxxxxxxxxxxxx/

This email notes that there is some code doing set_memory_rw() and
unnecessarily setting the VM_FLUSH_RESET_PERMS
flag, but in that case we don't care about the
set_direct_map_invalid_noflush call failing because the protections
are already RW.

Although we had also observed that all of this is fragile and depends on
the caller doing the
correct thing. The real solution should be somehow getting rid of the
BBM style invalidation.
Ryan had proposed some methods in that email thread.

One solution which I had thought of, is that, observe that we are doing
an overkill by
setting the linear map to invalid and then default, for the *entire*
region. What we
can do is iterate over the linear map alias of the vm_struct *area and
only change permission
back to RW for the pages which are *not* RW. And, those relevant
mappings are guaranteed to
be split because they were changed from RW to not RW.

@Yang and Ryan,

I saw Yang's patch here:
https://lore.kernel.org/all/20251023204428.477531-1-
yang@xxxxxxxxxxxxxxxxxxxxxx/
and realized that currently we are splitting away the linear map alias of
the *entire* region.

Shouldn't this then imply that set_direct_map_invalid_noflush will never
fail, since even

a set_memory_rox() call on a single page will split the linear map for
the entire region,

and thus there is no fragility here which we were discussing about? I may
be forgetting

something, this linear map stuff is confusing enough already.

It still may fail due to page table allocation failure when doing split.
But it is still fine. We may run into 3 cases:

1. set_memory_rox succeed to split the whole range, then
set_direct_map_invalid_noflush() will succeed too
2. set_memory_rox fails to split, for example, just change partial range
permission due to page table allocation failure, then
set_direct_map_invalid_noflush() may
a. successfully change the permission back to default till where
set_memory_rox fails at since that range has been successfully split. It
is ok since the remaining range is actually not changed to ro by
set_memory_rox at all
b. successfully change the permission back to default for the whole
range (for example, memory pressure is mitigated when
set_direct_map_invalid_noflush() is called). It is definitely fine as well

Correct, what I mean to imply here is that, your patch will break this? If
set_memory_* is applied on x till y, your patch changes the linear map alias

only from x till y - set_direct_map_invalid_noflush instead operates on 0
till size - 1, where 0 <=x <=y <= size - 1. So, it may encounter a -ENOMEM

on [0, x) range while invalidating, and that is *not* okay because we must
reset back [0, x) to default?

I see your point now. But I think the callers need to guarantee they call
set_memory_rox and set_direct_map_invalid_noflush on the same range, right?
Currently kernel just calls them on the whole area.

Nope. The fact that the kernel changes protections, and undoes the changed
protections, on the *entire* alias of the vm_struct region, protects us from
the fragility we were talking about earlier.

This is what I meant "kernel just calls them on the whole area".

Suppose you have a range from 0 till size - 1, and you call set_memory_* on a
random point (page) p. The argument we discussed above is independent of p,
which lets us drop our

previous erroneous conclusion that all of this works because no caller does a
partial set_memory_*.

Sorry I don't follow you. What "erroneous conclusion" do you mean? You can
call set_memory_* on a random point, but set_direct_map_invalid_noflush()
should be called on the random point too. The current code of
set_area_direct_map() doesn't consider this case because there is no such
call. Is this what you meant?

I was referring to the discussion in the linear map work - I think we had
concluded that we don't need to worry about the BBM style invalidation failing,
*because* no one does a partial set_memory_*.

What I am saying - we don't care whether caller does a partial or a full
set_memory_*, we are still safe, because the linear map alias change on both
sides (set_memory_* -> __change_memory_common, and vm_reset_perms ->
set_area_direct_map() )

operate on the entire region.

I'm thoughoughly confused again. I thought we had concluded this was all safe
when discussing in the context of the "block mapping the linear map" series. But
now I'm a bit unclear on whether we have a bug. I think I'm hearing that we
don't need this patch and Dev will submit an alternative which just adds some
comments to explain why this is safe?

IIUC, it is still all safe. I think Dev's patch is right. We should check the return value of __change_memory_common() because page table split may fail. I had the return value check in my old patches which called page table split function outside __change_memory_common().

W/o this patch, the callers of set_memory_*() may continue to run with some memory in wrong permissions, for example, RW memory but they are supposed to be RO, instead of jumping to error handling path.

Thanks,
Yang

Thanks,
Ryan

I would like to send a patch clearly documenting this behaviour, assuming no
one else finds a hole in this reasoning.

Proper comment to explain the subtle behavior is definitely welcome.

Thanks,
Yang

Thanks,
Yang

Hopefully I don't miss anything.

Thanks,
Yang

Will