Re: [PATCH v4] arm64: Enable permission change on arm64 kernel block mappings

From: Dev Jain
Date: Thu Jul 24 2025 - 06:41:28 EST



On 24/07/25 1:49 pm, Catalin Marinas wrote:
On Thu, Jul 03, 2025 at 08:44:41PM +0530, Dev Jain wrote:
This patch paves the path to enable huge mappings in vmalloc space and
linear map space by default on arm64. For this we must ensure that we can
handle any permission games on the kernel (init_mm) pagetable. Currently,
__change_memory_common() uses apply_to_page_range() which does not support
changing permissions for block mappings. We attempt to move away from this
by using the pagewalk API, similar to what riscv does right now;
RISC-V seems to do the splitting as well and then use
walk_page_range_novma().

however,
it is the responsibility of the caller to ensure that we do not pass a
range overlapping a partial block mapping or cont mapping; in such a case,
the system must be able to support range splitting.
How does the caller know what the underlying mapping is? It can't really
be its responsibility, so we must support splitting at least at the
range boundaries. If you meant the caller of the internal/static
update_range_prot(), that's an implementation detail where a code

Yes, I have added the comment in update_range_prot().

comment would suffice. But you can't require such awareness from the
callers of the public set_memory_*() API.

Yes, it converges to the architecture being aware that, since it supports
block mappings, a block mapping may be partially changed, so the internal
API must split at the start and end.


This patch is tied with Yang Shi's attempt [1] at using huge mappings
in the linear mapping in case the system supports BBML2, in which case
we will be able to split the linear mapping if needed without
break-before-make. Thus, Yang's series, IIUC, will be one such user of my
patch; suppose we are changing permissions on a range of the linear map
backed by PMD-hugepages, then the sequence of operations should look
like the following:

split_range(start)
split_range(end);
__change_memory_common(start, end);
This makes sense if that's the end goal but it's not part of this patch.

However, this patch can be used independently of Yang's; since currently
permission games are being played only on pte mappings (due to
apply_to_page_range not supporting otherwise), this patch provides the
mechanism for enabling huge mappings for various kernel mappings
like linear map and vmalloc.
Does this patch actually have any user without Yang's series?
can_set_direct_map() returns true only if the linear map uses page
granularity, so I doubt it can even be tested on its own. I'd rather see
this patch included with the actual user or maybe add it later as an
optimisation to avoid splitting the full range.

Hmmm. Makes sense. Although I don't think that this patch is an optimization;
post Yang's series (without this patch), if we change permissions on a partial
linear map block mapping, set_memory_* will return -EINVAL, causing problems.


---------------------
Implementation
---------------------

arm64 currently changes permissions on vmalloc objects locklessly, via
apply_to_page_range, whose limitation is to deny changing permissions for
block mappings. Therefore, we move away to use the generic pagewalk API,
thus paving the path for enabling huge mappings by default on kernel space
mappings, thus leading to more efficient TLB usage. However, the API
currently enforces the init_mm.mmap_lock to be held. To avoid the
unnecessary bottleneck of the mmap_lock for our usecase, this patch
extends this generic API to be used locklessly, so as to retain the
existing behaviour for changing permissions.
Is it really a significant bottleneck if we take the lock? I suspect if
we want to make this generic and allow splitting, we'll need a lock
anyway (though maybe for shorter intervals if we only split the edges).

The splitting primitive may or may not require a lock, Ryan and Yang had
some discussion on the linear map block mapping thread. I am assuming
that since we didn't need a lock in the past, there is no need to take it now,
since we are only changing the pagetable walk API being used.