Re: [PATCH v3] riscv: mm: Avoid spurious fault after hotplugging vmemmap

From: Vivian Wang

Date: Mon Jun 08 2026 - 10:28:43 EST

On 6/8/26 21:18, David Hildenbrand (Arm) wrote:

> On 6/8/26 14:43, David Hildenbrand (Arm) wrote:
>> On 6/8/26 04:24, Vivian Wang wrote:
>>> On 6/5/26 17:16, David Hildenbrand (Arm) wrote:
>>>> How does mark_new_valid_map() handle concurrent access?
>>>>
>>>> There is some comment in there, but I am not sure if that actually implies that
>>>> mark_new_valid_map() can be called concurrently from arbitrary context?
>>> AFAICT, yes, it is intended to be used concurrently from arbitrary
>>> context, and this is why I used them.
>>>
>>> I had not originally written the code, so some riscv maintainer can
>>> maybe chime in on this, but as I understand it the idea is that we only
>>> ever set bits in the bitmap in mark_new_valid_map() while "allocating",
>>> whereas the assembly code in handle_exception in
>>> arch/riscv/kernel/entry.S only ever clears individual bits. So at least
>>> the access to the bitmap itself should be fine.
>> Right, flush_cache_vmap() looks like a bitmap_set() -- non-atomically.
>>
>> .Lnew_vmalloc_kernel_address does an atomic bitmap clearing ("Atomically reset
>> the current cpu bit in new_vmalloc").
>>
>> I am primarily wondering whether we'd want atomic bit-setting as well?
>>
>>> This was added in commit 503638e0babf ("riscv: Stop emitting preventive
>>> sfence.vma for new vmalloc mappings").
>>>
>>> Do you think the mark_new_valid_map() function needs extra
>>> synchronization like smp_wmb()?
>> Right, I wonder about atomic access, but also about memory barriers.

Oh dear, I've dug up quite the rabbit hole haven't I...

I'm going to leave that at "I don't know" for now. Maybe someone more
confident at their memory ordering knowledge can chime in.

> BTW, does riscv support a better mechanism to avoid such spurious faults?
>
> E.g., on arm64 we have emit_pte_barriers() where we issue some barriers:
>
> "The isb ensures that any previous speculative "invalid translation" marker that
> is in the CPU's pipeline gets cleared, so that any access to that address after
> setting the pte to valid won't cause a spurious fault."
>
> I assume the riscv case is different, as it's about actual TLB entries, and not
> just translation markers in the pipeline?

So yeah, conceptually the "TLB" on RISC-V caches the actual pte_t words,
and RISC-V cores are permitted to cache these valid=0 PTEs indefinitely
(if no "Svvptc" feature bit), or for a finite amount of time (if with
"Svvptc" feature bit).

As mentioned in the kfence patches [1], SpacemiT X60 cores in SpacemiT
K1 appear to be able to cache these valid=0 PTEs across even an WFI, so
the "indefinitely" part is probably real, although that doesn't mean it
isn't pipeline-only if all the WFI does is to stall the pipeline.

Not that it would really help, since RISC-V also has nothing like the
lightweight pipeline clearing stuff you mentioned defined.

Vivian "dramforever" Wang

[1]: https://lore.kernel.org/linux-riscv/20260303-handle-kfence-protect-spurious-fault-v2-0-f80d8354d79d@xxxxxxxxxxx/