Re: [RFC][alpha] saner vmalloc handling (was Re: [Bug report] hash_name() may cross page boundary and trigger sleep in RCU context)

From: Al Viro

Date: Sun Nov 30 2025 - 15:31:09 EST


On Sun, Nov 30, 2025 at 07:03:31PM +0000, david laight wrote:

> (Doing the updates in the page fault handler definitely sounds like a recipe
> for disaster.)

See the comments in arch/x86/mm/fault.c regarding the need to do it in
#PF on i386. Basically, you have

CPU1: does vmalloc(), needs to grab a new slot in top-level table.
Updates the page table tree for init_mm (rooted at swapper_pg_dir).
Starts propagating that change to other threads.

CPU2: does vmalloc(), which grabs another address in the range
covered by the same slot. It works with the same page table tree,
so it sees that slot already occupied, inserts a new page reference
into the page table hanging off it. No top-level changes to
propagate, so it returns the address it has grabbed to the caller.

CPU2: caller of vmalloc() dereferences the returned object.
If propagation from CPU1 has not reached the top-level page table
CPU2 is using, the top-level slot is empty and MMU of CPU2 raises
#PF.

BTW, it might be possible for two parallel allocations to grab two
areas, each requiring grabbing its own top-level slot ;-/
It certainly can happen on x86 - two vmalloc(SZ_4M) in parallel
will do just that, but with NUMA you can get it 64bit boxen
as well.

So simple generation count on root page table won't solve it either...