Re: [PATCH 2/8] bpf: Recover arena kernel faults with scratch page

From: Tejun Heo

Date: Fri May 29 2026 - 15:04:39 EST


Hello,

On Fri, May 29, 2026 at 11:38:21AM -0700, Alexei Starovoitov wrote:
...
> > 1. The racing write. set_pte_at() and the scratch installer's
> > ptep_try_set() hit the same PTE with no common lock. On x86-64 and arm64
> > set_pte_at() is a single atomic store, so it can't tear against the
> > cmpxchg, but a plain store racing a cmpxchg isn't atomic in general.
> > David, is that the worry - an arch where set_pte_at() is split and could
> > tear - or something else?
> >
> > 2. The SEGV. It's a BPF program failure propagating out as a SEGV. Maybe
> > not ideal, but as long as we surface the BPF error properly, it doesn't
> > necessarily seem broken to me.
>
> returning EBUSY because apply_range_set_cb() hit scratch page
> is SEGV out of arena_vm_fault() and arguably ok-ish,
> but bpf_arena_alloc_pages() returning NULL because scratch page
> was in the range just sucks.
> Earlier bpf prog passed the wrong arena addr to kfunc and triggered
> that scratch page. It broke the contract and kept the pieces,
> so ok-ish too, but overwriting scratch page with proper page
> during bpf_arena_alloc_pages() is imo much better behavior.
> That scratch page will cause all future bpf_arena_alloc_pages() fail as well.
> Hence I prefer that check removed.

Yeah, let's do that. David, would that be enough? Or are you still concerned
about set_pte_at() competing with ptep_try_set()?

Thanks.

--
tejun