Re: [PATCH 2/8] bpf: Recover arena kernel faults with scratch page
From: Tejun Heo
Date: Sun May 31 2026 - 13:48:18 EST
Hello,
I posted the check removal [1], and Sashiko's review flagged a
break-before-make problem with it [2] that I think is real.
The scratch page is a present PAGE_KERNEL mapping, so having
apply_range_set_cb() overwrite it via set_pte_at() during
bpf_arena_alloc_pages() is a valid->valid PFN change. I'm not familiar with
arm at all. David, my understanding is that's a break-before-make violation
on arm64, and that on any arch the stale TLB entry keeps resolving to the
shared scratch page until it's flushed, so a later access can hit scratch
instead of the new page. Is that what you were worried about?
So instead of just dropping the check, the install should route through an
invalid entry rather than overwrite in place:
while (!ptep_try_set(pte, mk_pte(page, PAGE_KERNEL))) {
old = ptep_get(pte);
if (pte_none(old))
continue;
if (WARN_ON_ONCE(pte_page(old) != arena->scratch_page))
return -EBUSY;
ptep_get_and_clear(&init_mm, addr, pte);
broke_scratch = true;
}
ptep_try_set() only fills a none slot, so the slot goes scratch->none->page
and never valid->valid, and the loop copes with a concurrent fault
re-scratching it. This also closes the set_pte_at()-vs-ptep_try_set() race
I raised earlier, since both sides are now cmpxchg. A broken scratch entry
was live, so the caller flush_tlb_kernel_range()s those pages when
broke_scratch is set, like arena_free_pages() already does after clearing.
[1] https://lore.kernel.org/r/20260531165852.555930-1-tj@xxxxxxxxxx
[2] https://lore.kernel.org/r/20260531170854.31EA51F00893@xxxxxxxxxxxxxxx
Thanks.
--
tejun