Re: [PATCH] x86/alternatives: remove false sharing in poke_int3_handler()

From: Peter Zijlstra
Date: Tue Mar 25 2025 - 06:32:35 EST


On Tue, Mar 25, 2025 at 09:41:10AM +0100, Ingo Molnar wrote:
>
> * Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> > On Mon, Mar 24, 2025 at 08:53:31AM +0100, Eric Dumazet wrote:
> >
> > > BTW the atomic_cond_read_acquire() part is never called even during my
> > > stress test.
> >
> > Yes, IIRC this is due to text_poke_sync() serializing the state, as that
> > does a synchronous IPI broadcast, which by necessity requires all
> > previous INT3 handlers to complete.
> >
> > You can only hit that case if the INT3 remains after step-3 (IOW you're
> > actively writing INT3 into the text). This is exceedingly rare.
>
> Might make sense to add a comment for that.

Sure, find below.

> Also, any strong objections against doing this in the namespace:
>
> s/bp_/int3_
>
> ?
>
> Half of the code already calls it a variant of 'int3', half of it 'bp',
> which I had to think for a couple of seconds goes for breakpoint, not
> base pointer ... ;-)

It actually is breakpoint, as in INT3 raises #BP. For complete confusion
the things that are commonly known as debug breakpoints, those things in
DR7, they raise #DB or debug exceptions.

> Might as well standardize on int3_ and call it a day?

Yeah, perhaps. At some point you've got to know that INT3->#BP and
DR7->#DB and it all sorta makes sense, but *shrug* :-)


---
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index bf82c6f7d690..01e94603e767 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2749,6 +2749,13 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries

/*
* Remove and wait for refs to be zero.
+ *
+ * Notably, if after step-3 above the INT3 got removed, then the
+ * text_poke_sync() will have serialized against any running INT3
+ * handlers and the below spin-wait will not happen.
+ *
+ * IOW. unless the replacement instruction is INT3, this case goes
+ * unused.
*/
if (!atomic_dec_and_test(&bp_desc.refs))
atomic_cond_read_acquire(&bp_desc.refs, !VAL);