Re: [RFC][PATCH 1/2] x86: Allow breakpoints to emulate call functions

From: Linus Torvalds
Date: Mon May 06 2019 - 21:12:50 EST

On Mon, May 6, 2019 at 5:10 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> But the CPU that was rewriting instructions does a run_sync() after
> removing the int3:
> static void run_sync(void)
> {
> int enable_irqs;
> /* No need to sync if there's only one CPU */
> if (num_online_cpus() == 1)
> return;
> enable_irqs = irqs_disabled();
> /* We may be called with interrupts disabled (on bootup). */
> if (enable_irqs)
> local_irq_enable();
> on_each_cpu(do_sync_core, NULL, 1);
> if (enable_irqs)
> local_irq_disable();
> }
> Which sends an IPI to all CPUs to make sure they no longer see the int3.

Duh. I have been looking back and forth in that file, and I was mixing
ftrace_modify_code_direct() (which only does a local sync) with
ftrace_modify_code() (which does run_sync()). The dangers of moving
around by searching for function names.

That file is a maze of several functions that are very similarly named
and do slightly different things.

But yes, I was looking at the "direct" sequence.

> I think you are missing the run_sync() which is the heavy hammer to
> make sure all CPUs are in sync. And this is done at each stage:
> add int3
> run_sync();
> update call cite outside of int3
> run_sync()
> remove int3
> run_sync()
> HPA said that the last run_sync() isn't needed, but I kept it because I
> wanted to make sure. Looks like your analysis shows that it is needed.

Absolutely. I think we could get rid of it, but yes, to then avoid the
race we'd need to be a lot more clever.

Yeah, with the three run_sunc() things, the races I thought it had can't happen.