Re: [PATCH] x86/alternatives: Add cond_resched() to text_poke_bp_batch()

From: Peter Zijlstra
Date: Tue May 30 2023 - 08:02:11 EST


On Sun, May 28, 2023 at 08:46:52AM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" <rostedt@xxxxxxxxxxx>
>
> Debugging in the kernel has started slowing down the kernel by a
> noticeable amount. The ftrace start up tests are triggering the softlockup
> watchdog on some boxes. This is caused by the start up tests that enable
> function and function graph tracing several times. Sprinkling
> cond_resched() just in the start up test code was not enough to stop the
> softlockup from triggering. It would sometimes trigger in the
> text_poke_bp_batch() code.
>
> The text_poke_bp_batch() is run in schedulable context. Add
> cond_resched() between each phase (adding the int3, updating the code, and
> removing the int3). This keeps the softlockup from triggering in the start
> up tests.
>
> Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
> ---
> arch/x86/kernel/alternative.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index f615e0cb6d93..e024eddd457f 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -1953,6 +1953,14 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries
> */
> atomic_set_release(&bp_desc.refs, 1);
>
> + /*
> + * Function tracing can enable thousands of places that need to be
> + * updated. This can take quite some time, and with full kernel debugging
> + * enabled, this could cause the softlockup watchdog to trigger.
> + * Add cond_resched() calls to each phase.
> + */
> + cond_resched();

But but but... you can only have TP_VEC_MAX pokes queued, which is 256
on normal setups.

Please explain how this leads to problems and why you need _3_
reschedule points here.