Re: [PATCH] x86/alternatives: Add cond_resched() to text_poke_bp_batch()

From: Google
Date: Sun May 28 2023 - 22:52:57 EST


On Sun, 28 May 2023 08:46:52 -0400
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> From: "Steven Rostedt (Google)" <rostedt@xxxxxxxxxxx>
>
> Debugging in the kernel has started slowing down the kernel by a
> noticeable amount. The ftrace start up tests are triggering the softlockup
> watchdog on some boxes. This is caused by the start up tests that enable
> function and function graph tracing several times. Sprinkling
> cond_resched() just in the start up test code was not enough to stop the
> softlockup from triggering. It would sometimes trigger in the
> text_poke_bp_batch() code.
>
> The text_poke_bp_batch() is run in schedulable context. Add
> cond_resched() between each phase (adding the int3, updating the code, and
> removing the int3). This keeps the softlockup from triggering in the start
> up tests.
>
> Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
> ---
> arch/x86/kernel/alternative.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index f615e0cb6d93..e024eddd457f 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -1953,6 +1953,14 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries
> */
> atomic_set_release(&bp_desc.refs, 1);
>
> + /*
> + * Function tracing can enable thousands of places that need to be
> + * updated. This can take quite some time, and with full kernel debugging
> + * enabled, this could cause the softlockup watchdog to trigger.
> + * Add cond_resched() calls to each phase.
> + */
> + cond_resched();

Hmm, why don't you put this between the first step (put int3) and the
second step (put other bytes)? I guess those would takes more time.

Thank you,

> +
> /*
> * Corresponding read barrier in int3 notifier for making sure the
> * nr_entries and handler are correctly ordered wrt. patching.
> @@ -2030,6 +2038,7 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries
> * better safe than sorry (plus there's not only Intel).
> */
> text_poke_sync();
> + cond_resched();
> }
>
> /*
> @@ -2049,8 +2058,10 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries
> do_sync++;
> }
>
> - if (do_sync)
> + if (do_sync) {
> text_poke_sync();
> + cond_resched();
> + }
>
> /*
> * Remove and wait for refs to be zero.
> --
> 2.39.2
>


--
Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>