Re: [RFC][PATCH] sched,livepatch: Untangle cond_resched() and live-patching

From: Miroslav Benes
Date: Wed Mar 26 2025 - 10:38:12 EST


Hi,

On Wed, 26 Mar 2025, Peter Zijlstra wrote:

> On Wed, Mar 26, 2025 at 10:49:10AM +0100, Petr Mladek wrote:
> > On Mon 2025-03-24 14:49:09, Peter Zijlstra wrote:
> > >
> > > With the goal of deprecating / removing VOLUNTARY preempt, live-patch
> > > needs to stop relying on cond_resched() to make forward progress.
> > >
> > > Instead, rely on schedule() with TASK_FREEZABLE set. Just like
> > > live-patching, the freezer needs to be able to stop tasks in a safe /
> > > known state.
> >
> > > Compile tested only.
> > >
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > > ---
> > > include/linux/livepatch_sched.h | 15 +++++--------
> > > include/linux/sched.h | 6 -----
> > > kernel/livepatch/transition.c | 30 ++++++-------------------
> > > kernel/sched/core.c | 50 +++++++----------------------------------
> > > 4 files changed, 21 insertions(+), 80 deletions(-)
> > >
> > > diff --git a/include/linux/livepatch_sched.h b/include/linux/livepatch_sched.h
> > > index 013794fb5da0..7e8171226dd7 100644
> > > --- a/include/linux/livepatch_sched.h
> > > +++ b/include/linux/livepatch_sched.h
> > > @@ -3,27 +3,24 @@
> > > #define _LINUX_LIVEPATCH_SCHED_H_
> > >
> > > #include <linux/jump_label.h>
> > > -#include <linux/static_call_types.h>
> > > +#include <linux/sched.h>
> > > +
> > >
> > > #ifdef CONFIG_LIVEPATCH
> > >
> > > void __klp_sched_try_switch(void);
> > >
> > > -#if !defined(CONFIG_PREEMPT_DYNAMIC) || !defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
> > > -
> > > DECLARE_STATIC_KEY_FALSE(klp_sched_try_switch_key);
> > >
> > > -static __always_inline void klp_sched_try_switch(void)
> > > +static __always_inline void klp_sched_try_switch(struct task_struct *curr)
> > > {
> > > - if (static_branch_unlikely(&klp_sched_try_switch_key))
> > > + if (static_branch_unlikely(&klp_sched_try_switch_key) &&
> > > + READ_ONCE(curr->__state) & TASK_FREEZABLE)
> > > __klp_sched_try_switch();
> > > }
> >
> > Do we really need to check the TASK_FREEZABLE state, please?
> >
> > My understanding is that TASK_FREEZABLE is set when kernel kthreads go into
> > a "freezable" sleep, e.g. wait_event_freezable().
>
> Right.
>
> > But __klp_sched_try_switch() should be safe when the task is not
> > running and the stack is reliable. IMHO, it should be safe anytime
> > it is being scheduled out.
>
> So for the reasons you touched upon in the next paragraph, FREEZABLE
> seemed like a more suitable location.
>
> > Note that wait_event_freezable() is a good location. It is usually called in
> > the main loop of the kthread where the stack is small. So that the chance
> > that it is not running a livepatched function is higher than on
> > another random schedulable location.
>
> Right, it is the natural quiescent state of the kthread, it holds no
> resources.
>
> > But we actually wanted to have it in cond_resched() because
> > it might take a long time to reach the main loop, and sleep there.
>
> Well, cond_resched() is going to get deleted, so we need to find
> something else. And I was thinking that the suspend people want
> reasonable timeliness too -- you don't want your laptop to continue
> running for many seconds after you close the lid and stuff it in your
> bag, now do you.
>
> So per that reasoning I figured FREEZABLE should be good enough.
>
> Sharing the pain with suspend can only lead to improving both -- faster
> patching progress leads to faster suspend and vice-versa.

If I remember correctly, we had something like this in the old kGraft
implementation of the live patching (SUSE way). We exactly had a hook
somewhere in the kthread freezing code. This looks much cleaner and as far
as I know the fridge went through improvements recently.

Peter, so that I understand it correctly... we would rely on all kthreads
becoming freezable eventually so that both suspend and livepatch benefit.
Is that what you meant by the above?

Miroslav