Re: [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched

From: Song Liu
Date: Mon May 09 2022 - 12:22:28 EST




> On May 9, 2022, at 8:07 AM, Petr Mladek <pmladek@xxxxxxxx> wrote:
>
> On Sat 2022-05-07 10:46:28, Song Liu wrote:
>> Busy kernel threads may block the transition of livepatch. Call
>> klp_try_switch_task from __cond_resched to make the transition easier.
>
> Do you have some numbers how this speeds up the transition
> and how it slows down the scheduler, please?

We don’t have number on how much this would slow down the scheduler.
For the transition, we see cases where the transition cannot finish
with in 60 seconds (how much "kpatch load" waits by default).

>
> cond_resched() is typically called in cycles with many interactions
> where the task might spend a lot of time. There are two possibilities.
> cond_resched() is called in:
>
> + livepatched function
>
> In this case, klp_try_switch_task(current) will always fail.
> And it will non-necessarily slow down every iteration by
> checking the very same stack.
>
>
> + non-livepatched function
>
> In this case, the transition will succeed on the first attempt.
> OK, but it would succeed also without that patch. The task would
> most likely sleep in this cond_resched() so that it might
> be successfully transitioned on the next occasion.

We are in the non-live patched case. But the transition didn’t happen
in time, because the kernel thread doesn’t go to sleep. While there is
clearly something weird with this thread, we think live patch should
work because the thread does call cond_resched from time to time.

Thanks,
Song

>
>
> From my POV this patch this patch brings more harm than good.
>
> Note that scheduling is a fast path. It is repeated zillion-times
> on any system. But livepatch transition is a slow path. It does not
> matter if it takes 1 second or 1 hour.
>
> Best Regards,
> Petr