Re: [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched

From: Josh Poimboeuf
Date: Tue May 10 2022 - 14:42:35 EST


On Tue, May 10, 2022 at 06:07:00PM +0000, Rik van Riel wrote:
> On Tue, 2022-05-10 at 09:52 -0700, Josh Poimboeuf wrote:
> > On Tue, May 10, 2022 at 04:07:42PM +0000, Rik van Riel wrote:
> > > >
> > > Now I wonder if we could just hook up a preempt notifier
> > > for kernel live patches. All the distro kernels already
> > > need the preempt notifier for KVM, anyway...
> > >
> >
> > I wouldn't be opposed to that, but how does it solve this problem? 
> > If
> > as Peter said cond_resched() can be a NOP, then preemption would have
> > to
> > be from an interrupt, in which case frame pointers aren't reliable.
> >
> The systems where we are seeing problems do not, as far
> as I know, throw softlockup errors, so the kworker
> threads that fail to transition to the new KLP version
> are sleeping and getting scheduled out at times.

Are they sleeping due to an explicit call to cond_resched()?

> A KLP transition preempt notifier would help those
> kernel threads transition to the new KLP version at
> any time they reschedule.

... unless cond_resched() is a no-op due to CONFIG_PREEMPT?

> How much it will help is hard to predict, but I should
> be able to get results from a fairly large sample size
> of systems within a few weeks :)

As Peter said, keep in mind that we will need to fix other cases beyond
Facebook, i.e., CONFIG_PREEMPT combined with non-x86 arches which don't
have ORC so they can't reliably unwind from an IRQ.

--
Josh