Re: [PATCH v4 28/39] unwind_user/deferred: Add deferred unwinding interface
From: Peter Zijlstra
Date: Thu Jan 23 2025 - 17:13:48 EST
On Thu, Jan 23, 2025 at 10:43:05AM -0800, Josh Poimboeuf wrote:
> On Thu, Jan 23, 2025 at 09:25:34AM +0100, Peter Zijlstra wrote:
> > On Wed, Jan 22, 2025 at 08:05:33PM -0800, Josh Poimboeuf wrote:
> >
> > > However... would it be a horrible idea for 'next' to unwind 'prev' after
> > > the context switch???
> >
> > The idea isn't terrible, but it will be all sorta of tricky.
> >
> > The big immediate problem is that the CPU doing the context switch
> > looses control over prev at:
> >
> > __schedule()
> > context_switch()
> > finish_task_switch()
> > finish_task()
> > smp_store_release(&prev->on_cpu, 0);
> >
> > And this is before we drop rq->lock.
> >
> > The instruction after that store another CPU is free to claim the task
> > and run with it. Notably, another CPU might already be spin waiting on
> > that state, trying to wake the task back up.
> >
> > By the time we get to a schedulable context, @prev is completely out of
> > bounds.
>
> Could unwind_deferred_request() call migrate_disable() or so?
That's pretty vile... and might cause performance issues. You realy
don't want things to magically start behaving differently just because
you're tracing.
> How bad would it be to set some bit in @prev to prevent it from getting
> rescheduled until the unwind from @next has been done? Unfortunately
> two tasks would be blocked on the unwind instead of one.
Yeah, not going to happen. Those paths are complicated enough as is.
> BTW, this might be useful for another reason. In Steve's sframe meeting
> yesterday there was some talk of BPF needing to unwind from
> sched-switch, without having to wait indefinitely for @prev to get
> rescheduled and return to user.
-EPONIES, you cannot take faults from the middle of schedule(). They can
always use the best effort FP unwind we have today.