Re: [PATCH v3 2/4] perf: Enqueue SIGTRAP always via task_work.

From: Frederic Weisbecker
Date: Wed Apr 10 2024 - 07:37:16 EST


Le Tue, Apr 09, 2024 at 03:47:29PM +0200, Sebastian Andrzej Siewior a écrit :
> On 2024-04-09 14:36:51 [+0200], Frederic Weisbecker wrote:
> > > That wake_up() within preempt_disable() section breaks on RT.
> >
> > Ah, but the wake-up still wants to go inside recursion protection somehow or
> > it could generate task_work loop again due to tracepoint events...
>
> okay.
>
> > Although... the wake up occurs only when the event is dead after all...
>
> corner case or not, it has to work, right?

Yep.

>
> > > How do we go on from here?
> >
> > I'd tend to think you need my patchset first because the problems it
> > fixes were not easily visible as long as there was an irq work to take
> > care of things most of the time. But once you rely on task_work only then
> > these become a real problem. Especially the sync against perf_release().
>
> I don't mind rebasing on top of your series. But defaulting to task_work
> is not an option then?
>
> RT wise the irq_work is not handled in hardirq because of locks it
> acquires and is handled instead in a thread. Depending on the priority
> the task (receiving the event) it may run before the irq_work-thread.
> Therefore the task_work looked neat because the event would be handled
> _before_ the task returned to userland.

I see.

> Couldn't we either flush _or_ remove the task_work in perf_release()?

Right so the problem in perf_release() is that we may be dealing with task works
of other tasks than current. In that case, task_work_cancel() is fine if it
successes. But if it fails, you don't have the guarantee that the task work
isn't concurrently running or about to run. And you have no way to know about
that. So then you need some sort of flushing indeed.

Thanks.

> > Thanks.
> Sebastian