Re: [PATCH v4 2/6] perf: Enqueue SIGTRAP always via task_work.

From: Frederic Weisbecker
Date: Fri Nov 08 2024 - 08:12:10 EST


+Cc Oleg.

Le Thu, Nov 07, 2024 at 03:46:17PM +0100, Sebastian Andrzej Siewior a écrit :
> On 2024-10-30 16:46:22 [+0100], Frederic Weisbecker wrote:
> > This needs more thoughts. We must make sure that the parent is put _after_
> > the child because it's dereferenced on release, for example:
>
> > put_event()
> > free_event()
> > irq_work_sync(&event->pending_irq);
> > ====> IRQ or irq_workd
> > perf_event_wakeup()
> > ring_buffer_wakeup()
> > event = event->parent;
> > rcu_dereference(event->rb);
> >
> > And now after this patch it's possible that this happens after
> > the parent has been released.
> >
> > We could put the parent from the child's free_event() but some
> > places (inherit_event()) may call free_event() on a child without
> > having held a reference to the parent.
> >
> > Also note that with this patch the task may receive late irrelevant
> > signals after the event is removed. It's probably not that bad but
> > still... This could be a concern for exec(), is there a missing
> > task_work_run() there before flush_signal_handlers()?
>
> So if this causes so much pain, what about taking only one item at a
> item? The following passes the test, too:
>
> diff --git a/kernel/task_work.c b/kernel/task_work.c
> index c969f1f26be58..fc796ffddfc74 100644
> --- a/kernel/task_work.c
> +++ b/kernel/task_work.c
> @@ -206,7 +206,7 @@ bool task_work_cancel(struct task_struct *task, struct callback_head *cb)
> void task_work_run(void)
> {
> struct task_struct *task = current;
> - struct callback_head *work, *head, *next;
> + struct callback_head *work, *head;
>
> for (;;) {
> /*
> @@ -214,17 +214,7 @@ void task_work_run(void)
> * work_exited unless the list is empty.
> */
> work = READ_ONCE(task->task_works);
> - do {
> - head = NULL;
> - if (!work) {
> - if (task->flags & PF_EXITING)
> - head = &work_exited;
> - else
> - break;
> - }
> - } while (!try_cmpxchg(&task->task_works, &work, head));
> -
> - if (!work)
> + if (!work && !(task->flags & PF_EXITING))
> break;
> /*
> * Synchronize with task_work_cancel_match(). It can not remove
> @@ -232,13 +222,24 @@ void task_work_run(void)
> * But it can remove another entry from the ->next list.
> */
> raw_spin_lock_irq(&task->pi_lock);
> + do {
> + head = NULL;
> + if (work) {
> + head = READ_ONCE(work->next);
> + } else {
> + if (task->flags & PF_EXITING)
> + head = &work_exited;
> + else
> + break;
> + }
> + } while (!try_cmpxchg(&task->task_works, &work, head));
> raw_spin_unlock_irq(&task->pi_lock);

And having more than one task work should be sufficiently rare
that we don't care about doing the locking + cmpxchg() for each
of them pending.

I like it!

Thanks.

>
> - do {
> - next = work->next;
> - work->func(work);
> - work = next;
> + if (!work)
> + break;
> + work->func(work);
> +
> + if (head)
> cond_resched();
> - } while (work);
> }
> }
>
> Sebastian