Re: [PATCH v4 2/6] perf: Enqueue SIGTRAP always via task_work.

From: Frederic Weisbecker
Date: Thu Dec 05 2024 - 05:13:44 EST


Le Thu, Dec 05, 2024 at 10:20:16AM +0100, Oleg Nesterov a écrit :
> > Looking at task_work, it seems that most enqueues happen to the current task.
> > AFAICT, only io_uring() does remote enqueue. Would it make sense to have a light
> > version of task_work that is only ever used by current? This would be a very
> > simple flavour with easy queue and cancellation without locking/atomics/RmW
> > operations.
>
> Perhaps, but we also need to avoid the races with task_work_cancel() from
> another task. I mean, if a task T does task_work_add_light(work), it can race
> with task_work_cancel(T, ...) which can change T->task_works on another CPU.

I was thinking about two different lists.

>
>
> OK, I can't suggest a good solution for this problem, so I won't argue with
> the patch from Sebastian. Unfortunately, with this change we can never restore
> the fifo ordering. And note that the current ordering is not lifo, it is
> "unpredictable". Plus the extra lock/xchg for each work...

Good point it's unpredictable due to extraction before execution.

Another alternative is to maintain another head that points to the
head of the executing list. This way we can have task_work_cancel_current()
that completely cancels the work. That was my initial proposal here and it
avoids the lock/xchg for each work:

https://lore.kernel.org/all/Zx-B0wK3xqRQsCOS@localhost.localdomain/

>
> Hmm. I just noticed that task_work_run() needs a simple fix:
>
> --- x/kernel/task_work.c
> +++ x/kernel/task_work.c
> @@ -235,7 +235,7 @@
> raw_spin_unlock_irq(&task->pi_lock);
>
> do {
> - next = work->next;
> + next = READ_ONCE(work->next);
> work->func(work);
> work = next;
> cond_resched();
>
> Perhaps it makes sense before the patch from Sebastian even if that patch
> removes this do/while loop ?

Hmm, can work->next be modified concurrently here?

Thanks.

>
> Oleg.
>