Re: [PATCH v2 1/2] exit: change the release_task() paths to call flush_sigqueue() lockless
From: Frederic Weisbecker
Date: Thu Feb 06 2025 - 11:27:44 EST
Le Thu, Feb 06, 2025 at 04:23:14PM +0100, Oleg Nesterov a écrit :
> A task can block a signal, accumulate up to RLIMIT_SIGPENDING sigqueues,
> and exit. In this case __exit_signal()->flush_sigqueue() called with irqs
> disabled can trigger a hard lockup, see
> https://lore.kernel.org/all/20190322114917.GC28876@xxxxxxxxxx/
>
> Fortunately, after the recent posixtimer changes sys_timer_delete() paths
> no longer try to clear SIGQUEUE_PREALLOC and/or free tmr->sigq, and after
> the exiting task passes __exit_signal() lock_task_sighand() can't succeed
> and pid_task(tmr->it_pid) will return NULL.
>
> This means that after __exit_signal(tsk) nobody can play with tsk->pending
> or (if group_dead) with tsk->signal->shared_pending, so release_task() can
> safely call flush_sigqueue() after write_unlock_irq(&tasklist_lock).
>
> TODO:
> - we can probably shift posix_cpu_timers_exit() as well
Hmm, can't a timer be concurrently deleted between __exit_signal() set
tsk->sighand = NULL and release sighand lock, and the actual call to
posix_cpu_timer_exit() ? And then posix_cpu_timer_exit() calls timerqueue_del()
on a node that don't exist anymore?
That would even trigger the warning in posix_cpu_timer_del().
> - do_sigaction() can hit the similar problem
>
> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
Reviewed-by: Frederic Weisbecker <frederic@xxxxxxxxxx>