Re: [PATCH 0/11] Short circuit delivery for coredump signals
From: Eric W. Biederman
Date: Mon Jun 29 2026 - 02:24:02 EST
Oleg Nesterov <oleg@xxxxxxxxxx> writes:
> Eric,
>
> Please rebase on top of Linus's tree, git am fails at 7/11.
This was built on v7.1.
Now that v7.2-rc1 is out I will be happy to rebase on top of that.
> So far I didnt' try to read the individual patches, I've applied
> the whole series on top of 25fe708bbc59 to avoid the conflicts, and
> after the very quick glance I seem to see some problems.
>
> Please correct me.
>
> -------------------------------------------------------------------------
> complete_signal() does:
>
> if (sig_fatal(p, sig) && !sigismember(&t->real_blocked, sig) &&
> (sig == SIGKILL || !p->ptrace)) {
> /*
> * This signal will be fatal to the whole group.
> *
> * Start a group exit and wake everybody up.
> * This way we don't have other threads
> * running and doing things after a slower
> * thread has the fatal signal pending.
> */
> signal->flags = SIGNAL_GROUP_EXIT | SIGNAL_EXIT_DEQUEUE;
> signal->group_exit_code = sig;
> ... kill the thread group ...
>
> However, prepare_signal() still does:
>
> if (signal->flags & SIGNAL_GROUP_EXIT) {
> if (signal->core_state)
> return sig == SIGKILL;
> /*
> * The process is in the middle of dying, drop the signal.
> */
> return false;
>
> This means that if SIGKILL comes before coredump_begin() sets signal->core_state,
> it will be lost.
I will reexamine that. I used to have something to deal with this case
but somehow convinced myself it didn't matter.
> -------------------------------------------------------------------------
> dequeue_exit_signal:
>
> if (signal->flags & SIGNAL_EXIT_DEQUEUE) {
> struct sigpending *pending = NULL;
> struct sigqueue *timer_sigq;
> int signr = exit_code;
>
> signal->flags &= ~SIGNAL_EXIT_DEQUEUE;
>
> pending = sigismember(&tsk->pending.signal, signr) ?
> &tsk->pending : &signal->shared_pending;
>
> collect_signal(signr, pending, info, &timer_sigq);
>
> This looks obviously wrong. 2 threads, T1 and T2. SIGSEGV is sent to T1.
> T2 calls get_signal(), clears SIGNAL_EXIT_DEQUEUE and returns SIGSEGV.
> But collect_signal() won't find SIGSEGV, *info will be bogus.
Ugh.
I deliberately allowed the cross thread dumping so that whichever thread
won the race could just dump core. I failed to consider it would be
a problem for per thread signals.
I will have to think a little bit about how to know which queue to
remove the signal from. It is tempting to always place fatal signals
on the shared_pending queue.
Eric