Re: [PATCH 0/1] exit: kill signal_struct->quick_threads

From: Eric W. Biederman
Date: Mon Jun 10 2024 - 09:15:54 EST


Oleg Nesterov <oleg@xxxxxxxxxx> writes:

> Hello,
>
> Eric, I can't understand why the commit ("signal: Guarantee that
> SIGNAL_GROUP_EXIT is set on process exit") added the new
> quick_threads counter. And why, if we forget about --quick_threads,
> synchronize_group_exit() has to take siglock unconditionally.
> Did I miss something obvious?

At a minimum it is the exact same locking as everywhere else that sets
signal->flags, signal->group_exit_code, and signal->group_stop_count
uses.

So it would probably require some significant reason to not use
the same locking and complicate reasoning about the code. I suspect
setting those values without siglock held is likely to lead to
interesting races.

May I ask which direction you are coming at this from? Are you trying
to reduce the cost of do_exit? Are you interested in untangling the
mess that is exiting threads in a process?

I have a branch around that I was slowly working through to detangle
the entire mess. And if you are interested I can dig it back up.
My memory is I had all of the known issues worked through but I still
needed to feed the code through code review and merge it in small steps
to ensure I don't introduce regressions.

That is where signal->quick_threads comes from. In the work it is a
part of I wind up moving the decrement up much sooner to the point where
individual threads decide to exit. The decrement of signal->live comes
much too late to be useful in that context.

It is also part of me wanting to be able to uniformly use
SIGNAL_GROUP_EXIT and signal->group_exit_code when talking about the
process state, and p->exit_code when talking about the per task state.

At the moment I am staring at wait_task_zombie and trying to understand
how:

status = (p->signal->flags & SIGNAL_GROUP_EXIT)
? p->signal->group_exit_code : p->exit_code;

works without any locks or barriers.

Eric