Re: [PATCH v3 1/3] pidfd: allow pidfd_open() on non-thread-group leaders
From: Christian Brauner
Date: Fri Jan 26 2024 - 05:19:17 EST
On Thu, Jan 25, 2024 at 06:51:14PM +0100, Oleg Nesterov wrote:
> On 01/25, Christian Brauner wrote:
> >
> > > > When it is reaped is "mostly unrelated".
> > >
> > > Then why pidfd_poll() can't simply check !task || task->exit_state ?
> > >
> > > Nevermind. So, currently pidfd_poll() succeeds when the leader can be
> >
> > Hm, the comment right above mentions:
> >
> > /*
> > * Inform pollers only when the whole thread group exits.
> > * If the thread group leader exits before all other threads in the
> > * group, then poll(2) should block, similar to the wait(2) family.
> > */
> > > reaped, iow the whole thread group has exited.
>
> Yes, but the comment doesn't contradict with what I have said?
No, it doesn't. I'm trying to understand what you are suggesting though.
Are you saying !task || tas->exit_state is enough and we shouldn't use
the helper that was added in commit 38fd525a4c61 ("exit: Factor
thread_group_exited out of pidfd_poll"). If so what does that buy us
open-coding the check instead of using that helper? Is there an actual
bug here?
> > > But even if you are the
> > > parent, you can't expect that wait(WNOHANG) must succeed, the leader
> > > can be traced. I guess it is too late to change this behaviour.
> >
> > Hm, why is that an issue though?
>
> Well, I didn't say this is a problem. I simply do not know how/why people
> use pidfd_poll().
Sorry, I just have a hard time understanding what you wanted then. :)
"I guess it is too late to change this behavior." made it sound like a)
there's a problem and b) that you would prefer to change behavior. Thus,
it seems that wait(WNOHANG) hanging when a traced leader of an empty
thread-group has exited is a problem in your eyes.
>
> I mostly tried to explain why do I think that do_notify_pidfd() should
> be always called from exit_notify() path, not by release_task(), even
> if the task is not a leader.
>
> > Because a program would rely on WNOHANG to hang on
> > a ptraced leader? That seems esoteric imho.
>
> To me it would be usefule, but lets not discuss this now. The "patch"
Ok, that's good then. I would expect that at least stuff like rr makes
use of pidfd and they might rely on this behavior - although I haven't
checked their code.
> I sent doesn't change the current behaviour.
Yeah, I got that but it would still be useful to understand the wider
context you were adressing.
>
> > > What if we add the new PIDFD_THREAD flag? With this flag
> > >
> > > - sys_pidfd_open() doesn't require the must be a group leader
> >
> > Yes.
> >
> > >
> > > - pidfd_poll() succeeds when the task passes exit_notify() and
> > > becomes a zombie, even if it is a leader and has other threads.
> >
> > Iiuc, if an existing user creates a pidfd for a thread-group leader and
> > then polls that pidfd they would currently only get notified if the
> > thread-group is empty and the leader has exited.
> >
> > If we now start notifying when the thread-group leader exits but the
> > thread-group isn't empty then this would be a fairly big api change
>
> Hmm... again, this patch doesn't (shouldn't) change the current behavior.
>
> Please note "with this flag" above. If sys_pidfd_open() was called
> without PIDFD_THREAD, then sys_pidfd_open() still requires that the
> target task must be a group leader, and pidfd_poll() won't succeed
> until the leader exits and thread_group_empty() is true.
Yeah, I missed the PIDFD_THREAD flag suggestion. Sorry about that. Btw,
I'm not sure whether you remember that but when we originally did the
pidfd work you and I discussed thread support and already decided back
then that having a flag like PIDFD_THREAD would likely be the way to go.
The PIDFD_THREAD flag would be would be interesting because we could
make pidfd_send_signal() support this flag as well to allow sending a
signal to a specific thread. That's something that I had also wanted to
support. And I've been asked for this a few times already. What do you
think?