Re: [RFC PATCH] pidfs: ensure consistent ENOENT/ESRCH reporting

From: Oleg Nesterov
Date: Thu Apr 10 2025 - 06:19:55 EST


On 04/09, Oleg Nesterov wrote:
>
> Christian,
>
> I will actually read your patch tomorrow, but at first glance
>
> On 04/09, Christian Brauner wrote:
> >
> > The seqcounter might be
> > useful independent of pidfs.
>
> Are you sure? ;) to me the new pid->pid_seq needs more justification...
>
> Again, can't we use pid->wait_pidfd->lock if we want to avoid the
> (minor) problem with the wrong ENOENT?

I mean

int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
{
int err = 0;

spin_lock_irq(&pid->wait_pidfd->lock);

if (!pid_has_task(pid, PIDTYPE_PID))
err = -ESRCH;
else if (!(flags & PIDFD_THREAD) && !pid_has_task(pid, PIDTYPE_TGID))
err = -ENOENT;

spin_lock_irq(&pid->wait_pidfd->lock);

return err ?: __pidfd_prepare(pid, flags, ret);
}

To remind, detach_pid(pid, PIDTYPE_PID) does wake_up_all(&pid->wait_pidfd) and
takes pid->wait_pidfd->lock.

So if pid_has_task(PIDTYPE_PID) succeeds, __unhash_process() -> detach_pid(TGID)
is not possible until we drop pid->wait_pidfd->lock.

If detach_pid(PIDTYPE_PID) was already called and have passed wake_up_all(),
pid_has_task(PIDTYPE_PID) can't succeed.

Oleg.