Re: [PATCH] procfs: fix missing RCU protection when reading real_parent in do_task_stat()
From: Jinliang Zheng
Date: Wed Jan 28 2026 - 03:11:00 EST
On Wed, 28 Jan 2026 09:01:35 +0100, oleg@xxxxxxxxxx wrote:
> On 01/27, Mateusz Guzik wrote:
> >
> > On Tue, Jan 27, 2026 at 06:25:25PM +0100, Oleg Nesterov wrote:
> > > On 01/27, alexjlzheng@xxxxxxxxx wrote:
> > > > --- a/fs/proc/array.c
> > > > +++ b/fs/proc/array.c
> > > > @@ -528,7 +528,9 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
> > > > }
> > > >
> > > > sid = task_session_nr_ns(task, ns);
> > > > - ppid = task_tgid_nr_ns(task->real_parent, ns);
> > > > + rcu_read_lock();
> > > > + ppid = task_tgid_nr_ns(rcu_dereference(task->real_parent), ns);
> > > > + rcu_read_unlock();
> > >
> > > But this can't really help. If task->real_parent has already exited and
> > > it was reaped, then it is actually "Too late!" for rcu_read_lock().
> > >
> > > Please use task_ppid_nr_ns() which does the necessary pid_alive() check.
>
> Ah, I was wrong, I forgot about lock_task_sighand(task). So in this case
> pid_alive() is not necessary, and the patch is fine.
>
> But unless you have a strong opinion, I'd still suggest to use
> task_ppid_nr_ns(), see below.
I don't have a strong opinion on this. Your suggestion makes sense - task_ppid_nr_ns()
is more maintainable. I'm happy to update the patch as you recommend.
Thanks,
Jinliang Zheng. :)
>
> > Suppose it fits the time window between the current parent exiting and
> > the task being reassigned to init. Then you transiently see 0 as the pid,
> > instead of 1 (or whatever). This reads like a bug to me.
>
> But we can't avoid this. Without tasklist_lock even
>
> task_tgid_nr_ns(current->real_parent, ns);
>
> can return zero if we race with reparenting. If ->real_parent is reaped
> right after we read the ->real_parent pointer, it has no pids. See
> __unhash_process() -> detach_pid().
>
> > It probably should do precisely the same thing proposed in this patch,
> > as in:
> > rcu_read_lock();
> > ppid = task_tgid_nr_ns(rcu_dereference(task->real_parent), ns);
> > rcu_read_unlock();
>
> No, task_ppid_nr_ns(tsk) does need the pid_alive() check. If tsk exits,
> tsk->real_parent points to nowhere, rcu_read_lock() can't help.
>
> This all needs cleanups. ->real_parent and ->group_leader need the helpers
> (probably with some CONFIG_PROVE_RCU checks) and they should be moved to
> signal_struct.
>
> So far I have only sent some trivial initial cleanups/preparations, see
> https://lore.kernel.org/all/aXY_h8i78n6yD9JY@xxxxxxxxxx/
>
> I'll try to do the next step this week. If I have time ;) I am on a
> forced PTO caused by renovations in our apartment.
>
> Oleg.