Re: [PATCH 3/3] proc: simplify proc_task_readdir/first_tid paths

From: Eric W. Biederman
Date: Wed May 29 2013 - 16:39:21 EST


Oleg Nesterov <oleg@xxxxxxxxxx> writes:

> On 05/28, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@xxxxxxxxxx> writes:
>>
>> > proc_task_readdir() does not really need "leader", first_tid()
>> > has to revalidate it anyway. Just pass proc_pid(inode) to
>> > first_tid() instead, it can do pid_task(PIDTYPE_PID) itself
>> > and read ->group_leader only if necessary.
>> >
>> > Note: I am not sure proc_task_readdir() really needs the initial
>> > -ENOENT check, but this is what the current code does.
>>
>> This looks like a nice cleanup.
>>
>> We would need either -ENOENT or a return of 0 and an empty directory at
>> the least. We need the check so that empty directories don't have "."
>> and ".." entries.
>
> And this is not clear to me...
>
> Why the empty "." + ".." dir is bad if the task(s) has gone away after
> opendir?

Because the definition of a deleted directory that you are in is that
getdents will return -ENOENT.

You can reproduce this with any linux filesystem.
mkdir foo
cd foo
rmdir ../foo
strace -f ls .

open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
getdents(3, 0x1851c88, 32768) = -1 ENOENT (No such file or directory)
close(3) = 0

Proc is no different in this regard, and the task could have gone away
before opendir if the process made the task directory it's current
working directory before opening the file.

>> > if (tid && (nr > 0)) {
>> > pos = find_task_by_pid_ns(tid, ns);
>> > - if (pos && (pos->group_leader == leader))
>> > + if (pos && same_thread_group(pos, task))
>>
>> Sigh this reminds me we need to figure out how to kill task->pid and
>> task->tgid,
>
> Yeah.
>
>> which I assume means fixing same_thread_group.
>
> Now that ->signal can't go away before task_struct, we can make it
>
> static inline
> int same_thread_group(struct task_struct *p1, struct task_struct *p2)
> {
> return p1->signal == p2->signal;
> }

Oh cool. I will review that patch in just a moment.

>> > + if (!pid_task(proc_pid(inode), PIDTYPE_PID))
>> > + return -ENOENT;
>>
>> Strictly speaking this call to pid_task needs to be in a rcu critical
>> section.
>
> Argh, thanks.
>
> we do not really need rcu, we are not going to dereference this pointer,
> but we should make __rcu_dereference_check() happy...
>
> I'll change this... but once again, can't we simply remove this check?

I can understand the desire but I think we need something to see if the
task for the directory has at least a zombie around.

> While you are here. Could you explain the ->d_inode check in
> proc_fill_cache() ? The code _looks_ wrong,
>
> if (!child || IS_ERR(child) || !child->d_inode)
> goto end_instantiate;
>
> If d_inode == NULL, who does dput() ?
>
> OTOH, if we ensure d_inode != NULL, why do we check "if (inode)" after
> inode = child->d_inode ?
>
> IOW, it seems that this check should be simply removed?

I seem to agree as I have a patch removing it on my development branch.
I think I wrote that before realizing that negative dentries don't
happen in proc.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/