Re: broken do_each_pid_{thread,task}

From: Oleg Nesterov
Date: Mon Dec 15 2008 - 05:49:09 EST


On 12/14, Eric W. Biederman wrote:
>
> Jiri Slaby <jirislaby@xxxxxxxxx> writes:
> >
> > I'm getting
> > `if (type == PIDTYPE_PID)' is unreachable
> > warning from kernel/exit.c. The preprocessed code looks like:
> > do {
> > struct hlist_node *pos___;
> > if (pgrp != ((void *)0))
> > for (LIST ITERATION) {
> > {
> > if (!((p->state & 4) != 0))
> > continue;
> > retval = 1;
> > break;
> > }
> > if (PIDTYPE_PGID == PIDTYPE_PID)
> > break;
> > }
> > } while (0);
> > and it's obviously wrong.
>
> Actually the test:
> > if (PIDTYPE_PGID == PIDTYPE_PID)
> > break;
> Is technically ok. The compiler should optimize it out instead of warning.
> Although seeing the unexpected corner case it gets us into I think it would
> be good to reconsider this test.

Agreed. This check uglifies the code to fix the theoretical problem.

But, actually do_each_pid_task() is a bit ugly even without this check.
Lets forget about this check for a moment.

Firstly, all hlist_for_each_entry() helpers should be "fixed", we don't
need the second argument. For example, hlist_for_each_entry_rcu() could
be

#define hlist_for_each_entry_rcu(pos, head, member) \
for (pos = (void*)(head)->first; rcu_dereference(pos) && ({ \
prefetch(((struct hlist_node*)pos)->next); \
pos = hlist_entry((void*)pos, typeof(*pos), member); 1; \
}); pos = (void*)(pos)->member.next)

So we can define

#define for_each_pid_task(pid, type, task)
for (task = pid ? (void*)((pid)->tasks + type)->first : NULL; \
rcu_dereference(task) && ({ \
prefetch(((struct hlist_node*)task)->next); \
task = hlist_entry((void*)task, typeof(*task), pids[type].node); 1; \
}); task = (void*)(task)->pids[type].node.next)

Which can be used just

for_each_pid_task(pid, type, task)
do_something(task);

Not that I think it is worth to do, though ;)

We can even restore the ugly special case for PIDTYPE_PID:

#define for_each_pid_task(pid, type, task)
for (task = pid ? (void*)((pid)->tasks + type)->first : NULL; \
rcu_dereference(task) && ({ \
prefetch(((struct hlist_node*)task)->next); \
task = hlist_entry((void*)task, typeof(*task), pids[type].node); 1; }) \
task = (type != PIDTYPE_PID) ? (void*)(task)->pids[type].node.next : NULL)


> > For do_each_pid_thread(), even this code snippet from fs/ioprio.c is broken
> > due to double do {} while expansion:
> > do_each_pid_thread(pgrp, PIDTYPE_PGID, p) {
> > ret = set_task_ioprio(p, ioprio);
> > if (ret)
> > break;
> > } while_each_pid_thread(pgrp, PIDTYPE_PGID, p);
> >
> > Any idea how to get rid of this issue?
>
> The double loop there is certainly an issue. I'm not quite convinced that
> the error handling is correct even with the break statement. But the
> break statement was written when the code was just a single loop, so the
> behavior is definitely not what we intended.

Yes,

> With respect to error handling and IO priorities can we fix the error handling
> by doing what we do when we send a signal to a process group? That is note
> that there was an error, finish processing all of the other processes and then
> return the error?

Personally, I think you are right. But then we should change IOPRIO_WHO_USER
accordingly, imho.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/