Re: [PATCH] Update comment on find_task_by_pid_ns

From: Tetsuo Handa
Date: Wed Feb 10 2010 - 20:22:09 EST


Andrew Morton wrote:
> > What should we do? Adding rcu_read_lock()/rcu_read_unlock() to each
> > callers? Or adding rcu_read_lock()/rcu_read_unlock() inside
> > find_task_by_pid_ns()?
>
> Putting rcu_read_lock() in the callee isn't a complete solution.
> Because the function would still be returning a task_struct* without
> any locking held and without taking a reference against it. So that
> pointer is useless to the caller!
>
> We could add a new function which looks up the task and then takes a
> reference on it, insde suitable locks. The caller would then use the
> task_struct and then remember to call put_task_struct() to unpin it.
> This prevents the task_struct from getting freed while it's being
> manipulated, but it doesn't prevent fields within it from being altered
> - that's up to the caller to sort out.

Code for "struct task_struct" is too complicated for me to understand,
but my understanding is that

(1) tasklist_lock is acquired for writing.

(2) "struct task_struct" (to exit()) is removed from task's list.

(3) tasklist_lock is released.

(4) Wait for RCU grace period.

(5) kfree() members of "struct task_struct".

(6) kfree() "struct task_struct" itself.

If above sequence is correct, I think

rcu_read_lock();
task = find_task_by_pid_ns();
if (task)
do_something(task);
rcu_read_unlock();

do_something() can safely access all members of task without
read_lock(&tasklist_lock), except task->prev (I don't know the exact member)
and task->usage, because do_something() finishes its work before (5).
I think we need to call find_task_by_pid_ns() with both
read_lock(&tasklist_lock) and rcu_read_lock()

read_lock(&tasklist_lock);
rcu_read_lock();
task = find_task_by_pid_ns();
if (task)
atomido_something(task);
rcu_read_unlock();
read_unlock(&tasklist_lock);

only when do_something() wants to access task->prev or task->usage .

>
> One fix is to go through all those callsites and add the rcu_read_lock.
> That kinda sucks. Perhaps writing the new function which returns a
> pinned task_struct is better?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/