Re: [PATCH] Update comment on find_task_by_pid_ns

From: Andrew Morton
Date: Tue Feb 09 2010 - 17:09:00 EST


On Tue, 9 Feb 2010 06:42:45 +0900
Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:

> OK. I updated description.
>
> As of 2.6.32 , below users are missing rcu_read_lock().
>
> Users missing rcu_read_lock() when calling find_task_by_vpid():
>
> SYSCALL_DEFINE3(ioprio_set) in fs/ioprio.c
> SYSCALL_DEFINE2(ioprio_get) in fs/ioprio.c
> cap_get_target_pid() in kernel/capability.c

Actually, cap_get_target_pid() uses rcu_read_lock() and doesn't take
tasklist_lock.

> audit_prepare_user_tty() in kernel/audit.c
> audit_receive_msg() in kernel/audit.c
> check_clock() in kernel/posix-cpu-timers.c
> posix_cpu_timer_create() in kernel/posix-cpu-timers.c
> SYSCALL_DEFINE3(setpriority) in kernel/sys.c
> SYSCALL_DEFINE2(getpriority) in kernel/sys.c
> SYSCALL_DEFINE2(setpgid) in kernel/sys.c
> SYSCALL_DEFINE1(sched_getscheduler) in kernel/sched.c
> SYSCALL_DEFINE2(sched_getparam) in kernel/sched.c
> sched_setaffinity() in kernel/sched.c
> sched_getaffinity() in kernel/sched.c
> SYSCALL_DEFINE2(sched_rr_get_interval) in kernel/sched.c
> tomoyo_is_select_one() in security/tomoyo/common.c
> tomoyo_read_pid() in security/tomoyo/common.c
> SYSCALL_DEFINE6(move_pages) in mm/migrate.c
> SYSCALL_DEFINE4(migrate_pages) in mm/mempolicy.c
> find_process_by_pid() in arch/mips/kernel/mips-mt-fpaff.c
> pfm_get_task() in arch/ia64/kernel/perfmon.c
> cxn_pin_by_pid() in arch/frv/mm/mmu-context.c
>
> Users missing rcu_read_lock() when calling find_task_by_pid_ns():
>
> rest_init() in init/main.c
> getthread() in kernel/kgdb.c
> mconsole_stack() in arch/um/drivers/mconsole_kern.c
>
> What should we do? Adding rcu_read_lock()/rcu_read_unlock() to each
> callers? Or adding rcu_read_lock()/rcu_read_unlock() inside
> find_task_by_pid_ns()?

Putting rcu_read_lock() in the callee isn't a complete solution.
Because the function would still be returning a task_struct* without
any locking held and without taking a reference against it. So that
pointer is useless to the caller!

We could add a new function which looks up the task and then takes a
reference on it, insde suitable locks. The caller would then use the
task_struct and then remember to call put_task_struct() to unpin it.
This prevents the task_struct from getting freed while it's being
manipulated, but it doesn't prevent fields within it from being altered
- that's up to the caller to sort out.

One fix is to go through all those callsites and add the rcu_read_lock.
That kinda sucks. Perhaps writing the new function which returns a
pinned task_struct is better?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/