Re: [PATCH 1/1] move exit_task_namespaces() outside of exit_notify()

From: Eric W. Biederman
Date: Sat Apr 13 2013 - 21:53:11 EST


Oleg Nesterov <oleg@xxxxxxxxxx> writes:

> exit_notify() does exit_task_namespaces() after
> forget_original_parent(). This was needed to ensure that ->nsproxy
> can't be cleared prematurely, an exiting child we are going to
> reparent can do do_notify_parent() and use the parent's (ours) pid_ns.
>
> However, after 32084504 "pidns: use task_active_pid_ns in
> do_notify_parent" ->nsproxy != NULL is no longer needed, we rely
> on task_active_pid_ns().
>
> Move exit_task_namespaces() from exit_notify() to do_exit(), after
> exit_fs() and before exit_task_work().
>
> This solves the problem reported by Andrey, free_ipc_ns()->shm_destroy()
> does fput() which needs task_work_add(). And this allows us do simplify
> exit_notify(), we can avoid unlock/lock(tasklist) and we can change
> ->exit_state instead of PF_EXITING in forget_original_parent().

It feels like this ought to work, certainly the pid namespace should not
need this, and the pid namespace was the motivating case for most of the
movement. However we haven't called exit_task_namespaces this early
since 2006.

Ugh. I goofed and used that field in scm.c. Sigh. I will push a patch
to rename that field nsproxy->childrens_pid_ns so it is harder to
make the mistake I just made.

None of the uses of nsproxy->net_ns look like they will be used on the
exit path.

The /proc/<pid>/ns/{uts,ipc,net,mnt,pid} files are fine as nsproxy
itself is what becomes NULL and they test for that. Well except the pid
file uses task_active_pid_ns.

nsproxy->ipc_ns is isolated to files under ipc so it is probably fine.

Likewise the nsproxy->uts_ns uses look like they will be fine.

Likewise the nsproxy->mnt_ns uses look like they will be fine.

So in a quick skim through the uses no problem cases stick out, nor
can I think of anything that would cause trouble. This looks like a
good patch.

Acked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>

> Reported-by: Andrey Vagin <avagin@xxxxxxxxxx>
> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
>
> --- x/kernel/exit.c
> +++ x/kernel/exit.c
> @@ -649,7 +649,6 @@ static void exit_notify(struct task_stru
> * jobs, send them a SIGHUP and then a SIGCONT. (POSIX 3.2.2.2)
> */
> forget_original_parent(tsk);
> - exit_task_namespaces(tsk);
>
> write_lock_irq(&tasklist_lock);
> if (group_dead)
> @@ -795,6 +794,7 @@ void do_exit(long code)
> exit_shm(tsk);
> exit_files(tsk);
> exit_fs(tsk);
> + exit_task_namespaces(tsk);
> exit_task_work(tsk);
> check_stack_usage();
> exit_thread();
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/