Re: [PATCH 2/2] pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy

From: Eric W. Biederman
Date: Tue May 02 2017 - 17:20:00 EST

Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> writes:

> On 02.05.2017 19:33, Oleg Nesterov wrote:
>> sorry for delay, vacation...
>> On 04/28, Kirill Tkhai wrote:
>>> On 27.04.2017 19:22, Oleg Nesterov wrote:
>>>> Ah, OK, I didn't notice the ns->child_reaper check in pidns_for_children_get().
>>>> But note that it doesn't need tasklist_lock too.
>>> Hm, are there possible strange situations with memory ordering, when we see
>>> ns->child_reaper of already died ns, which was placed in the same memory?
>>> Do we have to use some memory barriers here?
>> Could you spell please? I don't understand your concerns...
>> I don't see how, say,
>> static struct ns_common *pidns_for_children_get(struct task_struct *task)
>> {
>> struct ns_common *ns = NULL;
>> struct pid_namespace *pid_ns;
>> task_lock(task);
>> if (task->nsproxy) {
>> pid_ns = task->nsproxy->pid_ns_for_children;
>> if (pid_ns->child_reaper) {
Oleg my apologies I missed this line earlier.
This does look like a valid way to skip read_lock(&tasklist_lock);
>> ns = &pid_ns->ns;
>> get_pid_ns(ns);
^^^^^^^^^^^^^ This needs to be:

>> }
>> }
>> task_unlock(task);
>> return ns;
>> }
>> can be wrong. It also looks more clean to me.
>> ->child_reaper is not stable without tasklist, it can be dead/etc, but
>> we do not care?
> I mean the following. We had a pid_ns1 with a child_reaper set. Then
> it became dead, and a new pid_ns2 were allocated in the same memory.

task->nsproxy->pid_ns_for_children is always changed with
task_lock(task) held. See switch_task_namespaces (used by unshare and
setns). This also gives us the guarantee that the pid_ns reference
won't be freed/reused in any for until task_lock(task) is dropped.

> A task on another cpu opens the pid_for_children file, and because
> of there is no memory ordering, it sees pid_ns1->child_reaper,
> when it opens pid_ns2.
> I forgot, what guarantees this situation is impossible? What guarantees,
> the renewed content of pid_ns2 on another cpu is seen not later, than
> we can't open it?