Re: [PATCH 3/3] pids: Make it possible to clone tasks with givenpids

From: Pavel Emelyanov
Date: Fri Nov 11 2011 - 10:58:27 EST


On 11/11/2011 07:25 PM, Oleg Nesterov wrote:
> On 11/11, Pavel Emelyanov wrote:
>>
>>>> Unless: you are using CLONE_NEWPID along with CLONE_CHILD_USEPIDS and
>>>> this child_tidptr array has only one pid (before zero pid).
>>>
>>> And, if you do clone(CLONE_NEWPID | CLONE_CHILD_USEPIDS), then
>>> new_ns->child_reaper == NULL (unless you pass "1" in child_tidptr[]) ?
>>>
>>>> So, could you please explain what I have missed?
>>>
>>> please ;) I guess I misread this patch completely. Help!
>>
>> This is how I plan to use this functionality.
>>
>> When creating an init of a container being restored I call
>>
>> pids[0] = 1;
>> pids[1] = 0;
>>
>> clone(CLONE_NEWPID | CLONE_CHILD_USEPIDS, &pids)
>
> Yep, this is clear. In this case everything works because the pid_ns
> has no pids (and thus ->last_pid == 0).
>
> But. Let me repeat the question, what if you do the same with
> pids[0] = 2 /* anything != 1 */ ? In this case we create the new
> pid_ns, but its ->child_reaper is NULL. Unless I missed something.

Hm... You're right here. I've missed the fact, then in recent kernels
child_reaper is set under pid == 1 condition (was clone_flags & CLONE_NEWPID).

How about if I fix it by disabling the simultaneous use of CLONE_NEWPID and
CLONE_CHILD_USEPIDS and checking for last_pid != 1 in the set_pidmap?

>> Then this created "init" task will have to read pids
>> from image files and call
>>
>> pids[0] = <pid>
>> pids[1] = 0
>>
>> clone(CLONE_CHILD_USEPIDS, &pids);
>>
>> one by one. At this point the last_pid is still 0
>
> Yes, understood. set_pidmap() bypasses the last_pid logic.
>
> Clever hack^Wtrick ;)

:)

> May be this deserves a comment above "if (pid_ns->last_pid != 0)",
> and perhaps it would be more clean to do this check before anything
> else.

OK, will fix this.

> Hmm. It seems, we can make a simpler patch to achieve the (roughly)
> same effect. Without touching copy_process/alloc_pid paths. What if
> we simply add PR_SET_LAST_PID? (or something else).
>
> In this case the new init (created normally) read the pids from image
> file and does prcrl(PR_SET_LAST_PID, pid-1) before the next fork.
>
> What do you think?

This will make it impossible to fork() children on restore in parallel. And
I don't want to lose this ability :(

> Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/